Lecture 8
Introduction to Python Lecture 8
Summary
• • •
Python exceptions Processes and Threads Programming with Threads
Python Exceptions
In Python, there are two distinguishable kinds of errors: syntax errors and exceptions. Syntax errors are usually called parsing errors. Exceptions are events that can modify the flow of control through a program. Python triggers exceptions automatically on errors, after that they can be intercepted and processed by the program, or can be thrown up to the calling methods. Exceptions allow jumping directly to a routine/handler which can process the unexpected occurred event
Exception roles
•
Error handling
•
Event notification
•
Special case handling
•
Termination actions
•
Unusual flow control
Statements processing exceptions
try/except Catch and recover from exceptions raised by Python, or by you. try/finally Perform cleanup actions, whether exceptions occur or not. raise Trigger an exception manually in your code. assert Conditionally trigger an exception in your code. with/as Implement context managers in Python 2.6 and 3.0 (optional in 2.5)
The general format of the try statement
try: <statements> except : <statements> except (name2, name3): <statements> except as : <statements> except: <statements> else: <statements>
# Run this main action first # Run if name1 is raised during try block # Run if any of these exceptions occur # Run if name4 is raised, and get instance raised # Run for all (other) exceptions raised # Run if no exception was raised during try block
try statement clauses
Clause
Comments
except:
Catch all (or all other) exception types
except name:
Catch a specific exception only
except name as value:
Catch the listed exception and its instance
except (name1, name2):
Catch any of the listed exceptions
except (name1, name2) as value: Catch any listed exception and its instance else:
Run if no exceptions are raised
finally:
Always perform this block
Catching an exception def getAtIndex(str, idx):
return str[idx]
print(getAtIndex("python", 7)) print("Statement after catching exception...")
Output: Traceback (most recent call last): File "exception-1.py", line 5, in <module> print(getAtIndex("python", 7)) File "exception-1.py", line 2, in getAtIndex return str[idx] IndexError: string index out of range
Notice that the statement after the function call is never executed
Catching the exception def getAtIndex(str, idx):
return str[idx]
try:
Output: Exception occurred Statement after catching exception...
print(getAtIndex("python", 7))
except IndexError:
print("Exception occurred”) print("Statement after catching exception...")
After handling the exception, the execution of the program is resumed
Raising an exception
Output:
def getAtIndex(str, idx):
if idx > len(str):
raise IndexError
return str[idx]
Exception occurred Statement after catching exception...
try:
print(getAtIndex("python", 7))
except IndexError:
print("Exception occurred”) print("Statement after catching exception...")
One can raise exceptions using the assert directive too:
def assert_fct(x):
assert x != 0
print(x)
assert_fct(0)
Output: Traceback (most recent call last): File "exception-5.py", line 5, in <module> assert_fct(0) File "exception-5.py", line 2, in assert_fct assert x != 0 AssertionError
Creating a custom exception
class CustomException(Exception):
def __str__(self):
return "CustomException representation..."
def getAtIndex(str, idx):
if idx > len(str):
raise CustomException
return str[idx]
try:
print(getAtIndex("python", 7))
except CustomException as e:
print("Exception occurred {}".format(e))
Output: Exception occurred CustomException representation...
Custom exceptions
The recommended mode for creating custom exceptions is by using OOP, using classes and classes hierarchies. The most important advantages of using class-based exceptions are: •
They can be organized in categories (adding future categories will not require changes in the try block)
•
They have state information attached
•
They support inheritance
Termination Actions
Let’s suppose we need to deal with a file, open it, do some processing, then close it class MyException(Exception):
pass
def process_file(file):
raise MyException
try:
f = open("../lab3/morse.txt", "r")
process_file(f)
print("after file processing...")
finally:
f.close()
print("finally statement reached...")
Output: finally statement reached... File "exception-4.py", line 9, in <module> process_file(f) File "exception-4.py", line 5, in process_file raise MyException __main__.MyException
The try/finally construction assures that the statements included in the finally block are executed regardless of what happens in the try block
This is extremely useful when we involve various resources in our program and we have to release them regardless of what events could occur during the execution of the program flow
Unified try/except/finally
class MyException(Exception):
def __str__(self):
return "MyException"
Output: Exception occurred MyException finally statement reached...
def process_file(file):
raise MyException
try:
f = open("../lab3/morse.txt", "r")
process_file(f)
print("after file processing...")
except MyException as me:
print("Exception occurred {}".format(me))
finally:
f.close()
print("finally statement reached...")
In this case, the exception raised in process_file() function is caught and processed in the exception handler defined in the except block, then the code defined in finally section is executed. In this way we assure ourselves that we release all the resources that we used during the program (file handlers, connections to databases, sockets, etc…)
C vs. Python error handling
C style doStuff() { if (doFirstThing() == ERROR) return ERROR; if (doNextThing() == ERROR) return ERROR; ... return doLastThing(); } main() { if (doStuff() == ERROR) badEnding(); else goodEnding(); }
Python style def doStuff(): doFirstThing() doNextThing() ... doLastThing() if __name__ == '__main__': try: doStuff() except: badEnding() else: goodEnding()
Processes and Threads
Processes and Threads
Processes context switching
How multiple processes share a CPU
(image from http://www.w3ii.com/en-US/operating_system/os_process_scheduling.html)
2 states for processes: • Run • Sleep
PCB - Program Context Block
Relation between threads and processes
[image taken from http://www.javamex.com/tutorials/threads/how_threads_work.shtml]
Python Threads
A Thread or a Thread of Execution is defined in computer science as the smallest unit that can be scheduled in an operating system. There are two different kind of threads: • Kernel threads • User-space Threads or user threads
(image from http://www.python-course.eu/threads.php)
Threads
Every process has at least one thread, i.e. the process itself. A process can start multiple threads. The operating system executes these threads like parallel "processes". On a single processor machine, this parallelism is achieved by thread scheduling or timeslicing.
Advantages of Threading: •
• •
Multithreaded programs can run faster on computer systems with multiple CPUs, because theses threads can be executed truly concurrent. A program can remain responsive to input. This is true both on single and on multiple CPU Threads of a process can share the memory of global variables. If a global variable is changed in one thread, this change is valid for all threads. A thread can have local variables.
Threads in Python
Python has a Global Interpreter Lock (GIL) It imposes various restrictions on threads, one of the most important is that it cannot utilize multiple CPUs Consider the following piece of code: def count(n): while n > 0: n -= 1
Two scenarios: •
Run it twice in series: count(100000000) count(100000000)
• Now, run it in parallel in two threads t1 = Thread(target=count,args=(100000000,)) t1.start() t2 = Thread(target=count,args=(100000000,)) t2.start() t1.join(); t2.join()
Some performance results on a Dual-Core MacBook? Sequential : 24.6s Threaded : 45.5s (1.8X slower!)
And if disabled one of the CPU cores, why does the threaded performance get better? Threaded : 38.0s
GIL in Python
Image taken from: https://callhub.io/understanding-python-gil/
Python Threads
Thread-Specific State • Each thread has its own interpreter specific data structure (PyThreadState) Current stack frame (for Python code) Current recursion depth Thread ID Some per-thread exception information Optional tracing/profiling/debugging hooks • It's a small C structure (<100 bytes)
The interpreter has a global variable that simply points to the ThreadState structure of the currently running thread
Python Threads
•
Only one Python thread can execute in the interpreter at once
•
There is a "global interpreter lock" that carefully controls thread execution
•
The GIL ensures that sure each thread get exclusive access to the interpreter internals when it's running (and that call-outs to C extensions play nice)
Using _thread package import _thread
def print_chars(thread_name, char):
for i in range(5):
print("{} {}".format(thread_name, i * char))
print("{} successfully ended".format(thread_name))
try:
print("Starting main program...")
_thread.start_new_thread(print_chars, ("Thread1 ", "*"))
_thread.start_new_thread(print_chars, ("Thread2 ", "-"))
_thread.start_new_thread(print_chars, ("Thread3 ", "="))
print("Ending main program...")
except:
print("Threads are unable to start")
while 1:
pass
Starting main program... Ending main program... Thread1 Thread1 * Thread1 ** Thread1 *** Thread1 **** Thread1 successfully ended Thread2 Thread2 Thread2 -Thread3 Thread3 = Thread3 == Thread3 === Thread3 ==== Thread3 successfully ended Thread2 --Thread2 ---Thread2 successfully ended
_thread is a deprecated module, acting at low-level. _thread.start_new_thread method was used here to start a new thread. One can notice here that we have a while loop in the main program, looping forever. This is to avoid the situation in which the main program starts the three threads, but it exits before the threads finish their execution, and the buffers are not flushed (the output is buffered by default) See the program output and notice that the threads output is displayed after the main program end its execution.
First threading program import threading
class ThreadExample(threading.Thread):
def __init__(self, name, char):
threading.Thread.__init__(self)
self.name = name
self.char = char
def run(self):
print("Starting thread {} ".format(self.name))
print_chars(self.name, self.char)
print("Ending thread {} ".format(self.name))
def print_chars(thread_name, char):
for i in range(5):
print("{} {}".format(thread_name, i * char))
if __name__ == "__main__":
thread_one = ThreadExample("Thread1", "*")
thread_two = ThreadExample("Thread2", "-")
thread_three = ThreadExample("Thread3", "=")
Starting thread Thread1 Thread1 Thread1 * Thread1 ** Thread1 *** Thread1 **** Ending thread Thread1 Starting thread Thread2 Thread2 Thread2 Thread2 -Thread2 --Thread2 ---Ending thread Thread2 Starting thread Thread3 Exiting the main program... Thread3 Thread3 = Thread3 == Thread3 === Thread3 ==== Ending thread Thread3
thread_one.start()
thread_two.start()
thread_three.start()
print("Exiting the main program...")
?
What happened here?
Using join() import threading
class ThreadExample(threading.Thread):
def __init__(self, name, char):
threading.Thread.__init__(self)
self.name = name
self.char = char
def run(self):
print("Starting thread {} ".format(self.name))
print_chars(self.name, self.char)
print("Ending thread {} ".format(self.name))
def print_chars(thread_name, char):
for i in range(5):
print("{} {}".format(thread_name, i * char))
if __name__ == "__main__":
thread_one = ThreadExample("Thread1", "*")
thread_two = ThreadExample("Thread2", "-")
thread_three = ThreadExample("Thread3", "=")
thread_one.start()
thread_two.start()
thread_three.start()
thread_one.join()
thread_two.join()
thread_three.join()
print("Exiting the main program...")
Starting thread Thread1 Thread1 Thread1 * Thread1 ** Thread1 *** Thread1 **** Ending thread Thread1 Starting thread Thread2 Thread2 Thread2 Thread2 -Thread2 --Thread2 ---Ending thread Thread2 Starting thread Thread3 Thread3 Thread3 = Thread3 == Thread3 === Thread3 ==== Ending thread Thread3 Exiting the main program...
Note that now the main program ends AFTER all the threads finish
!
Determining current thread import threading
import random
class ThreadExample(threading.Thread):
def __init__(self, name = None, char = '+'):
threading.Thread.__init__(self)
if name is None:
self.name = 'Thread' + str(random.randint(1, 100))
else:
self.name = name
self.char = char
def run(self):
print("Starting thread {} ".format(self.name))
print_chars(threading.current_thread().getName(), self.char)
print("Ending thread {} ".format(self.name))
def print_chars(thread_name, char):
for i in range(5):
print("{} {}".format(thread_name, i * char))
if __name__ == "__main__":
thread_one = ThreadExample(char = "*")
thread_two = ThreadExample(char = "-")
thread_three = ThreadExample(char = "=")
thread_one.start()
thread_two.start()
thread_three.start()
thread_one.join()
thread_two.join()
thread_three.join()
print("Exiting the main program...")
Starting thread Thread37 Thread37 Thread37 * Thread37 ** Thread37 *** Thread37 **** Ending thread Thread37 Starting thread Thread4 Thread4 Thread4 Thread4 -Thread4 --Thread4 ---Ending thread Thread4 Starting thread Thread84 Thread84 Thread84 = Thread84 == Thread84 === Thread84 ==== Ending thread Thread84 Exiting the main program...
Thread synchronization
Thread synchronization is defined as a mechanism which ensures that two or more concurrent processes or threads do not simultaneously execute some particular program segment known as critical section.
Synchronizing threads
We want to increment the variable count from two distinct threads. For this, we define a function that increments that variable and pass it to the Thread constructor.
import threading
count = 0
def increment_count(times):
global count
for i in range(1, times):
count += 1
return count
But, wait! What happens with the output? We expect 2000000 but every time the result is different.
if __name__ == "__main__":
t1 = threading.Thread(target=increment_count, args=[1000001])
t2 = threading.Thread(target=increment_count, args=[1000001])
Not really what we wanted, right?
t1.start()
t2.start()
t1.join()
t2.join()
Output:
print(count)
1285489 Process finished with exit code 0 count = count + 1
read count from memory read count from memory again increment the value by 1
1357161 Process finished with exit code 0 1249069 Process finished with exit code 0
Synchronizing threads
import threading
count = 0
def increment_count(times):
global count
for i in range(1, times): lock.acquire()
count += 1
lock.release()
return count
This time we add a lock around the critical section represented by the counter increment
This time, the result is always the same and the correct one.
if __name__ == "__main__":
lock = threading.Lock()
t1 = threading.Thread(target=increment_count, args=[1000001])
t2 = threading.Thread(target=increment_count, args=[1000001])
t1.start()
t2.start()
t1.join()
t2.join()
print(count)
Output: 2000000 Process finished with exit code 0 2000000 Process finished with exit code 0 2000000 Process finished with exit code 0
Synchronizing threads
import threading
import time
class myThread (threading.Thread):
def __init__(self, threadID, name, counter):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.counter = counter
def run(self):
print ("Starting " + self.name)
print_time(self.name, self.counter, 3)
def print_time(threadName, delay, counter):
while counter:
time.sleep(delay)
print ("%s: %s" % (threadName, time.ctime(time.time())))
counter -= 1
We want to print some info about the running threads, first the info about one thread, then the info about a second thread.
But, wait! What happens with the output? The print result is scrambled between threads… Not really what we wanted, right?
threads = []
# Create new threads
thread1 = myThread(1, "Thread-1", 1)
thread2 = myThread(2, "Thread-2", 2)
# Start new Threads
thread1.start()
thread2.start()
# Add threads to thread list
threads.append(thread1)
threads.append(thread2)
# Wait for all threads to complete
for t in threads:
t.join()
print ("Exiting Main Thread")
Output: Starting Thread-1 Starting Thread-2 Thread-1: Thu Nov 24 Thread-2: Thu Nov 24 Thread-1: Thu Nov 24 Thread-1: Thu Nov 24 Thread-2: Thu Nov 24 Thread-2: Thu Nov 24 Exiting Main Thread
18:20:49 18:20:50 18:20:50 18:20:51 18:20:52 18:20:54
2016 2016 2016 2016 2016 2016
Synchronizing threads import threading
import time
class myThread (threading.Thread):
def __init__(self, threadID, name, counter):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.counter = counter
def run(self):
print ("Starting " + self.name)
# Get lock to synchronize threads
threadLock.acquire()
print_time(self.name, self.counter, 3)
# Free lock to release next thread
threadLock.release()
This time one can see that the output looks much better.
critical section
def print_time(threadName, delay, counter):
while counter:
time.sleep(delay)
print ("%s: %s" % (threadName, time.ctime(time.time())))
counter -= 1
threadLock = threading.Lock()
threads = []
# Create new threads
thread1 = myThread(1, "Thread-1", 1)
thread2 = myThread(2, "Thread-2", 2)
# Start new Threads
thread1.start()
thread2.start()
# Add threads to thread list
threads.append(thread1)
threads.append(thread2)
# Wait for all threads to complete
for t in threads:
t.join()
print ("Exiting Main Thread")
Output: Starting Thread-1 Starting Thread-2 Thread-1: Thu Nov 24 Thread-1: Thu Nov 24 Thread-1: Thu Nov 24 Thread-2: Thu Nov 24 Thread-2: Thu Nov 24 Thread-2: Thu Nov 24 Exiting Main Thread
18:26:59 18:27:00 18:27:01 18:27:03 18:27:05 18:27:07
2016 2016 2016 2016 2016 2016
Queue data structure
A queue is an abstract data structure, open at both ends. One of the ends is always used to insert data in the structure, while the other end is used to extract data fro the structure.
The data insertion operation is called enqueueing, while the extraction of the data from the queue is called dequeueing.
Multithread Queues in Python
The Queue module provides a FIFO implementation suitable for multithreaded programming. It can be used to pass messages or other data between producer and consumer threads safely. Locking is handled for the caller, so it is simple to have as many threads as you want working with the same Queue instance.
•
get(): The get() removes and returns an item from the queue.
•
put(): The put() adds item to a queue.
•
qsize() : The qsize() returns the number of items that are currently in the queue.
•
empty(): The empty() returns True if queue is empty; otherwise, False.
•
full(): the full() returns True if queue is full; otherwise, False.
Multithread Queues threadList = ["Thread-1", "Thread-2", "Thread-3"]
nameList = ["One", "Two", "Three", "Four", "Five"]
queueLock = threading.Lock()
workQueue = queue.Queue(10)
threads = []
threadID = 1
import queue
import threading
import time
exitFlag = 0
class myThread (threading.Thread):
def __init__(self, threadID, name, q):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.q = q
# Create new threads
for tName in threadList:
thread = myThread(threadID, tName, workQueue)
thread.start()
threads.append(thread)
threadID += 1
# Fill the queue
queueLock.acquire()
for word in nameList:
workQueue.put(word)
queueLock.release()
def run(self):
print ("Starting " + self.name)
process_data(self.name, self.q)
print ("Exiting " + self.name)
def process_data(threadName, q):
while not exitFlag:
queueLock.acquire()
if not workQueue.empty():
data = q.get()
queueLock.release()
print ("%s processing %s" % (threadName, data))
else:
queueLock.release()
time.sleep(1)
Starting Thread-1 Starting Thread-2 Starting Thread-3 Thread-1 processing Thread-2 processing Thread-3 processing Thread-1 processing Thread-3 processing Exiting Thread-1 Exiting Thread-2 Exiting Thread-3 Exiting Main Thread
One Two Three Four Five
# Wait for queue to empty
while not workQueue.empty():
pass
# Notify threads it's time to exit
exitFlag = 1
# Wait for all threads to complete
for t in threads:
t.join()
print ("Exiting Main Thread")
References
Mark Lutz - Learning Python, O’Reilly Queue – A thread-safe FIFO implementationhttps://pymotw.com/2/ Queue/ Python 3 - Multithreaded programming https:// www.tutorialspoint.com/python3/python_multithreading.htm Inside the Python GIL - http://www.dabeaz.com/python/GIL.pdf Operating System Tutorial - http://www.w3ii.com/en-US/ operating_system/os_process_scheduling.html