The concept of Processes and threads - What is a process/What is a thread/multithreading application scenario
Use the process-fork function/multiprocessing module/process pool/interprocess communication
Using the threading module/Thread/RLock/Condition/thread pool
process and thread
The computers we use today have already entered the era of multi-CPU or multi-core, and the operating systems we use all support "multi-tasking" operating systems, which allows us to run multiple programs at the same time, or decompose a program into several relative tasks. Independent subtasks allow multiple subtasks to execute concurrently, thereby shortening the execution time of the program and allowing users to obtain a better experience. Therefore, no matter what programming language is used for development at the moment, it should be one of the necessary skills for programmers to realize that the program can perform multiple tasks at the same time, which is often called "concurrent programming". To this end, we need to discuss two concepts first, one is called process and the other is called thread.
A process is a program executed in the operating system. The operating system allocates storage space in units of processes. Each process has its own address space, data stack, and other auxiliary data for tracking process execution. The operating system manages the execution of all processes. , allocate resources reasonably for them. A process can create a new process by fork or spawn to perform other tasks, but the new process also has its own independent memory space, so data sharing must be achieved through an inter-process communication mechanism (IPC, Inter-Process Communication). Specific methods include pipes, signals, sockets, shared memory areas, and so on.
A process can also have multiple concurrent execution threads. Simply put, it has multiple execution units that can be scheduled by the CPU. This is the so-called thread. Since threads are under the same process, they can share the same context, so information sharing and communication between threads is easier than between processes. Of course, in a single-core CPU system, real concurrency is impossible, because only one thread can get the CPU at a certain moment, and multiple threads share the execution time of the CPU. The benefits of using multithreading to achieve concurrent programming are self-evident. The most important thing is to improve the performance of the program and improve the user experience. Today, almost all the software we use uses multithreading technology. This can be used The process monitoring tool that comes with the system (such as "Activity Monitor" in macOS, "Task Manager" in Windows) to confirm, as shown in the figure below.
Of course, multi-threading is not without disadvantages. From the perspective of other processes, multi-threaded programs are not friendly to other programs, because it takes up more CPU execution time, which makes other programs unable to obtain enough CPU execution time; On the one hand, from the developer's point of view, writing and debugging multi-threaded programs has higher requirements for developers, and it is more difficult for beginners.
Python supports both multi-process and multi-thread, so there are three main ways to use Python to implement concurrent programming: multi-process, multi-thread, multi-process + multi-thread.
Unix and Linux operating systems provide
fork()system calls to create processes. The parent process calls
fork()the function, and the child process is created. The child process is a copy of the parent process, but the child process has its own PID.
fork()The function is very special, it will return twice, the parent process can get
fork()the PID of the child process through the return value of the function, and the return value in the child process is always 0. Python's os module provides
fork()functions. Since the Windows system does not
fork()call it, to achieve cross-platform multi-process programming, you can use the class of the multiprocessing module
Processto create sub-processes, and this module also provides more advanced encapsulation, such as the process pool for batch startup processes (
Pool), for Queues (
Queue) and pipes (
Pipe) for inter-process communication.
Let's use an example of downloading files to illustrate the difference between using multi-process and not using multi-process. Let's take a look at the following code first.
from random import randint from time import time, sleep def download_task(filename): print('开始下载%s...' % filename) time_to_download = randint(5, 10) sleep(time_to_download) print('%s下载完成! 耗费了%d秒' % (filename, time_to_download)) def main(): start = time() download_task('Python从入门到住院.pdf') download_task('Peking Hot.avi') end = time() print('总共耗费了%.2f秒.' % (end - start)) if __name__ == '__main__': main()
The following is the result of running the program.
开始下载Python从入门到住院.pdf... Python从入门到住院.pdf下载完成! 耗费了6秒 开始下载Peking Hot.avi... Peking Hot.avi下载完成! 耗费了7秒 总共耗费了13.01秒.
As can be seen from the above example, if the code in the program can only be executed bit by bit in order, then even if two unrelated download tasks are executed, it is necessary to wait for the completion of one file download before starting the next download task, which is obviously unreasonable and inefficient. Next, we use the multi-process method to put the two download tasks into different processes, the code is as follows.
from multiprocessing import Process from os import getpid from random import randint from time import time, sleep def download_task(filename): print('启动下载进程，进程号[%d].' % getpid()) print('开始下载%s...' % filename) time_to_download = randint(5, 10) sleep(time_to_download) print('%s下载完成! 耗费了%d秒' % (filename, time_to_download)) def main(): start = time() p1 = Process(target=download_task, args=('Python从入门到住院.pdf', )) p1.start() p2 = Process(target=download_task, args=('Peking Hot.avi', )) p2.start() p1.join() p2.join() end = time() print('总共耗费了%.2f秒.' % (end - start)) if __name__ == '__main__': main()
In the above code, we
Processcreated the process object through the class, and
targetwe passed in a function through the parameters to represent the code to be executed after the process started, followed by
argsa tuple, which represented the parameters passed to the function.
ProcessThe method of the object is
startused to start the process, and the
joinmethod means to wait for the end of the process execution. Running the above code can obviously find that the two download tasks are started "simultaneously", and the execution time of the program will be greatly shortened, no longer the sum of the time of the two tasks. The following is the result of one execution of the program.
启动下载进程，进程号. 开始下载Python从入门到住院.pdf... 启动下载进程，进程号. 开始下载Peking Hot.avi... Peking Hot.avi下载完成! 耗费了7秒 Python从入门到住院.pdf下载完成! 耗费了10秒 总共耗费了10.01秒.
We can also use the classes and functions in the subprocess module to create and start subprocesses, and then communicate with the subprocesses through pipelines. We will not explain these contents here, and interested readers can understand these knowledge by themselves. Next we will focus on how to implement communication between two processes. We start two processes, one outputs Ping and the other outputs Pong, and the output of Ping and Pong from the two processes adds up to 10. It sounds simple, but it would be wrong to write it like this.
from multiprocessing import Process from time import sleep counter = 0 def sub_task(string): global counter while counter < 10: print(string, end='', flush=True) counter += 1 sleep(0.01) def main(): Process(target=sub_task, args=('Ping', )).start() Process(target=sub_task, args=('Pong', )).start() if __name__ == '__main__': main()
It seems that there is nothing wrong with it, but the final result is that Ping and Pong each output 10, Why? When we create a process in a program, the child process copies the parent process and all its data structures, and each child process has its own independent memory space, which means that each of the two child processes has a
countervariable, so the result is also You can imagine. A relatively simple way to solve this problem is to use the
Queueclass in the multiprocessing module, which is a queue that can be shared by multiple processes. The bottom layer is implemented through the pipeline and semaphore mechanism. Interested readers can try it out by themselves .
In the early version of Python, the thread module (now named _thread) was introduced to realize multi-thread programming. However, this module is too low-level, and many functions are not provided. Therefore, we recommend using the threading module for current multi-thread development. Modules provide a better object-oriented encapsulation for multithreaded programming. Let's implement the example of downloading files just now in a multi-threaded way.
from random import randint from threading import Thread from time import time, sleep def download(filename): print('开始下载%s...' % filename) time_to_download = randint(5, 10) sleep(time_to_download) print('%s下载完成! 耗费了%d秒' % (filename, time_to_download)) def main(): start = time() t1 = Thread(target=download, args=('Python从入门到住院.pdf',)) t1.start() t2 = Thread(target=download, args=('Peking Hot.avi',)) t2.start() t1.join() t2.join() end = time() print('总共耗费了%.3f秒' % (end - start)) if __name__ == '__main__': main()
We can directly use the
Threadclasses of the threading module to create threads, but we have talked about a very important concept called "inheritance". We can create new classes from existing classes, so we can also
Threadcreate custom classes by inheriting classes. The thread class, and then create a thread object and start the thread. The code is shown below.
from random import randint from threading import Thread from time import time, sleep class DownloadTask(Thread): def __init__(self, filename): super().__init__() self._filename = filename def run(self): print('开始下载%s...' % self._filename) time_to_download = randint(5, 10) sleep(time_to_download) print('%s下载完成! 耗费了%d秒' % (self._filename, time_to_download)) def main(): start = time() t1 = DownloadTask('Python从入门到住院.pdf') t1.start() t2 = DownloadTask('Peking Hot.avi') t2.start() t1.join() t2.join() end = time() print('总共耗费了%.2f秒.' % (end - start)) if __name__ == '__main__': main()
Because multiple threads can share the memory space of a process, it is relatively simple to realize communication between multiple threads. The most direct way you can think of is to set a global variable, and multiple threads can share this global variable. But when multiple threads share the same variable (we usually call it "resource"), it is very likely to produce uncontrollable results that cause the program to fail or even crash. If a resource is used by multiple threads in competition, then we usually call it a "critical resource". Access to a "critical resource" needs to be protected, otherwise the resource will be in a "chaos" state. The following example demonstrates the scenario where 100 threads transfer money to the same bank account (transfer 1 yuan). In this example, the bank account is a critical resource. Without protection, we are likely to get wrong result.
from time import sleep from threading import Thread class Account(object): def __init__(self): self._balance = 0 def deposit(self, money): # 计算存款后的余额 new_balance = self._balance + money # 模拟受理存款业务需要0.01秒的时间 sleep(0.01) # 修改账户余额 self._balance = new_balance @property def balance(self): return self._balance class AddMoneyThread(Thread): def __init__(self, account, money): super().__init__() self._account = account self._money = money def run(self): self._account.deposit(self._money) def main(): account = Account() threads =  # 创建100个存款的线程向同一个账户中存钱 for _ in range(100): t = AddMoneyThread(account, 1) threads.append(t) t.start() # 等所有存款的线程都执行完毕 for t in threads: t.join() print('账户余额为: ￥%d元' % account.balance) if __name__ == '__main__': main()
Running the above program, the result is surprising. 100 threads transfer 1 yuan to the account respectively, and the result is far less than 100 yuan. The reason for this situation is that we have not protected the "critical resource" of the bank account. When multiple threads deposit money into the account at the same time, they will execute
new_balance = self._balance + moneythis line of code together. The account balances obtained by multiple threads are all initial state
0the +1 operation was done above, so the wrong result was obtained. In this case, "lock" can come in handy. We can protect "critical resources" through "locks". Only threads that obtain "locks" can access "critical resources", while other threads that do not obtain "locks" can only be blocked until the threads that obtain "locks" are released. Only when the "lock" is obtained, other threads have the opportunity to obtain the "lock" and then access the protected "critical resource". The following code demonstrates how to use a "lock" to secure operations on a bank account to obtain correct results.
from time import sleep from threading import Thread, Lock class Account(object): def __init__(self): self._balance = 0 self._lock = Lock() def deposit(self, money): # 先获取锁才能执行后续的代码 self._lock.acquire() try: new_balance = self._balance + money sleep(0.01) self._balance = new_balance finally: # 在finally中执行释放锁的操作保证正常异常锁都能释放 self._lock.release() @property def balance(self): return self._balance class AddMoneyThread(Thread): def __init__(self, account, money): super().__init__() self._account = account self._money = money def run(self): self._account.deposit(self._money) def main(): account = Account() threads =  for _ in range(100): t = AddMoneyThread(account, 1) threads.append(t) t.start() for t in threads: t.join() print('账户余额为: ￥%d元' % account.balance) if __name__ == '__main__': main()
It is a pity that Python's multi-threading cannot take advantage of the multi-core characteristics of the CPU. This can be confirmed by starting a few threads that execute endless loops. The reason for this is that the Python interpreter has a "global interpreter lock" (GIL). Any thread must obtain the GIL lock before execution, and then the interpreter will automatically release the GIL lock every time 100 bytecodes are executed. , Let other threads have a chance to execute, this is a historical problem, but even so, like the example we gave before, the use of multi-threading is still positive in terms of improving execution efficiency and improving user experience.
Whether it is multi-process or multi-thread, as long as the number is large, the efficiency will definitely not improve. Why? Let's use an analogy. Suppose you are unfortunately preparing for the senior high school entrance examination. Every night, you need to do homework in five subjects: Chinese, Mathematics, English, Physics, and Chemistry. Each homework takes 1 hour. If you spend 1 hour doing Chinese homework first, and then spend 1 hour doing math homework after finishing, in this way, you finish all of them in turn, and it takes a total of 5 hours. This method is called a single-task model. If you plan to switch to a multi-tasking model, you can do Chinese for 1 minute, then switch to math homework, do 1 minute, then switch to English, and so on, as long as the switching speed is fast enough, this method can be executed with a single-core CPU Multitasking is the same thing, from an outsider's point of view, you are doing 5 homework at the same time.
However, there is a price for switching homework. For example, when switching from Chinese to mathematics, you must first clean up the Chinese books and pens on the table (this is called saving the scene), and then open the math textbook and find out the compass and ruler (this is called preparing for the new environment). ), to start doing math homework. The same is true when the operating system switches processes or threads. It needs to save the current execution environment (CPU register state, memory page, etc.), and then prepare the execution environment for the new task (restore the last register state, switch memory pages, etc.) to start execution. Although this switching process is fast, it also takes time. If there are thousands of tasks running at the same time, the operating system may be mainly busy switching tasks, and there is not much time to perform tasks. The most common situation in this situation is that the hard disk is beeping, there is no response when clicking the window, and the system is in a state of suspended animation. Therefore, once the multitasking reaches a limit, the system performance will drop sharply, and eventually all tasks will not be done well.
The second consideration of whether to use multitasking is the type of task, which can be divided into calculation-intensive and I/O-intensive. Computation-intensive tasks are characterized by a large number of calculations that consume CPU resources, such as video encoding and decoding or format conversion, etc. This kind of task depends entirely on the computing power of the CPU. More, the more time spent on task switching, the lower the efficiency of the CPU to perform tasks. Since computing-intensive tasks mainly consume CPU resources, the execution efficiency of such tasks is usually very low in a scripting language such as Python. The C language is the most competent for this type of task. We mentioned earlier that Python has embedded C/C++ codes. Mechanisms.
In addition to computing-intensive tasks, other tasks involving network and storage medium I/O can be regarded as I/O-intensive tasks. The characteristics of this type of task are that the CPU consumption is very small, and most of the time of the task is waiting for I/O. The /O operation completes (because the speed of I/O is much slower than the speed of CPU and memory). For I/O-intensive tasks, if you start multitasking, you can reduce the I/O waiting time so that the CPU can run efficiently. There is a large class of tasks that are I/O intensive, including network applications and web applications, which we will cover shortly.
Explanation: The content and examples above are from the "Python Tutorial" on Liao Xuefeng's official website . Due to different opinions on some points in the author's article, appropriate adjustments have been made to the text description of the original article.
The most important improvement of modern operating systems to I/O operations is to support asynchronous I/O. If you make full use of the asynchronous I/O support provided by the operating system, you can use a single-process single-thread model to perform multitasking. This new model is called an event-driven model. Nginx is a web server that supports asynchronous I/O. It uses a single-process model on a single-core CPU to efficiently support multitasking. On a multi-core CPU, you can run multiple processes (the number is the same as the number of CPU cores), making full use of the multi-core CPU. Server-side programs developed with Node.js also use this working mode, which is also a popular solution for concurrent programming.
In the Python language, the programming model of single thread + asynchronous I/O is called coroutine. With the support of coroutine, efficient multi-tasking programs can be written based on event-driven. The biggest advantage of coroutines is extremely high execution efficiency, because subroutine switching is not thread switching, but controlled by the program itself, so there is no thread switching overhead. The second advantage of the coroutine is that it does not require a multi-threaded lock mechanism, because there is only one thread, and there is no conflict of writing variables at the same time. In the coroutine, there is no need to lock the shared resources, and only need to judge the state, so execute The efficiency is much higher than multithreading. If you want to make full use of the multi-core feature of the CPU, the easiest way is multi-process + coroutine, which not only makes full use of multi-core, but also gives full play to the high efficiency of the coroutine, and can obtain extremely high performance. The content of this aspect will be explained in subsequent courses.
In the interface shown below, there are two buttons "Download" and "About". It takes 10 seconds to download files online by simulating clicking the "Download" button in a sleep mode. If we don't use "multithreading", we will It is found that when the "Download" button is clicked, other parts of the entire program are blocked by this time-consuming task and cannot be executed. This is obviously a very bad user experience. The code is as follows.
import time import tkinter import tkinter.messagebox def download(): # 模拟下载任务需要花费10秒钟时间 time.sleep(10) tkinter.messagebox.showinfo('提示', '下载完成!') def show_about(): tkinter.messagebox.showinfo('关于', '作者: 骆昊(v1.0)') def main(): top = tkinter.Tk() top.title('单线程') top.geometry('200x150') top.wm_attributes('-topmost', True) panel = tkinter.Frame(top) button1 = tkinter.Button(panel, text='下载', command=download) button1.pack(side='left') button2 = tkinter.Button(panel, text='关于', command=show_about) button2.pack(side='right') panel.pack(side='bottom') tkinter.mainloop() if __name__ == '__main__': main()
If multi-threading is used to execute time-consuming tasks in an independent thread, the main thread will not be blocked due to the execution of time-consuming tasks. The modified code is as follows.
import time import tkinter import tkinter.messagebox from threading import Thread def main(): class DownloadTaskHandler(Thread): def run(self): time.sleep(10) tkinter.messagebox.showinfo('提示', '下载完成!') # 启用下载按钮 button1.config(state=tkinter.NORMAL) def download(): # 禁用下载按钮 button1.config(state=tkinter.DISABLED) # 通过daemon参数将线程设置为守护线程(主程序退出就不再保留执行) # 在线程中处理耗时间的下载任务 DownloadTaskHandler(daemon=True).start() def show_about(): tkinter.messagebox.showinfo('关于', '作者: 骆昊(v1.0)') top = tkinter.Tk() top.title('单线程') top.geometry('200x150') top.wm_attributes('-topmost', 1) panel = tkinter.Frame(top) button1 = tkinter.Button(panel, text='下载', command=download) button1.pack(side='left') button2 = tkinter.Button(panel, text='关于', command=show_about) button2.pack(side='right') panel.pack(side='bottom') tkinter.mainloop() if __name__ == '__main__': main()
Let's complete the computationally intensive task of summing 1 to 100000000. This problem itself is very simple and can be solved with a little knowledge of loops. The code is as follows.
from time import time def main(): total = 0 number_list = [x for x in range(1, 100000001)] start = time() for number in number_list: total += number print(total) end = time() print('Execution time: %.3fs' % (end - start)) if __name__ == '__main__': main()
In the above code, I deliberately created a list container first and then filled it with 100000000 numbers. This step is actually time-consuming, so for the sake of fairness, when we decompose this task into 8 processes to execute When , we don't consider the time spent on list slicing operations for the time being, but just count the time spent on doing operations and merging operation results. The code is as follows.
from multiprocessing import Process, Queue from random import randint from time import time def task_handler(curr_list, result_queue): total = 0 for number in curr_list: total += number result_queue.put(total) def main(): processes =  number_list = [x for x in range(1, 100000001)] result_queue = Queue() index = 0 # 启动8个进程将数据切片后进行运算 for _ in range(8): p = Process(target=task_handler, args=(number_list[index:index + 12500000], result_queue)) index += 12500000 processes.append(p) p.start() # 开始记录所有进程执行完成花费的时间 start = time() for p in processes: p.join() # 合并执行结果 total = 0 while not result_queue.empty(): total += result_queue.get() print(total) end = time() print('Execution time: ', (end - start), 's', sep='') if __name__ == '__main__': main()
Compare the execution results of the two pieces of code (on the MacBook I am currently using, the above code takes about 6 seconds, while the following code takes less than 1 second. Again, we only compare the calculation time , regardless of the time spent on list creation and slicing operations), after using multi-process, because more CPU execution time is obtained and the multi-core characteristics of the CPU are better utilized, the execution time of the program is significantly reduced, and the calculation amount is more The larger the effect, the more obvious it is. Of course, if you want, you can also deploy multiple processes on different computers to make a distributed process. The specific method is to share the object through the network through
multiprocessing.managersthe manager provided in the module (register it on the network so that other computers can access it)
Queue), this part of the content is also reserved for the topic of reptiles.