python 进程池 与 线程池
进程池multiprocessing.Pool()
多进程,而不是多线程,所以pool函数里面的第一个参数如果大于CPU的核心数可能反而导致效率更低!!
一、四种方式
A:异步非阻塞的:不用等待当前进程执行完毕,随时根据系统调度来进行进程切换
1、apply_async
例子:
import time
from multiprocessing import Pool as mp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
pool = mp(5)
for i in range(3):
pool.apply_async(run, (i,))
print('非阻塞~~~~')
print('end')
输出:
start
非阻塞~~~~
end
num is 0
解释:进程的切换是操作系统来控制的,抢占式的切换模式。我们首先运行的是主进程,cpu运行很快,短短几行的代码,完全没有给操作系统进程切换的机会,主进程就运行完毕了,整个程序结束。子进程完全没有机会切换到程序就已经结束了。
例子
import time
from multiprocessing import Pool as mp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
pool = mp(5)
for i in range(3):
pool.apply_async(run, (i,))
print('非阻塞~~~~')
pool.close()
pool.join()
print('end')
输出:
start
非阻塞~~~~
num is 0
num is 1
1 is end
num is 2
2 is end
0 is end
end
解释:
pool.close()
pool.join()
告诉主进程,等着所有子进程执行完毕后,在运行剩余部分。剩余的部分是指pool.join()之后的部分。
注意:join()要放在close()后面
2、map_async
例子
import time
from multiprocessing import Pool as mp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
pool = mp(5)
num_list = [0, 1, 2]
pool.map_async(run, num_list)
print('非阻塞~~~~')
pool.close()
pool.join()
print('end')
输出:
start
非阻塞~~~~
num is 0
num is 1
1 is end
num is 2
2 is end
0 is end
endB、阻塞的:等待当前子进程执行完毕后,在执行下一个进程。有三个进程0,1,2。等待子进程0执行完毕后,再执行子进程1,然后子进程2,最后回到主进程执行主进程剩余部分,就像下面的执行结果一样。
1、apply
例子:
import time
from multiprocessing import Pool as mp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
pool = mp(5)
for i in range(3):
pool.apply(run, (i,))
print('阻塞~~~~')
pool.close()
pool.join()
print('end')
输出:
start
num is 0
0 is end
num is 1
1 is end
num is 2
2 is end
阻塞~~~~
end
2、map
例子:
import time
from multiprocessing import Pool as mp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
num_list = [0, 1, 2]
pool = mp(5)
pool.map(run, num_list)
print('阻塞~~~~')
pool.close()
pool.join()
print('end')
输出:
start
num is 0
num is 1
1 is end
num is 2
2 is end
0 is end
阻塞~~~~
end
线程池multiprocessing.dummy.Pool()
同样支持四种方式
例子
import time
from multiprocessing.dummy import Pool as tp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
pool = tp(5)
num_list = [0, 1, 2]
pool.map_async(run, num_list)
print('非阻塞~~~~')
pool.close()
pool.join()
print('end')
结果:
start
非阻塞~~~~
num is 0
num is 1
num is 2
2 is end
1 is end
0 is end
end总结
1、iO密集型建议使用多线程,CPU密集型建议使用多进程。
2、关于多进程,本质上就是进程会在核之间切换的。但是核本身只负责计算操作,所以如果有大量IO之类的操作,那进程可以被pause的时间就比较长。这种就比较适合开大于8个的进程(8核),一般情况就开8个就ok了,或者小于8.
进程之间的数据共享
multiprocessing.Manager().list() 创建list,用于进程之间共享数据
注意,创建一个「multiprocessing.Manager().list() 」则会开一个新的进程用于数据的管理。创建n个共享数据list,则会有n个进程用于数据的管理。
其他资料
















