设定一个场景,在用户了添加多个任务,点击run task按钮在后台处理这些tasks,并判断task成功或失败,因为task是耗时的,所以采用多线程方式处理tasks
考虑:
线程启动后如何获取task执行结果?
看代码:
import threading
import time
class TaskThread(threading.Thread):
"""
处理task相关的线程类
"""
def __init__(self, func, args=()):
super(TaskThread, self).__init__()
self.func = func # 要执行的task类型
self.args = args # 要传入的参数
def run(self):
# 线程类实例调用start()方法将执行run()方法,这里定义具体要做的异步任务
print("start func {}".format(self.func.__name__)) # 打印task名字 用方法名.__name__
self.result = self.func(*self.args) # 将任务执行结果赋值给self.result变量
def get_result(self):
# 改方法返回task函数的执行结果,方法名不是非要get_result
try:
return self.result
except Exception as ex:
print(ex)
return "ERROR"
def task_type1(task_id, task_name):
print("start tasks, name:{}, id:{}".format(task_name, task_id))
time.sleep(2)
print("end tasks, name:{}, id:{}".format(task_name, task_id))
return task_id
thread_pool = [] # 列表用来保存线程实例
for i in range(10):
# 循环创建线程对象
thread = TaskThread(task_type1, args=(i + 1, 'pay'))
# 将线程对象添加到pool
thread_pool.append(thread)
# 起动线程 执行tasks
thread.start()
for thread in thread_pool:
# 重要的一步,为什么一定要join
thread.join()
# 从线程pool中获取结果
print("result:{}".format(thread.get_result()))
运行结果:
start func task_type1
start tasks, name:pay, id:1
start func task_type1
start tasks, name:pay, id:2
start func task_type1
start tasks, name:pay, id:3
start func task_type1
start tasks, name:pay, id:4
start func task_type1
start tasks, name:pay, id:5
start func task_type1
start tasks, name:pay, id:6
start func task_type1
start tasks, name:pay, id:7
start func task_type1
start tasks, name:pay, id:8
start func task_type1
start tasks, name:pay, id:9
start func task_type1
start tasks, name:pay, id:10
end tasks, name:pay, id:4
end tasks, name:pay, id:2
end tasks, name:pay, id:1
end tasks, name:pay, id:5
end tasks, name:pay, id:8
end tasks, name:pay, id:3
result:1
result:2
end tasks, name:pay, id:9
result:3
result:4
result:5
end tasks, name:pay, id:10
end tasks, name:pay, id:6
result:6
end tasks, name:pay, id:7
result:7
result:8
result:9
result:10
上面代码实现了创建线程执行task,并获取任务,关键点在于线程类中实现了ge_result方法,用来获取任务函数的返回值,并且用到了thread.join(),这是必须的,如果没有thread.join()将会怎么样呢?
注释掉:thread.join() 并重新运行代码:
start func task_type1
start tasks, name:pay, id:1
start func task_type1
start tasks, name:pay, id:2
start func task_type1
start tasks, name:pay, id:3
start func task_type1
start tasks, name:pay, id:4
start func task_type1
start tasks, name:pay, id:5
start func task_type1
start tasks, name:pay, id:6
start func task_type1
start tasks, name:pay, id:7
start func task_type1
start tasks, name:pay, id:8
start func task_type1
start tasks, name:pay, id:9
start func task_type1
start tasks, name:pay, id:10
'TaskThread' object has no attribute 'result'
result:ERROR
'TaskThread' object has no attribute 'result'
result:ERROR
'TaskThread' object has no attribute 'result'
result:ERROR
'TaskThread' object has no attribute 'result'
result:ERROR
'TaskThread' object has no attribute 'result'
result:ERROR
'TaskThread' object has no attribute 'result'
result:ERROR
'TaskThread' object has no attribute 'result'
result:ERROR
'TaskThread' object has no attribute 'result'
result:ERROR
'TaskThread' object has no attribute 'result'
result:ERROR
'TaskThread' object has no attribute 'result'
result:ERROR
end tasks, name:pay, id:1
end tasks, name:pay, id:3
end tasks, name:pay, id:2
end tasks, name:pay, id:5
end tasks, name:pay, id:6
end tasks, name:pay, id:7
end tasks, name:pay, id:4
end tasks, name:pay, id:8
end tasks, name:pay, id:9
end tasks, name:pay, id:10
如果没有join,我们得到了这样的结果,这是为什么呢?
这是因为self.result在run方法中,只有self.func执行结束后,self.resultb才被赋值,我们在调用get_result时,run方法并未执行结束,self.result自然也未被赋值,所以抛了'TaskThread' object has no attribute 'result'异常.
那为什么join以后就可以正常获取self.result呢?
这就要理解一下多线程的原理:
1.当一个进程启动之后,会默认产生一个主线程,因为线程是程序执行流的最小单元,当设置多线程时,主线程会创建多个子线程,在python中,默认情况下(其实就是setDaemon(False)),主线程执行完自己的任务以后,就退出了,此时子线程会继续执行自己的任务
2. 如果setDaemon(True)方法,设置子线程为守护线程时,主线程一旦执行结束,则全部线程全部被终止执行,可能出现的情况就是,子线程的任务还没有完全执行结束,就被迫停止,当然,这不是我们想要看到的
3. 我们希望看到的结果是当主线程结束后阻塞,等待子线程执行完成,这时需要用到join()
那能将join()放在start()后面吗?
试一下:
thread.start()
thread.join()
执行结果:
start func task_type1
start tasks, name:pay, id:1
end tasks, name:pay, id:1
start func task_type1
start tasks, name:pay, id:2
end tasks, name:pay, id:2
start func task_type1
start tasks, name:pay, id:3
end tasks, name:pay, id:3
start func task_type1
start tasks, name:pay, id:4
end tasks, name:pay, id:4
start func task_type1
start tasks, name:pay, id:5
end tasks, name:pay, id:5
start func task_type1
start tasks, name:pay, id:6
end tasks, name:pay, id:6
start func task_type1
start tasks, name:pay, id:7
end tasks, name:pay, id:7
start func task_type1
start tasks, name:pay, id:8
end tasks, name:pay, id:8
start func task_type1
start tasks, name:pay, id:9
end tasks, name:pay, id:9
start func task_type1
start tasks, name:pay, id:10
end tasks, name:pay, id:10
result:1
result:2
result:3
result:4
result:5
result:6
result:7
result:8
result:9
result:10
结果是拿到了,但是每次start()之后join,则子线程阻塞,知道执行结束才开始下一次循环,这样的执行效率等同于单线程循环,如果一个线程任务执行2秒,那这样的方式执行10个任务就要20s,显然是错误的用法,因为join()的作用是: 线程同步,即主线程任务结束之后,进入阻塞状态,一直等待其他的子线程执行结束之后,主线程在终止