比较忙,好久没有写东西了。今天没啥事刚好有个小需求

需求:后端部分数据从大数据平台抽取到数仓(以web分的8个模块几十张表吧)

1.之前直接用txt写的python datax.py  XXX.json 发现效率贼特码低,能跑个十几二十分钟,转而用python直接写个多线程,为了方便直接吧各个模块的json放在各个模块了,反正表比较少,也懒得用线程池了。

直接8个线程同时起飞。

代码如下(非原代码)start.py

import os
import threading

credit = os.listdir(r'E:\datax\job\credit')
manage = os.listdir(r'E:\datax\job\manage')
zcgl = os.listdir(r'E:\datax\job\zcgl')
huresources = os.listdir(r'E:\datax\job\huresources')
retail = os.listdir(r'E:\datax\job\retail')
industclient = os.listdir(r'E:\datax\job\industclient')
institution = os.listdir(r'E:\datax\job\institution')
investment = os.listdir(r'E:\datax\job\investment')

def credit1():
    for i in credit:
        os.system(r"python  E:\datax\bin\datax.py E:\datax\job\credit\\" + i)
def manage1():
    for i in manage:
        os.system(r"python  E:\datax\bin\datax.py E:\datax\job\manage\\" + i)
def zcgl1():
    for i in zcgl:
        os.system(r"python   E:\datax\bin\datax.py E:\datax\job\zcgl\\" + i)
def huresources1():
    for i in huresources:
        os.system(r"python   E:\datax\bin\datax.py E:\datax\job\huresources\\" + i)
def retail1():
    for i in retail:
        os.system(r"python   E:\datax\bin\datax.py E:\datax\job\retail\\" + i)
def industclient1():
    for i in industclient:
        os.system(r"python   E:\datax\bin\datax.py E:\datax\job\industclient\\" + i)
def institution1():
    for i in institution:
        os.system(r"python   E:\datax\bin\datax.py E:\datax\job\institution\\" + i)
def investment1():
    for i in investment:
        os.system(r"python  E:\datax\bin\datax.py E:\datax\job\investment\\" + i)
def main():
    t1 = threading.Thread(target=credit1)
    t2 = threading.Thread(target=manage1)
    t3 = threading.Thread(target=zcgl1)
    t4 = threading.Thread(target=huresources1)
    t5 = threading.Thread(target=retail1)
    t6 = threading.Thread(target=industclient1)
    t7 = threading.Thread(target=institution1)
    t8 = threading.Thread(target=investment1)
#
    t1.start()
    t2.start()
    t3.start()
    t4.start()
    t5.start()
    t6.start()
    t7.start()
    t8.start()
#
if __name__ == '__main__':
    main()

之后写了个bat

内容python start.py  

2.windows 中有个任务执行计划

windows 定时调度datax json任务_数据

 

 over