進程相關操作
阿新 • • 發佈:2018-09-12
lex UNC window not tar nag manager request item
python中的多線程無法利用多核優勢,如果想要充分地使用多核CPU的資源(os.cpu_count()查看),在python中大部分情況需要使用多進程。
Python提供了multiprocessing。multiprocessing模塊用來開啟子進程,並在子進程中執行我們定制的任務(比如函數),該模塊與多線程模塊threading的編程接口類似。
簡單的進程程序:
import multiprocessing #引入模塊 def task(arg): print(arg) def run(): for i in range(10):#循環創建十個進程p=multiprocessing.Process(target=task,args=(i,)) p.start() #準備好執行進程 if __name__=="__main__": run()
常用功能:
join():括號內有參數時,指定等待子進程的時間,時間到了以後繼續向下執行,無參數時,等待子進程執行完畢以後繼續向下執行.
daemon():括號內默認值是False,手動改成True後,優先執行主進程,執行完不等待子進程是否已經執行完.
name():創建進程名稱 name=multiprocessing.current_process()#獲取線程名字
創建進程(兩種方式):
1 類繼承方法創建:
import multiprocessing class MyProcess(multiprocessing.Process): def run(self): print("當前進程是:",multiprocessing.current_process()) def run(): p1=MyProcess()#進程一 p1.start() #自動執行類裏面的run方法 p2=MyProcess() p2.start()#進程二 if __name__=="__main__": run()
2普通方法
import multiprocessing def task(): print("當前進程是:",multiprocessing.current_process()) def run(): for i in range(2): p=multiprocessing.Process(target=task,) p.start() if __name__=="__main__": run()
數據共享:
1 Queue:
import multiprocessing def task(arg,q): q.put(arg) if __name__=="__main__": q = multiprocessing.Queue() for i in range(10): p = multiprocessing.Process(target=task,args=(i,q,)) p.start() while True: v = q.get() print(v)
import mulprocessing q = multiprocessing.Queue() def task(arg,q): q.put(arg) def run(): for i in range(10): p = multiprocessing.Process(target=task, args=(i, q,)) p.start() while True: v = q.get() print(v) run()linux
2 Manger:
import multiprocessing import time def func(arg,dic): time.sleep(2) dic[arg] = 100 if __name__ == "__main__": m = multiprocessing.Manager() dic = m.dict() process_list = [] for i in range(10): p = multiprocessing.Process(target=func, args=(i, dic,)) p.start() process_list.append(p) while True: count=0 for p in process_list: if not p.is_alive(): count+=1 if count==len(process_list): break print(dic)
進程鎖:與線程用法一致.
import time import multiprocessing lock = multiprocessing.RLock() def task(arg): print(‘鬼子來了‘) lock.acquire() time.sleep(2) print(arg) lock.release() if __name__ == ‘__main__‘: p1 = multiprocessing.Process(target=task,args=(1,)) p1.start() p2 = multiprocessing.Process(target=task, args=(2,)) p2.start()
進程池:限制進程最多創建的數
import multiprocessing from concurrent.futures import ProcessPoolExecutor def task(): print("當前進程是:",multiprocessing.current_process()) time.sleep(1) if __name__=="__main__": pool=ProcessPoolExecutor(5) for i in range(10): pool.submit(task,) 打印結果為: 當前進程是: <Process(Process-2, started)> 當前進程是: <Process(Process-3, started)> 當前進程是: <Process(Process-4, started)> 當前進程是: <Process(Process-1, started)> 當前進程是: <Process(Process-5, started)> 一秒鐘以後: 當前進程是: <Process(Process-2, started)> 當前進程是: <Process(Process-3, started)> 當前進程是: <Process(Process-4, started)> 當前進程是: <Process(Process-1, started)> 當前進程是: <Process(Process-5, started)>
簡單爬蟲:
import requests from bs4 import BeautifulSoup from concurrent.futures import ProcessPoolExecutor,ThreadPoolExecutor def task(url): print(url) r1=requests.get(url=url,headers={ ‘User-Agent‘:‘Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.92 Safari/537.36‘ }) #查看下載下來的文本信息 soup=BeautifulSoup(r1.text,‘html.parser‘) print(soup.text) # content_list=soup.find(‘div‘,attrs={‘id‘:content_list}) # for item in content_list.find_all(‘div‘,attr={‘class‘:‘item‘}) # title = item.find(‘a‘).text.strip() # target_url = item.find(‘a‘).get(‘href‘) # print(title,target_url) def run(): pool=ThreadPoolExecutor(5) for i in range(1,50): pool.submit(task,‘https://dig.chouti.com/all/hot/recent/%s‘%i)
進程相關操作