工作中有個常用的場景,比如現在需要下載10W張圖片,我們不可能寫個for迴圈一張一張的下載吧,又或者是我們做個簡單的HTTP壓力測試肯定是要使用多個,程序或者執行緒去做(每個請求handler,會有一個引數(所有的引數生成一個佇列))然後把handler和佇列map到Pool裡面。肯定要用多執行緒或者是多程序,然後把這100W的佇列丟給執行緒池或者程序池去處理在python中multiprocessing Pool程序池,以及multiprocessing.dummy非常好用,一般:

  • from multiprocessing import Pool as ProcessPool
  • from multiprocessing.dummy import Pool as ThreadPool


# _*_ coding:utf-8 _*_

This file a sample demo to do http stress test
import requests
import time
from multiprocessing.dummy import
Pool as ThreadPool import urllib def get_ret_from_http(url): """cited from https://stackoverflow.com/questions/645312/what-is-the-quickest-way-to-http-get-in-python """ ret = requests.get(url) print ret.content # eg. result: {"error":false,"resultMap":{"check_ret":1},"success":true}
def multi_process_stress_test(): """ start up 4 thread to issue 1000 http requests to server and test consume time :return: """ start = time.time() # 實際中url帶引數的一般使用下面的make_url函式生成,這裡示例就不用(前面寫的現在懶得改了) url = """""" # generate task queue list lst_url = [url, url1]*50 # use 5 threads pool = ThreadPool(5) # task and handles to pool ret = pool.map(get_ret_from_http, lst_url) pool.close() pool.join() print 'time consume %s' % (time.time() - start) def make_url(): """ generate url with parameter https://xy.com/index.php? url=http%3A//xy.xxx.com/22.jpg&SecretId=xy_123_move cited from https://stackoverflow.com/questions/2506379/add-params-to-given-url-in-python https://github.com/gruns/furl a good util for url operator :return: """ para = {"SecretId": "xy_123_move", "url": "http://xy.xxx.com/22.jpg"} print urllib.urlencode(para) #url=http%3A%2F%2Fxy.xxx.com%2F22.jpg&SecretId=xy_123_move base_url = 'xy.com/index.php' return 'https://%s?%s' % (base_url, '&'.join('%s=%s' % (k, urllib.quote(str(v))) for k, v in para.iteritems())) if __name__ == '__main__': # get_ret_from_http() multi_process_stress_test() # print make_url() pass


# _*_ coding:utf-8 _*_
This file is about thread(dummy)/process pool
from multiprocessing import Pool as ProcessPool
from multiprocessing.dummy import Pool as ThreadPool
import logging
from time import sleep, time
from random import randrange

                    format='%(levelname)s %(asctime)s %(processName)s %(message)s',
                    datefmt='%Y-%m-%d %I:%M:%S')

def handler(sec):
    logging.debug('now I will sleep %s S', sec)

def get_pool(b_dummy=True, num=4):
    if b_dummy is True then get ThreadPool, or get process pool
    :param b_dummy: dummy thread Pool or Process pool
    :param num: thread or process num
    :return: pool object
    if b_dummy:
        pool = ThreadPool(num)
        pool = ProcessPool(num)

    return pool

def test_dummy_thread_pool():
    start_time = time()
    # generate task queue parameters lists
    lst_sleep_sec = [randrange(3, 10) for i in xrange(10)]
    pool = get_pool(b_dummy=False)

    results = pool.map(handler, lst_sleep_sec)
    logging.debug('time consume %s', time() - start_time)

if __name__ == '__main__':

工作中使用的語言比較多寫過C++,java, 部分html+js, python的.由於用到語言的間歇性,比如還幾個月沒有使用python了許多技巧就忘記了,於是我把一些常用的python程式碼分類專案在本人的github中,當實際中用到某一方法的時候就把常用的方法放到一個檔案中方便查詢。


