Python grequests模組使用場景及程式碼例項
阿新 • • 發佈:2020-08-11
使用場景:
1) 爬蟲設定ip代理池時驗證ip是否有效
2)進行壓測時,進行批量請求等等場景
grequests 利用 requests和gevent庫,做了一個簡單封裝,使用起來非常方便。
grequests.map(requests,stream=False,size=None,exception_handler=None,gtimeout=None)
另外,由於grequests底層使用的是requests,因此它支援
GET,OPTIONS, HEAD, POST, PUT, DELETE 等各種http method
所以以下的任務請求都是支援的
grequests.post(url,json={“name”:“zhangsan”})
grequests.delete(url)
程式碼如下:
import grequests urls = [ 'http://www.baidu.com','http://www.qq.com','http://www.163.com','http://www.zhihu.com','http://www.toutiao.com','http://www.douban.com' ] rs = (grequests.get(u) for u in urls) print(grequests.map(rs)) # [<Response [200]>,None,<Response [200]>,<Response [418]>] def exception_handler(request,exception): print("Request failed") reqs = [ grequests.get('http://httpbin.org/delay/1',timeout=0.001),grequests.get('http://fakedomain/'),grequests.get('http://httpbin.org/status/500') ] print(grequests.map(reqs,exception_handler=exception_handler))
實際操作中,也可以自定義返回的結果
修改grequests原始碼檔案:
例如:
新增extract_item() 函式合修改map()函式
def extract_item(request): """ 提取request的內容 :param request: :return: """ item = dict() item["url"] = request.url item["text"] = request.response.text or "" item["status_code"] = request.response.status_code or 0 return item def map(requests,gtimeout=None): """Concurrently converts a list of Requests to Responses. :param requests: a collection of Request objects. :param stream: If True,the content will not be downloaded immediately. :param size: Specifies the number of requests to make at a time. If None,no throttling occurs. :param exception_handler: Callback function,called when exception occured. Params: Request,Exception :param gtimeout: Gevent joinall timeout in seconds. (Note: unrelated to requests timeout) """ requests = list(requests) pool = Pool(size) if size else None jobs = [send(r,pool,stream=stream) for r in requests] gevent.joinall(jobs,timeout=gtimeout) ret = [] for request in requests: if request.response is not None: ret.append(extract_item(request)) elif exception_handler and hasattr(request,'exception'): ret.append(exception_handler(request,request.exception)) else: ret.append(None) yield ret
可以直接呼叫:
import grequests urls = [ 'http://www.baidu.com','http://www.douban.com' ] rs = (grequests.get(u) for u in urls) response_list = grequests.map(rs,gtimeout=10) for response in next(response_list): print(response)
支援事件鉤子
def print_url(r,*args,**kwargs): print(r.url) url = “http://www.baidu.com” res = requests.get(url,hooks={“response”: print_url}) tasks = [] req = grequests.get(url,callback=print_url) tasks.append(req) ress = grequests.map(tasks) print(ress)
以上就是本文的全部內容,希望對大家的學習有所幫助,也希望大家多多支援我們。