全棧測試 二 | 介面自動化:2、requests模組的基礎和使用
阿新 • • 發佈:2020-08-01
Step1:什麼是requests
requests是用Python語言編寫,基於urllib,採用Apache2 Licensed開源協議的HTTP庫。它比urllib更加方便,可以節約大量工作時間,還完全滿足HTTP測試需求,是一個簡單易用的HTTP庫。
Step2:例項 引入
# -*- coding:utf-8 -*- import requests response = requests.get('http://www.baidu.com') print(type(response)) print(response.content) print(response.status_code)print(response.text) print(type(response.text)) print(response.cookies)
重要:
- response.content():這是從網路上直接抓取的資料,沒有經過任何的解碼,是一個bytes型別,
- response.text():這是str型別資料,是requests庫將response.content進行解碼的字串,解碼需要指定一個編碼方式,requests會根據自己的猜測來判斷編碼方式,有時候會判斷錯誤,所以最穩妥的辦法是response.content.decode("utf-8"),指定一個編碼方式手動解碼
Step3:各種請求方式
# -*- coding:utf-8 -*- import requests requests.post('http://httpbin.org/post') requests.put('http://httpbin.org/put') requests.delete('http://httpbin.org/delete') requests.head('http://httpbin.org/get') requests.options('http://httpbin.org/get')
- get請求
① 基本用法
# -*- coding:utf-8 -*- import requests response
執行結果:
{ "args": {}, "headers": { "Accept": "*/*", "Accept-Encoding": "gzip, deflate", "Connection": "close", "Host": "httpbin.org", "User-Agent": "python-requests/2.18.4" }, "origin": "222.94.50.178", "url": "http://httpbin.org/get" }
②帶引數的get請求import requests data = { 'name':'python','age':17 } response = requests.get('http://httpbin.org/get',params=data) print(response.text)
執行結果:
{ "args": { "age": "17", "name": "python" }, "headers": { "Accept": "*/*", "Accept-Encoding": "gzip, deflate", "Connection": "close", "Host": "httpbin.org", "User-Agent": "python-requests/2.18.4" }, "origin": "222.94.50.178", "url": "http://httpbin.org/get?name=python&age=17" }
get和post請求的區別:-
GET是從伺服器上獲取資料,POST是向伺服器傳送資料
-
GET請求引數顯示,都顯示在瀏覽器網址上,HTTP伺服器根據該請求所包含URL中的引數來產生響應內容,即“Get”請求的引數是URL的一部分。 例如:
http://www.baidu.com/s?wd=Chinese
-
POST請求引數在請求體當中,訊息長度沒有限制而且以隱式的方式進行傳送,通常用來向HTTP伺服器提交量比較大的資料(比如請求中包含許多引數或者檔案上傳操作等),請求的引數包含在“Content-Type”訊息頭裡,指明該訊息體的媒體型別和編碼,
注意:避免使用Get方式提交表單,因為有可能會導致安全問題。 比如說在登陸表單中用Get方式,使用者輸入的使用者名稱和密碼將在位址列中暴露無遺。
③解析Jsonimport requests import json response = requests.get('http://httpbin.org/get') print(response.json()) print(type(response.json()))
④獲取二進位制資料# -*- coding:utf-8 -*- ''' 儲存百度圖示 ''' import requests response = requests.get('https://www.baidu.com/img/bd_logo1.png') with open('baidu.png','wb') as f: f.write(response.content) f.close()
⑤新增headers
如果直接爬取知乎的網站,是會報錯的,如:import requests response = requests.get('https://www.zhihu.com/explore') print(response.text)
執行結果:
<html><body><h1>500 Server Error</h1> An internal server error occured. </body></html>
解決辦法:
import requests headers = { 'user-agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36' } response = requests.get('https://www.zhihu.com/explore',headers = headers) print(response.text)
就是新增一個headers,就可以正常抓取,而headers中的資料,我是通過chrome瀏覽器自帶的開發者工具去找了然後copy過來的
-
- 基本POST請求
import requests data = { 'name':'python','age' : 18 } headers = { 'user-agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36' } response = requests.post('http://httpbin.org/post',data=data,headers=headers) print(response.json())
例項:爬取拉勾網python職位,並把資料儲存為字典
# -*- coding:utf-8 -*- import requests headers = { 'User-Agent' :'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) ' 'Chrome/69.0.3497.100 Safari/537.36', 'Referer':"https://www.lagou.com/jobs/list_python?labelWords=&fromSearch=true&suginput=" } data = { 'first':"True", 'pn':"1", 'kd' :"python" } url = 'https://www.lagou.com/jobs/positionAjax.json?needAddtionalResult=false' response = requests.get(url,headers=headers,params=data) print(response.json())
- 響應
import requests ''' response屬性 ''' response = requests.get('http://www.baidu.com') print(response.status_code,type(response.status_code)) print(response.history,type(response.history)) print(response.cookies,type(response.cookies)) print(response.url,type(response.url)) print(response.headers,type(response.headers))
執行結果:
200 <class 'int'> [] <class 'list'> <RequestsCookieJar[<Cookie BDORZ=27315 for .baidu.com/>]> <class 'requests.cookies.RequestsCookieJar'> http://www.baidu.com/ <class 'str'> {'Server': 'bfe/1.0.8.18', 'Date': 'Thu, 05 Apr 2018 06:27:33 GMT', 'Content-Type': 'text/html', 'Last-Modified': 'Mon, 23 Jan 2017 13:28:24 GMT', 'Transfer-Encoding': 'chunked', 'Connection': 'Keep-Alive', 'Cache-Control': 'private, no-cache, no-store, proxy-revalidate, no-transform', 'Pragma': 'no-cache', 'Set-Cookie': 'BDORZ=27315; max-age=86400; domain=.baidu.com; path=/', 'Content-Encoding': 'gzip'} <class 'requests.structures.CaseInsensitiveDict'>
- 狀態碼判斷
狀態碼參考表 http://www.cnblogs.com/wuzhiming/p/8722422.html
# -*- coding:utf-8 -*- import requests response = requests.get('http://www.cnblogs.com/hello.html') exit() if not response.status_code == requests.codes.not_found else print('404 not found') response1 = requests.get('http://www.baidu.com') exit() if not response1.status_code == requests.codes.ok else print('Request Successly')
- 高階操作
①檔案上傳
import requests file = {'file':open('baidu.png','rb')} response = requests.post('http://httpbin.org/post',files = file) print(response.text)
執行結果不演示
②獲取cookie
import requests response = requests.get('http://www.baidu.com') cookies = response.cookies print(cookies) for key,value in cookies.items(): print(key + '=' + value)
③會話維持
import requests s = requests.Session() s.get('http://httpbin.org/cookies/get/number/123456789') response = s.get('http://httpbin.org/cookies') print(response.text)
④證書驗證
import requests #verify=False表示不進行證書驗證 response = requests.get('https://www.12306.cn',verify=False) print(response.status_code)
手動指定證書
response1 = requests.get('https://www.12306.cn',cert=('/path/server.crt','/path/key'))
⑤代理設定
import requests #用法示例,代理可以自己百度免費的代理 proxies = { 'http':'http://127.0.0.1:埠號', 'https':'https://ip:埠號', 'http':'http://username:password@ip:埠號' } response = requests.get('http://www.baidu.com',proxies=proxies) print(response.status_code)
⑥超時設定
import requests response = requests.get('http://httpbin.org/get',timeout = 1) print(response.status_code)
⑦認證設定
import requests from requests.auth import HTTPBasicAuth response = requests.get('http://127.0.0.1:8888',auth=('user','password')) response1 = requests.get('http://127.0.0.1:8888',auth=HTTPBasicAuth('user','passwrd')) print(response.status_code)
PS:127.0.0.1:8888只是舉例
⑧異常處理
import requests from requests.exceptions import ReadTimeout,HTTPError,RequestException try: response = requests.get('http://httpbin.org/get',timeout = 0.01) print(response.status_code) except ReadTimeout: print("TIME OUT") except HTTPError: print('HTTP ERROR') except RequestException: print("ERROR")