淺談requests庫 - 程式人生

本文為部落格園ShyButHandsome的原創作品，轉載請註明出處

右邊有目錄，方便快速瀏覽

安裝

pip install requests  # 是requests而不是request（有s的）

requests的簡單使用

# requests的簡單使用,看看效果就行,後面會更仔細細講
import requests


r = requests.get("https://www.baidu.com") # 構造一個Request物件, 返回一個Response物件

print(r.status_code)  # 檢視狀態碼，檢查是否成功，200則為成功

print(r.encoding)  # 檢視當前編碼,預設為header的charset欄位

print(r.apparent_encoding)  # 檢視requests根據上下文判斷的編碼(備選編碼)

r.encoding = 'utf-8' #此處我們是已知網頁編碼才這麼寫

# 一般來說，我們都是採取r.encoding = r.apparent_encoding 的方式讓它自己轉換編碼

print(r.text)  # 檢視獲取到的網頁內容

簡單拓展：（後續會更加詳細的講）

方法	說明	`HTML`
`requests.request()`	構造一個請求，支撐以下各方法的基礎方法	\(ALL\)
`requests.get()`	獲取`HTML`網頁的主要方法	`GET`
`requests.head()`	獲取`HTML`網頁頭資訊的方法	`HEAD`
`requests.post()`	向`HTML`網頁提交`POST`請求的方法	`POST`
`requests.put()`	向`HTML`網頁提交`PUT`請求的方法	`PUT`
`requests.patch()`	向`HTML`網頁提交區域性修改請求	`PATCH`
`requsets.delete()`	向`HTML` 頁面提交刪除請求	`DELETE`

從`requests.get()方法`看`requests庫`

r = requests.get(url)

requests.get()方法會

構造一個向伺服器請求資源的Requset物件
返回一個包含伺服器資源的Response物件

這裡的Response物件就是r

type(r)
# 在python console中返回的結果是 <class 'requests.models.Response'>
# 即, r是一個Response物件

r儲存著所有從伺服器返回的資源,同時也包含了我們向伺服器請求的Request的資訊

requests.get()

的完整使用:

requests.get(url, params = None, **kwargs)

url: 擬獲取(目標網站)的\(url\)連結
params: url中的額外引數,字典或位元組流格式(可選)
**kwargs \(12\)個控制訪問的引數(可選)

探祕get()方法:

以下是requsets/api.py中的部分內容

def get(url, params=None, **kwargs):
    """Sends a GET request.

    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary, list of tuples or bytes to send in the query string for the :class:`Request`.
    :param \*\*kwargs: Optional arguments that ``request`` takes.
    :return: :class:`Response <Response>` object
    :rtype: requests.Response
    """

    kwargs.setdefault('allow_redirects', True)
    return request('get', url, params=params, **kwargs)

我們可以發現:(在上面的最後一行return語句)

get()方法是由requsets()方法封裝得來的

requests庫一共提供了\(7\)個常用方法(詳見上文:requests庫的簡單使用)

實際上:

除了request()方法外

其他\(6\)個方法都是通過呼叫request()方法來實現的

前面提到的:

requests.get()方法會

構造一個向伺服器請求資源的Requset物件

返回一個包含伺服器資源的Response物件

requests.get()方法構造出的Response物件實質上是requests()方法構造出來的

你甚至可以說requests庫只有一個方法:requests.request()

requests.request()構造出了兩個重要的物件

Request物件(負責發請求),和

Resopnse物件(儲存返回資訊)

而這個Response物件包含了爬蟲返回的全部內容,當屬重中之重

Response物件的幾個常見且重要屬性:

屬性	說明
`r.status_code`	`HTTP`請求的返回狀態,\(200\)表示連線成功,\(404\)(等)表示連線失敗
`r.text`	`HTTP`響應內容的字串形式,即,`url`對應的頁面內容
`r.encoding`	從`HTTP header`中猜測的相應內容的編碼方式
`r.apparent_encoding`	從內容中分析出相應內容編碼方式(備選編碼方式)
`r.content`	`HTTP`響應內容的二進位制形式

關於編碼:

r.encoding會從header的charset欄位讀取編碼資訊,

如果不存在charset欄位,則預設編碼為ISO-8859-1

關於Response物件舉幾個小例子:

r.status_code: 和我們常見的"404,網站不見了"是一個道理
r.text: 字串形式儲存著網頁原始碼,編碼方式是可以變的
r.content: 假設你現在get()到了一個圖片,那麼這個圖片,就是用二進位制的形式儲存的,那麼就可以通過r.content來還原這張圖片

爬取網頁資訊的基本流程:

graph TD A[requests.get] -->B(r.status_code) B --> C{200} C -->|YES| D[將r.text轉換為可讀編碼] C -->|NO| E[產生異常] D --> F[進一步操作] E --> G[進行除錯]

# 在python console的實踐(實踐出真知)
>>> import requests
>>> 
>>> r = requests.get("https://www.baidu.com")
>>> 
>>> type(r)
<class 'requests.models.Response'>
>>> 
>>> r.status_code  # 檢視當前狀態碼,非200則產生了異常
200 
>>> # 狀態碼為200沒有問題,可以進行下一步操作
... 
>>> r.encoding
'ISO-8859-1'
>>> # 預設就是'ISO-8859-1',但這種編碼方式並不支援中文,我們可以直接列印試試
...    
>>> r.text  # python console中是展開的,複製到這就成了一行,可以拖動程式碼框底部滑塊左右移動來觀察
'<!DOCTYPE html>\r\n<!--STATUS OK--><html> <head><meta http-equiv=content-type content=text/html;charset=utf-8><meta http-equiv=X-UA-Compatible content=IE=Edge><meta content=always name=referrer><link rel=stylesheet type=text/css href=https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/bdorz/baidu.min.css><title>ç\x99¾åº¦ä¸\x80ä¸\x8bï¼\x8cä½\xa0å°±ç\x9f¥é\x81\x93</title></head> <body link=#0000cc> <div id=wrapper> <div id=head> <div class=head_wrapper> <div class=s_form> <div class=s_form_wrapper> <div id=lg> <img hidefocus=true src=//www.baidu.com/img/bd_logo1.png width=270 height=129> </div> <form id=form name=f action=//www.baidu.com/s class=fm> <input type=hidden name=bdorz_come value=1> <input type=hidden name=ie value=utf-8> <input type=hidden name=f value=8> <input type=hidden name=rsv_bp value=1> <input type=hidden name=rsv_idx value=1> <input type=hidden name=tn value=baidu><span class="bg s_ipt_wr"><input id=kw name=wd class=s_ipt value maxlength=255 autocomplete=off autofocus=autofocus></span><span class="bg s_btn_wr"><input type=submit id=su value=ç\x99¾åº¦ä¸\x80ä¸\x8b class="bg s_btn" autofocus></span> </form> </div> </div> <div id=u1> <a href=http://news.baidu.com name=tj_trnews class=mnav>æ\x96°é\x97»</a> <a href=https://www.hao123.com name=tj_trhao123 class=mnav>hao123</a> <a href=http://map.baidu.com name=tj_trmap class=mnav>å\x9c°å\x9b¾</a> <a href=http://v.baidu.com name=tj_trvideo class=mnav>è§\x86é¢\x91</a> <a href=http://tieba.baidu.com name=tj_trtieba class=mnav>è´´å\x90§</a> <noscript> <a href=http://www.baidu.com/bdorz/login.gif?login&amp;tpl=mn&amp;u=http%3A%2F%2Fwww.baidu.com%2f%3fbdorz_come%3d1 name=tj_login class=lb>ç\x99»å½\x95</a> </noscript> <script>document.write(\'<a href="http://www.baidu.com/bdorz/login.gif?login&tpl=mn&u=\'+ encodeURIComponent(window.location.href+ (window.location.search === "" ? "?" : "&")+ "bdorz_come=1")+ \'" name="tj_login" class="lb">ç\x99»å½\x95</a>\');\r\n                </script> <a href=//www.baidu.com/more/ name=tj_briicon class=bri style="display: block;">æ\x9b´å¤\x9aäº§å\x93\x81</a> </div> </div> </div> <div id=ftCon> <div id=ftConw> <p id=lh> <a href=http://home.baidu.com>å\x85³äº\x8eç\x99¾åº¦</a> <a href=http://ir.baidu.com>About Baidu</a> </p> <p id=cp>&copy;2017&nbsp;Baidu&nbsp;<a href=http://www.baidu.com/duty/>ä½¿ç\x94¨ç\x99¾åº¦å\x89\x8då¿\x85è¯»</a>&nbsp; <a href=http://jianyi.baidu.com/ class=cp-feedback>æ\x84\x8fè§\x81å\x8f\x8dé¦\x88</a>&nbsp;äº¬ICPè¯\x81030173å\x8f·&nbsp; <img src=//www.baidu.com/img/gs.gif> </p> </div> </div> </div> </body> </html>\r\n'
>>> 
>>> # 可以發現出現了亂碼(不出現才怪)
... # 接下來我們需要將r.text轉換為可讀編碼,就是r.encoding和r.apparent_encoding的使用
... 
>>> r.apparent_encoding  # 檢視備選編碼方式
'utf-8'
>>> r.encoding = 'utf-8'  # 或者直接r.encoding = r.apparent_encoding
>>> 
>>> r.encoding
'utf-8'
>>> r.encoding  # 檢查是否成功改變編碼方式
'utf-8'
>>> # 成功更換,接下來看看效果吧
... 
>>> r.text  # python console中是展開的,複製到這就成了一行,可以拖動程式碼框底部滑塊左右移動來觀察
'<!DOCTYPE html>\r\n<!--STATUS OK--><html> <head><meta http-equiv=content-type content=text/html;charset=utf-8><meta http-equiv=X-UA-Compatible content=IE=Edge><meta content=always name=referrer><link rel=stylesheet type=text/css href=https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/bdorz/baidu.min.css><title>百度一下，你就知道</title></head> <body link=#0000cc> <div id=wrapper> <div id=head> <div class=head_wrapper> <div class=s_form> <div class=s_form_wrapper> <div id=lg> <img hidefocus=true src=//www.baidu.com/img/bd_logo1.png width=270 height=129> </div> <form id=form name=f action=//www.baidu.com/s class=fm> <input type=hidden name=bdorz_come value=1> <input type=hidden name=ie value=utf-8> <input type=hidden name=f value=8> <input type=hidden name=rsv_bp value=1> <input type=hidden name=rsv_idx value=1> <input type=hidden name=tn value=baidu><span class="bg s_ipt_wr"><input id=kw name=wd class=s_ipt value maxlength=255 autocomplete=off autofocus=autofocus></span><span class="bg s_btn_wr"><input type=submit id=su value=百度一下 class="bg s_btn" autofocus></span> </form> </div> </div> <div id=u1> <a href=http://news.baidu.com name=tj_trnews class=mnav>新聞</a> <a href=https://www.hao123.com name=tj_trhao123 class=mnav>hao123</a> <a href=http://map.baidu.com name=tj_trmap class=mnav>地圖</a> <a href=http://v.baidu.com name=tj_trvideo class=mnav>視訊</a> <a href=http://tieba.baidu.com name=tj_trtieba class=mnav>貼吧</a> <noscript> <a href=http://www.baidu.com/bdorz/login.gif?login&amp;tpl=mn&amp;u=http%3A%2F%2Fwww.baidu.com%2f%3fbdorz_come%3d1 name=tj_login class=lb>登入</a> </noscript> <script>document.write(\'<a href="http://www.baidu.com/bdorz/login.gif?login&tpl=mn&u=\'+ encodeURIComponent(window.location.href+ (window.location.search === "" ? "?" : "&")+ "bdorz_come=1")+ \'" name="tj_login" class="lb">登入</a>\');\r\n                </script> <a href=//www.baidu.com/more/ name=tj_briicon class=bri style="display: block;">更多產品</a> </div> </div> </div> <div id=ftCon> <div id=ftConw> <p id=lh> <a href=http://home.baidu.com>關於百度</a> <a href=http://ir.baidu.com>About Baidu</a> </p> <p id=cp>&copy;2017&nbsp;Baidu&nbsp;<a href=http://www.baidu.com/duty/>使用百度前必讀</a>&nbsp; <a href=http://jianyi.baidu.com/ class=cp-feedback>意見反饋</a>&nbsp;京ICP證030173號&nbsp; <img src=//www.baidu.com/img/gs.gif> </p> </div> </div> </div> </body> </html>\r\n'
>>> 
>>> # 這一次就能看見漢字的出現了
... 
>>> # 接下來我們試試狀態碼不是200
... # 通過不斷的主動試錯能加深對一門新技術的理解
... 
>>> # 比如,故意打錯url試試  
...
>>> r = requests.get("http://www.fkbaidu.com") 
>>> r.status_code
503
>>> # 或者,我們試試訪問某些"遺蹟"
... 
>>> r = requests.get("https://www.github.com/breakwa11/shadowsocksr")
>>> r.status_code
404
>>> 
>>> # 更多的情況可以自己多嘗試嘗試
... # 網路連線有風險,異常處理很重要
... # 通過主動試錯,來掌握常見的異常處理
... 
>>> exit()  # 結束程式執行

爬取網頁的通用程式碼框架

網路連線有風險,異常處理很重要

既然異常處理很重要,那總得知道是有什麼異常不是?

Requests庫的常見異常

異常	說明
`requests.ConnectionError`	網路連線錯誤異常,如`DNS`查詢失敗/拒絕連線等
`requests.HTTPError`	`HTTP`錯誤異常
`requests.URLRequired`	`URL`缺失異常
`requests.TooManyRedirects`	超過最大重定向次數,產生重定向異常
`requests.ConnectTimeout`	連線遠端伺服器超時異常
`requests.Timeout`	請求`URL`超時,產生超時異常

requests.ConnectTimeout和requests.Timeout的區別:

Timeout是指從發出URL請求到獲取整個網頁內容超時的異常
ConnectTimeout僅指與遠端伺服器連線產生的超時異常

raise_for_status()方法:

回顧前面流程圖,

我們在上文有介紹過這個流程圖
graph TD A[requests.get] -->B(r.status_code) B --> C{200} C -->|YES| D[將r.text轉換為可讀編碼] C -->|NO| E[產生異常] D --> F[進一步操作] E --> G[進行除錯]

我們需要在status_code為\(200\)的時候再進行下一步操作

而raise_for_status()方法就可以實現這個判斷

如果status_code不是\(200\),則產生異常requesets.HTTPError

瞭解了幾個常見的異常後,我們就可以來分析這個通用的程式碼框架了

import requests


def main():
    url = "https://www.baidu.com"
    print(get_HTML_text(url))


def get_HTML_text(url):
    try:
        r = requests.get(url, timeout=30)
        # 檢測異常
        r.raise_for_status()
        r.encoding = r.apparent_encoding
        return r.text
    except:
        # 異常處理
        return "產生異常!!!"


if __name__ == "__main__":
    main()

這個程式碼框架就是脫胎於這個流程圖

丟擲異常處理異常

沒有異常繼續幹活

當然,並不是說爬蟲程式碼就一定要按照上面的程式碼來寫

這是一種思路

`requests.request()方法`

在requests的簡單使用處,我們有提到Requests庫的\(7\)個常用方法

方法說明 HTML

requests.request() 構造一個請求，支撐以下各方法的基礎方法 \(ALL\)

requests.get() 獲取HTML網頁的主要方法 GET

requests.head() 獲取HTML網頁頭資訊的方法 HEAD

requests.post() 向HTML網頁提交POST請求的方法 POST

requests.put() 向HTML網頁提交PUT請求的方法 PUT

requests.patch() 向HTML網頁提交區域性修改請求 PATCH

requsets.delete() 向HTML頁面提交刪除請求 DELETE

在從requests.get()方法看requests庫處,我們有提到:

你甚至可以說requests庫只有一個方法:requests.request()

現在就來單獨講講requests.request()方法

同樣,我又跑了去找它的原始碼

原始碼是最好的說明書,官方文件其次

然後,很幸運地,

我發現它是提供了介面的

下面這段是requests/api.py中的程式碼

def request(method, url, **kwargs):
    """Constructs and sends a :class:`Request <Request>`.

    :param method: method for the new :class:`Request` object: ``GET``, ``OPTIONS``, ``HEAD``, ``POST``, ``PUT``, ``PATCH``, or ``DELETE``.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary, list of tuples or bytes to send in the query string for the :class:`Request`.
    :param data: (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the :class:`Request`.
    :param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
    :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload. ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')`` or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers to add for the file.
    :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How many seconds to wait for the server to send data before giving up, as a float, or a :ref:`(connect timeout, read timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
    :param verify: (optional) Either a boolean, in which case it controls whether we verify the server's TLS certificate, or a string, in which case it must be a path to a CA bundle to use. Defaults to ``True``.
    :param stream: (optional) if ``False``, the response content will be immediately downloaded.
    :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
    :return: :class:`Response <Response>` object
    :rtype: requests.Response

    Usage::

      >>> import requests
      >>> req = requests.request('GET', 'https://httpbin.org/get')
      >>> req
      <Response [200]>
    """

    # By using the 'with' statement we are sure the session is closed, thus we
    # avoid leaving sockets open which can trigger a ResourceWarning in some
    # cases, and look like a memory leak in others.
    with sessions.Session() as session:
        return session.request(method=method, url=url, **kwargs)

同樣的,我們在它的原始碼中發現了它十分詳細的使用說明

我不想做翻譯機,

你已經是個成熟的程式設計師了,要學會自己翻譯(能不百度就別百度)

那麼我就用我的話大致說一遍吧!

從它的腦袋:def request(method, url, **kwargs):開始

`method`

method: method for the new :class:Request object: "GET", "OPTIONS", "HEAD", "POST", "PUT", "PATCH", or "DELETE".

這個引數是用來構建一個Request物件的(就是提交一個請求)

負責決定請求方式

它的值一共有\(7\)個(不區分大小寫,實踐得出)

"GET"
"OPTIONS"
"HEAD"
"POST"
"PUT"
"PATCH"
"DELETE"

與HTML分別對應(名字都一樣,太貼心了)

`url`

url: URL for the new :class:Request object.

而這個url則是決定你要向哪個網頁發起請求

它的值就是目標網站的url

引用它的原始碼給出的例子:

>>> import requests
>>> req = requests.request('GET', 'https://httpbin.org/get')
>>> req
<Response [200]>

這裡的'GET'就是method

'https://httpbin.org/get'就是url

我們也看到:

req返回的是一個Response物件

這些上文都講過了,就不贅述了

`**kwargs`

這個是可選引數,共\(13\)個:

params
data
json
headers
cookies
files
auth
timeout
allow_redirects
proxies
verify
stream
cert

params

params: (optional) Dictionary, list of tuples or bytes to send in the query string for the :class:Request.

它的值是字典,元組列表或位元組序列(Dictionary, list of tuples or bytes)

功能是將它的值作為引數新增到url中

舉個例子:

>>> # 在python console中演示對params的使用
... 
>>> import requests
>>> 
>>> dic = {'wd': 'ShyButHandsome'}
>>> 
>>> r = requests.get("https://www.baidu.com/s", params = dic)  # 此處要顯示指定params   
>>>                         
>>> r.url  # 這樣就起到了在百度搜索"ShyButHandsome"的作用,後文還會再給出demo
'https://www.baidu.com/s?wd=ShyButHandsome'

data

data: (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the :class:Request.

它的值是字典,元組列表,位元組序列或檔案物件(Dictionary, list of tuples, bytes, or file-like object)

功能是向伺服器提供/提交資源(可以實現登入的功能)

舉個例子:

>>> # 在python console中演示對params的使用
... 
>>> import requests
>>> 
>>> url = "http://httpbin.org/post" # 本來是想post百度的,但是會出錯,這裡是官方給的練手的地方
>>> dic = {'key': 'value'}
>>> r = requests.post(url, data = dic)
>>> r. text
'{
    "args": {}, 
    "data": "", 
    "files": {}, 
    "form": {
    "key": "value"  // 這裡
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "9", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.23.0"
  }, 
  "json": null, 
  "url": "http://httpbin.org/post"
}
'

json

json: (optional) A JSON serializable Python object to send in the body of the :class:Request.

它的值是python物件

功能是把python物件編碼成JSON提交

我當時挺奇怪,這不是用data也能做嗎?

是的,但data更加簡單,你不用去import json

>>> # 不匯入json
...
>>> import requests
>>> url = "http://httpbin.org/post" 
>>> dic = {'key': 'value'}
>>> r = requests.post(url, json = dic)
>>> r.text
'{
    "args": {}, 
    "data": "{\\"key\\": \\"value\\"}", 
    "files": {}, 
    "form": {}, 
    "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "16", 
    "Content-Type": "application/json", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.23.0", 
  }, 
  "json": {
        "key": "value" // 這裡
  }, 
  "url": "http://httpbin.org/post"
}
'

>>> # 匯入json
... 
>>> import json  # 這裡
>>> import requests  
>>> url = "http://httpbin.org/post" 
>>> dic = {'key': 'value'}
>>> r = requests.post(url, data=json.dumps(dic))     # 這裡
>>> r.text
'{
    "args": {}, 
    "data": "{\\"key\\": \\"value\\"}", 
    "files": {}, 
    "form": {}, 
    "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "16", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.23.0", 
  }, 
  "json": {
        "key": "value"  // 這裡
  }, 
  "url": "http://httpbin.org/post"
}
'

headers

headers: (optional) Dictionary of HTTP Headers to send with the :class:Request.

它的值為字典(Dictionary)

功能是自定義HTTP頭(有些網站指定特定的瀏覽器才能訪問,這種時候就可以通過修改header中的user-agent來模擬這種瀏覽器)

舉個例子:

>>> # 通過修改header中的user-agent模擬Chrome10
>>> import requests  
>>> url = "http://httpbin.org/post" 
>>> hd = {'user-agent': 'Chrome/10'}
>>> r = requests.post(url, headers = hd)         
>>> r.text
'{
    "args": {}, 
    "data": "", 
    "files": {}, 
    "form": {}, 
    "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "0", 
    "Host": "httpbin.org", 
    "User-Agent": "Chrome/10",   // 這裡
  }, 
  "json": null, 
  "url": "http://httpbin.org/post"
}
'

cookies

cookies: (optional) Dict or CookieJar object to send with the :class:Request.

它的值是字典或CookieJar

功能是Request中的cookie

files

files: (optional) Dictionary of 'name': file-like-objects (or {'name': file-tuple}) for multipart encoding upload.file-tuple can be a 2-tuple ('filename', fileobj), 3-tuple ('filename', fileobj, 'content_type')or a 4-tuple ('filename', fileobj, 'content_type', custom_headers), where 'content-type' is a string defining the content type of the given file and custom_headers a dict-like object containing additional headers to add for the file.

它的值是字典

功能是傳輸檔案(可以實現向某一個連線提交某一個檔案)

auth

auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.

它的值是元組

支援HTTP認證功能

timeout

timeout: (optional) How many seconds to wait for the server to send data before giving up, as a float, or a :ref:(connect timeout, read timeout) <timeouts> tuple.

它的值是浮點數或元組

功能是:如果連線時間超出給定引數(單位是秒),則丟擲timeout異常

proxies

proxies: (optional) Dictionary mapping protocol to the URL of the proxy.

它的值是字典

功能是設定訪問代理伺服器(可以增加登入認證)

能夠有效的防止對爬蟲的逆追蹤

allow_redirects

allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to True.

布林型別,預設True

功能是重定向開關

stream

stream: (optional) if False, the response content will be immediately downloaded.

布林型別,預設True

功能是獲取內容立即下載開關

verify

verify: (optional) Either a boolean, in which case it controls whether we verify the server's TLS certificate, or a string, in which case it must be a path to a CA bundle to use. Defaults to True.

布林型別,預設為True

功能是認證SSL證書開關

cert

cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.

它的值是字串或元組型別

本地SSL證書路徑

文章題目為:淺談,寫太多反而累贅

如果想更加深入瞭解的話,推薦仔細閱讀(並做好筆記)requests的文件(有中文噠!)

寫在最後

其實不光是看官方文件,實踐也很重要

不然,學得快,忘得也快

我是ShyButHandsome，一個對技術充滿好奇的大帥哥，如果你覺得這篇文章寫的還行的話，不妨點點推薦？

相關推薦

淺談requests庫

本文為部落格園ShyButHandsome的原創作品，轉載請註明出處右邊有目錄，方便快速瀏覽安裝 pip install requests # 是requests而不是request（有s的） requests的簡單使用 # requests的簡單使用,看看效果就行,後面會更仔細細講 import

【自然語言處理】淺談語料庫

文章目錄【自然語言處理】淺談語料庫前言一、淺談語料庫 1、語料和語料庫 2、語料庫語言學 3、建議語料庫的意義二、語料庫深入瞭解

淺談撞庫防禦策略

2014，12306遭遇撞庫攻擊，13萬資料洩露；2015，烏雲網上爆出網易郵箱過億使用者資料由於撞庫洩露；資料洩露愈演愈烈，撞庫登入成為網站的一大安全威脅，今天小編就和大家探討一下如何才能夠有效的防止撞庫攻擊。俗語知己知彼，百戰不殆，小編在網上找了個撞庫教程整理給大家看

【C++】淺談boost庫智慧指標

#include<iostream> using namespace std; #include<boost/shared_ptr.hpp> #include<boost/weak_ptr.hpp> //雙向連結串列 struct Node { Node(int

淺談數據庫聯合查詢

com 右連接 tab 其中過時 padding right 分析 union all http://www.cnblogs.com/Candies/p/4142576.html 本文介紹以下內容： LFET JOIN、RIGHT JOIN、INNER JOIN、UNIO

淺談對數據庫範式的理解

異常成績例子否則第一範式 apt 用戶id 數據求一個　　數據庫的設計範式是數據庫設計所需要滿足的規範，若滿足的數據庫的規範，則該數據庫是簡潔的、結構明晰的，同時不會發生插入、刪除、跟新操作異常。否則數據庫的設計是不夠合理的，會給編程人員帶來很多麻煩，也可能會造

淺談數據庫中的觸發器

編號 sele 事件 .com col insert語句直接 lec 技術觸發器　　其是一種特殊的存儲過程。一般的存儲過程是通過存儲過程名直接調用，而觸發器主要是　　通過事件(增、刪、改)進行觸發而被執行的。其在表中數據發生變化時自動強制執行。　　常見的觸發器有兩

淺談Oracle12c 數據庫、用戶、CDB與PDB之間的關系

所有 bing 名詞 1.0 容器 ner 們的 roo val 名詞介紹：數據庫：數據庫（Database）是按照數據結構來組織、存儲和管理數據的倉庫，它產生於距今六十多年前，隨著信息技術和市場的發展，特別是二十世紀九十年代以後，數據管理不再僅僅是存儲和管理數據，而

淺談數據庫約束

type 能夠多語常量一對一 where 提交問題 varchar2 國有國法家有家規其實很多時候技術和生活息息相關，怎樣的需求就會有出來解決方案數據庫也是那麽一個神奇的東西，畢竟是關系型數據庫，數據獨立而又可以表表關聯，有時候就需要約束,在某些時候要規規矩矩做人

【幹貨】淺談分布式數據庫中間件之分庫分表

-o img 資源註意淺談中間件 water 大數據分離分庫分表，顧名思義就是把原本存儲於一個庫的數據分塊存儲到多個庫上，把原本存儲於一個表的數據分塊存儲到多個表上。那麽關於分庫分表，你了解多少呢？接下來，我們將從什麽是數據分片及如何進行分片兩方面對DDM分庫分表

淺談Mysql數據庫的備份方案

all data dmi 51cto 刪除 redo 關於 skip date 自從入行幹IT互聯網已經有些年頭了，以前自己確實比較懶，從來不寫博文，從來不總結工作經驗，導致自身的技術提升的很慢，成長也是很慢，正兒八經的開始寫博文也是從去年才開始的，寫點東西，總結下，確實挺

淺談數據庫集群（一）

以及 you sso 導致 pac sdn 大型 img watermark 現在，隨著上網人數的激增，一些大型的網站開始使用數據庫集群來提高數據庫的可靠性和數據庫的性能。那麽在介紹數據庫集群之前首先需要弄清楚幾個問題。 1.為什麽要用數據庫集群（1）

淺談分布式數據庫

base uic 形式業界 trac 最佳實踐 iam 沒有繼續文章集中整理總結mysql分庫分表開源產品，分布式數據庫的設計，以及實際應用案例等相關內容，部分附上本文作者實際應用過程中的理解。本文感謝sjdbc，mycat，姜承堯，林濤等文章提供的精彩介紹。

（轉）運維角度淺談MySQL數據庫優化

臨時 keyword 由於數據查詢 apr database inno 兩臺麻煩　　轉自：http://lizhenliang.blog.51cto.com/7876557/1657465 一個成熟的數據庫架構並不是一開始設計就具備高可用、高伸縮等特性的，它是隨著用

淺談Oracle數據庫的對象

zha 本地 bbb 其它 EDA mat 存儲合並 ons Oracle數據庫---對象中最基本的是表和視圖，其他還有約束、索引、序列、函數、存儲過程、甚至創建同義詞。對數據庫的操作可以基本歸結為對數據對象的操作,因此，在上篇博文講述了基本操作的基礎上，本篇博文將介紹其

淺談數據庫之存儲過程

urn 數據優化 tables named varchar 權限 def 來看什麽是存儲過程如果你接觸過其他的編程語言，那麽就好理解了，存儲過程就像是方法一樣。竟然他說方法那麽他就有類似的方法名，方法要傳遞的變量和返回結果，所以存儲過程有存儲過程名有存儲過程參數也有返

淺談強大易用支援URL Rewrite的iOS路由庫FFRouter

FFRouter 是 iOS 中一個強大且易用的 URL 路由框架，支援 URL Rewrite，使 APP 在釋出之後也可以動態修改相關路由邏輯。基於匹配查詢 URL，效率高。整合和使用都非常簡單！功能具備基本的 URL 註冊、Route、取消註冊、列印

淺談Java的io類庫使用

簡單理解 java的io包括輸入流InputStream，輸出流OutputStream，File類等，具體結構如下圖所示：在網路上的檔案傳輸有兩種方式，一種是字元流傳輸，一種是二進位制流傳輸（這裡的位元組流就是二進位制流，只是起了一個便於區分的名字）。字元流的樣子是這樣的：

淺談JSP中JSTL【標籤庫】常用標籤，EL表示式在JSP四大域中取值：

宣告：本測試使用的Tomcat9，JDK9 建立web4.0專案進行測試： EL表示式中字串【能轉為數字的】會自動強轉： EL表示式的全稱： Expression Language ；作用

淺談工作中使用過的幾種C++介面庫

**個人對本文的理解： C++的介面庫有三種。如果在只在window下做一個工具介面，選用MFC；如果介面需要通過網路釋出出去，可選WTL（windows模板庫）；如果需要跨平臺，首先Qt** 通常一個介面庫是否有廣大的使用人群，我覺得與以下幾個因素有關：支