爬蟲之Resquests模塊的使用（二）

阿新 • • 發佈：2018-10-15

tip conn script 直接訪問完全 .text ans 分享圖片

Requests

Requests模塊

Requests模塊是一個用於網絡訪問的模塊，其實類似的模塊有很多，比如urllib，urllib2，httplib，httplib2，他們基本都提供相似的功能。

在上一篇我們已經使用urllib模塊

而Requests會比urllib更加方便，可以節約我們大量的工作，它更加強大，所以更建議使用Requests。

各種請求方式

requests裏提供個各種請求方式

HTTP定義了與服務器進行交互的不同方式, 其中, 最基本的方法有四種: GET, POST, PUT, DELETE; 一個URL對應著一個網絡上的資源, 這四種方法就對應著對這個資源的查詢, 修改, 增加, 刪除四個操作.上面的程序用到的requests.get()來讀取指定網頁的信息, 而不會對信息就行修改, 相當於是"只讀". requests庫提供了HTTP所有基本的請求方式, 都是一句話搞定

技術分享圖片

以上方法均是在此方法的基礎上構建

requests.request(method, url, **kwargs)

GET請求：requests.get(url)

import requests

response = requests.get(‘http://httpbin.org/get‘) # 返回一個實例,包含了很多的信息

print(response.text)  # 所請求網頁的內容

帶參數的GET請求：requests.get(url, param=None)

通常我們會通過httpbin.org/get?key=val方式傳遞。Requests模塊允許使用params關鍵字傳遞參數，以一個字典來傳遞這些參數。

比如我們想傳遞key1=value1,key2=value2到http://httpbin.org/get裏面

構造的url：http://httpbin.org/get?key1=value1&key2=value2

import requests
data = {
    "key1":"key1",
    "key2":"key2"
}
response = requests.get("http://httpbin.org/get",params=data)
print(response.url)

運行結果如下

C:\Pycham\venv\Scripts\python.exe C:/Pycham/demoe3.py
http://httpbin.org/get?key1=key1&key2=key2

Process finished with exit code 0

可以看到,參數之間用&隔開,參數名和參數值之間用=隔開

上述兩種的結果是相同的，通過params參數傳遞一個字典內容，從而直接構造url

註意：通過傳參字典的方式的時候，如果字典中的參數為None則不會添加到url上

POST請求：requests.post(url, data=data)

requests.post()用法與requests.get()完全一致，特殊的是requests.post()有一個data參數，用來存放請求體數據

註意：同樣的在發送post請求的時候也可以和發送get請求一樣通過headers參數傳遞一個字典類型的數據

import requests

data = {
    "name":"zhaofan",
    "age":23
}
response = requests.post("http://httpbin.org/post",data=data)
print(response.text)

運行結果如下：

可以看到參數傳成功了，然後服務器返回了我們傳的數據。

C:\Pycham\venv\Scripts\python.exe C:/Pycham/demoe3.py
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "age": "23", 
    "name": "zhaofan"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Content-Length": "19", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.19.1"
  }, 
  "json": null, 
  "origin": "218.200.145.68", 
  "url": "http://httpbin.org/post"
}

傳送json格式數據

有時候我們需要傳送的信息不是表單形式的，需要我們傳JSON格式的數據過去，所以我們可以用 json.dumps() 方法把表單數據序列化

import json
import requests
 
url = ‘http://httpbin.org/post‘
data = {‘some‘: ‘data‘}
r = requests.post(url, data=json.dumps(data))
print r.text

運行結果如下：

C:\Pycham\venv\Scripts\python.exe C:/Pycham/demoe3.py
{
  "args": {}, 
  "data": "{\"some\": \"data\"}", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Connection": "close", 
    "Content-Length": "16", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.19.1"
  }, 
  "json": {
    "some": "data"
  }, 
  "origin": "218.200.145.68", 
  "url": "http://httpbin.org/post"
}

上傳文件

如果想要上傳文件，那麽直接用 file 參數即可

import requests
 
url = ‘http://httpbin.org/post‘
files = {‘file‘: open(‘test.txt‘, ‘rb‘)}
r = requests.post(url, files=files)
print r.text

運行結果如下

{
  "args": {}, 
  "data": "", 
  "files": {
    "file": "Hello World!"
  }, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "156", 
    "Content-Type": "multipart/form-data; boundary=7d8eb5ff99a04c11bb3e862ce78d7000", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.9.1"
  }, 
  "json": null, 
  "url": "http://httpbin.org/post"
}

解析json

import requests
import json
response=requests.get(‘http://httpbin.org/get‘)

res1=json.loads(response.text) #太麻煩

res2=response.json() #直接獲取json數據

print(res1 == res2) #True

響應Response

import requests

respone=requests.get(‘http://www.jianshu.com‘)
# respone屬性

print(respone.text)# 所請求網頁的內容

print(respone.content)

print(respone.status_code) #返回狀態碼

print(respone.headers)# 網頁的頭

print(respone.cookies)# 網頁的cookie內容

print(respone.cookies.get_dict())

print(respone.cookies.items())

print(respone.url) # 實際的網址

print(respone.history)
 
print(respone.encoding) # 所請求網頁的編碼方式

那r.text和r.content的區別是什麽呢?

r.text是unicode編碼的響應內容(r.text is the content of the response in unicode)

r.content是字符編碼的響應內容(r.content is the content of the response in bytes）

text屬性會嘗試按照encoding屬性自動將響應的內容進行轉碼後返回,如果encoding為None,requests會按照chardet(這是什麽?)猜測正確的編碼

如果你想取文本,可以通過r.text, 如果想取圖片,文件,則可以通過r.content

針對響應內容是二進制文件(如圖片)的場景,content屬性獲取響應的原始內容(以字節為單位)

爬蟲之Resquests模塊的使用（二）

tip conn script 直接訪問完全 .text ans 分享圖片 Requests Requests模塊 Requests模塊是一個用於網絡訪問的模塊，其實類似的模塊有很多，比如urllib，urllib2，httplib，httplib2，他們基本都提供

爬蟲之Resquests模塊的使用（二）

Requests

Requests模塊

各種請求方式

GET請求：requests.get(url)

帶參數的GET請求：requests.get(url, param=None)

POST請求：requests.post(url, data=data)

傳送json格式數據

上傳文件

解析json

響應Response

爬蟲之Resquests模塊的使用（二）

網絡編程- 解決黏包現象方案二之struct模塊（七）

Python基礎之常用模塊（三）

第二十天學習：模塊（二）

os模塊（二）

axis2開發webservice之編寫Axis2模塊（Module）

python模塊基礎之getpass模塊（pycharm中無法使用。）

python爬蟲從入門到放棄（二）之爬蟲的原理

python之collections模塊（OrderDict,defaultdict）

STL之set具體解釋（二）

模塊（1）

LIVE555研究之五：RTPServer（二）

python導入模塊（1）

python導入模塊（2）

nginx FastCGI模塊（FastCGI）配置

Win7下Python2.7環境安裝paramiko模塊（轉）

shutil模塊（2）——壓縮目錄、文件

jenkins實戰之jenkins安裝部署（二）

Jmeter之Bean shell使用（二）

封裝篇——圖片模塊（Glide）

爬蟲之Resquests模塊的使用（二）

Requests

Requests模塊

各種請求方式

GET請求：requests.get(url)

帶參數的GET請求：requests.get(url, param=None)

POST請求：requests.post(url, data=data)

傳送json格式數據

上傳文件

解析json

響應Response

相關推薦