1. 程式人生 > 實用技巧 >Python 爬蟲出現(ValueError: Invalid header name b':authority')

Python 爬蟲出現(ValueError: Invalid header name b':authority')

一,爬取比較有權威的網址

1. 出現

2. 表示在請求頭中有不識別的資料,明顯是無法解析請求頭

3.這是hppt2的請求,作為RFC 描述,Http 請求頭不能以分號開頭

  安裝hyper進行解析,因為hyper認識這樣的請求頭

pip install hyper

4. 修改程式碼

import requests,json
from hyper.contrib import HTTP20Adapter

url = 'https://www.qcc.com/api/bigsearch/judgementList'
headers = {
    ':authority': 'www.a.a',
    
':method': 'POST', ':path': '/api/bigsearch/judgementList', ':scheme': 'https', 'accept': 'aa/pa*', 'accept-encoding': 'ae, br', 'accept-language': 'a0.9', 'content-length': '294', 'content-type': 'at=UTF-8', 'cookie': 'aa', 'origin': 'xx', 'referer': 'xx', 'sec-fetch-dest
': 'x', 'sec-fetch-mode': 'x', 'sec-fetch-site': 'x-x', 'user-agent': 'xxx', 'x-requested-with': 'xx' } payload = { "caseName": "", "caseNo": "", "caseReason": "", "content": "", "courtName": "", "involvedAmtMax": "", "involvedAmtMin": "", "isExactlySearch
": "", "judgeDateBegin": "", "judgeDateEnd": "", "pageSize": "20", "party": "xxx", "publishDateBegin": "", "publishDateEnd": "", "searchKey": "" } data = json.dumps(payload) sessions=requests.session() sessions.mount('https://xxxxx', HTTP20Adapter()) res=sessions.post(url,headers=headers,data=data) print(res.text)

5. 最後根據method的請求方式進行請求