1. 程式人生 > 其它 >python爬蟲之偽裝請求頭

python爬蟲之偽裝請求頭

python本身也是通過向瀏覽器傳送請求獲取資料的,存在請求頭,如果不進行偽裝,會被對方伺服器識別從而爬取失敗

def askURL(url):
    data = bytes(urllib.parse.urlencode({
        "setAction": "classroomQuery",
        "PageAction": "Query",
        "day_time_text": "0011000000000",
        "school_area_code": "1",
        "building": "13",
        "week_no": "256"
, "day_no": "2", "day_time1": "ON", "B1": "查詢"}), encoding="utf-8") headers = { # 模擬瀏覽器頭部資訊,向瀏覽器傳送訊息 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:99.0) Gecko/20100101 Firefox/99.0" } #使用者代理,表示告訴伺服器,我麼是什麼型別的機器,本質上是告訴瀏覽器,我們可以接受什麼型別的內容 request=urllib.request.Request(url,headers=headers,data=data,method="
POST") html="" try: resonse = urllib.request.urlopen(request) html=resonse.read().decode("utf-8") #print(html) except urllib.error.URLError as e: if hasattr(e,"code"): print(e.code) if hasattr(e,"reason"): print(e.reason)
return html

偽裝請求頭headers

開啟任意網站開啟控制檯網路,複製請求頭即可