Scrapy框架實現的登入網站操作示例
阿新 • • 發佈:2020-02-06
本文例項講述了Scrapy框架實現的登入網站操作。分享給大家供大家參考,具體如下:
一、使用cookies登入網站
import scrapy class LoginSpider(scrapy.Spider): name = 'login' allowed_domains = ['xxx.com'] start_urls = ['https://www.xxx.com/xx/'] cookies = "" def start_requests(self): for url in self.start_urls: yield scrapy.Request(url,cookies=self.cookies,callback=self.parse) def parse(self,response): with open("01login.html","wb") as f: f.write(response.body)
二、傳送post請求登入,要手動解析網頁獲取登入引數
import scrapy class LoginSpider(scrapy.Spider): name='login_code' allowed_domains = ['xxx.com'] #1. 登入頁面 start_urls = ['https://www.xxx.com/login/'] def parse(self,response): #2. 程式碼登入 login_url='https://www.xxx.com/login' formdata={ "username":"xxx","pwd":"xxx","formhash":response.xpath("//input[@id='formhash']/@value").extract_first(),"backurl":response.xpath("//input[@id='backurl']/@value").extract_first() } #3. 傳送登入請求post yield scrapy.FormRequest(login_url,formdata=formdata,callback=self.parse_login) def parse_login(self,response): #4.訪問目標頁面 member_url="https://www.xxx.com/member" yield scrapy.Request(member_url,callback=self.parse_member) def parse_member(self,response): with open("02login.html",'wb') as f: f.write(response.body)
三、傳送post請求登入,自動解析網頁獲取登入引數
import scrapy class LoginSpider(scrapy.Spider): name='login_code2' allowed_domains = ['xxx.com'] #1. 登入頁面 start_urls = ['https://www.xxx.com/login/'] def parse(self,"pwd":"xxx" } #3. 傳送登入請求post yield scrapy.FormRequest.from_response( response,formxpath="//*[@id='login_pc']",method="POST",#覆蓋之前的get請求 callback=self.parse_login ) def parse_login(self,response): with open("03login.html",'wb') as f: f.write(response.body)
更多相關內容可檢視本站專題:《Python Socket程式設計技巧總結》、《Python正則表示式用法總結》、《Python資料結構與演算法教程》、《Python函式使用技巧總結》、《Python字串操作技巧彙總》、《Python入門與進階經典教程》及《Python檔案與目錄操作技巧彙總》
希望本文所述對大家基於Scrapy框架的Python程式設計有所幫助。