python爬蟲筆記-day4

阿新 • • 發佈：2018-11-26

驗證碼的識別
url不變，驗證碼不變
請求驗證碼的地址，獲得相應，識別

url不變，驗證碼會變
思路：對方伺服器返回驗證碼的時候，會和每個使用者的資訊和驗證碼進行一個對應，之後，在使用者傳送post請求的時候，會對比post請求中法的驗證碼和當前使用者真正的儲存在伺服器端的驗證碼是否相同
1.例項化session
2.使用seesion請求登入頁面，獲取驗證碼的地址
3.使用session請求驗證碼，識別
4.使用session傳送post請求’

使用selenium登入，遇到驗證碼
url不變，驗證碼不變，同上
url不變，驗證碼會變
1.selenium請求登入頁面，同時拿到驗證碼的地址
2.獲取登入頁面中driver中的cookie，交給requests模組傳送驗證碼的請求，識別
3.輸入驗證碼，點選登入

selenium使用的注意點
獲取文字和獲取屬性
先定位到元素，然後呼叫.text或者get_attribute方法來去

selenium獲取的頁面資料是瀏覽器中elements的內容
find_element和find_elements的區別
find_element返回一個element，如果沒有會報錯
find_elements返回一個列表，沒有就是空列表
在判斷是否有下一頁的時候，使用find_elements來根據結果的列表長度來判斷

如果頁面中含有iframe、frame，需要先呼叫driver.switch_to.frame的方法切換到frame中才能定位元素
selenium請求第一頁的時候回等待頁面載入完了之後在獲取資料，但是在點選翻頁之後，hi直接獲取資料，此時可能會報錯，因為資料還沒有加載出來，需要time.sleep(3)
selenium中find_element_by_class_name智慧接收一個class對應的一個值，不能傳入多個

db.stu.aggregate({$group:{_id:"$name",counter:{$sum:2}}})

db.stu.aggregate({$group:{_id:null,counter:{$sum:1}}})
db.stu.aggregate({$group:{_id:"$gender",name:{$push:"$name"}}})
db.stu.aggregate({$group:{_id:"$gender",name:{$push:"$$ROOT"}}})
db.tv3.aggregate(
  {$group:{_id:{"country":"$country",province:"$province",userid:"$userid"}}},
  {$group:{_id:{country:"$_id.country",province:"$_id.province"},count:{$sum:1}}},
  {$project:{country:"$_id.country",province:"$_id.province",count:"$count",_id:0}}
  )
db.stu.aggregate(

  {$match:{age:{$gt:20}}},
  {$group:{_id:"$gender",count:{$sum:1}}}
  )
db.t2.aggregate(
  {$unwind:"$size"}
  )
db.t3.aggregate(
  {$unwind:"$tags"},
  {$group:{_id:null,count:{$sum:1}}}
  )
db.t3.aggregate(
  {$unwind:{path:"$size",preserveNullAndEmptyArrays:true}}
  )

python爬蟲筆記-day4

python爬蟲筆記-day4

python爬蟲筆記

python爬蟲筆記----4.Selenium庫（自動化庫）

PYTHON 爬蟲筆記十:利用selenium+PyQuery實現淘寶美食數據搜集並保存至MongeDB（實戰項目三）

python | 爬蟲筆記（五）- 數據存儲

python | 爬蟲筆記 - （八）Scrapy入門教程

Python學習筆記DAY4---檔案操作

python爬蟲筆記-持續更新

python爬蟲筆記-day3

python爬蟲筆記-day1

Python學習筆記 Day4 列表 part 3及for迴圈

Python爬蟲筆記（一）——基礎知識簡單整理

python爬蟲筆記（七）:實戰（三）股票資料定向爬蟲

Python 爬蟲筆記（對維基百科頁面的深度爬取）

【python爬蟲筆記】網路爬蟲之實戰

【python爬蟲筆記】網路爬蟲之提取

【python爬蟲筆記】網路爬蟲之規則

Python爬蟲筆記（一）

python爬蟲筆記（六）——應對反爬策略

python筆記day4

python爬蟲筆記-day4

相關推薦