1. 程式人生 > 其它 >豆瓣圖書短評爬取(其中一本書的短評<前十頁>)

豆瓣圖書短評爬取(其中一本書的短評<前十頁>)

原文章在我的csdn上:https://blog.csdn.net/Thefreelittle/article/details/117574096

```python
import requests
from bs4 import BeautifulSoup
import time
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36'}
print("豆瓣圖書爬取---流浪地球。")
num 
= 1 for i in range (0,199,20): time.sleep(3) if i == 0: url = 'https://book.douban.com/subject/3266609/comments/?limit=20&status=P&sort=new_score' else: url = 'https://book.douban.com/subject/3266609/comments/?start='+str(i)+'&limit=20&status=P&sort=new_score' resp = requests.get(url, headers=headers) bs
=BeautifulSoup(resp.text,'html.parser') grid_view=bs.find_all('li',class_="comment-item")#裡面的每個li表示一個影片資料 print("------------------第"+str(num) +"頁評論資訊爬取。輸出樣例(點贊數、使用者名稱稱、評論時間、評論內容)------------------") cishu = 1 for item in grid_view: piaoshu = item.find('span',class_="vote-count").text tzuozhe = item.find('
span',class_="comment-info") zuozhe = tzuozhe.find('a').text shijian = item.find('span',class_="comment-time").text comment = item.find('span',class_="short").text ping = tzuozhe.find('span') if len(str(ping)) != 60: pingfen = "5個星" else: if ping.get('title') == "還行": pingfen = "3個星" elif ping.get('title') == "力薦": pingfen = "5個星" elif ping.get('title') == "推薦": pingfen = "4個星" elif ping.get('title') == "較差": pingfen = "2個星" else: pingfen = "1個星" print(""+str(num)+"頁的第"+str(cishu)+"條評論---"+"點贊數:"+str(piaoshu)+" 作者名稱:"+str(zuozhe)+" 評論時間:"+str(shijian)+" 評分:"+pingfen+" 評論內容:"+str(comment)+"\n") cishu += 1 num += 1 ```