爬蟲--BeautifulSoup簡單案例
阿新 • • 發佈:2018-11-11
1.以爬取簡書首頁標題為例
# coding:utf-8 import requests from bs4 import BeautifulSoup # 簡書首頁title爬取 class SoupSpider: def __init__(self): self.session = requests.Session() def jian_shu_spider(self, url, headers): response = requests.get(url, headers=headers).text # 將獲取到的內容轉換成BeautifulSoup格式 soup = BeautifulSoup(response, "lxml") # 查詢所有class="title"的語句 title_list = soup.find_all(class_= "title") for tit in title_list: title = tit.text print("文章標題:{}".format(title)) if __name__ == '__main__': soup_spider = SoupSpider() soup_spider.jian_shu_spider( "http://www.jianshu.com", { "Referer": "https://www.jianshu.com/", "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36" } )
2.爬取結果