簡單爬取知乎網的問答
阿新 • • 發佈:2019-01-06
簡單爬取知乎網的問答模組
利用requests以及pyquery,以及儲存基礎技術爬取知乎的問答模組。
import requests from pyquery import PyQuery as pq url = "https://www.zhihu.com/explore" headers = { "User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36", } html = requests.get(url, headers=headers).text doc = pq(html) # print(doc("html")) items = doc(".explore-tab .feed-item").items() for item in items: question = item.find("h2").text() author = item.find("author-link-line").text() answer = pq(item.find(".content").html()).text() # print(question) file = open("explore.txt","a",encoding="utf-8") file.write("\n".join([question,author,answer])) file.write("\n" + "=" * 80 + "\n") file.close()