1. 程式人生 > >python利用beautifulsoup多頁面爬蟲

python利用beautifulsoup多頁面爬蟲

pla .html info play 分享圖片 itl open 標簽 imp

利用了beautifulsoup進行爬蟲,解析網址分頁面爬蟲並存入文本文檔:

結果:

技術分享圖片

源碼:

from bs4 import BeautifulSoup
from urllib.request import urlopen
with open("熱門標題.txt","a",encoding="utf-8") as f:
    for i in range(2):
        url = "http://www.ltaaa.com/wtfy-{}".format(i)+".html"
        html = urlopen(url).read()
        soup = BeautifulSoup(html,"html.parser")
        titles = soup.select("div[class = ‘dtop‘ ] a") # CSS 選擇器
        for title in titles:
             print(title.get_text(),title.get(‘href‘))# 標簽體、標簽屬性
             f.write("標題:{}\n".format(title.get_text()))

  

python利用beautifulsoup多頁面爬蟲