1. 程式人生 > >爬豆瓣電影名

爬豆瓣電影名

user alt fire https agent tps gecko get text

import urllib.request
from bs4 import BeautifulSoup

url = "https://movie.douban.com/chart"
req = urllib.request.Request(url)
req.add_header("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:54.0) Gecko/20100101 Firefox/54.0")
response = urllib.request.urlopen(url)
bsObj = BeautifulSoup(response, html.parser
) bsObj = bsObj.find_all(div, {class: pl2}) print(bsObj[0].contents[1].get_text()) for tag in bsObj: div_tag = tag.contents[1].get_text() name = div_tag.strip(\n).replace( , ‘‘) +\n print(name)

技術分享

爬豆瓣電影名