爬豆瓣電影名
阿新 • • 發佈:2017-08-30
user alt fire https agent tps gecko get text
import urllib.request from bs4 import BeautifulSoup url = "https://movie.douban.com/chart" req = urllib.request.Request(url) req.add_header("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:54.0) Gecko/20100101 Firefox/54.0") response = urllib.request.urlopen(url) bsObj = BeautifulSoup(response, ‘html.parser‘) bsObj = bsObj.find_all(‘div‘, {‘class‘: ‘pl2‘}) print(bsObj[0].contents[1].get_text()) for tag in bsObj: div_tag = tag.contents[1].get_text() name = div_tag.strip(‘\n‘).replace(‘ ‘, ‘‘) +‘\n‘ print(name)
爬豆瓣電影名