MacOS下安裝BeautifulSoup庫及使用
阿新 • • 發佈:2018-11-07
BeautifulSoup簡介
BeautifulSoup庫是一個強大的python第三方庫,它可以解析html進行解析,並提取資訊。
安裝BeautifulSoup
- 開啟終端,輸入命令:
pip3 install beautifulsoup4
BeautifulSoup庫小測
- 小測用到的html頁面地址:http://python123.io/ws/demo.html
- 檢視它的原始碼:
- 用request庫獲得原始碼(存放在變數demo中):
>>> import requests >>> r = requests.get("http://python123.io/ws/demo.html") >>> r.text '<html><head><title>This is a python demo page</title></head>\r\n<body>\r\n<p class="title"><b>The demo python introduces several python courses.</b></p>\r\n<p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:\r\n<a href="http://www.icourse163.org/course/BIT-268001" class="py1" id="link1">Basic Python</a> and <a href="http://www.icourse163.org/course/BIT-1001870001" class="py2" id="link2">Advanced Python</a>.</p>\r\n</body></html>' >>> demo = r.text
- 匯入BeautifulSoup庫
>>> from bs4 import BeautifulSoup
>>>
- 使用BeautifulSoup庫解析html資訊
>>> demo = r.text >>> soup = BeautifulSoup(demo,'html.parser') >>> print(soup.prettify) <bound method Tag.prettify of <html><head><title>This is a python demo page</title></head> <body> <p class="title"><b>The demo python introduces several python courses.</b></p> <p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses: <a class="py1" href="http://www.icourse163.org/course/BIT-268001" id="link1">Basic Python</a> and <a class="py2" href="http://www.icourse163.org/course/BIT-1001870001" id="link2">Advanced Python</a>.</p> </body></html>> >>>
如何使用BeautifulSoup庫?
- 程式碼框架:
from bs4 import BeautifulSoup
soup = BeautifulSoup('<p>data</p>','html.parser')
- 其中BeautifulSoup的兩個引數:
- 第一個代表我們要解析的
html
格式的資訊。 - 第二個代表解析所使用到的解析器
- 第一個代表我們要解析的