1. 程式人生 > >怎樣用PyQt5.6 爬取網頁

怎樣用PyQt5.6 爬取網頁

PyQt 5.6 以後選用chromium 是新一代QT用的瀏覽器引擎。。。與之前的Webkit有很大的區別。經過長時間的測試,終於可以用了!

# -*- coding: utf-8 -*- import sys from PyQt5.QtCore import QUrl from PyQt5.QtWidgets import QApplication from PyQt5.QtWebEngineWidgets import QWebEnginePage, QWebEngineView class Render(QWebEngineView): def __init__(self, url): self.app = QApplication(sys.argv) QWebEngineView.__init__(self) self.loadFinished.connect(self._loadFinished) self.load(QUrl(url)) self.app.exec_() def _loadFinished(self, result): # This is an async call, you need to wait for this # to be called before closing the app self.page().toHtml(self.callable) def callable(self, data): self.html = data # Data has been stored, it's safe to quit the app self.app.quit() import lxml.html #定義一個網頁地址 url = 'https://xxxxxxxxxxxx' r = Render(url) result = r.html tree = lxml.html.fromstring(result)

參考下面的文章:

https://stackoverflow.com/questions/37754138/how-to-render-html-with-pyqt5s-qwebengineview