python 爬蟲時l兩種情況下設定ip代理proxy的方法(requests,selenium(chrome,phantomjs)
阿新 • • 發佈:2018-11-21
requests庫時,設定代理的方法:
import requests proxy = '127.0.0.1:9743' proxies = { 'http': 'http://' + proxy, 'https': 'https://' + proxy, } try: response = requests.get('http://httpbin.org/get', proxies=proxies) print(response.text) except requests.exceptions.ConnectionError as e: print('Error', e.args)
selenium 模組時,以Chrome瀏覽器為例:
from selenium import webdriver
proxy = '127.0.0.1:9743'
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=http://' + proxy)
chrome = webdriver.Chrome(chrome_options=chrome_options)
chrome.get('http://httpbin.org/get')
selenium 模組時,以phantomJS瀏覽器為例:
from selenium import webdriver
service_args = [
'--proxy=127.0.0.1:9743',
'--proxy-type=http'
]
browser = webdriver.PhantomJS(service_args=service_args)
browser.get('http://httpbin.org/get')
print(browser.page_source)