MacOS下使用爬蟲發生urllib.error.URLError

阿新 • • 發佈：2018-12-08

urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1045)>

發生了這樣的錯誤，但是在Windows10下就不會發生，查到了一許流星的部落格，發現可以這樣解決，但是隻是解決問題，並沒有理解問題發生的原因

在他部落格中

可能原因分析：

Python 2.7.9 之後引入了一個新特性
當你urllib.urlopen一個 https 的時候會驗證一次 SSL 證書 
當目標使用的是自簽名的證書時就會爆出一個 
urllib2.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate 
verify failed 		(_ssl.c:581)> 的錯誤訊息

處理方式如下：

import ssl
 
# This restores the same behavior as before.
context = ssl._create_unverified_context()
response = urllib.request.urlopen("https://no-valid-cert", context=context)

如此這樣那我在網上copy的程式碼(爬去熊貓TV英雄聯盟的主播rank)就可以完美執行

import re
from urllib import request
import ssl

context = ssl._create_unverified_context()


class Spider():
    # 匹配所有字元 [\s\S]*? 非貪婪
    url = 'https://www.panda.tv/cate/lol'
    root_pattern = '<div class="video-info">([\w\W]*?)</div>'
    name_pattern = '</i>([\w\W]*?)</span>'
    number_pattern = '<span class="video-number">([\w\W]*?)</span>'

    def __fetch_content(self):

        r = request.urlopen(Spider.url, context=context)
        # 位元組碼
        htmls = r.read()
        htmls = str(htmls, encoding='utf-8')

        return htmls

    def __analysis(self, htmls):
        root_html = re.findall(Spider.root_pattern, htmls)

        anchors = []
        for html in root_html:
            name = re.findall(Spider.name_pattern, html)
            number = re.findall(Spider.number_pattern, html)
            anchor = {'name': name, 'number': number}
            anchors.append(anchor)
        return anchors

    def __refine(self, anchors):

        # 匿名函式lambda
        l = lambda anchor: {'name': anchor['name'][0].strip(), 'number': anchor['number'][0]}
        return map(l, anchors)

    def __sort(self, anchors):

        # 預設增序
        anchors = sorted(anchors, key=self.__sort_seed, reverse=True)

        return anchors

    def __sort_seed(self, anchor):
        r = re.findall('\d*', anchor['number'])
        number = float(r[0])
        if '萬' in anchor['number']:
            number *= 10000

        return number

    def __show(self, anchors):
        for rank in range(0, len(anchors)):
            print('rank' + str(rank + 1) + ':' + anchors[rank]['name'] + ' ' + anchors[rank]['number'])

    def go(self):
        htmls = self.__fetch_content()
        anchors = self.__analysis(htmls)
        anchors = list(self.__refine(anchors))
        anchors = self.__sort(anchors)
        self.__show(anchors)


spider = Spider()
spider.go()

最後

恭喜IG今天獲得S8總冠軍

MacOS下使用爬蟲發生urllib.error.URLError

urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain

urllib.error.URLError 解析失敗

failed import open eat imp res 獲取網頁 code 解析如果在使用urllib 獲取網頁信息的時候，出現下面錯誤 urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIF

Python3.7 urllib.request https urllib.error.URLError

Python3.7 urllib.request https urllib.error.URLError 在python3.7中，請求https出現urllib.error.URLError異常，導致程式報錯；異常如下： urllib.error.URLError: urlope

Mac OS X 使用python urllib 模組通過ssl訪問報錯 urllib.error.URLError

今天是我第一次使用python的urllib.request.openurl 功能獲取網頁資訊，程式碼如下 # 獲取網路檔案from urllib.request import urlopenwith urlopen(url='https://book.douban.com/subject/1005022/

urllib.error.URLError解決

在匯入MNIST資料的時候，出現了錯誤命令：learn.datasets.load_dataset("mnist") 錯誤：urllib.error.URLError: <urlopen e

TensorFlow下載mnist錯誤：urllib.error.URLError:

錯誤的原因是下載源不對，在原始碼中修改即可，修改的位置在 anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py 修改位置為： # CVD

Macos下的xcrun error：invalid active developer path

問題的提出在macos下想使用brew安裝某個檔案，結果得到了如下錯誤資訊：執行命令： brew install mpv xcrun: error: invalid active developer path (/Library/Developer/Comm

Python3網絡爬蟲(三)：urllib.error異常

相關 log rom 函數 png win read .py 文件原作者及原文鏈接： https://blog.csdn.net/c406495762/article/details/59488464 運行平臺：Windows Python版本：Python3.x ID

macOS下安裝lxml的問題解決

python libxml2 lxml 安裝Python包python-pptx需要用到lxml，而安裝lxml報錯：fatal error: ‘libxml/xmlversion.h‘ file not found解決方法：xcode-select --install安裝完commandline

MacOS下免密碼ssh登陸

通過頻繁 span 免密使用 -c 基於依賴 oot 由於配置過程中需要頻繁的進行ssh連接到開發服務器執行命令以及通過scp命令向服務器拷貝文件等依賴ssh連接的操作。所以，配置本地環境跟服務器之間的ssh免密碼連接可以有效的提升工作效率。由於我本機已經

請求部署在 IIS7.5 上的 REST 服務的 Put/Post/Delete 操作發生 HTTP Error 405.0 - Method Not Allowed 錯誤之解決

超文本 sha 參考 handlers ron bapi .com rest 通過背景請求部署在 IIS7.5 上的 REST 服務的 Put/POST/DELETE 操作發生 HTTP Error 405.0 - Method Not Allowed 錯誤。 Issu

linux 下出現 SHELL syntax error:unexpected end of file 提示錯誤

shell syntax error 基本上可以判斷是字符異常錯誤首先註釋掉盡快會出錯的代碼，然後進行判斷如果代碼沒有問題，可以修改字符格式vim文本下，esc推出編輯模式 :set fileformat unix ，最後:wq 保存，錯誤消失。參考http://blog.csdn.net/

MacOS下mysql的卸載、重裝和root密碼重置

root hab 密碼重置 tgt bnf ks3 oot pbr swd U鋼1蓖嚼昭路2wmhttp://tushu.docin.com/sina_6264032544 強稚3下暗繞8圖核冶柯84筆http://www.docin.com/hmq4257 5v頓

在MacOS下使用sqlalchemy 連接sqlserver2012 數據庫

install scott esc 可能 etc lin not null with 在MacOS下使用sqlalchemy 連接sqlserver 數據庫前言最近有要求，要將數據庫換成巨硬家的sqlserver 2012 因為在網上苦苦找不到sqlalchemy 配置

python爬蟲(七)_urllib2：urlerror和httperror

mat 打開 urllib dfs prot 有用 esp except log urllib2的異常錯誤處理在我們用urlopen或opener.open方法發出一個請求時，如果urlopen或opener.open不能處理這個response，就產生錯誤。這裏主要說

python 爬蟲：HTTP ERROR 406

spl att sof sel cati python error line TP 解決方法：設置了Accept頭後解決了，但是還是不知道原因 headers:{ Accept:"text/html, application/xhtml+xml, */

爬蟲大概了解下爬蟲的

情況 logs 提取 .html 數據 ID -s bs4 request # 爬蟲網絡請求方式：urllib(模塊), requests(庫), scrapy, pyspider(框架)# 爬蟲數據提取方式：正則表達式, bs4, lxml, xpath, css哪種方法

mysql,密碼正確的情況下報錯，ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)

留言修改提示接下來 cat 查看騰訊 mysql密碼 securecrt 失敗關鍵詞： mysql密碼正確的情況下報錯，ERROR 1045 (28000): Access denied for user ‘root‘@‘localhost‘ (using p

解決在macOS下安裝了python卻沒有pip命令的問題【經驗總結】

安裝完成 highlight http bre del nbsp reading arc pre 可以使用brew直接安裝python，但是安裝完成了之後沒有pip命令。 pip是常用的python包管理工具，類似於java的maven。第一反應brew install

Python爬蟲教程-09-error 模塊

read tps exception url exceptio from 失敗 mark err Python爬蟲教程-09-error模塊今天的主角是error，爬取的時候，很容易出現錯，所以我們要在代碼裏做一些，常見錯誤的處，關於urllib.error URLErr

MacOS下使用爬蟲發生urllib.error.URLError

可能原因分析：

處理方式如下：

恭喜IG今天獲得S8總冠軍

相關推薦