decode解碼報錯UnicodeDecodeError: 'gb2312' codec can't decode byte 0x8f in position 6018: illegal multib
阿新 • • 發佈:2019-02-07
python抓取網頁後用decode解碼,報錯資訊如下:
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
html = html.decode("gb2312")
UnicodeDecodeError: 'gb2312' codec can't decode byte 0x8f in position 6018: illegal multibyte sequence
初步推測是網頁中有部分數值是錯誤的或者說不是採用<meta>標籤中charset顯示的顯示的編碼,那麼可以通過設定‘decode’函式的第二引數——‘errors’來解決這一問題
舉例:
html = html.decode("gb2312",errors = 'ignore')
截圖:
注意:不要把‘ignore’輸成了‘ignone’,否則會報錯!
報錯資訊:
LookupError: unknown error handler name 'ignone' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "D:\Personal\Desktop\測試.py", line 8, in <module> html = rep.read().decode("gb2312",errors="ignone") LookupError: decoding with 'gb2312' codec failed (LookupError: unknown error handler name 'ignone')
截圖: