python 2 encode and decode
https://docs.python.org/2/howto/unicode.html
a Unicode string is a sequence of code points, which are numbers from 0 to 0x10ffff. This sequence needs to be represented as a set of bytes (meaning, values from 0–255) in memory. The rules for translating a Unicode string into a sequence of bytes are called an encoding
一個Unicode字符串是一個序列的編碼點,是從0至0x10ffff的數值。這個序列需要在內存中表示為一組字節(意味著從0到255的值)。將一個Unicode字符串翻譯成一個字節序列被稱為編碼。
UTF-8 is probably the most commonly supported encoding. UTF stands for “Unicode Transformation Format”, and the ‘8’ means that 8-bit numbers are used in the encoding
utf-8 是最普遍支持的編碼。utf 表示“統一編碼轉換格式”, 8 表示8位編碼統一編碼轉換格式
Python’s 8-bit strings have a .decode([encoding], [errors])
method that interprets the string using the given encoding
Python的8位字符串有一個解碼([編碼],[錯誤])方法,它使用給定的編碼來解釋字符串
The unicode()
constructor has the signature unicode(string[, encoding, errors])
. All of its arguments should be 8-bit strings. The first argument is converted to Unicode using the specified encoding; if you leave off the encoding
argument, the ASCII encoding is used for the conversion, so characters greater than 127 will be treated as errors
unicode()構造體有個標誌函數unicode(string[, encoding, errors]).所有參數都應是8比特字符串。使用指定的編碼將第一個參數轉換為Unicode;如果去掉編碼參數,則使用ASCII編碼進行轉換,因此大於127的字符將被視為錯誤。
python 2 encode and decode