python2 unicode str

阿新 • • 發佈：2019-01-19

unicode

unicode是一種編碼方案， utf-8是unicode的一種實現方式。

Python2 編碼

In [1]: a = '啊哈哈'
In [2]: a
Out[2]: '\xe5\x95\x8a\xe5\x93\x88\xe5\x93\x88'
In [4]: type(a)
Out[4]: str
In [5]: len(a)
Out[5]: 9
In [6]: b = u'姚赫赫'
In [7]: type(b)
Out[7]: unicode
In [8]: len(b)
Out[8]: 3
In [9]: a.decode('utf-8')
Out[9]: u'\u554a\u54c8\u54c8' 

In [10]: b
Out[10]: u'\u59da\u8d6b\u8d6b'

In [11]: b.encode('utf-8')
Out[11]: '\xe5\xa7\x9a\xe8\xb5\xab\xe8\xb5\xab'

In [12]: c = '姚赫赫'

In [13]: c
Out[13]: '\xe5\xa7\x9a\xe8\xb5\xab\xe8\xb5\xab'

In [14]: import sys

In [15]: sys.getdefaultencoding()
Out[15]: 'ascii'

In [16]: b + c
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-16 
-c6b7c7e5694f> in <module>()
----> 1 b + c

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128)

In [17]: import sys

In [18]: relaod(sys)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-18-f73449e725b6> in <module>()
----> 1 relaod(sys)

NameError: name ' 
relaod' is not defined

In [19]: reload(sys)
<module 'sys' (built-in)>

In [20]: sys.setdefaultencoding('utf-8')

In [21]: b + c
Out[21]: u'\u59da\u8d6b\u8d6b\u59da\u8d6b\u8d6b'

In [22]: type(b + c)
Out[22]: unicode

python2 中a='啊哈哈', a的型別是str, 是編碼後的位元組序列。a的長度是位元組數；而b的型別是unicode(儲存文字字串), b的長度是字元數。

相互轉化

str –>decode(‘utf-8’) –> unicode
unicode –>encode(‘utf-8’)–> str
寫入檔案的時候str型別的可以直接寫入，unicode型別的必須encode之後寫入。

python2 unicode str

unicode unicode是一種編碼方案， utf-8是unicode的一種實現方式。 Python2 編碼 In [1]: a = '啊哈哈' In [2]: a Out[2]: '

python2 字串unicode str編碼解碼問題

若在python2檔案中硬編碼一箇中文字串（python2檔案編碼設為utf-8），其型別為一個str變數，可以使用decode('utf-8')方法將其轉化為unicode變數 a = '測試' type(a) Out[1]: str a.decode('utf-8') Out

Python原始字串（Raw String）/Unicode/str

來源原始字串是用來解決正則表示式和ASCII字元之間的衝突而產生的技術。例如正則表示式\b表示匹配單詞邊界，而ASCII字元\b表示退格，如果正則表示式要匹配退格，就要使用雙重轉義，如\\b。為了簡化過多的轉義符，就引入了原始字串，例如字串'\\b'可用r'

scrapy框架之post傳輸數據錯誤：TypeError: to_bytes must receive a unicode, str or bytes object, got int

pos data 簡單錯誤 soc spi ack erro http 錯誤名：TypeError: to_bytes must receive a unicode, str or bytes object, got int 錯誤翻譯：類型錯誤：to_bytes必須接收

python2中將Unicode編碼的中文和str相互轉換

在python2x版本中關於中文漢字轉換 1.中文------字串格式 >>> s = '漢字' >>> type(s) <type 'str'> 預設漢字型別是：str 列印 s 時會顯示如下內容：反斜槓和字母組合，一個漢字對應兩組這樣的組

python2.x中unicode字串轉化為str字串

首先理解編碼encode與解碼decode 很多介面返回的資料都是unicode字串，但是我們需要轉化成str，這樣才能進行json.loads()的反序列化操作。（雖然經過我證實，有時候unicode字串也是可以直接進行反序列化操作的，但是老師說這樣更嚴謹？）下面是將

Python2中unicode轉str

在Python2命令列中： a="中文" a ‘\xd6\xd0\xce\xc4’ b=u"中文" b u’\u4e2d\u6587’ b.encode('gb18030') ‘\xd6\xd0\xce\xc4’ 所以，unicode轉str只

python蛋疼的編碼decode、encode、unicode、str、byte的問題都在這了

機器 .com mage byte 一個 blog 字符同時 nbsp 　　相信很多人和我一樣，被python蛋疼的編碼問題糾纏不清，比如下面的　　私以為出現這種錯誤的原因還是對一些基本的編解碼概念不夠熟悉，下面就說說我的理解：　　首先python剛出來的時候uni

python2.7運行出現的Unicode equal comparison failed to convert both arguments to Unicode - interpreting

unicode weibo shu 猿團 arguments tts www p s 2.7 闖托諒擠糯亟粕徊屎狄崩托醒悄http://jz.docin.com/tts5863 檬嚎比白妒芽旨形肛葡成http://jz.docin.com/sina_5848623411

Python中的str與unicode處理方法

text pre def 包括 unicode編碼 response 會有 determine 展示 Python中的str與unicode處理方法 2015/03/25 · 基礎知識 · 3 評論 · Python 分享到：42 原文出處： liuaiqi627

Python2 處理 Unicode 字符串的規則

unicode python2 round () -c 應該 nbsp mic fff 在 Python2 中處理 Unicode 字符串，需遵循如下規則： 1. 程序中的字符串要加前綴 u 2. 不要用 str()，而應該用 unicode() 作為字符串轉換函數。不

str和unicode類

解碼 python color 們的 utf8 int 字符疑問 decode 首先明確一點，我們編輯好一段文本，python並不知道我們的文本是以什麽格式編碼的。如果是純英文字符還好說，如果這段代碼中有漢字，則會報錯了。所以我們要顯式的告訴python此文本的編碼格式

python unicode to str and str to unicode

spa cap cme lse PE static not style code @staticmethod def unicode2str(p_unicode): v = p_unicode.encode(‘unicode-escape‘

python 字符串編碼 str和unicode 區別以及相互轉化 decode('utf-8') encode('utf-8')

encode unicode 字符串 code com bubuko src 區別分享圖片 python 字符串編碼 str和unicode 區別以及相互轉化 decode('utf-8') encode('utf-8'

一問讀懂ASCII、Unicode、Utf-8以及Python2編碼問題

最近用到Python2.7處理中文遇到了很多坑，查閱了一些資料後終於基本弄清楚了基本編碼問題，寫下此文作為總結。最好的學習資料是維基百科，不過百科裡寫的比較囉嗦，本文精簡地梳理了這些核心概念。看完本文後，對某個概念仍然不清楚可以繼續閱讀對應百科詞條。 ASCII和Unico

str 和unicode的互轉

//str 和unicode的互轉 #coding=utf-8 def to_unicode(unicode_or_str): if isinstance(unicode_or_str,str): value=unicode_or_str.decode('utf-

python str轉unicode和unicode轉str

str轉Unicode： strtypeE.decode('gbk') unicode轉str： unicodetypeE.encode('gbk') 有時程式會報這樣的錯誤： Unicode equal comparison failed to convert both a

python2中unicode物件與位元組流(bytes)物件的編碼問題

在網上看了三天的python2的編碼問題，從最初的暈暈乎乎到後來的徹底暈倒 ( 畢竟寫部落格的人不一定理解，理解的不一定寫的對，寫的對的不一定表達的清楚 ) 又到後來的明瞭，如果這篇文章有幸被你讀到了，希望沒能誤導你首先推薦別人家的文章 http://www.cn

python將unicode和str互相轉化

問題一：將u'\u810f\u4e71'轉換為'\u810f\u4e71' 方法： s_unicode = u'\u810f\u4e71' s_str = s_unicode.encode('unicode-escape').decode('string_escape'

把unicode編碼的十六進位制字串轉換為漢字（Python2.7）

#小端位元組序轉為大端位元組序 def little2big_endian(hex_string): big_endian_str = '' #定義一個空字串 for i in range(len

python2 unicode str

unicode

Python2 編碼

相互轉化

相關推薦