1. 程式人生 > >python unicode中文輸出檔案錯誤解決

python unicode中文輸出檔案錯誤解決

問題描述:在用python中的json包解析json字串時,若遇到欄位值為中文,直接print在螢幕上沒問題,但是重定向到檔案或者寫檔案時,出UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)。

json檔案內容:

<pre name="code" class="python">jsonstr.txt:
{"QN48":"tc_39282f82912aa6cc_14ef7fe562a_6f5d","city":"滄州","firstCategory":null,"secondCategor":null,"url":"http://touch.piao.qunar.com/touch/detail.htm?id=8928&in_track=t_qudaoyy_tuijian_menpiao&bd_source=xiaomiliulanqi"}
{"QN48":"tc_038c9119e228c705_14dbc12f7e4_dbe4","city":"北京","firstCategory":"親子","secondCategor":"動植物園","url":null}
{"QN48":"tc_48a929ab20cade19_147484fd593_f7ba","city":"廣州","firstCategory":"親子","secondCategor":"動植物園","url":null}
{"QN48":"tc_884a929ab20cade19_147484fd593_f7ba","city":"","firstCategory":"親子","secondCategor":"動植物園","url":""}
jsontest.py:
<pre name="code" class="python">#!/usr/bin/env python
# -*- coding: utf-8 -*-
import json
with open('jsonstr.txt','rt') as f:
        for line in f:
                obj = json.loads(line)
                city=obj['city']
                print city,type(city)


<span style="font-family: Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255);">執行,螢幕顯示:</span>



至此,一切正常。但是當把輸出寫到檔案中時,



在python中寫檔案也出一樣的錯。

問題原因:

loads方法返回了原始的物件,但是仍然發生了一些資料型別的轉化(見下圖)。json中的string型別到了python中成了unicode型別。

python在寫檔案時試圖將unicode的字串轉成ascii碼字元,導致出錯,而英文字串不出錯。

問題解決:

將unicode型別轉換成string型別的字串,但是卻不能直接用str()函式,用str()還是會出一樣的錯。這裡需要將字串的編碼轉換一下,在unicode字串後面encode("utf-8"):

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import json
with open('jsonstr.txt','rt') as f:
        for line in f:
                obj = json.loads(line)
                city=obj['city'].encode("utf-8")
                print city,type(city)

結果如下:


這樣就能順利寫檔案了。

參考:

http://www.cnblogs.com/coser/archive/2011/12/14/2287739.html