python unicode中文輸出檔案錯誤解決
阿新 • • 發佈:2019-02-11
問題描述:在用python中的json包解析json字串時,若遇到欄位值為中文,直接print在螢幕上沒問題,但是重定向到檔案或者寫檔案時,出UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)。
json檔案內容:
<pre name="code" class="python">jsonstr.txt:
{"QN48":"tc_39282f82912aa6cc_14ef7fe562a_6f5d","city":"滄州","firstCategory":null,"secondCategor":null,"url":"http://touch.piao.qunar.com/touch/detail.htm?id=8928&in_track=t_qudaoyy_tuijian_menpiao&bd_source=xiaomiliulanqi"} {"QN48":"tc_038c9119e228c705_14dbc12f7e4_dbe4","city":"北京","firstCategory":"親子","secondCategor":"動植物園","url":null} {"QN48":"tc_48a929ab20cade19_147484fd593_f7ba","city":"廣州","firstCategory":"親子","secondCategor":"動植物園","url":null} {"QN48":"tc_884a929ab20cade19_147484fd593_f7ba","city":"","firstCategory":"親子","secondCategor":"動植物園","url":""}
jsontest.py:
<pre name="code" class="python">#!/usr/bin/env python
# -*- coding: utf-8 -*-
import json
with open('jsonstr.txt','rt') as f:
for line in f:
obj = json.loads(line)
city=obj['city']
print city,type(city)
<span style="font-family: Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255);">執行,螢幕顯示:</span>
在python中寫檔案也出一樣的錯。
問題原因:
loads方法返回了原始的物件,但是仍然發生了一些資料型別的轉化(見下圖)。json中的string型別到了python中成了unicode型別。
python在寫檔案時試圖將unicode的字串轉成ascii碼字元,導致出錯,而英文字串不出錯。
問題解決:
將unicode型別轉換成string型別的字串,但是卻不能直接用str()函式,用str()還是會出一樣的錯。這裡需要將字串的編碼轉換一下,在unicode字串後面encode("utf-8"):
#!/usr/bin/env python # -*- coding: utf-8 -*- import json with open('jsonstr.txt','rt') as f: for line in f: obj = json.loads(line) city=obj['city'].encode("utf-8") print city,type(city)
結果如下:
這樣就能順利寫檔案了。
參考:
http://www.cnblogs.com/coser/archive/2011/12/14/2287739.html