Python3爬蟲檔案持久化
阿新 • • 發佈:2018-11-07
用json.dumps()將資料儲存到檔案中中文顯示不正常
def write_to_file(content):
'''
持久化儲存到txt檔案
:param content: 字典物件
:return:
'''
# a:追加; ensure_ascii:設定json.dumps()寫入檔案中的中文正常顯示
with open('maoyanTop100.txt', 'a', encoding='utf8') as f:
f.write(json.dumps(content) + '\n')
檔案內容如下:
{"the_index": "21", "image_url": "http://p0.meituan.net/movie/[email protected]_220h_1e_1c", "title": "\u6307\u73af\u738b3\uff1a\u738b\u8005\u65e0\u654c", "actor": "\u4f0a\u83b1\u8d3e\u00b7\u4f0d\u5fb7,\u4f0a\u6069\u00b7\u9ea6\u514b\u83b1\u6069,\u4e3d\u8299\u00b7\u6cf0\u52d2", "the_time": "2004-03-15", "score": "9.2"} ...
json.dumps 序列化時對中文預設使用的ascii編碼.想輸出真正的中文需要指定ensure_ascii=False。
新增ensure_ascii=False
def write_to_file(content):
'''
持久化儲存到txt檔案
:param content: 字典物件
:return:
'''
# encoding ensure_ascii設定檔案中的中文正常顯示
with open('maoyanTop100.txt', 'a', encoding='utf8') as f:
f. write(json.dumps(content, ensure_ascii=False) + '\n')
檔案內容如下:
{"the_index": "1", "image_url": "http://p1.meituan.net/movie/[email protected]_220h_1e_1c", "title": "霸王別姬", "actor": "張國榮,張豐毅,鞏俐", "the_time": "1993-01-01", "score": "9.6"}
...