python3 讀檔案編碼錯誤
阿新 • • 發佈:2018-11-22
Exception in thread Thread-4: Traceback (most recent call last): File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/usr/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/home/yangguang/machineLearning/learn_machineLearning/Tensorflow_learning/cnn_own/data_prepare/src/tfrecord.py", line 129, in _process_image_files_batch image_buffer, height, width = _process_image(filename, coder) File "/home/yangguang/machineLearning/learn_machineLearning/Tensorflow_learning/cnn_own/data_prepare/src/tfrecord.py", line 71, in _process_image image_data = f.read() File "/usr/lib/python3.6/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte INFO:root:2018-05-25 11:11:55.104869: Finished writing all 4800 images in data set.
執行程式碼時發現以上錯誤,原因是read的檔案是binary格式,解決辦法:
with open(filename, 'r') as f: image_data = f.read()
將read方法從 'r' 改成 'rb ',b表示binary,即可
tensorflow裡的BYTELIST 在python2裡可以直接傳入字串,在python3裡則需要先轉成bytes型別:
text_b = bytes(text, encoding='utf-8')