新手踩坑tensorflow上執行模型
一、rom ._ellip_harm_2 import _ellipsoid, _ellipsoid_norm ImportError: cannot import name '_ellipsoid'
Python中可能會遇到 cannot import name ‘XXX’ 錯誤, 其實這有可能出現再模組匯入的順序問題上, 比如:在A檔案頭執行到語句 from B import XXX ,程式馬上就會轉到B檔案中去,從頭到尾順序尋找B檔案中的XXX函式,而A檔案就暫停執行,直到把XXX函式複製到記憶體中,但B檔案中的檔案頭可能也有匯入, 如果B檔案頭中又匯入了A檔案中的函式,由於XXX函式還沒有被複制。所以於A檔案因為暫停執行而無法匯入,就會出現上面的錯誤了。
二、libstdc++.so.6: version `CXXABI_1.3.9' not found (required by..... )
ibstdc++.so.6在系統中存在於/usr/lib/libstdc++.so.6 或者/usr/lib/x86_64-linux-gnu/libstdc++.so.6。導致這個問題的出現可能是你在別的庫中(例如anaconda)也存在該動態庫檔案。
1、檢視
strings /usr/lib/libstdc++.so.6 | grep 'CXXABI'
或者
strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep 'CXXABI'
可以看到
CXXABI_1.3
CXXABI_1.3.1
CXXABI_1.3.2
CXXABI_1.3.3
CXXABI_1.3.4
CXXABI_1.3.5
CXXABI_1.3.6
CXXABI_1.3.7
CXXABI_1.3.8
CXXABI_1.3.9
CXXABI_TM_1
CXXABI_FLOAT128
有CXXABI_1.3.9這一項,而同樣的方式檢視anaconda3/lib/下libstdc++.so.6檔案
strings anaconda3/lib/libstdc++.so.6 | grep 'CXXABI'
最高才存在CXXABI_1.3.7
2、複製動態庫
#刪除原來的libstdc++.so.6
sudo rm -rf anaconda3/lib/libstdc++.so.6
#拷貝新的動態庫檔案,注意自己的是哪個版本
sudo cp /usr/lib/libstdc++.so.6.0.21 /home/ubuntu/anaconda3/lib/
3、建立軟連線
cd anaconda3/lib/
sudo chmod +r libstdc++.so.6.0.21
sudo ln -sf libstdc++.so.6.0.21 libstdc++.so.6
sudo ldconfig
//////////////////////////////////////////////////////////////////上述原因具體事情,具體分析。
/home/jdmdx/anaconda3/envs/tensorflow/bin/python "/home/jdmdx/tensorflow code/MobileNetV2-master/train_mobilenetv2.py"
2018-04-18 01:01:04.704523: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-04-18 01:01:04.744722: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Feature: image/encoded (data type: string) is required but could not be found.
2018-04-18 01:01:04.750963: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Feature: image/encoded (data type: string) is required but could not be found.
2018-04-18 01:01:04.751038: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Feature: image/encoded (data type: string) is required but could not be found.
2018-04-18 01:01:04.751253: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Feature: image/encoded (data type: string) is required but could not be found.
WARNING:tensorflow:From /home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
[*] Try to load trained model...
[*] Reading checkpoints...
Traceback (most recent call last):
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1327, in _do_call
return fn(*args)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1312, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1420, in _call_tf_sessionrun
status, run_metadata)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [1,1,1280,9] rhs shape= [1,1,1280,10]
[[Node: save/Assign_24 = Assign[T=DT_FLOAT, _class=["loc:@mobilenetv2/logits/w"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mobilenetv2/logits/w, save/RestoreV2:24)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/jdmdx/tensorflow code/MobileNetV2-master/train_mobilenetv2.py", line 118, in <module>
main()
File "/home/jdmdx/tensorflow code/MobileNetV2-master/train_mobilenetv2.py", line 86, in main
could_load, step = load(sess, saver, args.checkpoint_dir)
File "/home/jdmdx/tensorflow code/MobileNetV2-master/train_mobilenetv2.py", line 18, in load
saver.restore(sess, os.path.join(checkpoint_dir, ckpt_name))
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1775, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 905, in run
run_metadata_ptr)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1140, in _run
feed_dict_tensor, options, run_metadata)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
run_metadata)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [1,1,1280,9] rhs shape= [1,1,1280,10]
[[Node: save/Assign_24 = Assign[T=DT_FLOAT, _class=["loc:@mobilenetv2/logits/w"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mobilenetv2/logits/w, save/RestoreV2:24)]]
Caused by op 'save/Assign_24', defined at:
File "/home/jdmdx/tensorflow code/MobileNetV2-master/train_mobilenetv2.py", line 118, in <module>
main()
File "/home/jdmdx/tensorflow code/MobileNetV2-master/train_mobilenetv2.py", line 81, in main
saver = tf.train.Saver()
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1311, in __init__
self.build()
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1320, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1357, in _build
build_save=build_save, build_restore=build_restore)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 809, in _build_internal
restore_sequentially, reshape)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 470, in _AddRestoreOps
assign_ops.append(saveable.restore(saveable_tensors, shapes))
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 162, in restore
self.op.get_shape().is_fully_defined())
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 281, in assign
validate_shape=validate_shape)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 61, in assign
use_locking=use_locking, name=name)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3290, in create_op
op_def=op_def)
File "/home/jdmdx/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1654, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [1,1,1280,9] rhs shape= [1,1,1280,10]
[[Node: save/Assign_24 = Assign[T=DT_FLOAT, _class=["loc:@mobilenetv2/logits/w"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mobilenetv2/logits/w, save/RestoreV2:24)]]
\\\\\\\\\\\\\\\\\\\\\\\\\\\不知道什麼錯誤,在sess上加入了sess=tf.Session()導致mobilenetv2訓練卡死
IndexError: list index out of range
在config檔案中出現這種錯誤,是因為輸入引數的時候沒有輸入全。
Tensorflow FIFOQueue '_4_batch_join/fifo_queue' is closed and has insufficient elements
在跑mobileNet模型時,出現的錯誤,原因是在sess.run之前 沒有寫sess = tf.InteractiveSession()
protobuf 轉換python程式碼時發生 Expected "required", "optional", or "repeated".錯誤解決方法
Google Protocol Buffers 簡稱 Protobuf,它提供了一種靈活、高效、自動序列化結構資料的機制,可以聯想 XML,但是比 XML 更小、更快、更簡單。僅需要自定義一次你所需的資料格式,然後使用者就可以使用 Protobuf 編譯器自動生成各種語言的原始碼,方便的讀寫使用者自定義的格式化的資料。與語言無關,與平臺無關,還可以在不破壞原資料格式的基礎上,依據老的資料格式,更新現有的資料格式。
在很多谷歌開源的程式中都大部分用到了protobuf,比如最新開源出來的object_detection中就存在這樣的定義。最近想著編譯一下這個目標檢測識別的程式,發現protobuf居然報了個錯誤,錯誤碼即如下:
.proto:386:3: Expected "required", "optional", or "repeated".
針對這個問題,才發現自己電腦上的protobuf版本是2.5版本(通過命令:protoc --version進行檢視),可能原因還是新版本又更新了一些引數,查看了下需求,也發現需要用到2.6版本,所以只能重新進行編譯protobuf.高的版本。
於是從https://github.com/google/protobuf/releases?after=v2.6.1下載2.6的資料包。
直接按照傳統的 ./configure, make -j4 , sudo make install等方式來進行安裝。
這時候還需要一個步驟,需要在/etc/profile中配置一下引數:
[plain] view plain copy
export LD_LIBRARY_PATH=/usr/local/lib
不然就會報如下錯誤:
protoc: error while loading shared libraries: libprotoc.so.9: cannot open shared object file: No such file or directory
完成安裝後,再重新進行進行程式碼轉換,即可成功。
[html] view plain copy
protoc ./object_detection/protos/*.proto --python_out=.
dscbigdata-Lenovo-Product:~/work/tensorflow/models-master$
這時候,對應的python檔案已經生成。
在object_detection中的mobilev1訓練時,出現failed to find any matching files for .......................(ckpt儲存地址)
此訓練使用遷移學習進行的訓練,需要下載別人的訓練節點進行訓練加速。
/////////////////////////////////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////////////////////
。///////////////////////////
///////////////////////
////////////////////
/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
ubuntu16.04電腦異常關機後分辨率出現的特別低,設定裡解析度只有800*600。查資料發現是因為異常關機導致驅動出現錯誤。
解決方法為:在設定中選擇更新:
然後選擇驅動裝置:
由於上傳的電腦沒有GPU,上面顯示沒有。有GPU的 選擇英偉達驅動進行更新,重啟計算機即可。。