1. 程式人生 > 其它 >安裝cuda和cudnn過程中踩的坑

安裝cuda和cudnn過程中踩的坑

技術標籤:問題記錄tensorflowcudacudnn

因為資料量太大要用tensorflow-gpu了,然後我打開了https://www.tensorflow.org/install/source_windows

我安裝的tensorflow2.2,然後找到這麼個東西

啊,也就是說,我需要安裝cuda10.1和cudnn7.6(這個的GPU驅動最低要求418.X,我的1080算力6.1,安裝了457.x驅動,沒有問題~)

於是我進入cuda官網和cudnn官網,下載了cuda和這麼個東西

安裝完驗證一下,cmd輸入ipython:

import tensorflow as tf
tf.config,list_physical_devices('GPU')

結果如下:

In [1]: import tensorflow as tf
2020-12-08 09:28:33.192053: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll

In [2]: tf.config.list_physical_devices('GPU')
2020-12-08 09:28:38.076395: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-12-08 09:28:38.102058: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.7845GHz coreCount: 20 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 298.32GiB/s
2020-12-08 09:28:38.108714: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-12-08 09:28:38.150620: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-12-08 09:28:38.195522: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-12-08 09:28:38.205895: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-12-08 09:28:38.257251: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-12-08 09:28:38.281755: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-12-08 09:28:38.285713: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found
2020-12-08 09:28:38.288173: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Out[2]: []

嗯???怎麼會沒有cudnn64_7.ddl???我看了一眼我cudnn的ddl,發現我居然安裝成8了(手殘吧可能),於是我再次開啟cudnn,確定自己點了個這麼個東西。

裝完之後新增環境變數,再次去驗證安裝。

結果GPU居然是空列表???剛剛那個起碼還找到了裝置只是用不了,這個徹底空了???

好吧,我又去看了眼自己的cudnn,發現每個目錄下只有一個檔案(剛剛的版本8每個目錄下都有好幾個檔案呢)。

想著會不會是我沒有重新啟動ipython或者這個檔案有相對路徑之類的問題。

於是乎,我先把cudnn的檔案直接copy到cuda目錄下的同名目錄下,刪去了剛剛新增的環境變數。

然後直接去ipython上試了試:

直接報錯

重新開啟ipython:

In [1]: import tensorflow as tf
2020-12-08 09:46:13.198478: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll

In [2]: tf.config.list_physical_devices('GPU')
2020-12-08 09:46:16.595417: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-12-08 09:46:16.615413: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1080 computeCapability: 6.1
coreClock: 1.7845GHz coreCount: 20 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 298.32GiB/s
2020-12-08 09:46:16.620825: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-12-08 09:46:16.625676: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-12-08 09:46:16.630241: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-12-08 09:46:16.633591: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-12-08 09:46:16.638724: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-12-08 09:46:16.643290: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-12-08 09:46:16.660019: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-12-08 09:46:16.662985: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
Out[2]: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

成功解決!撒花~