系統安裝情況以及深度學習環境搭建

阿新 • • 發佈：2018-12-03

1.戴爾AL安裝Ubuntu16.04問題總結

1）.找不到固態硬碟

由於dell電腦的出廠設定，在BIOS裡面都統一把硬碟模式設為RAID ON，但這種模式下可能會導致不能正確識別或者完全發揮處SSD的效能。下面是把RAID模式更改位AHCI的方法。

進入wins之後，按下WIN鍵+R鍵，輸入msconfig,進入如下引導介面，安全引導打鉤，最小打鉤，如下所示

之後，點選重新啟動；在啟動之後，按下F2鍵進入BIOS依次找到Advanced介面，選中SATA operation,並按下回車鍵，選擇AHCI模式，這裡提示要重新裝系統，不用理會，點選YES即可，然後按F10，選擇YES，重啟電腦。重啟之後，進入windows的安全模式，再次按下WIN鍵和R鍵，並輸入msconfig，在引導介面，把之前的安全引導的勾全部去掉，

然後點選下面的確定，最後選擇重新啟動。開機成功，證明我們開啟了AHCI模式。

2）觸控式螢幕驅動不對

sudo su
echo 'blacklist i2c_hid' >> /etc/modprobe.d/blacklist.conf
depmod -a
update-initramfs -u

and reboot

3）黑屏

安裝完ubuntu16.04之後，可能會出現黑屏的現象，解決方法：

一、

開機在系統選擇時按”e”進入grub的編輯模式
找到“quite splash”並在後面加上對nvidia顯示卡的驅動支援”nomodeset”

按 Ctrl+X或F10啟動系統
以管理員許可權編輯/etc/default/grub
找到GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash”，修改為：GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash nomodeset”
更新grub：sudo update-grub，並重新開機

二、安裝完系統後，可能會進入系統，進入之後執行如下

sudo nano /etc/default/grub
找到這一行：
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"修改為GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nomodeset"
貌似Ctrl+o, ctrl +x後（具體看下面提示）更新GRUB，輸入：sudo update-grub

環境搭建

1.安裝依賴包

sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler  
  
sudo apt-get install --no-install-recommends libboost-all-dev  
  
sudo apt-get install libopenblas-dev liblapack-dev libatlas-base-dev  
  
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev  
  
sudo apt-get install git cmake build-essential

2.安裝顯示卡驅動

由於16.04預設安裝的是nouveau顯示卡驅動，而它不能用於CUDA的，需要解除安裝並重新安裝

1）首先禁用Ubuntu16.04自帶的顯示卡驅動nouveau，禁用方法就是在 /etc/modprobe.d/blacklist-nouveau.conf 檔案中新增一條禁用命令，如下

sudo gedit /etc/modprobe.d/blacklist-nouveau.conf

開啟後發現該檔案中沒有任何內容，寫入：

    blacklist nouveau  
    options nouveau modeset=0

儲存後關閉檔案，注意此時還需執行以下命令使禁用 nouveau 真正生效：

 sudo update-initramfs -u

檢測禁用生效了沒，使用如下

lsmod | grep nouveau

下面就開始重灌顯示卡驅動：

我的驅動下載的是NVIDIA_Linux-x86_64-415.13.run,放到自己的使用者名稱home目錄下

下面進入文字模式，ctrl+alt+f1,在文字模式下關閉桌面服務：sudo service lightdm stop,（如果要下載之前安裝的英偉達驅動可以使用sudo apt-get purge nvidia* ）,進入到存放驅動的目錄下，執行如下命令：

sudo sh NVIDIA_Linux-x86_64-415.13.run --no-opengl-libs    #run檔案的檔名根據自己下的檔名修改，預設是我提供的檔案

期間出現如下：

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 387.26?
(y)es/(n)o/(q)uit: y

do you want to run nvidia-xconfig？
(y)es/(n)o/(q)uit: n

Install the CUDA 9.1 Samples?
(y)es/(n)o/(q)uit: n

Install the CUDA 9.1 Toolkit?
(y)es/(n)o/(q)uit: n

然後重新啟動系統reboot就可以了，在此驅動安裝完畢。使用如下命令nvidia-settings和nvidia-smi來驗證。

下面安裝cuda10(通過命令nvidia-smi來檢視到的)，下載之，名字叫cuda_10.0.130_410.48_linux.run。

執行如下

sudo sh cuda_9.1.85_387.26_linux.run --no-opengl-libs

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 387.26?  
(y)es/(n)o/(q)uit: n  
  
Install the CUDA 9.1 Toolkit?  
(y)es/(n)o/(q)uit: y  
  
Enter Toolkit Location  
 [ default is /usr/local/cuda-9.1 ]:   
  
Do you want to install a symbolic link at /usr/local/cuda?  
(y)es/(n)o/(q)uit: y  
  
Install the CUDA 9.1 Samples?  
(y)es/(n)o/(q)uit: y  
  
Enter CUDA Samples Location  
 [ default is /home/ccem ]:   
  
Installing the CUDA Toolkit in /usr/local/cuda-9.1 ...  
Installing the CUDA Samples in /home/ccem ...  
Copying samples to /home/ccem/NVIDIA_CUDA-9.1_Samples now...  
Finished copying samples.  
  
===========  
= Summary =  
===========  
  
Driver:   Not Selected  
Toolkit:  Installed in /usr/local/cuda-9.1  
Samples:  Installed in /home/ccem  
  
Please make sure that  
 -   PATH includes /usr/local/cuda-9.1/bin  
 -   LD_LIBRARY_PATH includes /usr/local/cuda-9.1/lib64, or, add /usr/local/cuda-9.1/lib64 to /etc/ld.so.conf and run ldconfig as root  
  
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.1/bin  
  
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.1/doc/pdf for detailed information on setting up CUDA.  
  
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 9.1 functionality to work.  
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:  
 sudo <CudaInstaller>.run -silent -driver  
  
Logfile is /tmp/cuda_install_36731.log

如果出現如下，則說明缺少依賴庫

Installing the CUDA Toolkit in /usr/local/cuda-9.1 …   
Missing recommended library: libGLU.so   
Missing recommended library: libX11.so   
Missing recommended library: libXi.so   
Missing recommended library: libXmu.so
則對應安裝依賴庫
sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev

安裝完後，配置cuda的環境變數下面是為當前使用者配置

sudo gedit ~/.bashrc  
export PATH=/usr/local/cuda/bin:$PATH     #/usr/local/cuda和/usr/local/cuda-10.0是同一個資料夾，前者是後者的映象  
  
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

source ~/.bashrc使之生效；下面是為所有使用者配置環境變數

$ sudo vim /etc/profile

export PATH=/usr/local/cuda/bin:${PATH} # 必須
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:${LD_LIBRARY_PATH} # 非必須，可以用前面介紹的方式

檢驗CUDA 是否安裝成功，輸入：

cd /usr/local/cuda-10.0/samples/1_Utilities/deviceQuery  
  
sudo make  
  
./deviceQuery

下面是安裝cuDNN v7，我下載的版本是cudnn-10.0-linux-x64-v7.4.1.5.tgz。把他解壓到任何路徑，我的解壓路徑位/usr/local/cudnn下面，解壓後的資料夾名為cuda，資料夾中包含兩個資料夾：一個為include，另一個為lib64。將解壓後的檔案中的lib64資料夾關聯到環境變數中。這一步很重要，sudo gedit ~/.bashrc,輸入如下內容

export LD_LIBRARY_PATH=/your/path/to/cudnn/lib64:$LD_LIBRARY_PATH

其中/your/path/to/cudnn/lib64是指.tgz解壓後的檔案所在路徑中的lib64資料夾。儲存，退出並source一下，再重啟一下Terminal（終端），該步驟可以成功的配置cuDNN的Lib檔案，配置cuDNN的最後一步就是將解壓後的cuDNN資料夾（一般該檔名為cuda）中的include資料夾（/your/path/to/cudnn/include）中的cudnn.h檔案拷貝到/usr/local/cuda/include中，由於進入了系統路徑，因此執行該操作時需要獲取管理員許可權。

   cd cuda/include
   sudo cp *.h /usr/local/cuda/include/

之後，再重置cudnn.h檔案的讀寫許可權： sudo chmod a+r /usr/local/cuda/include/cudnn.h，至此，cuDNN的配置就全部安裝完成了。

下面安裝tensorflow，我選擇的原始碼安裝方式，參考https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/get_started/os_setup.md以及https://blog.csdn.net/a446712385/article/details/79149977

在終端輸入以下命令：

$ git clone --recurse-submodules https://github.com/tensorflow/tensorflow

–recurse-submodules 引數是必須得, 用於獲取 TesorFlow 依賴的 protobuf 庫.放入home目錄下，下面下載Bazel並安裝之

下載的名字為bazel-0.15.2-installer-linux-x86_64.sh

安裝其他依賴：

sudo apt-get update
sudo apt-get install python-pip python-numpy swig python-dev python-wheel sudo apt-get install pkg-config zip g++ zlib1g-dev unzip

sudo apt-get install default-jdk

//For Python 2.7:
sudo apt-get install python-numpy swig python-dev python-wheel

//For Python 3.x:
$ sudo apt-get install python3-numpy swig python3-dev python3-wheel

在這裡使用python3.

export PATH=/usr/bin:$PATH,這是python環境變數的配置

./bazel-0.3.2-installer-linux-x86_64.sh --user
將執行路徑output/bazel 新增到$PATH環境變數後bazel工具就可以使用了，環境變數配置
~/.bashrc下面輸入
export PATH=$HOME/bin:$PATH

下面去配置tensorflow，

進入到它的資料夾下面，執行./configure

這部分是配置tensorflow，然後再生成whl安裝tensorflow。
直接pip安裝就是安裝官網提供的已經配置好的whl包，而原始碼安裝就是利用bazel編譯後，生成whl包，再進行安裝。

（如果是需要開啟GPU，在這裡需要配置cuda和cudnn。因為電腦顯示卡計算能力不夠不能開啟GPU，所以之前沒有安裝cuda和cudnn）

1）配置

You have bazel 0.17.2 installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3.5


Found possible Python library paths:
  /usr/local/lib/python3.5/dist-packages
  /usr/lib/python3/dist-packages
Please input the desired Python library path to use.  Default is [/usr/local/lib/python3.5/dist-packages]

Do you wish to build TensorFlow with Apache Ignite support? [Y/n]: n
No Apache Ignite support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 9.0]: 


Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 


Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 


Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 


Do you wish to build TensorFlow with TensorRT support? [y/N]: n
No TensorRT support will be enabled for TensorFlow.

Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]: 


Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1]: 


Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 


Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 


Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
    --config=mkl            # Build with MKL support.
    --config=monolithic     # Config for mostly static monolithic build.
    --config=gdr            # Build with GDR support.
    --config=verbs          # Build with libverbs support.
    --config=ngraph         # Build with Intel nGraph support.
Configuration finished

View Code

上面的部分程式碼是參考https://www.cnblogs.com/seniusen/p/9756302.html

以上在配置的過程中可能會出錯，在這裡我把系統預設的Python2改為了python3.5，使用方法如下

備份原來的python2軟連結，sudo mv /usr/bin/python /usr/bin/python.2-bak,然後執行ln -s /usr/local/bin/python3.5 /usr/bin/python,使用python --version測試成功，但是在編譯tensorflow的時候會出現一些問題，NO module named keras.preprocessing，解決方法sudo pip install keras,但是又出現其他的錯誤ModuleNotFoundError: No module named 'pip._internal'，解決方法

 wget https://bootstrap.pypa.io/get-pip.py  --no-check-certificate
sudo python get-pip.py

然後測試，pip -V，即可解決。

下面進行編譯

在tensorflow目錄下，輸入以下三個命令：

bazel build -c opt //tensorflow/tools/pip_package:build_pip_package

編譯很久，結束之後，執行

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

tmp/tensorflow_pkg目錄下找到（whl包的名字可能不一樣，和電腦屬性或者當前tensorflow版本之類的有關），我的名字為tensorflow-1.12.0rc0-cp35-cp35m-linux_x86_64.whl

將其複製到主資料夾,以便安裝

sudo pip install tensorflow-1.12.0rc0-cp35-cp35m-linux_x86_64.whl

等待安裝完成後，輸入以下命令，不報錯即安裝成功.

測試是否安裝成功

python
#這裡會輸出python的版本資訊，見下圖
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
#這裡會輸出GPU的相關資訊，表明TensorFlow是在GPU上執行的，見下圖
>>> sess.run(hello)
b'Hello, TensorFlow!'
>>> a = tf.constant(10)
>>> b = tf.constant(22)
>>> sess.run(a+b)
32
>>>

以下是tensorflow c++的介面設定https://www.cnblogs.com/seniusen/p/9756302.html

系統安裝情況以及深度學習環境搭建

系統安裝情況以及深度學習環境搭建

Ubuntu16.04系統GPU深度學習環境搭建

保姆級教程——Ubuntu16.04 Server下深度學習環境搭建：安裝CUDA8.0，cuDNN6.0，Bazel0.5.4，原始碼編譯安裝TensorFlow1.4.0(GPU版)

深度學習環境搭建第一步----Ubuntu 安裝（win7 + win10）

raspberry 3b 64bit 系統安裝，以及交叉編譯環境、除錯環境的搭建

Ubuntu深度學習環境搭建 tensorflow+pytorch

深度學習環境搭建：Tensorflow1.4.0+Ubuntu16.04+Python3.5+Cuda8.0+Cudnn6.0

ubuntu16 深度學習環境搭建步驟

TX2 深度學習環境搭建記錄cmake從3.5升級到3.7.2

[work] 最爽的GPU深度學習環境搭建教程

ubuntu18.04+RTX2080深度學習環境搭建

深度學習環境搭建ubuntu16.04_cuda_8.0_cudnn.6.0_anaconda3- 5.01_python3.6_tensorflow_gpu1.4_pycharm2018

【深度學習】Ubuntu下CUDA+ cuDNN + TensorFlow/TensorLayer 深度學習環境搭建

【深度學習】windows 10下CUDA+ cuDNN + MXNet/TensorFlow/TensorLayer 深度學習環境搭建

Win10+RTX2080深度學習環境搭建：tensorflow、mxnet、pytorch、caffe

深度學習環境搭建：ubuntu16.04+nvidia驅動+cuda9.0+cudnn7.0.5

第一節，windows和ubuntu下深度學習環境搭建

Ubuntu16.04+CUDA+CUDNN+Anaconda+Tensorflow+keras深度學習環境搭建

win8.1 64位+Anaconda3-5.0.1+cdua9.0+cuDNN v7.0.5+tensorflow 1.8.0 深度學習環境搭建小記

深度學習環境搭建：win10+GTX1060 + tensorflow1.5+keras+cuda9.0+cudnn7

系統安裝情況以及深度學習環境搭建

相關推薦