1. 程式人生 > 實用技巧 >通過Dockerfile建立caffe-gpu環境

通過Dockerfile建立caffe-gpu環境

我這裡GPU型號為 tesla T4, 算力為7.5,會有一些特殊處理。首先是我不會在Dockerfile中編譯caffe

本次環境安裝通過Dockerfile安裝,如果按照本教程要先安裝docker以及nvidia-docker

1. 下載caffe的程式碼

git clone https://github.com/BVLC/caffe.git

2. 替換caffe目錄下docker/gpu中Dockerfile

內容如下

FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04
LABEL maintainer [email protected]

RUN apt-get update && apt-get install -y --no-install-recommends \
        build-essential \
        vim \
        python-opencv \
        python-tk \
        cmake \
        git \
        wget \
        libatlas-base-dev \
        libboost-all-dev \
        libgflags-dev \
        libgoogle-glog-dev \
        libhdf5-serial-dev \
        libleveldb-dev \
        liblmdb-dev \
        libopencv-dev \
        libprotobuf-dev \
        libsnappy-dev \
        protobuf-compiler \
        python-dev \
        python-numpy \
        python-pip \
        python-setuptools \
        python-scipy && \
    rm -rf /var/lib/apt/lists/*

ENV CAFFE_ROOT=/opt/caffe
WORKDIR $CAFFE_ROOT

# FIXME: use ARG instead of ENV once DockerHub supports this
# https://github.com/docker/hub-feedback/issues/460
ENV CLONE_TAG=1.0

RUN git clone -b ${CLONE_TAG} --depth 1 https://github.com/BVLC/caffe.git .

RUN pip install --upgrade pip && \
    pip install pip -U && \
    pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple && \
    pip install python-dateutil==2.5.0

# git clone https://github.com/NVIDIA/nccl.git && cd nccl && make -j install && cd .. && rm -rf nccl &&
RUN cd python && for req in $(cat requirements.txt) pydot; do pip install $req; done && cd ..
RUN mkdir build && cd build && \
    #cmake -DUSE_CUDNN=1 .. && 
    #make -j"$(nproc)"

ENV PYCAFFE_ROOT $CAFFE_ROOT/python
ENV PYTHONPATH $PYCAFFE_ROOT:$PYTHONPATH
ENV PATH $CAFFE_ROOT/build/tools:$PYCAFFE_ROOT:$PATH
RUN echo "$CAFFE_ROOT/build/lib" >> /etc/ld.so.conf.d/caffe.conf && ldconfig

WORKDIR /workspace

3.開始構建image

nvidia-docker build -t caffe-cuda10:gpu gpu

構建成功後可以通過下面檢視

docker images

4.建立container

nvidia-docker run -it caffe-cuda10:gpu /bin/bash

通過

nvidia-docker ps -a

檢視剛建立的container_id,然後開始使用

nvidia-docker exec -it container_id /bin/bash

5.編譯caffe

因為這張gpu卡的算力是7.5的,所以在Dockerfile中沒有編譯caffe.

修改/opt/caffe/cmake/的Cuda.cmake在第7行加上75

set(Caffe_known_gpu_archs "20 21(20) 30 35 50 60 61 75")

然後cd到build目錄下

cmake -DUSE_CUDNN=1 .. 
make -j"$(nproc)"

完成!

備註

如果算力沒有超過6.1可以使用下面這個Dockerfile,修改的地方就是在Dockerfile中編譯好環境。
就是不高於這個-gencode arch=compute_61,code=compute_61

備註1

FROM nvidia/cuda:8.0-cudnn6-devel-ubuntu16.04
LABEL maintainer [email protected]

RUN apt-get update && apt-get install -y --no-install-recommends \
        build-essential \
        vim \
        python-opencv \
        python-tk \
        cmake \
        git \
        wget \
        libatlas-base-dev \
        libboost-all-dev \
        libgflags-dev \
        libgoogle-glog-dev \
        libhdf5-serial-dev \
        libleveldb-dev \
        liblmdb-dev \
        libopencv-dev \
        libprotobuf-dev \
        libsnappy-dev \
        protobuf-compiler \
        python-dev \
        python-numpy \
        python-pip \
        python-setuptools \
        python-scipy && \
    rm -rf /var/lib/apt/lists/*

ENV CAFFE_ROOT=/opt/caffe
WORKDIR $CAFFE_ROOT

# FIXME: use ARG instead of ENV once DockerHub supports this
# https://github.com/docker/hub-feedback/issues/460
ENV CLONE_TAG=1.0

RUN git clone -b ${CLONE_TAG} --depth 1 https://github.com/BVLC/caffe.git .

RUN pip install --upgrade pip && \
    pip install pip -U && \
    pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple && \
    pip install python-dateutil==2.5.0

# git clone https://github.com/NVIDIA/nccl.git && cd nccl && make -j install && cd .. && rm -rf nccl &&
RUN cd python && for req in $(cat requirements.txt) pydot; do pip install $req; done && cd ..
RUN mkdir build && cd build && \
    cmake -DUSE_CUDNN=1 .. && \
    make -j"$(nproc)"

ENV PYCAFFE_ROOT $CAFFE_ROOT/python
ENV PYTHONPATH $PYCAFFE_ROOT:$PYTHONPATH
ENV PATH $CAFFE_ROOT/build/tools:$PYCAFFE_ROOT:$PATH
RUN echo "$CAFFE_ROOT/build/lib" >> /etc/ld.so.conf.d/caffe.conf && ldconfig

WORKDIR /workspace

備註2

這些依賴是訓練的依賴,沒有可以不執行。

apt-get update
apt-get install python-skimage
pip install python-dateutil==2.1 -i https://pypi.tuna.tsinghua.edu.cn/simple some-package