1、检查系统是否安装了 gcc
sudo gcc --version
如果系统没有安装 gcc 则会提示 command not found。这时要先安装 gcc。
sudo apt-get build-dep gcc
该命令apt-get build-dep (packagename) 表示安装相关的编译环境。
sudo apt-get install build-essential
2、Verify the System has the Correct Kernel Headers and Development Packages Installed
The kernel headers and development packages for the currently running kernel can be installed with:
(uname -r)
3、下载cuda tookit 9.0(9.2版本无法使用)
注意下载 deb 版,16.04 local版。
The CUDA Toolkit can be installed using either of two different installation mechanisms: distribution-specific packages (RPM and Deb packages), or a distribution-independent package (runfile packages). The distribution-independent package has the advantage of working across a wider set of Linux distributions, but does not update the distribution's native package management system. The distribution-specific packages interface with the distribution's native package management system. It is recommended to use the distribution-specific packages, where possible.
4、关闭 nouveau 驱动
Create a file at /etc/modprobe.d/blacklist-nouveau.conf with the following contents:
blacklist nouveau options nouveau modeset=0
可以在命令行执行
sudo sh -c 'echo "blacklist nouveau" >> /etc/modprobe.d/blacklist-nouveau.conf'
sudo sh -c 'echo "options nouveau modeset=0" >> /etc/modprobe.d/blacklist-nouveau.conf'
Regenerate the kernel initramfs:
$ sudo update-initramfs -u
$ sudo reboot
使用命令
lsmod | grep nouveau
如果没有任何输出则表明关闭成功
5、Install repository meta-data
如果是非216的机器,重启后要重新挂载共享文件夹
sudo mount -t cifs -o username=ai,password=fs95536! //172.19.62.216/download216 /home/ai/download216
$ sudo dpkg -i cuda-repo-<distro><version><architecture>.deb
例如我们这里的
$ sudo dpkg -i /home/ai/download216/cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
6、Installing the CUDA public GPG key
When installing using the local repo:
$ sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub
例如我们这里为
sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
7、Update the Apt repository cache
$ sudo apt-get update
8、Install CUDA
$ sudo apt-get install cuda-libraries-9-0
9、安装 cuda 驱动
安装驱动可能需要的依赖包
sudo apt-get update
sudo apt-get install dkms build-essential linux-headers-generic gcc-multilib
安装驱动
$sudo chmod u x /home/ai/download216/NVIDIA-Linux-x86_64-390.87.run
$sudo /home/ai/download216/NVIDIA-Linux-x86_64-390.87.run --dkms -s
出现以下警告可以忽略
WARNING: nvidia-installer was forced to guess the X library path '/usr/lib' and X module path '/usr/lib/xorg/modules'; these paths were not queryable from the system. If X fails to find the NVIDIA X driver module, please install the pkg-config
utility and the X.Org SDK/development package for your distribution and reinstall the driver.
安装完后测试驱动
nvidia-smi
如果有信息输出表示安装成功了
经测试可以不用重启系统
重启系统
如果是非216的机器,重启后要重新挂载共享文件夹
sudo mount -t cifs -o username=ai,password=fs95536! //172.19.62.216/download216 /home/ai/download216
10、Environment Setup
The PATH variable needs to include /usr/local/cuda-9.0/bin
To add this path to the PATH variable:
{PATH: :${PATH}}
In addition, when using the runfile installation method, the LD_LIBRARY_PATH variable needs to contain /usr/local/cuda-9.0/lib64 on a 64-bit system, or /usr/local/cuda-9.0/lib on a 32-bit system
To change the environment variables for 64-bit operating systems:
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64
{LD_LIBRARY_PATH}}
11、安装cuda补丁
非216机器下面代码中的 download 要改成 download216
sudo dpkg -i /home/ai/download216/cuda-repo-ubuntu1604-9-0-local-cublas-performance-update_1.0-1_amd64.deb
sudo dpkg -i /home/ai/download216/cuda-repo-ubuntu1604-9-0-local-cublas-performance-update-2_1.0-1_amd64.deb
sudo dpkg -i /home/ai/download216/cuda-repo-ubuntu1604-9-0-local-cublas-performance-update-3_1.0-1_amd64.deb
sudo dpkg -i /home/ai/download216/cuda-repo-ubuntu1604-9-0-176-local-patch-4_1.0-1_amd64.deb
12、安装cudnn
解压 .solitairetheme8 的安装包,该后缀的包时候所有的 Linux 平台
非216机器下面代码中的 download 要改成 download216
cp /home/ai/download216/cudnn-9.0-linux-x64-v7.1.solitairetheme8 cudnn-9.0-linux-x64-v7.1.tgz
tar -xzvf cudnn-9.0-linux-x64-v7.1.tgz
Copy the following files into the CUDA Toolkit directory.
sudo cp cuda/include/cudnn.h /usr/local/cuda-9.0/include
sudo chmod a r /usr/local/cuda-9.0/include/cudnn.h (此句不需要)
》sudo chmod a r /usr/local/cuda-9.0/lib64/libcudnn
13、安装libcupti-dev 库
sudo apt-get install cuda-command-line-tools-9-0
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-9.0/extras/CUPTI/lib64
14、安装anacoda
sudo bash /home/ai/download216/Anaconda3-5.2.0-Linux-x86_64.sh
更新源
source ~/.bashrc
升级conda到最新版
sudo chown -R ai:ai /home/ai/anaconda3
conda update -n base conda
升级安装包到最新版
conda update --all
15、创建名为 tensorflow 的 conda 环境,以运行某个版本的 Python
conda create -n tensorflow pip python=3.5
16、激活 conda 环境
source activate tensorflow
17、安装 TensorFlow
pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.6.0-cp35-cp35m-linux_x86_64.whl (由于网络问题,此方法不推荐) 推荐: pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --ignore-installed --upgrade tensorflow-gpu
pip install ipykernel
conda install jupyter notebook
18、配置 jupyter notebook 远程登录
jupyter notebook --generate-config
$ jupyter notebook password Enter password: **** Verify password: **** [NotebookPasswordApp] Wrote hashed password to /home/ai/.jupyter/jupyter_notebook_config.json
在 jupyter_notebook_config.py 中找到下面的行,取消注释并修改。
c.NotebookApp.ip='*' c.NotebookApp.password = u'sha:ce.../home/ai/.jupyter/jupyter_notebook_config.json 中的内容' c.NotebookApp.open_browser = False c.NotebookApp.port =8888 #可自行指定一个端口, 访问时使用该端口