具备条件:root权限进行操
修改root密码:
1. $ sudo passwd 输入两次新密码
2. $ su root 登陆 root账户
显卡驱动安装:
step .1:首先,检测你的NVIDIA图形卡和推荐的驱动程序的模型。执行命令:
$ ubuntu-drivers devices
输出结果为:
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001180sv00001458sd0000353Cbc03sc00i00
vendor : NVIDIA Corporation
model : GK104 [GeForce GTX 680]
driver : nvidia-304 - distro non-free
driver : nvidia-340 - distro non-free
driver : nvidia-384 - distro non-free recommended
driver : xserver-xorg-video-nouveau - distro free builtin
== cpu-microcode.py ==
driver : intel-microcode - distro free
从中可以看到,这里有一个设备是GTX 680 ,对应的驱动是NVIDIA -304,340,384 ,而推荐是安装384版本的驱动。
$ sudo ubuntu-drivers autoinstall
安装完成后最好重启下。
检查驱动是否安装成功
CUDA Toolkit 9.0
目前tensorflow只支持CUDA Toolkit 9.0。
下载地址:
目前没有支持ubuntu18.04的CUDA Toolkit 9.0。选择17.10的版本,安装base installer一般够用了。
--------------------------------------------------------------------------------------------------------------------------------------------
下载的“cuda_8.0.27_linux.run”有1.4G,按照Nivdia官方给出的方法安装CUDA8:
sudo sh cuda_8.0.27_linux.run --tmpdir=/opt/temp/
这里加了--tmpdir主要是直接运行“sudo sh cuda_8.0.27_linux.run”会提示空间不足的错误,其实是全新的电脑主机,硬盘足够大的,google了以下发现加个tmpdir就可以了:
Not enough space on parition mounted at /.
Need 5091561472 bytes.Disk space check has failed. Installation cannot continue.
执行后会有一系列提示让你确认,非常非常非常非常关键的地方是是否安装361这个低版本的驱动:
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 361.62?
答案必须是n,否则之前安装的GTX1080驱动就白费了,而且问题多多。
#会有说明,需要看的自己看,看了几页不想看/条款看不懂的按q键
1].如果安装过程中提示失败,根据提示查看log排错
2].安装成功后的log
Do you accept the previously read EULA?
accept/decline/quit: accept
You are attempting to install on an unsupported configuration. Do you wish to continue?
(y)es/(n)o [ default is no ]: y
#这里384.81表示显卡驱动版本,如果本机安装的显卡驱动版本比它高就不需要安装
#选no主要是前面有问题的时候安了CUDA9.2
#正常应该是yes
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81?
(y)es/(n)o/(q)uit: n
Install the CUDA 9.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
[ default is /usr/local/cuda-9.0 ]:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 9.0 Samples?
(y)es/(n)o/(q)uit: y
Enter CUDA Samples Location
[ default is /root ]:
Installing the CUDA Toolkit in /usr/local/cuda-9.0 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so
Missing recommended library: libGL.so
Installing the CUDA Samples in /root ...
Copying samples to /root/NVIDIA_CUDA-9.0_Samples now...
Finished copying samples.
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-9.0
Samples: Installed in /root, but missing recommended libraries
Please make sure that
- PATH includes /usr/local/cuda-9.0/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-9.0/lib64, or, add /usr/local/cuda-9.0/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.0/bin
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.0/doc/pdf for detailed information on setting up CUDA.
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 9.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
sudo <CudaInstaller>.run -silent -driver
Logfile is /tmp/cuda_install_7657.log
/root/NVIDIA_CUDA-9.0_Samples
2.设置环境变量
运行:vi /etc/ld.so.conf.d/cuda.conf
#写入两行
/usr/local/cuda/lib64
/usr/local/cuda/extras/CUPTI/lib64
运行:vi /etc/profile
#加入两行
export CUDA_HOME=/usr/local/cuda/bin
export PATH=$PATH:$CUDA_HOME
3.重启,使用reboot命令。
测试安装情况
没有报错就表示安装成功
cd /root/NVIDIA_CUDA-9.0_Samples/samples/1_Utilities/deviceQuery
make
./deviceQuery
# Result = PASS 成功
cd ../bandwidthTest
make
./bandwidthTest
#Result = PASS 成功
--------------------------------------------------------------------------------------------------------------------------------------------
CUDNN 7.0
地址:https://developer.nvidia.com/rdp/cudnn-archive
# 解压
tar -zxvf cudnn-9.0-linux-x64-v7.tgz
# 复制相应文件
sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda-9.0/lib64/
sudo cp cuda/include/cudnn.h /usr/local/cuda-9.0/include/
# 所有用户可读
sudo chmod a+r /usr/local/cuda-9.0/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
libcupti
sudo apt-get install libcupti-dev
配置
在~/.bashrc中加入
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
安装tensorflow-gpu
根据需要进行修改
pip install --upgrade tensorflow-gpu
测试
#再来个测试代码,保存到比如test.py
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
#执行 python3 test.py
#第一次有点慢
#没报错,有显卡信息,b'Hello, TensorFlow!',表示成功。
本文结束了,还得继续学习Tensorflow了。
成功!