文档章节

来自官方文档的Ubuntu 16.04 + tensorflow-GPU 配置

刘小米_思聪
 刘小米_思聪
发布于 2017/12/21 23:56
字数 1991
阅读 643
收藏 0

I  Preprare for CUDA installation

官方文档:http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html 这个官方文档是针对cuda 9.1.5的,而我们安装的是cuda 8.0,所以在安装cuda的语句中版本号会稍有不同,其它都是可放心参照的方法。

本节是一些准备工作,查看操作系统版本号、GPU型号等。

1.1 Verify You Have a CUDA-Capable GPU 查看本机是否有GPU

To verify that your GPU is CUDA-capable, go to your distribution's equivalent of System Properties, or, from the command line, enter:

$ lspci | grep -i nvidia

cuda 目前支持的GPU版本型号和大类包括:https://developer.nvidia.com/cuda-gpus

1.2 Verify You Have a Supported Version of Linux 查看Linux版本

The CUDA Development Tools are only supported on some specific distributions of Linux. These are listed in the CUDA Toolkit release notes. To determine which distribution and release number you're running, type the following at the command line:

$ uname -m && cat /etc/*release

1.3 Verify the System Has gcc Installed 确认gcc是否安装,并查看gcc版本号.

The gcc compiler is required for development using the CUDA Toolkit. gcc 是GNU编译器套装(英语:GNU Compiler Collection,缩写为GCC),指一套编程语言编译器. 编译器版本可处理多种语言:比如Java,Ada, C, C++等等. It is not required for running CUDA applications. It is generally installed as part of the Linux installation, and in most cases the version of gcc installed with a supported version of Linux will work correctly. To verify the version of gcc installed on your system, type the following on the command line:

$ gcc --version

1.4 Verify the System has the Correct Kernel Headers and Development Packages Installed 查看系统内核headers和development packages,与内核版本保持一致即可。

The CUDA Driver requires that the kernel headers and development packages for the running version of the kernel be installed at the time of the driver installation, as well whenever the driver is rebuilt. For example, if your system is running kernel version 3.17.4-301, the 3.17.4-301 kernel headers and development packages must also be installed.

While the Runfile installation performs no package validation, the RPM and Deb installations of the driver will make an attempt to install the kernel header and development packages if no version of these packages is currently installed. However, it will install the latest version of these packages, which may or may not match the version of the kernel your system is using. Therefore, it is best to manually ensure the correct version of the kernel headers and development packages are installed prior to installing the CUDA Drivers, as well as whenever you change the kernel version.

The version of the kernel your system is running can be found by running the following command:

手动查看kernel版本

$ uname -r

The kernel headers and development packages for the currently running kernel can be installed with:

安装与系统kernel版本对应的headers 和development packages.

$ sudo apt-get install linux-headers-$(uname -r)

 

II. Download CUDA toolkit 8.0 and Installation

(注意:目前tensorflow 1.3 只支持CUDA toolkit 8.0+cudnn 6.0 )

建议读者在安装时,请check 实时的tensorflow官网上支持的CUDA 版本 以及cudnn版本,否则装了最新版本,不被tensorflow支持,还得卸载重新来过。

tensorflow 官网: https://www.tensorflow.org/install/install_linux?hl=zh-cn#prepare_your_environment 支持的版本信息如下,更高版本不行:

  • CUDA® Toolkit 8.0. For details, see NVIDIA's documentation. Ensure that you append the relevant Cuda pathnames to the LD_LIBRARY_PATH environment variable as described in the NVIDIA documentation.
  • The NVIDIA drivers associated with CUDA Toolkit 8.0.
  • cuDNN v6.0. For details, see NVIDIA's documentation. Ensure that you create the CUDA_HOMEenvironment variable as described in the NVIDIA documentation.

2.1 Download  cuda toolkit 下载cuda toolkit,注意下载cuda 8.0

https://developer.nvidia.com/cuda-80-ga2-download-archive   

选择 Linux> x86_64> ubuntu> 16.04> deb(local)

2.2 install cuda toolkit 8.0 安装

在terminal 窗口依次输入以下Installation Instructions

cd命令进入到下载文件的文件夹,然后输入以下命令,安装cuda
  1. `$ sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb`
  2. `$ sudo apt-get update`
  3. `$ sudo apt-get install cuda`

********如果上述命令为你安装的不是cude-8-0而是新版cuda-9-0等,解决方案如下**********

因为我之前安装过高版本的cuda-9.1,发现tensorflow不支持,因此卸载并请清除过cuda-9.1。用上面三句话重新安装cuda最后还是会自动安装cuda-9.0而不是我希望的cuda-8。

参考解决方案网址:https://devtalk.nvidia.com/default/topic/1024342/cuda-setup-and-installation/unable-to-uninstall-cuda-9-0-completely-and-install-8-0-instead/

归纳如下:

先卸载已经安装的高版本的cuda9.1

$ sudo apt-get --purge remove cuda

$ sudo apt autoremove

然后清理apt-cache

$ sudo apt-get clean

最后重新安装,并且cuda的指定版本号

$ sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb

$ sudo apt-get update

$ sudo apt-get install cuda-8-0

顺利完成!

******************************************

2.3 environment setup 配置环境变量

打开\home目录下的.bashrc 文件(这是隐藏文件,因此需要先用ctrl+H 快捷键显示隐藏文件再打开),在.bashrc的最后追加如下语句:

export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}

export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64/${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

# 注意这里要路径要和Nvida驱动版本一致  在终端输入 $cat /proc/driver/nvidia/version 可以查看驱动版本号

export LPATH=/usr/lib/nvidia-387:$LPATH

export LIBRARY_PATH=/usr/lib/nvidia-387:$LIBRARY_PATH

注意:上述语句中除了export后面的空格,不要有不必要的空格,否则会不识别,是空格敏感的

2.4 Test cuda是否安装成功, 查看nvcc编译器的版本

$ nvcc -V

 

III. install cudnn  (深度神经网络库 Deep Neural Network library) 

官方文档:http://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html

3.1 download cudnn (注意下载cudnn 6.0)

读者别嫌麻烦,注册加入(join)一下,然后就可以免费下载,下载时注意选择与本机ubuntu版本,cuda版本号对应的cudnn 6.0

https://developer.nvidia.com/rdp/form/cudnn-download-survey

3.2 install cudnn 

  • Navigate to your <cudnnpath> directory containing cuDNN Debian file. cd命令进入到下载这三个文件的目录,然后依次安装
$ sudo dpkg -i libcudnn6_6.0.3.11-1+cuda8.0_amd64.deb
  • Install the developer library, for example:
$ sudo dpkg -i libcudnn6-dev_6.0.3.11-1+cuda8.0_amd64.deb
  • Install the code samples and the cuDNN Library User Guide, for example:
$ sudo dpkg -i libcudnn6-doc_6.0.3.11-1+cuda8.0_amd64.deb

这里的sudo dpkg -i 后面的 ‘ libcudnn6-...’  版本号 以自己下载文件的命名为准。

小结:cuDNN is just installed by dropping files onto your system, 不用配置环境变量.

 

IV. install Tensorflow-gpu

参考官网文档: https://www.tensorflow.org/install/install_linux?hl=zh-cn#prepare_your_environment

4.1 prepare

The libcupti-dev library, which is the NVIDIA CUDA Profile Tools Interface. This library provides advanced profiling support. To install this library, issue the following command:

sudo apt-get install libcupti-dev

4.2  用native pip命令安装 tensorflow-gup

sudo apt-get install python3-pip python3-dev # for Python 3.n

pip3 install tensorflow-gpu # Python 3.n; GPU support 

(Optional.) If above step ‘$ pip3 install tensor flow-gpu’ failed, install the latest version of TensorFlow by issuing a command of the following format:

sudo pip3 install --upgrade tfBinaryURL   # Python 3.n 

where tfBinaryURL identifies the URL of the TensorFlow Python package. The appropriate value oftfBinaryURL depends on the operating system, Python version, and GPU support. Find the appropriate value for tfBinaryURL here. For example, to install TensorFlow for Linux, Python 3.4, and CPU-only support, issue the following command:

sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.4.0-cp34-cp34m-linux_x86_64.whl

4.3 类似2.3节提到的环境变量配置,在.bashrc文档中再追加环境变量

# Tensorflow 要求的环境变量

export CUDA_HOME=/usr/local/cuda-8.0

4.4. Test tensorflow-gpu 是否配置成功, 跑一段代码

$ python3

# 进入Python 环境下

>>> import tensorflow as tf

>>> hello =tf.constant("hello, tensorflow")

>>> sess = tf.Session()
>>> print(sess.run(hello))

输出了"hello, tensorflow" ,运行成功,恭喜你。

 

附录:遇到过的错误及解决方案

1. 我一切都安装好了,但是运行时报错,cannot load nativeruntime tensorflow: 

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/__init__.py", line 23, in <module>
    from tensorflow.python import *
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in <module>
    _pywrap_tensorflow = swig_import_helper()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)

ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

错误原因:I installed Cuda 9.0, but I realized that tensorflow 1.3 does not yet support it.

方法:

# I did following steps to remove cuda 9.0

$ sudo apt-get --purge remove cuda

$ sudo apt autoremove

# Then clear apt-cache

$ sudo apt-get clean

# Then I tried following steps to reinstall the cuda 8.0

$ sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb

$ sudo apt-get update

$ sudo apt-get install cuda

再次遇到问题: I have tried uninstalling cuda v9.0 but when I try to uninstall v8.0, v9.0 keeps getting installed instead. However cuda 9.0 keeps getting installed instead. How do I prevent this from happening and install 8.0?

Nvidia ansuwer: 再卸载一遍,安装时上述三句话的最后一句指定cuda版本号

$ sudo apt-get install cuda-8-0

其他参考:

https://segmentfault.com/a/1190000008234390

 

© 著作权归作者所有

刘小米_思聪
粉丝 58
博文 60
码字总数 43955
作品 0
西安
其他
私信 提问
TensorFlow集成TensorRT环境配置

本文前提是cuda和cudnn以及TensorRT已经安装完毕,具体详情,可以参考上一篇文章: https://developer.nvidia-china.com/forum.php?mod=viewthread&tid=8767&extra=page%3D1 TensorRT下载地址...

AI科技大本营
2018/04/26
0
0
配置 Mask-RCNN (matterport)

写在前面 个人博客:配置 Mask-RCNN (matterport) 此文记录配置 Mask-RCNN 的相关工作,具体为 matterport/MaskRCNN ,系统环境为 Ubuntu 16.04 。 准备工作 matterport/Mask_RCNN 中系统及依...

DexterLei
2018/03/07
0
0
TensorFlow GPU 与 源码编译

在深度学习中,服务器的GPU可以极大地加快算法的执行速度,不同版本的TensorFlow默认使用的GPU版本不同,导致与服务器无法兼容,这就需要根据服务器的GPU版本,重新编译TensorFlow源码。 欢迎...

SpikeKing
2018/06/30
0
0
如何在 Ubuntu 16.04 上安装并使用 TensorFlow

引言 TensorFlow 是由谷歌构建的用于训练神经网络的开源机器学习软件。TensorFlow 的神经网络以有状态数据流图的形式表示。图中的每个节点表示神经网络在多维数组上执行的操作。这些多维数组...

ismdeep
2017/12/24
15.6K
3
入门系列之在Ubuntu 16.04上安装和使用TensorFlow

欢迎大家前往腾讯云+社区,获取更多腾讯海量技术实践干货哦~ 本文由谢鸢 发表于云+社区专栏 介绍 TensorFlow是一款由Google构建的用于训练神经网络的开源机器学习软件。TensorFlow的神经网络...

腾讯云加社区
2018/07/20
0
0

没有更多内容

加载失败,请刷新页面

加载更多

好程序员Java教程分享Zookeeper基本原理与运用场景

好程序员Java教程分享Zookeeper基本原理与运用场景一、什么是Zookeeper? zookeeper是一个分布式的一致性协调服务。 换句话说,也可以把zookeeper看成一个小型的分布式文件系统。但是和FastD...

好程序员官网
31分钟前
6
0
mysql表情符

1 修改表字段为utf8md4 ALTER table property_info MODIFY `address` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL 2 MySQL数据库服务器配置文件mysqld.cn......

干死it
51分钟前
4
0
正则表达式的基本语法

本文摘自LTP.NET知识库。 正则表达式的形式一般如下: /love/ 其中位于“/”定界符之间的部分就是将要在目标对象中进行匹配的模式。 用户只要把希望查找匹配对象的模式内容放入“/”定界符之...

木庄
53分钟前
6
0
java 框架有哪些?

十大常用框架: 一、SpringMVC 二、Spring 三、Mybatis 四、Dubbo 五、Maven 六、RabbitMQ 七、Log4j 八、Ehcache 九、Redis 十、Shiro 延展阅读: 一、SpringMVC Spring Web MVC是一种基于J...

java框架开发者
53分钟前
10
0
细谈Mysql事务

文章原创于公众号:程序猿周先森。本平台不定时更新,喜欢我的文章,欢迎关注我的微信公众号。 上一篇着重谈到了MySQL锁的概念,里面谈到了事务的概念,其实大部分开发者对于事务肯定不陌生,...

程序猿周先森
今天
5
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部