Ubuntu 16.04下Matlab2014a+Anaconda2+OpenCV3.1+Caffe安装(2)

之后更换cudnn动态库,可以获得更快的计算效率。下载完cudnn5.0之后进行解压,cd进入cudnn5.0解压之后的include目录,在命令行进行如下操作:

$ sudo cp cudnn.h /usr/local/cuda/include/ #复制头文件

再将lib64目录下的动态文件进行复制和链接:

$ sudo cp lib* /usr/local/cuda/lib64/ #复制动态链接库 $ cd /usr/local/cuda/lib64/ $ sudo rm -rf libcudnn.so libcudnn.so.5 #删除原有动态文件 $ sudo ln -s libcudnn.so.5.0.5 libcudnn.so.5 $ sudo ln -s libcudnn.so.5 libcudnn.so

然后设置环境变量和动态链接库,在命令行输入:

$ sudo gedit /etc/profile

在打开的文件末尾加入:

export PATH = /usr/local/cuda/bin:$PATH

保存之后,创建链接文件:

$ sudo vim /etc/ld.so.conf.d/cuda.conf

按下键盘i进行编辑,输入链接库位置:

/usr/local/cuda/lib64

然后按esc,输入:wq保存退出。并在终端输入:

$ sudo ldconfig

使链接立即生效。

3、cuda用例安装与测试

在安装cuda.run文件时,我们已经选择安装了samples用例,还需要编译。因为当前的cuda还不支持gcc5.0以上的版本,在编译之前,我们需要修改配置文件,否则无法编译成功。在终端输入:

$ cd /usr/local/cuda-7.5/include $ cp host_config.h host_config.h.bak #备份编译头文件 $ sudo gedit host_config.h

然后在115行修改编译其支持的版本:

# if GNUC > 4 || (GNUC == 4 && GNUC_MINOR > 9)
# error – unsupported GNU version! gcc versions later than 4.9 are not supported!
# endif /* GNUC > 4 || (GNUC == 4 && GNUC_MINOR > 9) */

将if后面连续两个4改为5即可,然后进入用例文件进行编译:

$ cd /usr/local/cuda/samples $ sudo make all -j4 $ cd /usr/local/cuda/samples/bin/x86_64/linux/release $ sudo ./deviceQuery

成功之后会出现下列信息:

CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: “GeForce GTX 960M”
CUDA Driver Version / Runtime Version 8.0 / 7.5
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 4044 MBytes (4240375808 bytes)
( 5) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores
GPU Max Clock rate: 1176 MHz (1.18 GHz)
Memory Clock rate: 2505 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 7.5, NumDevs = 1, Device0 = GeForce GTX 960M
Result = PASS

通过上述用例测试,就表明显卡驱动安装和cuda安装完全成功了。

二、BLAS安装与配置

BLAS(基础线性代数集合)是一个应用程序接口的标准。caffe官网上推荐了三种实现:ATLAS, MKL, or OpenBLAS。其中atlas可以直接通过命令行安装,在此不再介绍。我采用的是intel的mkl库,首先,通过上面链接在intel官网申请学生版的Parallel Studio XE Cluster Edition ,下载完成之后cd到下载目录进行安装:

$ tar zxvf parallel_studio_xe_2016_update3.tgz #解压下载文件 $ chmod 777 parallel_studio_xe_2016_update3 -R #获取文件权限 $ cd parallel_studio_xe_2016_update3/ $ sudo ./install_GUI.sh

安装完成之后,进行相关文件的链接:

$ sudo gedit /etc/ld.so.conf.d/intel_mkl.conf

在打开的文件中添加库文件:

/opt/intel/lib/intel64
/opt/intel/mkl/lib/intel64

添加完成之后,编译链接时lib文件立即生效:

$ sudo ldconfig 三、OpenCV3.1.0安装与配置

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/15187.html