提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档
文章目录
- 前言
- 一、安装
- 二、关键指标
- 1.驱动版本和CUDA版本对应
- 1.最适配版本
- 2.最低支持版本
- 2.CUDA版本和CUDNN版本对应
- 三、验证有效性
- 1.驱动验证
- 2.CUDA验证
- 1.nvcc
- 2.sample
- 3.CUDNN验证
- 1.头文件验证法
- 2.sample验证法
- 总结
前言
使用CUDA开发AI项目必然离不开CUDA和CUDNN的安装,今天这篇文章就是教你怎么验证CUDA和CUDNN是否安装成功和验证版本是不是安装对了。
一、安装
怎么安装CUDA和CUDNN这里就不多说了,随便都能搜到。简而言之就是显卡驱动->CUDA->CUDNN这种安装顺序。其中,显卡驱动又可以通过普通安装、CUDA附带安装和DKMS安装,这几种方式。具体的请自行查阅其它博主文章。
二、关键指标
显卡驱动和CUDA版本有对应关系,CUDA版本和CUDNN版本又有对应关系,这里简单通过CUDA-11.2来说明这种关系,从而防止小白踩坑。请认真往下看:
1.驱动版本和CUDA版本对应
进去之后点击Release Notes
标签,之后往下看找到驱动项。
1.最适配版本
仔细看,主要是Windows和Linux。其中,Windows就不说了,Windows10或Windows11。Linux包括我们常用的Ubuntu、Debian、CentOS等等。这里,我们只讨论Ubuntu,因为其它系统我没装,理论上是一样的,你只需要在操作系统选择里选择对应的下载就行了。
这里只做下演示,按照你的操作系统版本选就行了,没什么难度,等待下载好安装。
回到刚才的驱动版本,CUDA-11.2版本对Linux驱动要求是>=460.27.04
;Windows是>=460.89
。这个驱动是版本最匹配的版本,当然只要大于这个版本都行。有时候,你可能还会遇到比较特殊的一种版本要求。请继续往下看:
2.最低支持版本
以CUDA-11.8版本为例,这个CUDA版本官方标注了最低版本驱动要求,
CUDA-11.8官网说明
同样,进去之后点击Release Notes标签,之后往下看找到驱动项。
从图片中看到从CUDA-11.0到CUDA-11.8家族都是支持这种最小化驱动需求的。这意味着我们的程序可以跑起来,但是难道这意味着Minimum
和Compatiable
没有区别吗?答案当然是非也,请看官方说明:
From CUDA 11 onwards, applications compiled with a CUDA Toolkit release from within a CUDA major release family can run, with limited feature-set, on systems having at least the minimum required driver version as indicated below. This minimum required driver can be different from the driver packaged with the CUDA Toolkit but should belong to the same major release.
看不懂没关系,关键在这一句:with limited feature-set
。简而言之,核心功能能用,一些新特性可能不能用,简单来说,尽量安装Compatiable版本的驱动,Minimum是无奈之举。当然,驱动怎么升级这里就不说了,可以自行检索。
总结,驱动向下兼容,尽量安装高版本。
2.CUDA版本和CUDNN版本对应
一般来说,每个CUDA版本都至少有一个版本的CUDNN和它对应。注意:是至少一个版本,可以是多个版本
。以CUDA-11.2为例,我们看下官网:
图片有点长,我们简略看一下。从8.1.0版本开始,CUDNN开始支持CUDA-11.2。简而言之,如果你安装了CUDA-11.2,CUDNN至少安装8.1.0,当然可以根据需要安装后续的支持版本,包括11.x。还是建议如果不是特殊需求,尽量不要过度追求高版本,可能带不来实质性的性能提升
。
好了,对应版本介绍完了,下面说下验证安装的问题。
三、验证有效性
显卡驱动+CUDA+CUDNN三者缺一不可。验证方法也不是只有一种,我介绍几种给你做参考。
1.驱动验证
nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.27.04 Driver Version: 460.27.04 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:3B:00.0 Off | N/A |
| 31% 36C P0 53W / 250W | 0MiB / 11019MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 208... Off | 00000000:86:00.0 Off | N/A |
| 36% 35C P0 28W / 250W | 0MiB / 11019MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
如果正常显示出来,而且检测出显卡就说明驱动没问题。
说下几个主要指标:
NVIDIA-SMI 460.27.04:工具版本
Driver Version: 460.27.04:显卡驱动版本
CUDA Version: 11.2:参考CUDA版本,具体以官网为准
GPU:显卡序号,0是第一块显卡,以此类推,可以混插不同型号(少见),具体以实际为准
Name:显卡名字,Geforce RTX 2080 Ti ,类似这种
Fan:风扇转速比例,如果核心温度上升风扇会转的更快
Temp:显卡核心温度(摄氏度),温度过高会降频
,从而引发显卡效率下降
Pwr:当前功耗/热设计功耗,一般不会超过功耗墙
,除非手动调整
Memory-Usage:显存占用/显存总量,爆显存会严重影响效率
,或者导致任务被kill
GPU-Util:显卡性能占用百分比,如果是100%意味着显卡性能吃满了
Processes:目前正在使用显卡的应用程序,包括AI和编解码
2.CUDA验证
这个比较简单,最常用的方法就是nvcc
1.nvcc
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0
只要可以打印,而且版本是对的就行了。
2.sample
这个方法要复杂点,一般不用。需要安装CUDA的时候勾选CUDA Sample
。
只要编译源代码,然后执行示例程序,没报错就是可以的。方法虽然复杂了点,但是比nvcc更靠谱,专业!
3.CUDNN验证
由于CUDNN是个纯的库,默认的CUDNN没有什么BIN文件,所以验证起来也不如CUDA那么容易。当然,并不意味着没办法,方法也有两种:
1.头文件验证法
可能的安装路径是/usr/include/cudnn.h或/usr/local/include/cudnn.h
whereis cudnn_version.h
cudnn: /usr/include/cudnn_version.h
我的在这个目录下,接下来通过连续命令来确认CUDNN版本。
cat /usr/include/cudnn_version.h | grep -A 5 MAJOR
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 1
#define CUDNN_PATCHLEVEL 0
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#endif /* CUDNN_VERSION_H */
grep -A 5是查到MAJOR字段后再往下展示5行。可以看到CUDNN版本有3个部分组成
,组合起来就是8.1.0。
注意,这个方法其实是有漏洞的,它只能说明你这个目录下有CUDNN文件,并不能说明CUDNN就一定安装成功了。如果非要抬杠的话,如果库被误删了呢?还能说明CUDNN是完整的吗?
当然,绝大多数情况下只要CUDNN正常安装就没有问题。
2.sample验证法
前提是,你安装了CUDNN的sample,请看下图:
8.1.0版本CUDNN包有三个部分组成,其中Runtime是必须的,Dev是开发环境,Smaple是示例代码,建议3个都装,一步到位。请看下面操作:
cd /usr/src/
ls
cudnn_samples_v8
要么在/usr/src/下,要么在/usr/local/src下。我们接着操作:
cd cudnn_samples_v8/
cd mnistCUDNN/
./mnistCUDNN
Executing: mnistCUDNN
cudnnGetVersion() : 8100 , CUDNN_VERSION from cudnn.h : 8005 (8.0.5)
Host compiler version : GCC 7.5.0
There are 2 CUDA capable devices on your machine :
device 0 : sms 68 Capabilities 7.5, SmClock 1545.0 Mhz, MemSize (Mb) 11019, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=0
device 1 : sms 68 Capabilities 7.5, SmClock 1545.0 Mhz, MemSize (Mb) 11019, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=1
Using device 0
Testing single precision
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.018240 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.018432 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.036736 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.067552 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.308416 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.486848 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.044928 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.062112 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.076672 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.077632 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.086336 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.087488 time requiring 2450080 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.017248 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.017472 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.018144 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.048384 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.048640 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.059392 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.043040 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.049152 time requiring 2000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.061632 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.063232 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.073088 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.085568 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
Testing half precision (math in single precision)
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.018304 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.019232 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.019360 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.051200 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.053792 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.061696 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.055008 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.056064 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.071552 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.078848 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.079648 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.098304 time requiring 2450080 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.018176 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.019520 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.019552 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.048704 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.055488 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.061504 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.053248 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.055584 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.063968 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.069664 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.070560 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.079520 time requiring 2450080 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
重点是:cudnnGetVersion() : 8100
,这个就是程序检测到的CUDNN版本,以这个为准。
最后:Test passed!
,出现这个说明成功了,包括驱动+CUDA+CUDNN全部成功了
。如果失败了,你需要再检查下安装环境了!
总结
1、方法还有很多,两个就够了
2、如果环境有问题,大概率测试程序是跑不完的