参考:

https://zhuanlan.zhihu.com/p/361545761

https://github.com/NVIDIA/nvidia-docker/issues/1551

 

 

=======================================================

 

 

问题描述:

在WSL下的ubuntu中使用nvidia-docker启动某个镜像的容器,命令如下:

sudo docker run -it -v /home/devil/shareData:/shareData -p 127.0.0.1:3333:22  --runtime=nvidia --gpus all  30acf12ceadb /bin/bash

 

 

报错,报错信息具体如下:

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/76aebda714a598487d6ec2615bfbc8729722e3138a846830a407d07f929128c4/merged/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: file exists: unknown.
ERRO[0000] error waiting for container:

 

 

需要注意的是如果使用纯docker启动该镜像的容器则不报错,命令:

sudo docker run -it -v /home/devil/shareData:/shareData -p 127.0.0.1:3333:22  30acf12ceadb /bin/bash

也就是说只有使用nvidia-docker启动该镜像下的容器才会报错。

 

重点:

如果同一个镜像的容器在非WSL下,即纯物理机Ubuntu环境下使用nvidia-docker启动是不会报错的。

也就是说该种错误只有在WSL下使用nvidia-docker启动某个镜像下的容器才会如此报错。

 

 

故障原因:

nvidia-docker最古老的容器内nvidia gpu的调用是需要在镜像(或容器)中安装与宿主机nvidia显卡驱动兼容的驱动版本,但是这一要求比较难以满足,因为如果宿主机的nvidia驱动略低于docker容器下nvidia驱动版本就很容易出现forward compatibility错误,而比较可行的就是容器内的nvidia驱动版本略低于宿主机版本。正是因为最早的nvidia-docker这个难以保证宿主机和容器的nvidia驱动版本匹配,因此现在的nvidia-docker使用的方案是在制作docker镜像时不安装nvidia driver和cuda,而是在nvidia-docker容器启动时自动把宿主机中的nvidia driver和cuda映射给容器,对应的nvidia-docker启动容器时附加参数为--runtime=nvidia --gpus all,但是有一些人对这个原理并不是很了解因此在制作镜像的时候依旧会把nvidia driver和cuda打包进去。由于wsl下对物理机的nvidia显卡是使用模拟的方式,这时的wsl中使用的nvidia驱动其实是wsl-nvidia-driver,也正是由于该驱动的一些特性导致在wsl中如果使用nvidia-docker启动自身带有nvidia driver和cuda的容器就会在启动时报错。其报错的故障具体点为wsl使用nvidia-docker启动容器时在自动创建/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1文件和/usr/lib/x86_64-linux-gnu/libcuda.so.1文件时会判断镜像中是否有相同的文件,如果有则报错,也就是本文开头说提的报错信息,而在ubuntu物理机上使用nvidia-docker首次启动容器时即使镜像中存在/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1文件和/usr/lib/x86_64-linux-gnu/libcuda.so.1文件也会对其进行强制覆盖(强制映射)(该种覆盖并不会影响容器的保存,比如在使用docker commit时对应的文件依旧是原镜像中的文件,而不是nvidia-docker映射给的宿主机中对应的文件)。

 

 

 

解决方案:

1. 使用docker而不是nvidia-docker启动原始镜像下的容器,手动删除或改名文件/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1和文件/usr/lib/x86_64-linux-gnu/libcuda.so.1 ,然后把此时的容器打包为镜像,具体操作:

sudo docker run --rm -it  14.14.15.100:5000/pytorch/pytorch:20.08-py3-cuda11

mv /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1  /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1.bak

mv /usr/lib/x86_64-linux-gnu/libcuda.so.1  /usr/lib/x86_64-linux-gnu/libcuda.so.1.bak

sudo docker commit  72f081acebae  new:v1   # 另开一个终端执行

2.  使用nvidia-docker启动上一步打包的镜像:

sudo docker run -it -v /home/devil/shareData:/shareData -p 127.0.0.1:3333:22  --runtime=nvidia --gpus all  new:v1  /bin/bash

 

成功运行,故障解决,运行效果如下: 

WSL启动nvidia-docker镜像:报错libnvidia-ml.so.1- file exists- unknown_杂谈

 

 

3. 在这个运行成功的支持nvidia显卡的docker中进行安装配置,如果需要把这个环境重新导出给其他电脑上,这时我们需要对其进行重新的docker commit操作,如果是导回给原始镜像制作时的运行环境中,这时我们可以在docker commit之前恢复之前对文件的修改,即:

rm  /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1

rm /usr/lib/x86_64-linux-gnu/libcuda.so.1

mv /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1.bak  /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1

mv /usr/lib/x86_64-linux-gnu/libcuda.so.1.bak  /usr/lib/x86_64-linux-gnu/libcuda.so.1

 

 

 

 

========================================================

 

 

 

进一步补充:

nvidia-docker的官方测试镜像的容器启动方式:

sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi

我们进入到这个镜像所启动的容器内,查找nvidia文件:

WSL启动nvidia-docker镜像:报错libnvidia-ml.so.1- file exists- unknown_linux_02

 

查找cuda文件:

root@f543d46c991d:/# find / -name *cuda*
/var/lib/dpkg/info/cuda-cudart-11-6.list
/var/lib/dpkg/info/cuda-toolkit-11-6-config-common.list
/var/lib/dpkg/info/cuda-toolkit-11-config-common.postinst
/var/lib/dpkg/info/cuda-toolkit-config-common.postrm
/var/lib/dpkg/info/cuda-toolkit-config-common.conffiles
/var/lib/dpkg/info/cuda-toolkit-11-config-common.list
/var/lib/dpkg/info/cuda-toolkit-11-config-common.md5sums
/var/lib/dpkg/info/cuda-toolkit-11-6-config-common.postinst
/var/lib/dpkg/info/cuda-toolkit-config-common.md5sums
/var/lib/dpkg/info/cuda-compat-11-6.list
/var/lib/dpkg/info/cuda-compat-11-6.md5sums
/var/lib/dpkg/info/cuda-compat-11-6.shlibs
/var/lib/dpkg/info/cuda-toolkit-config-common.postinst
/var/lib/dpkg/info/cuda-toolkit-11-config-common.conffiles
/var/lib/dpkg/info/cuda-toolkit-11-6-config-common.md5sums
/var/lib/dpkg/info/cuda-toolkit-config-common.list
/var/lib/dpkg/info/cuda-toolkit-11-6-config-common.postrm
/var/lib/dpkg/info/cuda-cudart-11-6.md5sums
/var/lib/dpkg/info/cuda-compat-11-6.triggers
/var/lib/dpkg/info/cuda-toolkit-11-config-common.postrm
/var/lib/dpkg/alternatives/cuda-11
/var/lib/dpkg/alternatives/cuda
/usr/share/doc/cuda-cudart-11-6
/usr/share/doc/cuda-toolkit-11-6-config-common
/usr/share/doc/cuda-compat-11-6
/usr/share/doc/cuda-toolkit-config-common
/usr/share/doc/cuda-toolkit-11-config-common
/usr/local/cuda-11
/usr/local/cuda
/usr/local/cuda-11.6
/usr/local/cuda-11.6/compat/libcuda.so
/usr/local/cuda-11.6/compat/libcuda.so.510.108.03
/usr/local/cuda-11.6/compat/libcuda.so.1
/usr/local/cuda-11.6/targets/x86_64-linux/lib/libcudart.so.11.0
/usr/local/cuda-11.6/targets/x86_64-linux/lib/libcudart.so.11.6.55
/etc/alternatives/cuda-11
/etc/alternatives/cuda
/etc/apt/sources.list.d/cuda.list
/etc/ld.so.conf.d/989_cuda-11.conf
/etc/ld.so.conf.d/000_cuda.conf

 

-------------------------------------------------------------------------

 

可以看到现在的支持nvidia gpu的镜像其自身是不带有nvidia driver的,而我们对报错的镜像的原始版本启动的容器进行搜索:

root@c3840fee8f26:~# find / -name *nvidia*

/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1.bak
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.455.23.05
/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.455.23.05
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.455.23.05
/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.455.23.05
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.455.23.05
/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.450.80.02
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.455.23.05
/usr/lib/wsl/drivers/nvhm.inf_amd64_4a2f8a62d5686839/libnvidia-ptxjitcompiler.so.1
/usr/lib/wsl/drivers/nvhm.inf_amd64_4a2f8a62d5686839/nvidia-smi
/usr/lib/wsl/drivers/nvhm.inf_amd64_4a2f8a62d5686839/libnvidia-ml.so.1
/usr/lib/wsl/drivers/nvhm.inf_amd64_4a2f8a62d5686839/libnvidia-ml_loader.so
/usr/lib/pkgconfig/nvidia-ml-11.0.pc
/usr/bin/nvidia-debugdump
/usr/bin/nvidia-smi
/usr/bin/nvidia-cuda-mps-control
/usr/bin/nvidia-cuda-mps-server
/usr/bin/nvidia-persistenced
/usr/local/cuda-11.0/compat/libnvidia-ptxjitcompiler.so.1
/usr/local/cuda-11.0/compat/libnvidia-ptxjitcompiler.so.450.80.02
/usr/local/cuda-11.0/targets/x86_64-linux/lib/stubs/libnvidia-ml.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/stubs/libnvidia-ml.so.1
/usr/local/cuda-11.0/Sanitizer/docs/common/formatting/nvidia.png
/usr/local/cuda-11.0/extras/CUPTI/doc/common/formatting/nvidia.png
/etc/apt/sources.list.d/nvidia-ml.list
/etc/ld.so.conf.d/nvidia.conf

 

root@c3840fee8f26:~# find / -name *nvidia*
/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1.bak
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.470.82.01
/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.450.102.04
/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.67
/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.455.23.05
/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.455.23.05
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.455.23.05
/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.455.23.05
/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.455.23.05
/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.450.80.02
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.455.23.05
/usr/lib/wsl/drivers/nvhm.inf_amd64_4a2f8a62d5686839/libnvidia-ptxjitcompiler.so.1
/usr/lib/wsl/drivers/nvhm.inf_amd64_4a2f8a62d5686839/nvidia-smi
/usr/lib/wsl/drivers/nvhm.inf_amd64_4a2f8a62d5686839/libnvidia-ml.so.1
/usr/lib/wsl/drivers/nvhm.inf_amd64_4a2f8a62d5686839/libnvidia-ml_loader.so
/usr/lib/pkgconfig/nvidia-ml-11.0.pc
/usr/bin/nvidia-debugdump
/usr/bin/nvidia-smi
/usr/bin/nvidia-cuda-mps-control
/usr/bin/nvidia-cuda-mps-server
/usr/bin/nvidia-persistenced
/usr/local/cuda-11.0/compat/libnvidia-ptxjitcompiler.so.1
/usr/local/cuda-11.0/compat/libnvidia-ptxjitcompiler.so.450.80.02
/usr/local/cuda-11.0/targets/x86_64-linux/lib/stubs/libnvidia-ml.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/stubs/libnvidia-ml.so.1
/usr/local/cuda-11.0/Sanitizer/docs/common/formatting/nvidia.png
/usr/local/cuda-11.0/extras/CUPTI/doc/common/formatting/nvidia.png
/etc/apt/sources.list.d/nvidia-ml.list
/etc/ld.so.conf.d/nvidia.conf
/shareData/nvidia-smi.sh
root@c3840fee8f26:~#
root@c3840fee8f26:~#
root@c3840fee8f26:~# find / -name *cuda*
/tmp/libnccl-dev_2.12.10-1+cuda11.0_amd64.deb
/tmp/libnccl2_2.12.10-1+cuda11.0_amd64.deb
/var/lib/dpkg/info/cuda-nvdisasm-11-0.md5sums
/var/lib/dpkg/info/cuda-cudart-dev-11-0.md5sums
/var/lib/dpkg/info/cuda-compiler-11-0.md5sums
/var/lib/dpkg/info/cuda-libraries-dev-11-0.list
/var/lib/dpkg/info/cuda-nvrtc-dev-11-0.md5sums
/var/lib/dpkg/info/cuda-command-line-tools-11-0.md5sums
/var/lib/dpkg/info/cuda-nvml-dev-11-0.md5sums
/var/lib/dpkg/info/cuda-driver-dev-11-0.md5sums
/var/lib/dpkg/info/cuda-compiler-11-0.list
/var/lib/dpkg/info/cuda-gdb-11-0.list
/var/lib/dpkg/info/cuda-nvml-dev-11-0.list
/var/lib/dpkg/info/cuda-nvdisasm-11-0.list
/var/lib/dpkg/info/cuda-cudart-dev-11-0.list
/var/lib/dpkg/info/cuda-cupti-dev-11-0.list
/var/lib/dpkg/info/cuda-minimal-build-11-0.list
/var/lib/dpkg/info/cuda-cuobjdump-11-0.md5sums
/var/lib/dpkg/info/cuda-libraries-dev-11-0.md5sums
/var/lib/dpkg/info/cuda-memcheck-11-0.list
/var/lib/dpkg/info/cuda-minimal-build-11-0.md5sums
/var/lib/dpkg/info/cuda-cupti-11-0.list
/var/lib/dpkg/info/cuda-cupti-dev-11-0.md5sums
/var/lib/dpkg/info/cuda-driver-dev-11-0.list
/var/lib/dpkg/info/cuda-nvcc-11-0.md5sums
/var/lib/dpkg/info/cuda-gdb-11-0.md5sums
/var/lib/dpkg/info/cuda-cuobjdump-11-0.list
/var/lib/dpkg/info/cuda-cupti-11-0.md5sums
/var/lib/dpkg/info/cuda-nvprof-11-0.list
/var/lib/dpkg/info/cuda-command-line-tools-11-0.list
/var/lib/dpkg/info/cuda-nvrtc-dev-11-0.list
/var/lib/dpkg/info/cuda-sanitizer-11-0.list
/var/lib/dpkg/info/cuda-nvprof-11-0.md5sums
/var/lib/dpkg/info/cuda-memcheck-11-0.md5sums
/var/lib/dpkg/info/cuda-nvcc-11-0.list
/var/lib/dpkg/info/cuda-nvprune-11-0.md5sums
/var/lib/dpkg/info/cuda-sanitizer-11-0.md5sums
/var/lib/dpkg/info/cuda-nvprune-11-0.list
/var/lib/dpkg/info/cuda-libraries-11-0.md5sums
/var/lib/dpkg/info/cuda-nvrtc-11-0.list
/var/lib/dpkg/info/cuda-libraries-11-0.list
/var/lib/dpkg/info/cuda-nvtx-11-0.list
/var/lib/dpkg/info/cuda-nvtx-11-0.md5sums
/var/lib/dpkg/info/cuda-nvrtc-11-0.md5sums
/var/lib/dpkg/info/cuda-compat-11-0.md5sums
/var/lib/dpkg/info/cuda-cudart-11-0.list
/var/lib/dpkg/info/cuda-compat-11-0.shlibs
/var/lib/dpkg/info/cuda-cudart-11-0.md5sums
/var/lib/dpkg/info/cuda-compat-11-0.triggers
/var/lib/dpkg/info/cuda-compat-11-0.list
/var/lib/dpkg/info/cuda-cudart-11-0.conffiles
/usr/include/linux/cuda.h
/usr/lib/x86_64-linux-gnu/libcuda.so.1
/usr/lib/x86_64-linux-gnu/libcuda.so.1.bak
/usr/lib/x86_64-linux-gnu/libcuda.so.470.82.01
/usr/lib/x86_64-linux-gnu/libcuda.so.450.102.04
/usr/lib/x86_64-linux-gnu/libcuda.so.418.67
/usr/lib/x86_64-linux-gnu/libcuda.so.455.23.05
/usr/lib/x86_64-linux-gnu/libcuda.so
/usr/lib/x86_64-linux-gnu/libcuda.so.450.80.02
/usr/lib/x86_64-linux-gnu/libicudata.so.60.2
/usr/lib/x86_64-linux-gnu/libicudata.so.60
/usr/lib/wsl/drivers/nvhm.inf_amd64_4a2f8a62d5686839/libcuda_loader.so
/usr/lib/wsl/drivers/nvhm.inf_amd64_4a2f8a62d5686839/libcuda.so.1.1
/usr/lib/pkgconfig/cudart-11.0.pc
/usr/lib/pkgconfig/cuda-11.0.pc
/usr/bin/nvidia-cuda-mps-control
/usr/bin/nvidia-cuda-mps-server
/usr/share/doc/cuda-command-line-tools-11-0
/usr/share/doc/cuda-nvprof-11-0
/usr/share/doc/cuda-minimal-build-11-0
/usr/share/doc/cuda-nvcc-11-0
/usr/share/doc/cuda-cupti-11-0
/usr/share/doc/cuda-nvrtc-dev-11-0
/usr/share/doc/cuda-libraries-dev-11-0
/usr/share/doc/cuda-driver-dev-11-0
/usr/share/doc/cuda-gdb-11-0
/usr/share/doc/cuda-cudart-dev-11-0
/usr/share/doc/cuda-nvdisasm-11-0
/usr/share/doc/cuda-compiler-11-0
/usr/share/doc/cuda-nvprune-11-0
/usr/share/doc/cuda-cupti-dev-11-0
/usr/share/doc/cuda-nvml-dev-11-0
/usr/share/doc/cuda-memcheck-11-0
/usr/share/doc/cuda-sanitizer-11-0
/usr/share/doc/cuda-cuobjdump-11-0
/usr/share/doc/cuda-nvrtc-11-0
/usr/share/doc/cuda-nvtx-11-0
/usr/share/doc/cuda-libraries-11-0
/usr/share/doc/cuda-compat-11-0
/usr/share/doc/cuda-cudart-11-0
/usr/share/vim/vim80/syntax/cuda.vim
/usr/share/vim/vim80/indent/cuda.vim
/usr/local/include/openmpi/mpiext/mpiext_cuda_c.h
/usr/local/lib/python3.6/dist-packages/torch/utils/hipify/cuda_to_hip_mappings.py
/usr/local/lib/python3.6/dist-packages/torch/utils/hipify/__pycache__/cuda_to_hip_mappings.cpython-36.pyc
/usr/local/lib/python3.6/dist-packages/torch/backends/cuda
/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/utils/cuda_enabled.h
/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/utils/cuda_lazy_init.h
/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include/torch/cuda.h
/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/cuda
/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/jit/passes/cuda_graph_fuser.h
/usr/local/lib/python3.6/dist-packages/torch/include/ATen/cuda
/usr/local/lib/python3.6/dist-packages/torch/include/ATen/native/cuda
/usr/local/lib/python3.6/dist-packages/torch/include/c10/cuda
/usr/local/lib/python3.6/dist-packages/torch/include/c10/cuda/impl/cuda_cmake_macros.h
/usr/local/lib/python3.6/dist-packages/torch/include/caffe2/cuda_rtc
/usr/local/lib/python3.6/dist-packages/torch/lib/libcudart-3f3c6934.so.11.0
/usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_cuda.so
/usr/local/lib/python3.6/dist-packages/torch/lib/libc10_cuda.so
/usr/local/lib/python3.6/dist-packages/torch/cuda
/usr/local/lib/python3.6/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake
/usr/local/lib/python3.6/dist-packages/torch/testing/_internal/__pycache__/common_cuda.cpython-36.pyc
/usr/local/lib/python3.6/dist-packages/torch/testing/_internal/common_cuda.py
/usr/local/lib/python3.6/dist-packages/caffe2/contrib/prof/__pycache__/cuda_profile_ops_test.cpython-36.pyc
/usr/local/lib/python3.6/dist-packages/caffe2/contrib/prof/cuda_profile_ops_test.py
/usr/local/lib/python3.6/dist-packages/torchvision.libs/libcudart.1372cad0.so.11.0
/usr/local/lib/libmca_common_cuda.la
/usr/local/lib/libmca_common_cuda.so.40.20.0
/usr/local/lib/libmca_common_cuda.so
/usr/local/lib/libmca_common_cuda.so.40
/usr/local/lib/openmpi/mca_coll_cuda.so
/usr/local/lib/openmpi/mca_btl_smcuda.la
/usr/local/lib/openmpi/mca_coll_cuda.la
/usr/local/lib/openmpi/mca_btl_smcuda.so
/usr/local/share/man/man3/MPIX_Query_cuda_support.3
/usr/local/share/openmpi/help-mpi-btl-smcuda.txt
/usr/local/share/openmpi/help-mpi-common-cuda.txt
/usr/local/share/openmpi/help-mpi-coll-cuda.txt
/usr/local/cuda-11.0
/usr/local/cuda-11.0/compat/libcuda.so
/usr/local/cuda-11.0/compat/libcuda.so.1
/usr/local/cuda-11.0/compat/libcuda.so.450.80.02
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudart.so.11.0
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudart.so.11.0.221
/usr/local/cuda-11.0/targets/x86_64-linux/lib/stubs/libcuda.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/stubs/libcuda.so.1
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudart.so
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudadevrt.a
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudart_static.a
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_awbarrier.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/generated_cuda_runtime_api_meta.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/generated_cuda_meta.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/generated_cudaVDPAU_meta.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cudalibxt.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_runtime_api.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_occupancy.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_texture_types.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_awbarrier_helpers.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_fp16.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_device_runtime_api.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_vdpau_interop.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_surface_types.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cudaEGL.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_fp16.hpp
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_pipeline_helpers.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/generated_cudaGL_meta.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cudart_platform.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/generated_cuda_vdpau_interop_meta.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_runtime.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_gl_interop.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/thrust/system/cuda
/usr/local/cuda-11.0/targets/x86_64-linux/include/thrust/system/cuda/detail/guarded_cuda_runtime_api.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_profiler_api.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cudaGL.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_bf16.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/nvperf_cuda_host.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cudaVDPAU.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cudaProfiler.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_pipeline.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_bf16.hpp
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_egl_interop.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_pipeline_primitives.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_stdint.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/generated_cuda_gl_interop_meta.h
/usr/local/cuda-11.0/targets/x86_64-linux/include/cuda_awbarrier_primitives.h
/usr/local/cuda-11.0/nvvm/libnvvm-samples/cuda-c-linking
/usr/local/cuda-11.0/nvvm/libnvvm-samples/cuda-c-linking/cuda-c-linking.cpp
/usr/local/cuda-11.0/Sanitizer/include/generated_cuda_runtime_api_meta.h
/usr/local/cuda-11.0/Sanitizer/include/generated_cuda_meta.h
/usr/local/cuda-11.0/Sanitizer/include/generated_cudaVDPAU_meta.h
/usr/local/cuda-11.0/Sanitizer/include/generated_cudaGL_meta.h
/usr/local/cuda-11.0/Sanitizer/include/generated_cuda_vdpau_interop_meta.h
/usr/local/cuda-11.0/Sanitizer/include/generated_cuda_profiler_api_meta.h
/usr/local/cuda-11.0/Sanitizer/include/generated_cuda_gl_interop_meta.h
/usr/local/cuda-11.0/Sanitizer/docs/common/formatting/cuda-toolkit-documentation.png
/usr/local/cuda-11.0/bin/cudafe++
/usr/local/cuda-11.0/bin/cuda-gdb
/usr/local/cuda-11.0/bin/cuda-gdbserver
/usr/local/cuda-11.0/bin/cuda-memcheck
/usr/local/cuda-11.0/extras/Debugger/lib64/libcudacore.a
/usr/local/cuda-11.0/extras/Debugger/include/libcudacore.h
/usr/local/cuda-11.0/extras/Debugger/include/cudacoredump.h
/usr/local/cuda-11.0/extras/Debugger/include/cuda_stdint.h
/usr/local/cuda-11.0/extras/Debugger/include/cudadebugger.h
/usr/local/cuda-11.0/extras/CUPTI/doc/common/formatting/cuda-toolkit-documentation.png
/usr/local/cuda-11.0/extras/CUPTI/samples/userrange_profiling/simplecuda.cu
/usr/local/cuda-11.0/extras/CUPTI/samples/autorange_profiling/simplecuda.cu
/usr/local/cuda
/etc/apt/sources.list.d/cuda.list
/etc/ld.so.conf.d/cuda-11-0.conf

 

我们可以看到这个报错的镜像中安装了多个版本的nvidia driver,而这些driver都是直接安装在镜像中的,这时由于WSL启动nvidia-docker运行容器时不限制覆盖文件:

/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1

/usr/lib/x86_64-linux-gnu/libcuda.so.1

因此使用WSL运行镜像中安装过nvidia驱动的容器则会报错。

 

 

 ========================================================