电脑部署大模型python运行程序出现错误:RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
详细错误如下:
Traceback (most recent call last):
File “cli_demo.py”, line 11, in
webglm = load_model(args)
File “/root/WebGLM/model/modeling_webglm.py”, line 66, in load_model
webglm = WebGLM(webglm_ckpt_path, retiever_ckpt_path, args.device, args.filter_max_batch_size, args.searcher)
File “/root/WebGLM/model/modeling_webglm.py”, line 8, in init
self.ref_retriever = ReferenceRetiever(retriever_ckpt_path, device, filter_max_batch_size, searcher_name)
File “/root/WebGLM/model/retriever/init.py”, line 14, in init
self.filter = ReferenceFilter(retriever_ckpt_path, device, filter_max_batch_size)
File “/root/WebGLM/model/retriever/filtering/contriver.py”, line 76, in init
self.scorer = ContrieverScorer(retriever_ckpt_path, device, max_batch_size)
File “/root/WebGLM/model/retriever/filtering/contriver.py”, line 16, in init
self.query_encoder = self.query_encoder.to(self.device).eval()
File “/usr/local/python3/lib/python3.8/site-packages/transformers/modeling_utils.py”, line 1878, in to
return super().to(*args, **kwargs)
File “/usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 927, in to
return self._apply(convert)
File “/usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 579, in _apply
module._apply(fn)
File “/usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 579, in _apply
module._apply(fn)
File “/usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 602, in _apply
param_applied = fn(param)
File “/usr/local/python3/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 925, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File “/usr/local/python3/lib/python3.8/site-packages/torch/cuda/init.py”, line 217, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
检测GPU:
python38 -c "import torch; print(torch.zeros(1).cuda()); print(torch.cuda.is_available())"
或者
CUDA_VISIBLE_DEVICES=0 LD_PRELOAD=./dummy-uvm.so python38 -c 'import torch; print(torch.cuda.get_device_name(0))'
找到调用的地方进行修改,例如:
vim ./model/retriever/filtering/contriver.py
修改里面的调用GPU地方内容为:
self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")