Bug 6149 - nvidia-driver 安装过程中kernel-devel的依赖检测机制失效,未安装kernel-devel场景下仍能正常安装驱动但是安装后驱动不生效
Summary: nvidia-driver 安装过程中kernel-devel的依赖检测机制失效,未安装kernel-devel场景下仍能正常安装驱动但是安装后驱动不生效
Status: RESOLVED FIXED
Alias: None
Product: Anolis OS 23
Classification: Anolis OS
Component: BaseOS Modules (show other bugs) BaseOS Modules
Version: 23.0
Hardware: x86_64 Linux
: P3-Medium S2-major
Target Milestone: ---
Assignee: xuchunmei
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-11 10:19 UTC by feitian200603
Modified: 2023-08-17 14:49 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description feitian200603 alibaba_cloud_group 2023-08-11 10:19:50 UTC
[root@localhost ~]# nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
[root@localhost ~]# lsmod | grep nvidia
[root@localhost ~]#
[root@localhost ~]# rpm -qa |grep nvidia
nvidia-driver-NVML-530.30.02-4.an23.x86_64
nvidia-driver-libs-530.30.02-4.an23.x86_64
nvidia-driver-cuda-libs-530.30.02-4.an23.x86_64
nvidia-driver-NvFBCOpenGL-530.30.02-4.an23.x86_64
nvidia-persistenced-530.30.02-4.an23.x86_64
nvidia-driver-cuda-530.30.02-4.an23.x86_64
nvidia-driver-devel-530.30.02-4.an23.x86_64
nvidia-driver-530.30.02-4.an23.x86_64
nvidia-kmod-common-530.30.02-4.an23.x86_64
kmod-nvidia-latest-dkms-530.30.02-4.an23.x86_64
nvidia-modprobe-530.30.02-4.an23.x86_64
nvidia-settings-530.30.02-4.an23.x86_64
nvidia-xconfig-530.30.02-4.an23.x86_64
[root@localhost ~]# rpm -qa |grep cuda
cuda-toolkit-config-common-12.1.105-4.an23.noarch
cuda-toolkit-12-config-common-12.1.105-4.an23.noarch
cuda-toolkit-12-1-config-common-12.1.105-4.an23.noarch
cuda-nvml-devel-12-1-12.1.105-4.an23.x86_64
cuda-driver-devel-12-1-12.1.105-4.an23.x86_64
cuda-cudart-12-1-12.1.105-4.an23.x86_64
cuda-opencl-12-1-12.1.105-4.an23.x86_64
cuda-nvrtc-12-1-12.1.105-4.an23.x86_64
cuda-nvdisasm-12-1-12.1.105-4.an23.x86_64
cuda-cupti-12-1-12.1.105-4.an23.x86_64
cuda-nvprof-12-1-12.1.105-4.an23.x86_64
cuda-cccl-12-1-12.1.109-4.an23.x86_64
cuda-libraries-12-1-12.1.1-4.an23.x86_64
cuda-cudart-devel-12-1-12.1.105-4.an23.x86_64
cuda-nvrtc-devel-12-1-12.1.105-4.an23.x86_64
cuda-opencl-devel-12-1-12.1.105-4.an23.x86_64
cuda-nsight-compute-12-1-12.1.1-4.an23.x86_64
cuda-profiler-api-12-1-12.1.105-4.an23.x86_64
cuda-libraries-devel-12-1-12.1.1-4.an23.x86_64
cuda-nvtx-12-1-12.1.105-4.an23.x86_64
cuda-nvprune-12-1-12.1.105-4.an23.x86_64
cuda-nvcc-12-1-12.1.105-4.an23.x86_64
cuda-gdb-12-1-12.1.105-4.an23.x86_64
cuda-documentation-12-1-12.1.105-4.an23.x86_64
cuda-cuxxfilt-12-1-12.1.105-4.an23.x86_64
cuda-cuobjdump-12-1-12.1.111-4.an23.x86_64
cuda-compiler-12-1-12.1.1-4.an23.x86_64
cuda-nsight-systems-12-1-12.1.1-4.an23.x86_64
cuda-nsight-12-1-12.1.105-4.an23.x86_64
cuda-nvvp-12-1-12.1.105-4.an23.x86_64
cuda-visual-tools-12-1-12.1.1-4.an23.x86_64
cuda-demo-suite-12-1-12.1.105-4.an23.x86_64
nvidia-driver-cuda-libs-530.30.02-4.an23.x86_64
nvidia-driver-cuda-530.30.02-4.an23.x86_64
cuda-drivers-530.30.02-4.an23.x86_64
cuda-runtime-12-1-12.1.1-4.an23.x86_64
cuda-sanitizer-12-1-12.1.105-4.an23.x86_64
cuda-command-line-tools-12-1-12.1.1-4.an23.x86_64
cuda-tools-12-1-12.1.1-4.an23.x86_64
cuda-toolkit-12-1-12.1.1-4.an23.x86_64
cuda-12-1-12.1.1-4.an23.x86_64
cuda-12.1.1-4.an23.x86_64
Comment 1 xuchunmei alibaba_cloud_group 2023-08-11 11:02:25 UTC
安装环境的kernel与kernel-devel版本不匹配导致nvidia-kmod安装史dkms执行失败。

考虑在nvidia-kmod安装时dkms执行失败给出提示。

另外由于kernel-devel包允许存在多版本,dkms构建时依赖当前内核版本,仍旧需要给出指导文档来说明如何正确安装nvidia-driver。
Comment 2 xuchunmei alibaba_cloud_group 2023-08-17 14:49:39 UTC
已在kmod-nvidia-latest-dkms安装时增加检测,安装时如若OS上未安装kernel-devel-$(uname -r),则会给出提示。
类似于如下提示:

  Running scriptlet: kmod-nvidia-latest-dkms-3:470.199.02-3.an8.x86_64                                                                                                                                 32/34
You should install kernel-devel-5.10.134-14.1.an8.x86_64 first to build nvidia module for current kernel.
warning: %post(kmod-nvidia-latest-dkms-3:470.199.02-3.an8.x86_64) scriptlet failed, exit status 1

Error in POSTIN scriptlet in rpm package kmod-nvidia-latest-dkms