[elrepo] after kmod-nvidia el7 update during boot still use the old kmod

Phil Perry phil at elrepo.org
Sun Oct 25 10:27:08 EDT 2015


On 25/10/15 10:38, Farkas Levente wrote:
> hi,
> after upgrading to kmod-nvidia-352.55-1.el7.elrepo.x86_64 after boot
> i've only got black screen. and this happended with the previous version
> also. the reason is that during boot the kernel's initramfs still
> contains the old kmod. the relevant part of dmesg:
> -------------------------------
> [    1.080456] Request for unknown module key 'The ELRepo Project
> (http://elrepo.org): ELRepo.org Secure Boot Key:
> f365ad3481a7b20e3427b61b2a26635b83fe427b' err -11
> [    1.080716] nvidia: module license 'NVIDIA' taints kernel.
> [    1.081210] Disabling lock debugging due to kernel taint
> [    1.085063] nvidia: module verification failed: signature and/or
> required key missing - tainting kernel
> [    1.088709] vgaarb: device changed decodes:
> PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
> [    1.090184] [drm] Initialized nvidia-drm 0.0.0 20150116 for
> 0000:01:00.0 on minor 0
> [    1.090188] libata version 3.00 loaded.
> [    1.090389] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  352.41
> Fri Aug 21 23:09:52 PDT 2015
> -------------------------------
> and even if i run manualy kmod's postinstall scripts i i=only got one error:
> -------------------------------
> [root at eagle ~]# modules=( $(find
> /lib/modules/3.10.0-229.el7.x86_64/extra/nvidia | grep '\.ko$') )
> [root at eagle ~]# printf '%s\n' "${modules[@]}" | /sbin/weak-modules
> --add-modules
> 
> gzip: /boot/initramfs-3.10.0-229.14.1.el7.x86_64.img: not in gzip format
> 
> gzip: /boot/initramfs-3.10.0-229.14.1.el7.x86_64.tmp: not in gzip format
> -------------------------------
> and probably it's failed to create the new initramfs.
> all what can i do is:
> 
> rmmod nvidia; modprobe nvidia; systemctl restart graphical.target
> 
> and everything works again.
> may be it's a bug in rhel/centos-7's kmod or dracut, but it's strange
> that no one else has the same problem?
> thank you for your help in advance.
> regards.
> 

As you say, the reason can only be because the initramfs image still
contains the old module. This would imply the it's not being updated
during the package update.

As you are probably aware, when a kmod gets updated, kmodtool calls the
weak-modules script which in turn calls dracut to update the initramfs.

So I'll try to replicate this on a fully updated RHEL7 system.

I start off with kmod-nvidia-352.41-1.el7.elrepo installed.

Checking the initramfs image I see the nvidia driver is there, as a
symlink to the actual module in /extra/nvidia/nvidia.ko:

# lsinitrd | grep nvidia
-rw-r--r--   1 root     root          128 May 23  2014
etc/ld.so.conf.d/nvidia.conf
drwxr-xr-x   3 root     root            0 Oct 25 13:24 usr/lib64/nvidia
drwxr-xr-x   2 root     root            0 Oct 25 13:24 usr/lib64/nvidia/tls
drwxr-xr-x   2 root     root            0 Oct 25 13:24
usr/lib/modules/3.10.0-229.14.1.el7.x86_64/weak-updates/nvidia
lrwxrwxrwx   1 root     root           53 Oct 25 13:24
usr/lib/modules/3.10.0-229.14.1.el7.x86_64/weak-updates/nvidia/nvidia.ko
-> ../../../3.10.0-229.el7.x86_64/extra/nvidia/nvidia.ko
drwxr-xr-x   2 root     root            0 Oct 25 13:24
usr/lib/modules/3.10.0-229.el7.x86_64/extra/nvidia
-rw-r--r--   1 root     root     17325151 Aug 29 15:18
usr/lib/modules/3.10.0-229.el7.x86_64/extra/nvidia/nvidia.ko


But because it's not versioned this won't tell us if the initramfs image
gets updated on updating the driver or not.

After updating to kmod-nvidia-352.55-1.el7.elrepo we can check if dracut
has been run on the initramfs images:

# cat /var/log/messages | grep 'Executing: /sbin/dracut'

Oct 25 13:41:46 rhel7 dracut: Executing: /sbin/dracut -f
/boot/initramfs-3.10.0-123.el7.x86_64.tmp 3.10.0-123.el7.x86_64
Oct 25 13:42:26 rhel7 dracut: Executing: /sbin/dracut -f
/boot/initramfs-3.10.0-229.11.1.el7.x86_64.tmp 3.10.0-229.11.1.el7.x86_64
Oct 25 13:43:17 rhel7 dracut: Executing: /sbin/dracut -f
/boot/initramfs-3.10.0-229.14.1.el7.x86_64.tmp 3.10.0-229.14.1.el7.x86_64
Oct 25 13:44:08 rhel7 dracut: Executing: /sbin/dracut -f
/boot/initramfs-3.10.0-229.7.2.el7.x86_64.tmp 3.10.0-229.7.2.el7.x86_64
Oct 25 13:44:59 rhel7 dracut: Executing: /sbin/dracut -f
/boot/initramfs-3.10.0-229.el7.x86_64.tmp 3.10.0-229.el7.x86_64
Oct 25 13:46:04 rhel7 dracut: Executing: /sbin/dracut -f
/boot/initramfs-3.10.0-123.el7.x86_64.tmp 3.10.0-123.el7.x86_64
Oct 25 13:46:42 rhel7 dracut: Executing: /sbin/dracut -f
/boot/initramfs-3.10.0-229.11.1.el7.x86_64.tmp 3.10.0-229.11.1.el7.x86_64
Oct 25 13:47:33 rhel7 dracut: Executing: /sbin/dracut -f
/boot/initramfs-3.10.0-229.14.1.el7.x86_64.tmp 3.10.0-229.14.1.el7.x86_64
Oct 25 13:48:24 rhel7 dracut: Executing: /sbin/dracut -f
/boot/initramfs-3.10.0-229.7.2.el7.x86_64.tmp 3.10.0-229.7.2.el7.x86_64
Oct 25 13:49:15 rhel7 dracut: Executing: /sbin/dracut -f
/boot/initramfs-3.10.0-229.el7.x86_64.tmp 3.10.0-229.el7.x86_64

So as expected we see dracut has been run twice on all kernels, once
during %post as the new kmod package is installed and once during
%postun as the old kmod is uninstalled.

Rebooting after the kmod update proceeds as expected.

So I'm unable to replicate the issue. /var/log/messages clearly shows
dracut has been run during the package update and has rebuilt the
initramfs for each kernel to incorporate the updated nvidia.ko module.

I would suggest you uninstall kmod-nvidia and revert back to nouveau,
and reboot. Then do a fresh install kmod-nvidia-352.41-1.el7.elrepo and
reboot.

Check the initramfs as I did above. Then update to
kmod-nvidia-352.55-1.el7.elrepo, watching what dracut does, and see if
you can replicate the issue.

If we can work out what's happening, and why, then maybe we can come up
with a solution to make the process more robust.

Note that in the %post script, on first install only, we do:

/usr/bin/dracut --add-drivers nvidia -f /boot/initramfs-$KERNEL.img $KERNEL

to add the nvidia.ko module for each kABI-compatible kernel. Once the
nvidia module is in the initramfs it should then be updated without issue.

Otherwise, you may just be able to fix the issue by manually updating
your initramfs with 'dracut --add-drivers nvidia'.




More information about the elrepo mailing list