[elrepo] Problem with CUDA since 331.67.elrepo

Michael Lampe mlampe0 at googlemail.com
Sat Apr 26 13:45:23 EDT 2014


Hi Phil,

Phil Perry wrote:
> Is this anything nvidia-modprobe can handle? Should we be packaging this
> file (I don't believe we are are present)?

nvidia-modprobe is a suid root wrapper for modprobe + other things. It's 
called whenever nvidia finds a problem with modules or device files. It 
will surely rectify everything for whoever calls it. Packaging this 
should be more or less equivalent to packaging a setuid root shell as 
bash-nvidia.

The other device files are created by udev. They are listed in 
/etc/udev/makedev.d/60-nvidia.nodes for static creation and the rules 
for doing so are already supplied by the distro in 
/etc/makedev.d/01linux-2.6.x:

c $CONSOLE             195   0  1 255 nvidia%d
c $CONSOLE             195 255  1   1 nvidiactl

Finally, nvidia.ko gets loaded because it registers with alias 
char-major-195-*. Permissions of the device files are then further 
handled by pam with the glob /dev/nvidia* in 
/etc/security/console.perms.d/50-default.perms so that whoever is logged 
in to the console has r/w access.

The pam stuff continues to work automatically with the new device file, 
but device file creation and module loading cannot be done in this way 
because there is no fixed major no, nvidia-uvm will pick a free one at 
random _after_ loading. Also the only udev event created after loading 
is add@/module/nvidia-uvm which goes down the drain in 
/etc/udev/rules.d/05-udev-early.rules because you cannot do anything 
with it anyway. So udev is completely out of the game.

Hence:

$ cat /etc/modprobe.d/nvidia
options nvidia NVreg_ModifyDeviceFiles=0
install nvidia /sbin/modprobe --ignore-install nvidia && /sbin/modprobe 
nvidia-uvm && /bin/mknod -m 0600 /dev/nvidia-uvm c `/bin/grep nvidia-uvm 
/proc/devices | /usr/bin/cut -d' ' -f1` 0 && /sbin/pam_console_apply 
/dev/nvidia-uvm

-Michael

PS: I'm also strongly suggesting 'options nvidia 
NVreg_ModifyDeviceFiles=0', because up to now the X driver/old kernel 
module _is_ tinkering with device file permissions behind pam's back.



More information about the elrepo mailing list