[elrepo] Problem with CUDA since 331.67.elrepo

Thu Apr 24 19:31:36 EDT 2014

They now have a second kernel module 'nvidia-uvm' that wants to be built 
and packaged too. Affected are both older CUDA versions and the new 6.0.

While that's easy to fix, there is an annoying little problem. There 
comes a new device file '/dev/nvidia-uvm' with the new driver module. 
And it picks an arbitrary unused major no. at load time. So you cannot 
have it loaded by modprobe automatically when the device node is 
accessed, because you cannot create the device node before the module is 
loaded, and you also cannot use udev in any way because this module 
creates no data in sysfs and therfore creates no udev events.

But the usage pattern is that CUDA programs will access 
'/dev/nvidia-uvm' and expect everything to be already in place.

The only solution I found is to piggyback the new module + device 
creation on the old one:

$ cat /etc/modprobe.d/nvidia
options nvidia NVreg_ModifyDeviceFiles=0
install nvidia /sbin/modprobe --ignore-install nvidia; /sbin/modprobe 
nvidia-uvm; /bin/mknod -m 0600 /dev/nvidia-uvm c `/bin/grep nvidia-uvm 
/proc/devices | /usr/bin/cut -d' ' -f1` 0; /sbin/pam_console_apply 
/dev/nvidia-uvm

This works because '/proc/devices' reveals what major the new module 
picked. Permissions are set and (automatically) maintained with the same 
logic that applies to the other /dev/nvidia* files.

What do you think? If nobody comes up with a better idea, I'll post the 
complete patch.

-Michael