[elrepo] Problem with CUDA since 331.67.elrepo

Phil Perry phil at elrepo.org
Fri May 2 09:13:04 EDT 2014


On 26/04/14 18:45, Michael Lampe wrote:
> Hi Phil,
>
> Phil Perry wrote:
>> Is this anything nvidia-modprobe can handle? Should we be packaging this
>> file (I don't believe we are are present)?
>
> nvidia-modprobe is a suid root wrapper for modprobe + other things. It's
> called whenever nvidia finds a problem with modules or device files. It
> will surely rectify everything for whoever calls it. Packaging this
> should be more or less equivalent to packaging a setuid root shell as
> bash-nvidia.
>
> The other device files are created by udev. They are listed in
> /etc/udev/makedev.d/60-nvidia.nodes for static creation and the rules
> for doing so are already supplied by the distro in
> /etc/makedev.d/01linux-2.6.x:
>
> c $CONSOLE             195   0  1 255 nvidia%d
> c $CONSOLE             195 255  1   1 nvidiactl
>
> Finally, nvidia.ko gets loaded because it registers with alias
> char-major-195-*. Permissions of the device files are then further
> handled by pam with the glob /dev/nvidia* in
> /etc/security/console.perms.d/50-default.perms so that whoever is logged
> in to the console has r/w access.
>
> The pam stuff continues to work automatically with the new device file,
> but device file creation and module loading cannot be done in this way
> because there is no fixed major no, nvidia-uvm will pick a free one at
> random _after_ loading. Also the only udev event created after loading
> is add@/module/nvidia-uvm which goes down the drain in
> /etc/udev/rules.d/05-udev-early.rules because you cannot do anything
> with it anyway. So udev is completely out of the game.
>
> Hence:
>
> $ cat /etc/modprobe.d/nvidia
> options nvidia NVreg_ModifyDeviceFiles=0
> install nvidia /sbin/modprobe --ignore-install nvidia && /sbin/modprobe
> nvidia-uvm && /bin/mknod -m 0600 /dev/nvidia-uvm c `/bin/grep nvidia-uvm
> /proc/devices | /usr/bin/cut -d' ' -f1` 0 && /sbin/pam_console_apply
> /dev/nvidia-uvm
>
> -Michael
>
> PS: I'm also strongly suggesting 'options nvidia
> NVreg_ModifyDeviceFiles=0', because up to now the X driver/old kernel
> module _is_ tinkering with device file permissions behind pam's back.
>

Michael,

I've merged your patches, and built some testing packages (331.67-2) 
which I can release to the testing repo, but I've come across a small 
issue whilst doing some quick pre-release testing.

On RHEL6, when running glxgears the animation noticeably stutters, it is 
no longer smooth. The fps count is still reported as ~60fps, apparently 
linked to the refresh rate of my panel, but the animation "looks" more 
like 5-10 fps!

Downgrading to 331.67-1 confirmed we appear to have introduced a glitch.

Unloading the nvidia-uvm module had no effect so that does not appear to 
be the cause.

Commenting out the 'NVreg_ModifyDeviceFiles=0' in 
/etc/modprobe.d/nvidia.conf fixed the issue.

Are you able to observe similar behaviour?

I don't observe any issues on RHEL5 where glxgears reports ~11,000fps 
with or without 'NVreg_ModifyDeviceFiles=0'.

Phil






More information about the elrepo mailing list