[elrepo] Problem with CUDA since 331.67.elrepo
Michael Lampe
mlampe0 at googlemail.com
Fri May 2 13:00:55 EDT 2014
Phil Perry wrote:
> I've merged your patches, and built some testing packages (331.67-2)
> which I can release to the testing repo, but I've come across a small
> issue whilst doing some quick pre-release testing.
>
> On RHEL6, when running glxgears the animation noticeably stutters, it is
> no longer smooth. The fps count is still reported as ~60fps, apparently
> linked to the refresh rate of my panel, but the animation "looks" more
> like 5-10 fps!
>
> Downgrading to 331.67-1 confirmed we appear to have introduced a glitch.
>
> Unloading the nvidia-uvm module had no effect so that does not appear to
> be the cause.
>
> Commenting out the 'NVreg_ModifyDeviceFiles=0' in
> /etc/modprobe.d/nvidia.conf fixed the issue.
>
> Are you able to observe similar behaviour?
>
> I don't observe any issues on RHEL5 where glxgears reports ~11,000fps
> with or without 'NVreg_ModifyDeviceFiles=0'.
Well, I admit to have tested mostly with el5, which works like I
described, see
https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Deployment_Guide/s1-pam-console.html.
The only el6 machine with nvidia hardware I have available here at work
is a GPU-Server. It has no video-out, so I cannot login at the console.
It also uses another mechanism for device file permissions (actually
none: r/w for everyone).
El6 doesn't have /etc/security/console.perms.d/50-default.perms, it
would want to use something like /lib/udev/rules.d/70-acl.rules to add
an acl for _all_ locally logged in users.
Options/ideas:
1) Write a udev rule for nvidia's modules. I'm 99% sure this won't work,
because the nvidia stuff doesn't populate sysfs and never creates udev
events.
2) Create /etc/security/console.perms.d/50-nvidia.perms with a line like
this:
<console> 0600 /dev/nvidia* 0600 root
Then permissions should be handled as in el5. Multiple logins via X
won't work I guess, because permissions cannot be accumulated like
entries to an acl.
3) Create all devices once r/w for everyone and stick to that.
4) Admit defeat. Remove NVreg_ModifyDeviceFiles=0, put in suid root
nvidia-modprobe, and let nvidia have their bloody way.
Better ideas?
-Michael
More information about the elrepo
mailing list