[elrepo] Nvidia Driver Performance: El Repo vs. NVidia's .run file
Manuel Wolfshant
wolfy at nobugconsulting.ro
Tue Apr 26 04:47:09 EDT 2016
On 04/25/2016 10:21 PM, Phil Perry wrote:
> On 25/04/16 18:09, EBradley at williams-int.com wrote:
>> Hello Manuel and Phil,
>>
>> Thank you both for the prompt replies! Before I provide the info
>> requested by Manuel I wanted to share something I discovered in the
>> interim. On my test machine (still using the el repo drivers) I
>> realized that the terminal window from which Ansys Mechanical was
>> launched contained an error message I hadn't seen earlier:
>>
>> libGl error: failed to load driver: swrast
>>
>> This error message is not present when launching the program from the
>> user's machine mentioned earlier, as it is now using the .run driver
>> from Nvidia.com. Googling led me to a few different sites, one of
>> which recommended checking out the symbolic links in /usr/lib64;
>> specifically those regarding libGL.so.1. Upon doing so, and in
>> comparing my machine to the user's machine, I realized there were
>> some pretty obvious differences between the two. To start,
>> /usr/lib64/libGL.so.1 on the user's machine is a symbolic link to
>> libGL.so.361.42 whereas on my machine it links to libGL.so.1.2.0. The
>> libGL.so.361.42 file on my machine is contained in /usr/lib64/nvidia,
>> a folder which is not present on the user's machine. This folder
>> contains, as one would expect, many Nvidia-specific library files and
>> symbolic links that appear to be contained directly in /usr/lib64 on
>> the user's machine. When I modified my /usr/lib64/libGL.so.1 symbolic
>> link to point to /usr/lib64/nvidia/libGl.so.361.42
> the part
> was displayed as expected in Ansys Mechanical and the error message
> re: swrast was also absent from the terminal window.
>>
>> While this is good, I have to assume that my fix of reconfiguring the
>> symbolic link for libGl.so.1 isn't suitable for long-term production
>> use as the next update to the el repo driver or kernel will most
>> likely overwrite my changes. I am also a bit concerned that there are
>> other applications waiting to call upon some other library or
>> symbolic link that is missing/misconfigured. The differences between
>> /usr/lib64 on my machine and the user's seems too great to be
>> ignored. I'm not sure if this gives you enough to go on in
>> identifying a bug so please let me know if there's anything else I
>> can provide, or if you'd still like to see the info originally
>> requested by Manuel.
>>
>> Thanks,
>>
>>
>> Evan
>
> Ah, the way we install and handle libGL is one major difference
> between the elrepo driver and the nvidia installer.
>
> The nvidia drivers use their own libGL library. The nvidia installer
> backs up the original distro file in /usr/lib{64}/ and replaces it
> with the nvidia version of lbGL. This approach works fine until the
> distro updates the mesa-libGL package thus overwriting the nvidia
> library and breaking the installation.
>
> To better solve this issue we install all the nvidia libs to a
> separate nvidia dir /usr/lib{64}/nvidia/ and then update the lib path
> in /etc/ld.so.conf.d/nvidia.conf
>
> cat /etc/ld.so.conf.d/nvidia.conf
> /usr/lib64/nvidia
> /usr/lib64/vdpau
> /usr/lib/nvidia
> /usr/lib/vdpau
>
> So your system should be using the nvidia copy of libGL in
> /usr/lib{64}/nvidia/, not the distro copy in /usr/lib{64}/ (assuming
> you have the elrepo drivers installed)
>
> You can confirm this by using the ldd command to see which version a
> program is linked against. For example,
>
>
> # ldd /usr/bin/glxgears | grep libGL
> libGL.so.1 => /usr/lib64/nvidia/libGL.so.1 (0x0000003496200000)
>
> and we see it's correctly linked against the nvidia libGL.
>
> Please try running the above on a few programs including the program
> you are having issues with and let us know which copy of libGL is
> being used.
>
> Given your workaround fixes the issue, it would appear to be a bug in
> your application which appears to be using the wrong libGL (the above
> should confirm this).
>
> As you correctly identify, your workaround will work fine until the
> distro updates the libGL.so.1 lib and symlinks, which is exactly the
> case for the nvidia .run installer and the very reason we don't
> package the files that way.
..., and now we all know the reason for my request to see the Xorg.0.logs :)
More information about the elrepo
mailing list