[elrepo] Nvidia Driver Performance: El Repo vs. NVidia's .run file

Manuel Wolfshant wolfy at nobugconsulting.ro
Tue Apr 26 04:47:09 EDT 2016


On 04/25/2016 10:21 PM, Phil Perry wrote:
> On 25/04/16 18:09, EBradley at williams-int.com wrote:
>> Hello Manuel and Phil,
>>
>> Thank you both for the prompt replies! Before I provide the info 
>> requested by Manuel I wanted to share something I discovered in the 
>> interim. On my test machine (still using the el repo drivers) I 
>> realized that the terminal window from which Ansys Mechanical was 
>> launched contained an error message I hadn't seen earlier:
>>
>> libGl error: failed to load driver: swrast
>>
>> This error message is not present when launching the program from the 
>> user's machine mentioned earlier, as it is now using the .run driver 
>> from Nvidia.com. Googling led me to a few different sites, one of 
>> which recommended checking out the symbolic links in /usr/lib64; 
>> specifically those regarding libGL.so.1. Upon doing so, and in 
>> comparing my machine to the user's machine, I realized there were 
>> some pretty obvious differences between the two. To start, 
>> /usr/lib64/libGL.so.1 on the user's machine is a symbolic link to 
>> libGL.so.361.42 whereas on my machine it links to libGL.so.1.2.0. The 
>> libGL.so.361.42 file on my machine is contained in /usr/lib64/nvidia, 
>> a folder which is not present on the user's machine. This folder 
>> contains, as one would expect, many Nvidia-specific library files and 
>> symbolic links that appear to be contained directly in /usr/lib64 on 
>> the user's machine. When I modified my /usr/lib64/libGL.so.1 symbolic 
>> link to point to /usr/lib64/nvidia/libGl.so.361.42
>  the part
> was displayed as expected in Ansys Mechanical and the error message 
> re: swrast was also absent from the terminal window.
>>
>> While this is good, I have to assume that my fix of reconfiguring the 
>> symbolic link for libGl.so.1 isn't suitable for long-term production 
>> use as the next update to the el repo driver or kernel will most 
>> likely overwrite my changes. I am also a bit concerned that there are 
>> other applications waiting to call upon some other library or 
>> symbolic link that is missing/misconfigured. The differences between 
>> /usr/lib64 on my machine and the user's seems too great to be 
>> ignored. I'm not sure if this gives you enough to go on in 
>> identifying a bug so please let me know if there's anything else I 
>> can provide, or if you'd still like to see the info originally 
>> requested by Manuel.
>>
>> Thanks,
>>
>>
>> Evan
>
> Ah, the way we install and handle libGL is one major difference 
> between the elrepo driver and the nvidia installer.
>
> The nvidia drivers use their own libGL library. The nvidia installer 
> backs up the original distro file in /usr/lib{64}/ and replaces it 
> with the nvidia version of lbGL. This approach works fine until the 
> distro updates the mesa-libGL package thus overwriting the nvidia 
> library and breaking the installation.
>
> To better solve this issue we install all the nvidia libs to a 
> separate nvidia dir /usr/lib{64}/nvidia/ and then update the lib path 
> in /etc/ld.so.conf.d/nvidia.conf
>
> cat /etc/ld.so.conf.d/nvidia.conf
> /usr/lib64/nvidia
> /usr/lib64/vdpau
> /usr/lib/nvidia
> /usr/lib/vdpau
>
> So your system should be using the nvidia copy of libGL in 
> /usr/lib{64}/nvidia/, not the distro copy in /usr/lib{64}/ (assuming 
> you have the elrepo drivers installed)
>
> You can confirm this by using the ldd command to see which version a 
> program is linked against. For example,
>
>
> # ldd /usr/bin/glxgears | grep libGL
>         libGL.so.1 => /usr/lib64/nvidia/libGL.so.1 (0x0000003496200000)
>
> and we see it's correctly linked against the nvidia libGL.
>
> Please try running the above on a few programs including the program 
> you are having issues with and let us know which copy of libGL is 
> being used.
>
> Given your workaround fixes the issue, it would appear to be a bug in 
> your application which appears to be using the wrong libGL (the above 
> should confirm this).
>
> As you correctly identify, your workaround will work fine until the 
> distro updates the libGL.so.1 lib and symlinks, which is exactly the 
> case for the nvidia .run installer and the very reason we don't 
> package the files that way. 



..., and now we all know the reason for my request to see the Xorg.0.logs :)




More information about the elrepo mailing list