[elrepo-devel] We have fglrx packages and X does not behave. How to proceed ?

Manuel Wolfshant wolfy at nobugconsulting.ro
Thu Dec 24 03:43:58 EST 2015


On 12/23/2015 11:59 PM, Phil Perry wrote:
> On 23/12/15 12:48, Manuel Wolfshant wrote:
>
> Hi Wolfy,
>
>> Hello
>>
>>      As some of you already know, for sometime I am the maintainer of the
>> fglrx (AMD video drivers ) packages.
>>      Things look fine on EL6 ( at least I have not heard about any
>> complains ) but on EL7 there is a problem which seems to occur more or
>> less randomly. I for one was not able to identify the pattern triggering
>> it ( and to make things worse, I still do not have an actual EL7 system
>> to test with my own set of hands and eyes ).  To cut a long story short,
>> despite the presence of a file named /etc/X11/xorg.conf.d/20-fglrx.conf
>> which contains
>>      Section "Files"
>>          ModulePath   "/usr/lib64/xorg/modules/extensions/fglrx"
>>          ModulePath   "/usr/lib64/xorg/modules"
>>      EndSection
>> Xorg randomly decides to load
>> /usr/lib64/xorg/modules/extensions/libglx.so rather than
>> /usr/lib64/xorg/modules/extensions/fglrx/libglx.so . This leads to:
>> [    38.173] (II) "glx" will be loaded by default.
>> [    38.173] (II) LoadModule: "glx"
>> [    38.174] (II) Loading /usr/lib64/xorg/modules/extensions/libglx.so
>> which later on leads to a SIG11 given that the fglrx.so hates stock
>> libglx.so
>>
> If this were an Xorg bug then you'd expect the same (or similar)
> behaviour with the nvidia driver, which potentially has the exact same
> issue?
Right, I would. And I know that it does not happen, which puzzles me


> I install /etc/X11/xorg.conf.d/99-nvidia.conf:
>
> Section "Files"
> 	ModulePath   "/usr/lib64/xorg/modules/extensions/nvidia"
> 	ModulePath   "/usr/lib64/xorg/modules"
> EndSection
To be honest, my 20-fglrx.conf is the younger son ( or daughter? ) of 
your 99-nvidia.conf. I simply modified your file, after cloning the 
github repo.


> so the only difference there being the numbering, 99-nvidia.conf vs
> 20-fglrx.conf
Right. But given that the only other existing file is 00-keyboard.conf, 
one would expect the same behaviour of 20-fglrx.conf and 99-fglrx.conf, 
right ?


>
> For what it's worth I have never seen this issue with the nvidia driver.
> If you do think you have a reproducer then I have real (nvidia) hardware
> and could certainly test.
Unfortunately I do not have a reproducer. I have seen the behaviour I 
reported in several occasions (on my system, on CentOS 7.0 exactly one 
year ago but at the time I did not pinpoint the issue ; during the last 
2 months it also happened to several people, some of which reported on 
the other elrepo list ); however the  same config worked on other 
systems, including on my test PC. It's as if the config files AND the 
defaults are evaluated in parallel and there is a race "somewhere". 
However given the complexity of the X code and my lack of time ( and 
interest, to be fully honest ) I will not dig in the source.


>>      Now, the obvious solution would be the one originally implemented by
>> AMD (and disabled in elrepo packages ): renaming the stock libglx.so to
>> something else when the package is installed and restoring it at removal
>> time. However this leads the system prone to issues if/when
>> xorg-x11-server-Xorg is updated given that the file will get recreated.
>>
>>      Therefore I come to you with this question: beside trying to file a
>> bug against RHEL/xorg  and asking for their help in identifying the
>> reason why the behaviour of X is erratic, what can I ( emphasize on I )
>> can do to make the users of the fglrx packages on EL7 happier  ? I've
>> never used it before in my 9 years of packaging but I wonder if
>> something along
>>
>>              %triggerin  -- xorg-x11-server-Xorg
>>                      mv /usr/lib64/xorg/modules/extensions/libglx.so
>> /usr/lib64/xorg/modules/extensions/libglx.so.elrepo
>>
>> (and the corresponding %triggerun to restore the file ) would be fine
>> for this case.
>>
>>      Thoughts, anyone ?
>>
> This is exactly the type of case you would use %triggerin and I would
> expect it to work as described. It's not ideal (IMHO) and almost a last
> resort, but if it's the kind of brute force approach required to get it
> to work then feel free to use it.
>
> Refining a little:
>
> %triggerin -- xorg-x11-server-Xorg
> [ -f %{_libdir}/xorg/modules/extensions/libglx.so ] && \
>    mv %{_libdir}/xorg/modules/extensions/libglx.so \\
>    %{_libdir}/xorg/modules/extensions/libglx.so.elrepo &>/dev/null
modulo an ending " || : " that's exactly what I had in mind. but I am 
trying to avoid this just because I was taught it's a "last resort - 
brute force" and I was looking for a polite alternative.


>
> I wouldn't worry about using %triggerun - if a user is uninstalling Xorg
> then you don't need to worry about it,
2nd thought: unless I remove it in that moment, we'd leave garbage 
behind ( the renamed .so would be left on the system), which is not 
nice. I curse Brother and their linux packages almost daily because both 
at print time as well as at scan time they fill my /tmp with temporary 
files which they do not remove. I'd rather not do the same.


>   but you *would* need to restore
> the file upon uninstalling the fglrx package(s) in either %preun or
> %postun. But I'm guessing that's probably what you meant :-)
yes, you are spot on. and "%postun when $1 == 0" would be my choice, if 
I remember correctly the order in which the scriptlets are executed. 
need to verify, though.



> As a point of note, the nvidia packages I inherited from rpmforge when
> we first started elrepo used a %triggerin script as apparently
> xorg-x11-server-Xorg had a tendency to empty the "Files" section of
> xorg.conf so it needed to be recreated each time. It's still there in
> the current el5 driver:
>
> https://github.com/elrepo/packages/blob/master/nvidia-x11-drv/el5/nvidia-x11-drv.spec
>
> but got removed when I ported the package to el6 upon it's release. I
> have no idea if it's still needed, it's been there over 6 years and does
> no harm.
>
good to know. if others did it in the past, I guess that at least I am 
not a lunatic :)
based on your input I'll give the package another spin, using 99-fglrx 
instead of 20-fglrx for the config file name ( just to make sure 
%triggein is the last resort -- unless others' ideas pop in ) and if 
this fails... %triggerin it is.


thanks for the hint(s).



More information about the elrepo-devel mailing list