[elrepo-devel] We have fglrx packages and X does not behave. How to proceed ?
Phil Perry
phil at elrepo.org
Thu Dec 24 05:32:52 EST 2015
On 24/12/15 08:43, Manuel Wolfshant wrote:
> On 12/23/2015 11:59 PM, Phil Perry wrote:
>> On 23/12/15 12:48, Manuel Wolfshant wrote:
>>
>> Hi Wolfy,
>>
>>> Hello
>>>
>>> As some of you already know, for sometime I am the maintainer of
>>> the
>>> fglrx (AMD video drivers ) packages.
>>> Things look fine on EL6 ( at least I have not heard about any
>>> complains ) but on EL7 there is a problem which seems to occur more or
>>> less randomly. I for one was not able to identify the pattern triggering
>>> it ( and to make things worse, I still do not have an actual EL7 system
>>> to test with my own set of hands and eyes ). To cut a long story short,
>>> despite the presence of a file named /etc/X11/xorg.conf.d/20-fglrx.conf
>>> which contains
>>> Section "Files"
>>> ModulePath "/usr/lib64/xorg/modules/extensions/fglrx"
>>> ModulePath "/usr/lib64/xorg/modules"
>>> EndSection
>>> Xorg randomly decides to load
>>> /usr/lib64/xorg/modules/extensions/libglx.so rather than
>>> /usr/lib64/xorg/modules/extensions/fglrx/libglx.so . This leads to:
>>> [ 38.173] (II) "glx" will be loaded by default.
>>> [ 38.173] (II) LoadModule: "glx"
>>> [ 38.174] (II) Loading /usr/lib64/xorg/modules/extensions/libglx.so
>>> which later on leads to a SIG11 given that the fglrx.so hates stock
>>> libglx.so
>>>
>> If this were an Xorg bug then you'd expect the same (or similar)
>> behaviour with the nvidia driver, which potentially has the exact same
>> issue?
> Right, I would. And I know that it does not happen, which puzzles me
>
>
>> I install /etc/X11/xorg.conf.d/99-nvidia.conf:
>>
>> Section "Files"
>> ModulePath "/usr/lib64/xorg/modules/extensions/nvidia"
>> ModulePath "/usr/lib64/xorg/modules"
>> EndSection
> To be honest, my 20-fglrx.conf is the younger son ( or daughter? ) of
> your 99-nvidia.conf. I simply modified your file, after cloning the
> github repo.
>
>
>> so the only difference there being the numbering, 99-nvidia.conf vs
>> 20-fglrx.conf
> Right. But given that the only other existing file is 00-keyboard.conf,
> one would expect the same behaviour of 20-fglrx.conf and 99-fglrx.conf,
> right ?
>
Yes. I'm clutching at straws :-)
>
>>
>> For what it's worth I have never seen this issue with the nvidia driver.
>> If you do think you have a reproducer then I have real (nvidia) hardware
>> and could certainly test.
> Unfortunately I do not have a reproducer. I have seen the behaviour I
> reported in several occasions (on my system, on CentOS 7.0 exactly one
> year ago but at the time I did not pinpoint the issue ; during the last
> 2 months it also happened to several people, some of which reported on
> the other elrepo list ); however the same config worked on other
> systems, including on my test PC. It's as if the config files AND the
> defaults are evaluated in parallel and there is a race "somewhere".
> However given the complexity of the X code and my lack of time ( and
> interest, to be fully honest ) I will not dig in the source.
>
Nice theory, and would explain your observations :-)
>
>>> Now, the obvious solution would be the one originally
>>> implemented by
>>> AMD (and disabled in elrepo packages ): renaming the stock libglx.so to
>>> something else when the package is installed and restoring it at removal
>>> time. However this leads the system prone to issues if/when
>>> xorg-x11-server-Xorg is updated given that the file will get recreated.
>>>
>>> Therefore I come to you with this question: beside trying to file a
>>> bug against RHEL/xorg and asking for their help in identifying the
>>> reason why the behaviour of X is erratic, what can I ( emphasize on I )
>>> can do to make the users of the fglrx packages on EL7 happier ? I've
>>> never used it before in my 9 years of packaging but I wonder if
>>> something along
>>>
>>> %triggerin -- xorg-x11-server-Xorg
>>> mv /usr/lib64/xorg/modules/extensions/libglx.so
>>> /usr/lib64/xorg/modules/extensions/libglx.so.elrepo
>>>
>>> (and the corresponding %triggerun to restore the file ) would be fine
>>> for this case.
>>>
>>> Thoughts, anyone ?
>>>
>> This is exactly the type of case you would use %triggerin and I would
>> expect it to work as described. It's not ideal (IMHO) and almost a last
>> resort, but if it's the kind of brute force approach required to get it
>> to work then feel free to use it.
>>
>> Refining a little:
>>
>> %triggerin -- xorg-x11-server-Xorg
>> [ -f %{_libdir}/xorg/modules/extensions/libglx.so ] && \
>> mv %{_libdir}/xorg/modules/extensions/libglx.so \\
>> %{_libdir}/xorg/modules/extensions/libglx.so.elrepo &>/dev/null
> modulo an ending " || : " that's exactly what I had in mind. but I am
> trying to avoid this just because I was taught it's a "last resort -
> brute force" and I was looking for a polite alternative.
>
>
>>
>> I wouldn't worry about using %triggerun - if a user is uninstalling Xorg
>> then you don't need to worry about it,
> 2nd thought: unless I remove it in that moment, we'd leave garbage
> behind ( the renamed .so would be left on the system), which is not
> nice. I curse Brother and their linux packages almost daily because both
> at print time as well as at scan time they fill my /tmp with temporary
> files which they do not remove. I'd rather not do the same.
>
>
>> but you *would* need to restore
>> the file upon uninstalling the fglrx package(s) in either %preun or
>> %postun. But I'm guessing that's probably what you meant :-)
> yes, you are spot on. and "%postun when $1 == 0" would be my choice, if
> I remember correctly the order in which the scriptlets are executed.
> need to verify, though.
>
Yes, as long as you clean up after yourself upon uninstall of your
package - doesn't really matter if you do it at %preun or %postun in
this particular case.
The important thing is that the original libglx.so file is restored and
can be used once the fglrx drivers are uninstalled, otherwise we would
leave Xorg in a broken state.
>
>
>> As a point of note, the nvidia packages I inherited from rpmforge when
>> we first started elrepo used a %triggerin script as apparently
>> xorg-x11-server-Xorg had a tendency to empty the "Files" section of
>> xorg.conf so it needed to be recreated each time. It's still there in
>> the current el5 driver:
>>
>> https://github.com/elrepo/packages/blob/master/nvidia-x11-drv/el5/nvidia-x11-drv.spec
>>
>>
>> but got removed when I ported the package to el6 upon it's release. I
>> have no idea if it's still needed, it's been there over 6 years and does
>> no harm.
>>
> good to know. if others did it in the past, I guess that at least I am
> not a lunatic :)
> based on your input I'll give the package another spin, using 99-fglrx
> instead of 20-fglrx for the config file name ( just to make sure
> %triggein is the last resort -- unless others' ideas pop in ) and if
> this fails... %triggerin it is.
>
Agreed - sounds like a good plan. The only thing I would add is to maybe
comment why you've added the %triggerin in the SPEC file.
>
> thanks for the hint(s).
>
More information about the elrepo-devel
mailing list