[elrepo] Should I be able to get a kernel dump from kernel-lt on CentOS 5.X

Dag Wieers dag at wieers.com
Thu Apr 17 12:31:37 EDT 2014


On Wed, 16 Apr 2014, Antonio Dupont wrote:

> I've installed  kernel-lt-3.2.57-1.el5.elrepo.i686.rpm on my CentOS 5.5
> system because I'm trying to troubleshoot a third party USB hardware issue
> that causes the system to hang.  If I issue the commands:
>
> echo 1 > /proc/sys/kernel/sysrq
> echo "c" > /proc/sysrq-trigger
>
> The system looks like it's starting to capture the dump information, but
> complains about many kernel-module version mismatches and the kernel that
> it says it's mismatching is cryptic.  For example:
>
> Mounting sysfs filesystem
> Creating /dev
> Creating inital device notes
> Loading scsi_mod.ko module
> insmod: kernel-module version mismatch
> /lib/modules/3.2.57-1.el5.elrepo/scsi_mod.ko was compiled for kernel
> version M?**=a while this kernel is version 3.2.57-1.el5.elrepo
>
> The M?**=a is the cryptic part.  The ** are actually suppose to be boxes,
> but I don't know how to make that system.
>
> With the latest CentOS 5.X debug kernel (kernel-debug-2.6.18-371.6.1.el5)
> when I issue those commands kdump starts and captures system information in
> a vmcore file.
>
> Does kernel-lt-3.2.57-1.el5.elrepo have the capability to capture kernel
> dump information?  If so, any appearant indications of what I am doing
> wrong.  If no, I will work on compiling my own 3.2.57 kernel with the
> necessary parameters.  Any suggestions are appreciated.

Reading up on the details, it doesn't seem to strike me as a problem 
related to the lack of a debug kernel. kdump works fine on normal kernels 
on RHEL (debug kernels are just normal kernels with additional debug 
functionality added, that may help you track some strange kernel related 
bug, but also slows down the kernel as a result). Don't be confused with 
the kernel-debuginfo, which is also not needed on RHEL for a working kdump 
BTW. (It may be useful for doing post-mortem analysis of your vmcore files 
though)

Somehow it seems that for whatever reason the scsi_mod.ko kernel module 
inside the kdump initrd does not match the kernel you're booting once the 
system crashes. Which would be weird, as I assume the kernel boots fine 
and loads the same module without a glitch on a normal boot. The cryptic 
part looks really funny indeed.

I haven't tried -lt kernels myself, but it is possible that these kernels 
need different parameters to enable kdump. (on RHEL6 you can e.g. do 
crashkernel=auto, whereas RHEL5 needs specific offset and size based on 
your physical RAM).

So I would investigate what the kdump documentation of this specific 
kernel version instructs you to do, I would investigate the version 
information of this specific scsi_mod kernel (try modinfo) and I would 
investigate if there is something unusual going on with the kdump initrd 
that is created when you start /etc/init.d/kdump restart (you can delete 
it from /boot and have it recreate it if need be).

BTW To compare my -ml kernel to the official RHEL kernel, I can see Red 
Hat's crashkernel=auto implementation adds a specific 
CONFIG_KEXEC_AUTO_RESERVE config option:

----
[root at moria ~]# grep -C5 CONFIG_CRASH_DUMP /boot/config-2.6.32-431.11.2.el6.x86_64
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
CONFIG_KEXEC_AUTO_RESERVE=y
CONFIG_CRASH_DUMP=y
CONFIG_KEXEC_JUMP=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HOTPLUG_CPU=y
[root at moria ~]# grep -C5 CONFIG_CRASH_DUMP /boot/config-3.14.0-1.el6.elrepo.x86_64
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_KEXEC_JUMP=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HOTPLUG_CPU=y
----

And the documentation at 
/usr/share/doc/kernel-ml-doc-3.14.0/Documentation/kdump/kdump.txt does not 
indicate crashkernel=auto is a valid option in kernel 3.14.0 (not sure if 
that is still correct though). At least it never was a valid option in 
RHEL5.

-- 
-- dag wieers, dag at wieers.com, http://dag.wieers.com/
-- dagit linux solutions, contact at dagit.net, http://dagit.net/

[Any errors in spelling, tact or fact are transmission errors]


More information about the elrepo mailing list