SLES10 and megaraid_sas dkms

This is a quick write-up of the problem I had (and subsequent solution) with installing the latest version of the megaraid_sas driver (as used by the Dell PowerEdge 1950/2950/etc family for their PERC5 RAID controllers) on SLES 10.

I was bitten by some recent problems with one of my PowerEdge 2950’s RAID container which was exacerbated by the fact that I had failed to apply some rather urgent firmware updates for the PERC5/i controller card. The PERC5/i is just a Dell-badged LSI SAS RAID card and thus uses the megaraid_sas drivers, but I am jumping ahead a bit here. For the sake of those who follow the same naive course as I did, may you find this quickly with Google and save yourself some grief.

Having fell foul to a failed RAID container, I quickly went out and found the latest firmware updates which I knew existed but had yet to apply to the failed system. I applied it to the system and a couple of other which I knew were also in need. Installed on a bootable flash drive, the updates went by rather quickly. Yay.

I reboot the system and all seems well, that is until I notice that syslog is running full load on one CPU. So I look at my /var/log/messages and it is chock full of errors like this, at a rate like 100-line per second:

May 10 09:30:03 doomed kernel:   status = 1, message = 00, host = 0, driver = 08
May 10 09:30:03 doomed kernel:   <6>sd: Current: sense key: Illegal Request
May 10 09:30:03 doomed kernel:     Additional sense: Invalid command operation code
May 10 09:30:03 doomed kernel: FAILED

So this is very rapidly filling up my /var partition and making me unhappy. Very unhappy.

I thought this was perhaps related to a failed rebuild of my newly initialized RAID container… so I waited for it to finish 1.43TB of RAID building, and it still spewed away. Googling the error message didn’t yield a whole lot in results. Eventually I looked on the Dell site to see if their were any other updates I was missing, and lo and behold the day after they released the PERC5/i firmware update there was also a driver update.

I downloaded the driver. It requires The Dell DKMS (Dynamic Kernel Module Support), so I download that as well. Install the DKMS rpm, no problem. Install the megaraid_sas rpm, no problem their either. It installs the module, does a mkinitrd, and waits for me to reboot. I reboot and all is hunky dory. No more crazy errors. I am happy and decide it is now time to rinse and repeat with my other two servers exhibiting the same problem.

No joy. When I go to the next box and install the megaraid_sas rpm I get:

Preparing...                ########################################### [100%]
1:megaraid_sas           ########################################### [100%]
Loading tarball for module: megaraid_sas / version: v00.00.03.09
Loading /usr/src/megaraid_sas-v00.00.03.09...
Creating /var/lib/dkms/megaraid_sas/v00.00.03.09/source symlink...
DKMS: ldtarball Completed.
Kernel preparation unnecessary for this kernel.  Skipping...
Building module:
cleaning build area....
make KERNELRELEASE=2.6.16.27-0.9-smp -C /lib/modules/2.6.16.27-0.9-smp/build SUBDIRS=/var/lib/dkms/megaraid_sas/v00.00.03.09/build modules....(bad exit status: 2)
Error! Bad return status for module build on kernel: 2.6.16.27-0.9-smp (x86_64)
Consult the make.log in the build directory
/var/lib/dkms/megaraid_sas/v00.00.03.09/build/ for more information.
Error! Could not locate megaraid_sas.ko for module megaraid_sas in the DKMS tree.
You must run a dkms build for kernel 2.6.16.27-0.9-smp (x86_64) first.
error: %post(megaraid_sas-v00.00.03.09-1.noarch) scriptlet failed, exit status 4

WTF?!?

It ends up that the first box I updated was actually a pristine install of SLES10 as released. No kernel updates no nothing. So all it ended up doing, it seems, was unpacking a precompiled module from somewhere and stuffing it in the right place. Hell, I didn’t even have the kernel-source rpm installed on it.

The referenced make.log looks like:

DKMS make.log for megaraid_sas-v00.00.03.09 for kernel 2.6.16.27-0.9-smp (x86_64)
Thu May 10 21:19:18 EDT 2007
make: Entering directory `/usr/src/linux-2.6.16.27-0.9-obj/x86_64/smp'
make -C ../../../linux-2.6.16.27-0.9 O=../linux-2.6.16.27-0.9-obj/x86_64/smp modules
CC [M]  /var/lib/dkms/megaraid_sas/v00.00.03.09/build/megaraid_sas.o
/var/lib/dkms/megaraid_sas/v00.00.03.09/build/megaraid_sas.c: In function 'megasas_probe_one':
/var/lib/dkms/megaraid_sas/v00.00.03.09/build/megaraid_sas.c:2629: error: 'IRQF_SHARED' undeclared (
first use in this function)
/var/lib/dkms/megaraid_sas/v00.00.03.09/build/megaraid_sas.c:2629: error: (Each undeclared identifie
r is reported only once
/var/lib/dkms/megaraid_sas/v00.00.03.09/build/megaraid_sas.c:2629: error: for each function it appea
rs in.)
make[3]: *** [/var/lib/dkms/megaraid_sas/v00.00.03.09/build/megaraid_sas.o] Error 1
make[2]: *** [_module_/var/lib/dkms/megaraid_sas/v00.00.03.09/build] Error 2
make[1]: *** [modules] Error 2
make: *** [modules] Error 2
make: Leaving directory `/usr/src/linux-2.6.16.27-0.9-obj/x86_64/smp'

If I take a look in /var/lib/dkms/megaraid_sas/v00.00.03.09 I see that there is a patches directory, and in there a sles10-ga.patch. Part of the patch file says:

-       if (request_irq(pdev->irq, megasas_isr, IRQF_SHARED, "megasas", instance)) {
+       if (request_irq(pdev->irq, megasas_isr, SA_SHIRQ, "megasas", instance)) {

Well, that is odd, because the make log was complaining about IRQF_SHARED… so it seems the patch was not getting applied to the source at all. K-lame. Whatever detection mechanism there is for the distro you are running has obviously failed here. So a little of this:

# patch < patches/sles10-ga.patch
patching file megaraid_sas.c
# dkms build -m megaraid_sas -v v00.00.03.09
Kernel preparation unnecessary for this kernel.  Skipping...
Building module:
cleaning build area....
make KERNELRELEASE=2.6.16.27-0.9-smp -C /lib/modules/2.6.16.27-0.9-smp/build SUBDIRS=/var/lib/dkms/megaraid_sas/v00.00.03.09/build modules....
cleaning build area....
DKMS: build Completed.

and presto, the module builds. Complete the task with a

# dkms install -m megaraid_sas -v v00.00.03.09
Running module version sanity check.
megaraid_sas.ko:
- Original module
- Found /lib/modules/2.6.16.27-0.9-smp/kernel/drivers/scsi/megaraid//megaraid_sas.ko
- Storing in /var/lib/dkms/megaraid_sas/original_module/2.6.16.27-0.9-smp/x86_64/
- Archiving for uninstallation purposes
- Installation
- Installing to /lib/modules/2.6.16.27-0.9-smp/kernel/drivers/scsi/megaraid//
/etc/modprobe.conf: added alias reference for 'megaraid_sas'
depmod.....
Saving old initrd as /boot/initrd-2.6.16.27-0.9-smp_old
Making new initrd as /boot/initrd-2.6.16.27-0.9-smp
(If next boot fails, revert to the _old initrd image)
mkinitrd.....
DKMS: install Completed.

and everything is good to go. Yay. Reboot. Doom averted. Go home.


A co-worker of mine came up with a subsequently much neater solution to this:

The fix is the make dkms aware of newer SLES 10 kernel versions by doing the following:
vi /usr/src/megaraid_sas-v00.00.03.09/dkms.conf

Change the following lines:

PATCH[6]="sles10-ga.patch"
PATCH_MATCH[6]="2\.6\.16\.21-0\.8"

to:

PATCH[6]="sles10-ga.patch"
PATCH_MATCH[6]="2\.6\.16\.2.-0\..*"

3 Responses to “SLES10 and megaraid_sas dkms”

  1. Bill says:

    Thank you so much for the elegant write-up. I was about going out of my mind trying to google a solution. I had a custom kernel installed, so I also had trouble building the module. Needed to get the kernel source rpm installed and ensure that the /lib/modules//build and /lib/modules//source soft links were pointed correctly to the sources. Thanks again.

  2. Here at work, we have varying dell poweredges. PowerEdge 750, 860, 2650, SC1435..etc. They all primarily run Enterprise Linux. I like the utility “afacli”, which allows manipulation of the hardware RAID controller and such. It was semi standardized until we bought new servers. Is there a general linux solution that can do the same thing or an alternative program/script to install that would work with most raid controllers or just with the Perc 5/SAS on the new servers. Thanks!

  3. Jorden says:

    Thanx man, it works. Saved me a lot(!) of time.
    Although the purpose of Dell’s dkms project is very elegant, I hope the patch will also be in the mainstream kernel and not only in a separate module which must be dkms-based (and also suffers the patch error). Perhaps it already is, but I use the SLES10 kernel for now.

Leave a Reply