check_megaraid_sas Nagios plugin

This is somewhat related to my earlier posting about updating the megaraid drivers. I use Nagios at work for system monitoring and one thing that I like to check is the status of the volumes managed by the RAID controller. When I first started configuring the Nagios on my new PowerEdge 1950 and 2950 systems I found a check_perc5i over on Nagios Exchange.

Unfortunately the plugin only looked like it worked properly. It would report back correctly things like the number of volumes you had online, the number of disks, failed disks etc., but if you had a failed disk it would not actually return the proper error status. It just kept on going blindly saying OK : Bad Disks=3.

So I have written my own script to check the RAID controller status, check_megaraid_sas. It is somewhat similar to the work I did for the PERC3Di with afacli and Nagios quite a while back.

In order to use it you need to have LSI’s MegaCli utility installed and the user executing the script will need to have sudo privileges (w/o a password) to execute it. Then you will end up with output like:
OK: 0:0:RAID-1:2 drives:68GB:Optimal 1:0:RAID-5:7 drives:2792GB:Optimal Drives:10 Hotspare(s):1
or (less good)
WARNING: 0:0:RAID-1:2 drives:74GB:Optimal 0:1:RAID-5:4 drives:1396GB:Optimal Drives:6 (3 Errors)

The warning is due to the detection of “other” disk errors on the drive. I am trying to find out from Dell if I can reset this count in the controller. Otherwise if it is cumulative I will probably modify my code to take a n argument for a threshold under which to ignore non-fatal errors. The output above is basically in the form:
<status> <controller #>:<volume #>:<RAID level>:<volume drive count>:<volume size>:<volume status> ... Drives:<total drives attached to controller(s)>

Tags:

7 Responses to “check_megaraid_sas Nagios plugin”

  1. Bozz says:

    Continual False Warnings after RAID Rebuild

    Hi,

    I’ve got a Server with a Raid 5 configuration installed with a Megaraid Raid Controller. I had to replace a faulty disk. It has rebuilt fine and the Raid array is back online.

    I’ve got a Nagios Monitoring server checking this box. It has NRPE 2.12 installed. The Remote Servers have Nagios 3.12, NRPE 2.12, nagios-plugins-1.4.13, check_megaraid_sas and MegaCli-4.00.11.rpm installed.

    Nagios is constantly reporting the LSI RAID STATUS as WARNING, even though all is OK. I assume it is referring to the history of the failure somehow…

    Can anyone help ?

    Thanks
    Bozz

  2. Klaba says:

    hello
    i have to change line 132 in your script to
    if ( m/Size:\s*(([\d\.]+)\s*(MB|GB|TB))/ ) {
    because my megacli (MegaRAID SAS 8880EM2) returns size of logical disk like this:
    Size:837.312 GB
    origin regular expression does not like . and space between size and unit

    regards Klaba

  3. Ah, that is actually fixed in my latest version (rev 9) of the plugin, only it seems that I don’t have that latest version uploaded here on my own site (which I will fix). The latest version is (usually) up on the Nagios Exchange (or whatever they are calling it now, I keep forgetting) site as soon as I make any changes. I will dig up the links for them and edit this post for better future reference.

    And here are the links to the two sites I try to keep it updated at:

    MonitoringExchange (this seems to be the more popular source, it actually gets some comments there)
    Nagios Exchange

    -Jonathan

  4. Jason Dobyns says:

    I am getting this error when trying to use this with one of the newer LSI 9260-8i cards.

    This is the error

    /usr/lib/nagios/plugins/check_megaraid_sas
    Use of uninitialized value in concatenation (.) or string at /usr/lib/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib/nagios/plugins/check_megaraid_sas line 163.
    OK: 0:0:RAID-: drives:: 0:1:RAID-: drives:: Drives:6

    Here is the output of the commands the script appears to be running.

    MegaCli -LdInfo -L0 -a0

    Adapter 0 — Virtual Drive Information:
    Virtual Drive: 0 (Target Id: 0)
    Name :
    RAID Level : Primary-6, Secondary-0, RAID Level Qualifier-3
    Size : 40.0 GB
    State : Optimal
    Stripe Size : 64 KB
    Number Of Drives : 6
    Span Depth : 1
    Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
    Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
    Access Policy : Read/Write
    Disk Cache Policy : Disabled
    Encryption Type : None

    MegaCli -LdInfo -L1 -a0

    Adapter 0 — Virtual Drive Information:
    Virtual Drive: 1 (Target Id: 1)
    Name :
    RAID Level : Primary-6, Secondary-0, RAID Level Qualifier-3
    Size : 516.875 GB
    State : Optimal
    Stripe Size : 64 KB
    Number Of Drives : 6
    Span Depth : 1
    Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
    Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
    Access Policy : Read/Write
    Disk Cache Policy : Disabled
    Encryption Type : None

    Any help would be greatly appreciated.

    Thank You,
    Jason D.

  5. PJF says:

    Thanks for putting this together, having the same issue as Jason D.

    LSI 9260-4i

    MegaCLI 8.00.23

    CentOS 5.5 x86_64

    /usr/lib64/nagios/plugins/check_megaraid_sas
    Use of uninitialized value in concatenation (.) or string at /usr/lib64/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib64/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib64/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib64/nagios/plugins/check_megaraid_sas line 163.
    Use of uninitialized value in concatenation (.) or string at /usr/lib64/nagios/plugins/check_megaraid_sas line 163.
    OK: 0:0:RAID-: drives:: Drives:4

    Thanks for your help!

  6. It seems the latest version of MegaCli has added new areas of whitespace in their output. Minor adjustments were made to the script to cope with this and it is now available as revision 10 from Nagios Exchange and Monitoring Exchange (linked to in an earlier comment).

  7. PJF says:

    Thanks Jonathan, much appreciated :)

Leave a Reply