# Hardware RAID Notification (solved)

## Bigun

We have a machine setup here at work that has a Hardware RAID 5, using the "3ware 9xxx support" flag in the kernel.  Is there anyway to report if a drive goes out via SNMP?  

I've read through SNMP reporting, and I saw nothing involving reporting if hardware raid experiences a warning or failure.  (Not to mention the wiki pages being out didn't help any).

Anyone have some documentation I could read through?

----------

## snIP3r

hi!

perhaps a workaround via smartmontools can help. you can monitor the harddrives of a 3ware raidcontroller via smartmontools and then use the info via snmp interface.

heres a thread about snmp support for smartmontools:

http://marc2.theaimsgroup.com/?l=smartmontools-support&m=112091479501037&w=2

HTH

snIP3r

----------

## Bigun

I've got smartd running and monitoring all my disks.  

My /etc/smartd.conf:

```
/dev/twa0 -d 3ware,0 -m some@email.com

/dev/twa0 -d 3ware,1 -m some@email.com

/dev/twa0 -d 3ware,2 -m some@email.com

/dev/twa0 -d 3ware,3 -m some@email.com

/dev/twa0 -d 3ware,4 -m some@email.com

/dev/twa0 -d 3ware,5 -m some@email.com

/dev/twa0 -d 3ware,6 -m some@email.com

/dev/twa0 -d 3ware,7 -m some@email.com

```

However, I popped out one of the drives to see if I could generate an e-mail and nothing came through.  I checked /var/log/messages and smartd didn't see anything happen.  Here is a clip from the messages log when I popped out the drive:

```
Dec 22 14:35:49 hs-vsbackup1 3w-9xxx: scsi0: AEN: WARNING (0x04:0x0019): Drive removed:port=5.

Dec 22 14:35:49 hs-vsbackup1 3w-9xxx: scsi0: AEN: ERROR (0x04:0x0002): Degraded unit:unit=0, port=5.

Dec 22 14:40:01 hs-vsbackup1 cron[17509]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons )

Dec 22 14:44:22 hs-vsbackup1 3w-9xxx: scsi0: AEN: INFO (0x04:0x001A): Drive inserted:port=5.

Dec 22 14:44:22 hs-vsbackup1 3w-9xxx: scsi0: AEN: INFO (0x04:0x001F): Unit operational:unit=0.

Dec 22 14:44:22 hs-vsbackup1 3w-9xxx: scsi0: AEN: INFO (0x04:0x000C): Initialize started:unit=0.

```

The 3w-9xxx driver went nuts, but smartd didn't seem to care.  

On a side note, I nabbed the utility tw_cli and it seems to show a *LOT* more verbose information about the controller:

```
hs-vsbackup1 ~ # tw_cli /c0 show all

/c0 Driver Version = 2.26.02.010

/c0 Model = 9550SX-12MI

/c0 Memory Installed  = 224MB

/c0 Firmware Version = FE9X 3.04.00.005

/c0 Bios Version = BE9X 3.04.00.002

/c0 Monitor Version = BL9X 3.02.00.001

/c0 Serial Number = L021503A6180141

/c0 PCB Version = Rev 032

/c0 PCHIP Version = 1.60

/c0 ACHIP Version = 1.70

/c0 Number of Ports = 12

/c0 Number of Units = 1

/c0 Number of Drives = 8

/c0 Total Optimal Units = 0

/c0 Not Optimal Units = 1 

/c0 JBOD Export Policy = off

/c0 Disk Spinup Policy = 1

/c0 Spinup Stagger Time Policy (sec) = 2

/c0 Auto-Carving Policy = off

/c0 Auto-Carving Size = 2048 GB

/c0 Auto-Rebuild Policy = on

/c0 Controller Bus Type = PCIX

/c0 Controller Bus Width = 64 bits

/c0 Controller Bus Speed = 133 Mhz

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy

------------------------------------------------------------------------------

u0    RAID-10   REBUILDING     32      -       64K     1490.07   OFF    OFF    

Port   Status           Unit   Size        Blocks        Serial

---------------------------------------------------------------

p0     OK               u0     465.76 GB   976773168     6QG1PPBR            

p1     OK               u0     372.61 GB   781422768     WD-WMAMY1574401     

p2     OK               u0     372.61 GB   781422768     WD-WMAMY1573699     

p3     OK               u0     372.61 GB   781422768     WD-WMAMY1574052     

p4     OK               u0     372.61 GB   781422768     WD-WMAMY1574207     

p5     DEGRADED         u0     372.61 GB   781422768     WD-WMAMY1574411     

p6     OK               u0     372.61 GB   781422768     WD-WMAMY1574185     

p7     OK               u0     372.61 GB   781422768     WD-WMAMY1574332     

p8     NOT-PRESENT      -      -           -             -

p9     NOT-PRESENT      -      -           -             -

p10    NOT-PRESENT      -      -           -             -

p11    NOT-PRESENT      -      -           -             -

```

Was I wrong in assuming that popping out a drive would generate an e-mail via smartd?  Or is smartd not enough?

----------

## snIP3r

hi!

if you only wnat to have a email notofication you can install the 3dm2 webinterface. there you can set the level of notification you want. heres a screenshot of mine:

http://area52.kicks-ass.org/sniper/images/3dm2.jpg

you can install the 3dm2 via portage overlay:

http://ge.mine.nu/3dm2.html

the mail-notification via 3dm2 works perfect for me since 9.5.1.1. heres an example from my system:

 *Quote:*   

> 
> 
> Oct 05, 2008 02:03.46AM - Controller 0
> 
> ERROR - Drive power on reset detected: port=3
> ...

 

 *Quote:*   

> 
> 
> Oct 05, 2008 02:03.46AM - Controller 0
> 
> WARNING - Drive removed: port=3
> ...

 

 *Quote:*   

> 
> 
> Oct 05, 2008 02:03.46AM - Controller 0
> 
> ERROR - Degraded unit: unit=0, port=3
> ...

 

each message was in one email... (btw: now everything runs perfect again with the raid  :Wink:  )

HTH

snIP3r

----------

## Bigun

I'll try the overlay tomorrow.

----------

## Bigun

I just set it up, awesome, thanks a bunch!

----------

## -Craig-

No problem!   :Smile: 

----------

