# SCSI error!

## Taoub

Hello, i have a machine with 2 SCSI drives(each one 73Gb), 1st one work fine , from second one i'm getting an error. I've tried diffrient FS (EXT3,XFS), 'cos i thought it could be FS error, but still getting this error.

/proc/scsi/scsi

Attached devices:

Host: scsi1 Channel: 00 Id: 01 Lun: 00

  Vendor: SEAGATE  Model: ST373307LC       Rev: 0006

  Type:   Direct-Access                    ANSI SCSI revision: 03

Host: scsi1 Channel: 00 Id: 03 Lun: 00

  Vendor: SEAGATE  Model: ST373307LC       Rev: 0006

  Type:   Direct-Access                    ANSI SCSI revision: 03

Does anyone know what could be wrong? 

Jan 14 14:06:45 [kernel] SCSI error : <1 0 3 0> return code = 0x8000002

 What does that mean??

I'm posting some logs from /var/log/kernel/

----EXT3

Jan 12 10:57:44 [kernel] scsi1: ERROR on channel 0, id 2, lun 0, CDB: 0x28 00 01 2c 91 07 00 01 00 00

Jan 12 10:57:51 [kernel] scsi1: ERROR on channel 0, id 2, lun 0, CDB: 0x28 00 01 2c 91 b7 00 00 08 00

----AFTER CHANGING TO XFS

Jan 14 14:05:25 [kernel] scsi1:0:3:0: Attempting to abort cmd e7646c80: 0x2a 0x0 0x3 0x0 0x57 0x77 0x0 0x1 0xb0 0x0

Jan 14 14:05:35 [kernel] scsi1:0:3:0: Attempting to abort cmd e7646c80: 0x0 0x0 0x0 0x0 0x0 0x0

Jan 14 14:05:45 [kernel] scsi1:0:3:0: Attempting to abort cmd f7d76b00: 0x0 0x0 0x0 0x0 0x0 0x0

Jan 14 14:05:45 [kernel]  22 FIFO_USE[0x0] SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x37]

Jan 14 14:05:45 [kernel]  26 FIFO_USE[0x0] SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x37]

Jan 14 14:05:45 [kernel]  19 FIFO_USE[0x0] SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x37]

Jan 14 14:05:45 [kernel]   3 FIFO_USE[0x0] SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x37]

Jan 14 14:05:55 [kernel] scsi1:0:3:0: Attempting to abort cmd e75afe00: 0x0 0x0 0x0 0x0 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75af380: 0x2a 0x0 0x4 0x0 0x15 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75af200: 0x2a 0x0 0x4 0x0 0x19 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e7646080: 0x2a 0x0 0x4 0x0 0x1d 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75afc80: 0x2a 0x0 0x4 0x0 0x21 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75afb00: 0x2a 0x0 0x4 0x0 0x25 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75af980: 0x2a 0x0 0x4 0x0 0x29 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75af800: 0x2a 0x0 0x4 0x0 0x2d 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75af680: 0x2a 0x0 0x4 0x0 0x31 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75a9c80: 0x2a 0x0 0x4 0x0 0x35 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75af080: 0x2a 0x0 0x4 0x0 0x39 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd f7d76500: 0x2a 0x0 0x4 0x0 0x3d 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e7646e00: 0x2a 0x0 0x4 0x0 0x41 0x3f 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75a9b00: 0x2a 0x0 0x4 0x0 0x45 0x3f 0x0 0x0 0xe0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd f7d76200: 0x2a 0x0 0x4 0x81 0x8b 0xd7 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e7646980: 0x2a 0x0 0x4 0x81 0xb3 0xd7 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e7646b00: 0x2a 0x0 0x4 0x81 0xb7 0xd7 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd f7d76080: 0x2a 0x0 0x4 0x81 0xbb 0xd7 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd f7d76680: 0x2a 0x0 0x4 0x81 0xbf 0xd7 0x0 0x3 0xb8 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75a9e00: 0x2a 0x0 0x4 0x80 0xab 0x2b 0x0 0x0 0x40 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e7646800: 0x2a 0x0 0x4 0x80 0xab 0x6b 0x0 0x0 0x40 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e7646380: 0x2a 0x0 0x4 0x80 0xab 0xab 0x0 0x0 0x40 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e7646500: 0x2a 0x0 0x4 0x80 0xab 0xeb 0x0 0x0 0x40 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e7646200: 0x2a 0x0 0x4 0x80 0xac 0x2b 0x0 0x0 0x40 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75a9800: 0x2a 0x0 0x4 0x80 0xac 0x6b 0x0 0x0 0x40 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e7646680: 0x2a 0x0 0x4 0x80 0xac 0xab 0x0 0x0 0x40 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75a9680: 0x2a 0x0 0x4 0x81 0x2c 0x7f 0x0 0x0 0x10 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd e75a9980: 0x2a 0x0 0x4 0x81 0xc3 0x97 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] scsi1:0:3:0: Attempting to abort cmd f7d76800: 0x2a 0x0 0x4 0x81 0xc7 0x97 0x0 0x4 0x0 0x0

Jan 14 14:05:56 [kernel] Recovery code sleeping

Jan 14 14:05:56 [kernel] (scsi1:A:3:0): Bus Device Reset Message Sent

Jan 14 14:06:45 [kernel] SCSI error : <1 0 3 0> return code = 0x8000002

                - Last output repeated twice -

Jan 14 14:09:46 [kernel] I/O error in filesystem ("sdb1") meta-data dev sdb1 block 0x48066df       ("xlog_iodone") error 5 buf count 32768

----------

## agent_jdh

Have you tried running SeaTools http://www.seagate.com/support/seatools/ ?  There should be a long/comprehensive test you can do which will scan the whole drive for problems, and also do some basic tests to see if the drive electronics are OK.  You'll probably want the Enterprise version, although the Desktop version is very handy because you can burn an ISO/create a boot disk which might be more useful in the future if things go belly-up and you can't boot your OS.

You could also use the SCSI card (if it's an Adaptec, I don't know about other brands) to low-level format the drive and then repartition/reformat it from the OS.

----------

## Janne Pikkarainen

Is the drive terminated properly? The most common reason for SCSI problems is misconfigured terminators...

----------

## Taoub

2 agent_jdh: Thanks, it's a good idea, i'll try. Anyother advices?

2 Janne Pikkarainen : Because it is brand new server, i believe it has automatic internal termination.

----------

## Janne Pikkarainen

If it's a brand new server, it may also have a it's own diagnostic tool. At least IBM xSeries servers can be self-checked by pressing F2 during boot. It then enters to a very comprehensive self-diagnostics programs which can be used to check every component in the server.

----------

## agent_jdh

 *Taoub wrote:*   

> 2 agent_jdh: Thanks, it's a good idea, i'll try. Anyother advices?
> 
> 2 Janne Pikkarainen : Because it is brand new server, i believe it has automatic internal termination.

 

What I've advised is really a starting point in trying to find out what's wrong ... try SeaTools and report back what it finds (even if it finds nothing).

As Janne also points out, if it's a new (decent) server it probably has comprehensive onboard diagnostics available, so you'll want to run them as well.

Let us know how you get on.

----------

## Taoub

2 agent_jdh:  I've tried SeaTools, right now i think it was a bad idea.

SCSI drive couldn't pass FULL TEST, (nothing was say about a reason),

so i've tried low format. When it was finished,  i got this

/dev/sg1 [=/dev/sdb  bus1 ch=0 target=3 lun=0]

        /dev/sg1

        Vendor = SEAGATE

        Product = ST373307LC

        Version = 0006

        Serial Number = 3HZ1P5YK

        Copyright = Copyright (c) 2003 Seagate All rights reserved

        SCSI Firmware = 05160006

        Servo RAM Release = 2002C907

        Servo ROM Release = 00000000

        Servo RAM Date = C907

        Servo ROM Date = 2002

        -Cannot read capacity  (Sense data = 03/31/00)

        -this is a Seagate drive

        -this drive supports DST

                -short DST time = 120 seconds

                -long DST time = 1488 seconds

        -Mode Page Settings [current value (default)]:

                -WCE bit = 1 (1)

                -RCD bit = 0 (0)

                -AWRE bit = 1 (1)

                -ARRE bit = 1 (1)

I was tring to set capacity manualy, using Seatools but it didn't help 

/var/log/kernel/

Jan 15 17:16:05 [kernel] scsi1: ERROR on channel 0, id 3, lun 0, CDB: Read (10) 00 00 00 00 00 00 00 08 00

Jan 15 17:16:05 [kernel] scsi1: ERROR on channel 0, id 3, lun 0, CDB: Read (10) 00 00 00 00 01 00 00 07 00

Jan 15 17:16:05 [kernel] scsi1: ERROR on channel 0, id 3, lun 0, CDB: Read (10) 00 00 00 00 02 00 00 06 00

Jan 15 17:16:05 [kernel] scsi1: ERROR on channel 0, id 3, lun 0, CDB: Read (10) 00 00 00 00 03 00 00 05 00

Jan 15 17:16:05 [kernel] scsi1: ERROR on channel 0, id 3, lun 0, CDB: Read (10) 00 00 00 00 04 00 00 04 00

Jan 15 17:16:05 [kernel] scsi1: ERROR on channel 0, id 3, lun 0, CDB: Read (10) 00 00 00 00 05 00 00 03 00

Jan 15 17:16:05 [kernel] scsi1: ERROR on channel 0, id 3, lun 0, CDB: Read (10) 00 00 00 00 06 00 00 02 00

Jan 15 17:16:05 [kernel] scsi1: ERROR on channel 0, id 3, lun 0, CDB: Read (10) 00 00 00 00 07 00 00 01 00

 :Sad: 

----------

## agent_jdh

 *Taoub wrote:*   

> 2 agent_jdh:  I've tried SeaTools, right now i think it was a bad idea.
> 
> SCSI drive couldn't pass FULL TEST, (nothing was say about a reason),
> 
> so i've tried low format. When it was finished,  i got this
> ...

 

If the drive coudn't pass the full SeaTools test then the drive is defective.  As it's a new server, it should be under guarantee.  Contact the server vendor for a replacement unit.

You probably shouldn't have tried to change the drive's capacity using SeaTools - if it failed the initial test, that should have been enough to show the drive as defective.  The vendor may accuse you of causing the problem by tampering with the drive.  You shouldn't try these things unless you really know what you're doing.

I trust you've opened the box up (assuming you're allowed to in the warranty), and checked the cables?  If the SCSI cable has lots of connectors on it, you should try connecting the HDD to a different one, in case there is a damaged pin on the connector being used.

----------

## Taoub

I've tried all that you say about cabling and slots, didn't help. I was trying to change capacity to the same one, when after low format a got a message that it can't read a capacity. Is it possible to check if the drive was low formated?

----------

## agent_jdh

 *Taoub wrote:*   

> I've tried all that you say about cabling and slots, didn't help. I was trying to change capacity to the same one, when after low format a got a message that it can't read a capacity. Is it possible to check if the drive was low formated?

 

Seriously, don't play around with this drive any more.  It's defective.  Get it replaced.

----------

