# Disabling IRQ #5: Promise PDC20518 SATAII150 TX4 card

## Gherald

My new SATA card ceases to function ~20 minutes after boot.  How can I determine the cause?

gentoo-dev-sources-2.6.11:

```
pylon root # uname -a

Linux pylon 2.6.11-gentoo #2 SMP Wed Mar 2 12:18:53 CST 2005 x86_64 AMD Athlon(tm) 64 Processor 3200+ AuthenticAMD GNU/Linux
```

Relevant /var/log/messages:

```
Mar  3 04:49:18 pylon ACPI: PCI Interrupt Link [LNK3] enabled at IRQ 5

Mar  3 04:49:18 pylon PCI: setting IRQ 5 as level-triggered

Mar  3 04:49:18 pylon ACPI: PCI interrupt 0000:02:08.0[A] -> GSI 5 (level, low) -> IRQ 5

Mar  3 04:49:18 pylon ata1: SATA max UDMA/133 cmd 0xFFFFC20000014200 ctl 0xFFFFC20000014238 bmdma 0x0 irq 5

Mar  3 04:49:18 pylon ata2: SATA max UDMA/133 cmd 0xFFFFC20000014280 ctl 0xFFFFC200000142B8 bmdma 0x0 irq 5

Mar  3 04:49:18 pylon ata3: SATA max UDMA/133 cmd 0xFFFFC20000014300 ctl 0xFFFFC20000014338 bmdma 0x0 irq 5

Mar  3 04:49:18 pylon ata4: SATA max UDMA/133 cmd 0xFFFFC20000014380 ctl 0xFFFFC200000143B8 bmdma 0x0 irq 5

Mar  3 04:49:18 pylon ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4003 85:3469 86:3c01 87:4003 88:407f

Mar  3 04:49:18 pylon ata1: dev 0 ATA, max UDMA/133, 390721968 sectors: lba48

Mar  3 04:49:18 pylon ata1: dev 0 configured for UDMA/133

Mar  3 04:49:18 pylon scsi0 : sata_promise

Mar  3 04:49:18 pylon ata2: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4003 85:3469 86:3c01 87:4003 88:407f

Mar  3 04:49:18 pylon ata2: dev 0 ATA, max UDMA/133, 390721968 sectors: lba48

Mar  3 04:49:18 pylon ata2: dev 0 configured for UDMA/133

Mar  3 04:49:18 pylon scsi1 : sata_promise

Mar  3 04:49:18 pylon ata3: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4003 85:3469 86:3c01 87:4003 88:407f

Mar  3 04:49:18 pylon ata3: dev 0 ATA, max UDMA/133, 390721968 sectors: lba48

Mar  3 04:49:18 pylon ata3: dev 0 configured for UDMA/133

Mar  3 04:49:18 pylon scsi2 : sata_promise

Mar  3 04:49:18 pylon ata4: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4003 85:3469 86:3c01 87:4003 88:407f

Mar  3 04:49:18 pylon ata4: dev 0 ATA, max UDMA/133, 390721968 sectors: lba48

Mar  3 04:49:18 pylon ata4: dev 0 configured for UDMA/133

Mar  3 04:49:18 pylon scsi3 : sata_promise

Mar  3 04:49:18 pylon Vendor: ATA       Model: ST3200822AS       Rev: 3.01

Mar  3 04:49:18 pylon Type:   Direct-Access                      ANSI SCSI revision: 05

Mar  3 04:49:18 pylon Vendor: ATA       Model: ST3200822AS       Rev: 3.01

Mar  3 04:49:18 pylon Type:   Direct-Access                      ANSI SCSI revision: 05

Mar  3 04:49:18 pylon Vendor: ATA       Model: ST3200822AS       Rev: 3.01

Mar  3 04:49:18 pylon Type:   Direct-Access                      ANSI SCSI revision: 05

Mar  3 04:49:18 pylon Vendor: ATA       Model: ST3200822AS       Rev: 3.01

Mar  3 04:49:18 pylon Type:   Direct-Access                      ANSI SCSI revision: 05

Mar  3 04:49:18 pylon SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)

Mar  3 04:49:18 pylon SCSI device sda: drive cache: write back

Mar  3 04:49:18 pylon SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)

Mar  3 04:49:18 pylon SCSI device sda: drive cache: write back

Mar  3 04:49:18 pylon /dev/scsi/host0/bus0/target0/lun0: p1 p2 p3 < p5 p6 >

Mar  3 04:49:18 pylon Attached scsi disk sda at scsi0, channel 0, id 0, lun 0

Mar  3 04:49:18 pylon SCSI device sdb: 390721968 512-byte hdwr sectors (200050 MB)

Mar  3 04:49:18 pylon SCSI device sdb: drive cache: write back

Mar  3 04:49:18 pylon SCSI device sdb: 390721968 512-byte hdwr sectors (200050 MB)

Mar  3 04:49:18 pylon SCSI device sdb: drive cache: write back

Mar  3 04:49:18 pylon /dev/scsi/host1/bus0/target0/lun0: p1 p2 p3 < p5 p6 >

Mar  3 04:49:18 pylon Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0

Mar  3 04:49:18 pylon SCSI device sdc: 390721968 512-byte hdwr sectors (200050 MB)

Mar  3 04:49:18 pylon SCSI device sdc: drive cache: write back

Mar  3 04:49:18 pylon SCSI device sdc: 390721968 512-byte hdwr sectors (200050 MB)

Mar  3 04:49:18 pylon SCSI device sdc: drive cache: write back

Mar  3 04:49:18 pylon /dev/scsi/host2/bus0/target0/lun0: p1 p2 p3 < p5 p6 >

Mar  3 04:49:18 pylon Attached scsi disk sdc at scsi2, channel 0, id 0, lun 0

Mar  3 04:49:18 pylon SCSI device sdd: 390721968 512-byte hdwr sectors (200050 MB)

Mar  3 04:49:18 pylon SCSI device sdd: drive cache: write back

Mar  3 04:49:18 pylon SCSI device sdd: 390721968 512-byte hdwr sectors (200050 MB)

Mar  3 04:49:18 pylon SCSI device sdd: drive cache: write back

Mar  3 04:49:18 pylon /dev/scsi/host3/bus0/target0/lun0: p1 p2 p3 < p5 p6 >

Mar  3 04:49:18 pylon Attached scsi disk sdd at scsi3, channel 0, id 0, lun 0

-- sometime later --

Mar  3 05:07:09 pylon irq 5: nobody cared!

Mar  3 05:07:09 pylon 

Mar  3 05:07:09 pylon Call Trace:<IRQ> <ffffffff801560e0>{__report_bad_irq+48} <ffffffff801561b9>{note_interrupt+89} 

Mar  3 05:07:09 pylon <ffffffff80155a89>{__do_IRQ+281} <ffffffff8011130a>{do_IRQ+74} 

Mar  3 05:07:09 pylon <ffffffff8010e931>{ret_from_intr+0}  <EOI> <ffffffff8010e9ae>{retint_careful+13} 

Mar  3 05:07:09 pylon 

Mar  3 05:07:09 pylon handlers:

Mar  3 05:07:09 pylon [<ffffffff803bc400>] (pdc_interrupt+0x0/0x1f0)

Mar  3 05:07:09 pylon Disabling IRQ #5

Mar  3 05:07:39 pylon ata3: command timeout

Mar  3 05:07:39 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:07:39 pylon ata3: called with no error (51)!

Mar  3 05:07:39 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:07:39 pylon sdc: Current: sense key=0x3

Mar  3 05:07:39 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:07:39 pylon end_request: I/O error, dev sdc, sector 206442807

Mar  3 05:07:39 pylon raid5: Disk failure on sdc6, disabling device. Operation continuing on 2 devices

Mar  3 05:08:09 pylon ata3: command timeout

Mar  3 05:08:09 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:08:09 pylon ata3: called with no error (51)!

Mar  3 05:08:09 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:08:09 pylon sdc: Current: sense key=0x3

Mar  3 05:08:09 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:08:09 pylon end_request: I/O error, dev sdc, sector 206442815

Mar  3 05:08:09 pylon ata2: command timeout

Mar  3 05:08:09 pylon ata2: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:08:09 pylon ata2: called with no error (51)!

Mar  3 05:08:09 pylon SCSI error : <1 0 0 0> return code = 0x8000002

Mar  3 05:08:09 pylon sdb: Current: sense key=0x3

Mar  3 05:08:09 pylon ASC=0xc ASCQ=0x2

Mar  3 05:08:09 pylon end_request: I/O error, dev sdb, sector 390716727

Mar  3 05:08:09 pylon md: write_disk_sb failed for device sdb6

Mar  3 05:08:39 pylon ata3: command timeout

Mar  3 05:08:39 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:08:39 pylon ata3: called with no error (51)!

Mar  3 05:08:39 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:08:39 pylon sdc: Current: sense key=0x3

Mar  3 05:08:39 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:08:39 pylon end_request: I/O error, dev sdc, sector 206442823

Mar  3 05:08:39 pylon ata1: command timeout

Mar  3 05:08:39 pylon ata1: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:08:39 pylon ata1: called with no error (51)!

Mar  3 05:08:39 pylon SCSI error : <0 0 0 0> return code = 0x8000002

Mar  3 05:08:39 pylon sda: Current: sense key=0x3

Mar  3 05:08:39 pylon ASC=0xc ASCQ=0x2

Mar  3 05:08:39 pylon end_request: I/O error, dev sda, sector 390716727

Mar  3 05:08:39 pylon md: write_disk_sb failed for device sda6

Mar  3 05:08:39 pylon md: errors occurred during superblock update, repeating

Mar  3 05:09:09 pylon ata3: command timeout

Mar  3 05:09:09 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:09:09 pylon ata3: called with no error (51)!

Mar  3 05:09:09 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:09:09 pylon sdc: Current: sense key=0x3

Mar  3 05:09:09 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:09:09 pylon end_request: I/O error, dev sdc, sector 206442831

Mar  3 05:09:09 pylon ata2: command timeout

Mar  3 05:09:09 pylon ata2: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:09:09 pylon ata2: called with no error (51)!

Mar  3 05:09:09 pylon SCSI error : <1 0 0 0> return code = 0x8000002

Mar  3 05:09:09 pylon sdb: Current: sense key=0x3

Mar  3 05:09:09 pylon ASC=0xc ASCQ=0x2

Mar  3 05:09:09 pylon end_request: I/O error, dev sdb, sector 390716727

Mar  3 05:09:09 pylon md: write_disk_sb failed for device sdb6

Mar  3 05:09:25 pylon /usr/sbin/cron[7360]: (root) MAIL (mailed 53048 bytes of output but got status 0x0001 )

Mar  3 05:09:39 pylon ata3: command timeout

Mar  3 05:09:39 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:09:39 pylon ata3: called with no error (51)!

Mar  3 05:09:39 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:09:39 pylon sdc: Current: sense key=0x3

Mar  3 05:09:39 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:09:39 pylon end_request: I/O error, dev sdc, sector 206442839

Mar  3 05:09:39 pylon ata1: command timeout

Mar  3 05:09:39 pylon ata1: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:09:39 pylon ata1: called with no error (51)!

Mar  3 05:09:39 pylon SCSI error : <0 0 0 0> return code = 0x8000002

Mar  3 05:09:39 pylon sda: Current: sense key=0x3

Mar  3 05:09:39 pylon ASC=0xc ASCQ=0x2

Mar  3 05:09:39 pylon end_request: I/O error, dev sda, sector 390716727

Mar  3 05:09:39 pylon md: write_disk_sb failed for device sda6

Mar  3 05:09:39 pylon md: errors occurred during superblock update, repeating

Mar  3 05:10:01 pylon /usr/sbin/cron[7734]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons )

Mar  3 05:10:09 pylon ata3: command timeout

Mar  3 05:10:09 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:10:09 pylon ata3: called with no error (51)!

Mar  3 05:10:09 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:10:09 pylon sdc: Current: sense key=0x3

Mar  3 05:10:09 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:10:09 pylon end_request: I/O error, dev sdc, sector 206442847

Mar  3 05:10:09 pylon ata2: command timeout

Mar  3 05:10:09 pylon ata2: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:10:09 pylon ata2: called with no error (51)!

Mar  3 05:10:09 pylon SCSI error : <1 0 0 0> return code = 0x8000002

Mar  3 05:10:09 pylon sdb: Current: sense key=0x3

Mar  3 05:10:09 pylon ASC=0xc ASCQ=0x2

Mar  3 05:10:09 pylon end_request: I/O error, dev sdb, sector 390716727

Mar  3 05:10:09 pylon md: write_disk_sb failed for device sdb6

Mar  3 05:10:39 pylon ata3: command timeout

Mar  3 05:10:39 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:10:39 pylon ata3: called with no error (51)!

Mar  3 05:10:39 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:10:39 pylon sdc: Current: sense key=0x3

Mar  3 05:10:39 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:10:39 pylon end_request: I/O error, dev sdc, sector 206442855

Mar  3 05:10:39 pylon ata1: command timeout

Mar  3 05:10:39 pylon ata1: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:10:39 pylon ata1: called with no error (51)!

Mar  3 05:10:39 pylon SCSI error : <0 0 0 0> return code = 0x8000002

Mar  3 05:10:39 pylon sda: Current: sense key=0x3

Mar  3 05:10:39 pylon ASC=0xc ASCQ=0x2

Mar  3 05:10:39 pylon end_request: I/O error, dev sda, sector 390716727

Mar  3 05:10:39 pylon md: write_disk_sb failed for device sda6

Mar  3 05:10:39 pylon md: errors occurred during superblock update, repeating

Mar  3 05:11:09 pylon ata3: command timeout

Mar  3 05:11:09 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:11:09 pylon ata3: called with no error (51)!

Mar  3 05:11:09 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:11:09 pylon sdc: Current: sense key=0x3

Mar  3 05:11:09 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:11:09 pylon end_request: I/O error, dev sdc, sector 206442863

Mar  3 05:11:09 pylon ata2: command timeout

Mar  3 05:11:09 pylon ata2: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:11:09 pylon ata2: called with no error (51)!

Mar  3 05:11:09 pylon SCSI error : <1 0 0 0> return code = 0x8000002

Mar  3 05:11:09 pylon sdb: Current: sense key=0x3

Mar  3 05:11:09 pylon ASC=0xc ASCQ=0x2

Mar  3 05:11:09 pylon end_request: I/O error, dev sdb, sector 390716727

Mar  3 05:11:09 pylon md: write_disk_sb failed for device sdb6

Mar  3 05:11:38 pylon ata3: command timeout

Mar  3 05:11:38 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:11:38 pylon ata3: called with no error (51)!

Mar  3 05:11:38 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:11:38 pylon sdc: Current: sense key=0x3

Mar  3 05:11:38 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:11:38 pylon end_request: I/O error, dev sdc, sector 206442871

Mar  3 05:11:38 pylon ata1: command timeout

Mar  3 05:11:38 pylon ata1: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:11:38 pylon ata1: called with no error (51)!

Mar  3 05:11:38 pylon SCSI error : <0 0 0 0> return code = 0x8000002

Mar  3 05:11:38 pylon sda: Current: sense key=0x3

Mar  3 05:11:38 pylon ASC=0xc ASCQ=0x2

Mar  3 05:11:38 pylon end_request: I/O error, dev sda, sector 390716727

Mar  3 05:11:38 pylon md: write_disk_sb failed for device sda6

Mar  3 05:11:38 pylon md: errors occurred during superblock update, repeating

Mar  3 05:12:08 pylon ata3: command timeout

Mar  3 05:12:08 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:12:08 pylon ata3: called with no error (51)!

Mar  3 05:12:08 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:12:08 pylon sdc: Current: sense key=0x3

Mar  3 05:12:08 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:12:08 pylon end_request: I/O error, dev sdc, sector 206442879

Mar  3 05:12:08 pylon ata2: command timeout

Mar  3 05:12:08 pylon ata2: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:12:08 pylon ata2: called with no error (51)!

Mar  3 05:12:08 pylon SCSI error : <1 0 0 0> return code = 0x8000002

Mar  3 05:12:08 pylon sdb: Current: sense key=0x3

Mar  3 05:12:08 pylon ASC=0xc ASCQ=0x2

Mar  3 05:12:08 pylon end_request: I/O error, dev sdb, sector 390716727

Mar  3 05:12:08 pylon md: write_disk_sb failed for device sdb6

Mar  3 05:12:38 pylon ata3: command timeout

Mar  3 05:12:38 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:12:38 pylon ata3: called with no error (51)!

Mar  3 05:12:38 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:12:38 pylon sdc: Current: sense key=0x3

Mar  3 05:12:38 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:12:38 pylon end_request: I/O error, dev sdc, sector 206442887

Mar  3 05:12:38 pylon ata1: command timeout

Mar  3 05:12:38 pylon ata1: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:12:38 pylon ata1: called with no error (51)!

Mar  3 05:12:38 pylon SCSI error : <0 0 0 0> return code = 0x8000002

Mar  3 05:12:38 pylon sda: Current: sense key=0x3

Mar  3 05:12:38 pylon ASC=0xc ASCQ=0x2

Mar  3 05:12:38 pylon end_request: I/O error, dev sda, sector 390716727

Mar  3 05:12:38 pylon md: write_disk_sb failed for device sda6

Mar  3 05:12:38 pylon md: errors occurred during superblock update, repeating

Mar  3 05:13:08 pylon ata3: command timeout

Mar  3 05:13:08 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:13:08 pylon ata3: called with no error (51)!

Mar  3 05:13:08 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:13:08 pylon sdc: Current: sense key=0x3

Mar  3 05:13:08 pylon ASC=0x11 ASCQ=0x4

Mar  3 05:13:08 pylon end_request: I/O error, dev sdc, sector 206442895

Mar  3 05:13:08 pylon ata2: command timeout

Mar  3 05:13:08 pylon ata2: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:13:08 pylon ata2: called with no error (51)!

Mar  3 05:13:08 pylon SCSI error : <1 0 0 0> return code = 0x8000002

Mar  3 05:13:08 pylon sdb: Current: sense key=0x3

Mar  3 05:13:08 pylon ASC=0xc ASCQ=0x2

Mar  3 05:13:08 pylon end_request: I/O error, dev sdb, sector 390716727

Mar  3 05:13:08 pylon md: write_disk_sb failed for device sdb6

Mar  3 05:13:38 pylon ata3: command timeout

Mar  3 05:13:38 pylon ata3: status=0x51 { DriveReady SeekComplete Error }

Mar  3 05:13:38 pylon ata3: called with no error (51)!

Mar  3 05:13:38 pylon SCSI error : <2 0 0 0> return code = 0x8000002

Mar  3 05:13:38 pylon sdc: Current: sense key=0x3

Mar  3 05:13:38 pylon ASC=0x11 ASCQ=0x4
```

----------

## Gherald

Extracting data from the two logical arrays has proven very painful.  I have had to hard shutdown every ~20 minutes, reboot, and restart an rsync to seperate drives countless times.  But I finally had everything backed up and was able to deactivate the md devices and zero the md superblocks.

Now for the interesting part.  I had suspected, but have now been able to conclusively determine that the "Disabling IRQ #5" (and hence the crashing, or whatever the correct term is) /only/ occurs when md is running.  I ran badblocks on each individual drive without any issues.  The time taken was ~70 minutes apiece, does that seem right for 200GB drives?

I was surprised that badblocks ran fine, and needed something new to try.  So I  rebooted again and attempted to run mkreiserfs on each partition. This succeeded on the following 94.5GB partitions:

sda5 sdb5 sdc5 sdd5

sda6 sda6 sdc6

but sdd6 caused mkreiserfs and a later fdisk -ls to completely hang at the kernel level.  Hmm.

Yet another hard reboot later (to single user, this time), and now mkreiserfs to sdd6 was successful.  *sigh...

So how does one go about torture-testing eight 94GB reiserfs partitions?

time dd if=/dev/zero of=/mnt/sda5/zero is all I can think of atm... I'll follow up with the results.

But after those finish, I'm not sure what to try next.  Got any bright ideas, folks?

I'm using gentoo-dev-sources-2.6.11-r2 now, not that I think that changes anything.

The system is pefectly stable when the drives attached to this controller are not being used.  (well, I for one think running folding@home and 8 torrents overnight is stable).  The system has had no problems before I got this controller.  And previously, the 4 drives had been heavily used together in a hardware Raid0 on a WinXP video editing workstation, before I moved them to this server, so I think they are fine, especially in light of the low-level badblocks not encountering any problems...  :Confused: 

----------

## Tiger683

For the problem described, try following patch:

```

--- linux-2.5.68-bk6/drivers/ide/ide-io.c Fri Apr 25 16:08:53 2003

+++ linux/drivers/ide/ide-io.c Fri Apr 25 16:13:37 2003

@@ -850,14 +850,14 @@

  * happens anyway when any interrupt comes in, IDE or otherwise

  * the kernel masks the IRQ while it is being handled.

  */

- if (hwif->irq != masked_irq)

+ if (masked_irq != IDE_NO_IRQ && hwif->irq != masked_irq)

  disable_irq_nosync(hwif->irq);

  spin_unlock(&ide_lock);

  local_irq_enable();

 /* allow other IRQs while we start this request */

  startstop = start_request(drive, rq);

  spin_lock_irq(&ide_lock);

- if (hwif->irq != masked_irq)

+ if (masked_irq != IDE_NO_IRQ && hwif->irq != masked_irq)

  enable_irq(hwif->irq);

  if (startstop == ide_released)

  goto queue_next;

_

```

Cheers

T

----------

## Gherald

I am really confused how a 2.5.68 ide patch is supposed to help a 2.6.11 handle sata drives through the scsi layer, but here's a log of how the patch fails:

```
pylon src # cat idepatch

--- linux-2.5.68-bk6/drivers/ide/ide-io.c Fri Apr 25 16:08:53 2003

+++ linux/drivers/ide/ide-io.c Fri Apr 25 16:13:37 2003

@@ -850,14 +850,14 @@

  * happens anyway when any interrupt comes in, IDE or otherwise

  * the kernel masks the IRQ while it is being handled.

  */

- if (hwif->irq != masked_irq)

+ if (masked_irq != IDE_NO_IRQ && hwif->irq != masked_irq)

  disable_irq_nosync(hwif->irq);

  spin_unlock(&ide_lock);

  local_irq_enable();

 /* allow other IRQs while we start this request */

  startstop = start_request(drive, rq);

  spin_lock_irq(&ide_lock);

- if (hwif->irq != masked_irq)

+ if (masked_irq != IDE_NO_IRQ && hwif->irq != masked_irq)

  enable_irq(hwif->irq);

  if (startstop == ide_released)

  goto queue_next;

pylon src # cd linux-2.6.11-gentoo-r2/

pylon linux-2.6.11-gentoo-r2 # cat ../idepatch | patch -p1

(Stripping trailing CRs from patch.)

patching file drivers/ide/ide-io.c

Hunk #1 FAILED at 850.

1 out of 1 hunk FAILED -- saving rejects to file drivers/ide/ide-io.c.rej
```

Same thing happens with vanilla-2.6.11

And before anyone asks, I cannot use anything older than 2.6.11 because it is the first to support this controller.

Back to what I was doing, the eight "time dd if=/dev/zero of=...." I left running overnight finished fine, taking about 90 minutes to fill each drive entirely.  I let the machine idle for awhile in single user, then noticed a "Disabling IRQ #18".  This is the first I'd seen an IRQ other than #5 be disabled, and the first I'd seen it happen w/o active md devices.  Here's the log:

```
Mar  6 03:49:19 pylon irq 18: nobody cared!

Mar  6 03:49:19 pylon

Mar  6 03:49:19 pylon Call Trace:<IRQ> <ffffffff80156120>{__report_bad_irq+48} <ffffffff801561f9>{note_interrupt+89}

Mar  6 03:49:19 pylon <ffffffff80155ac9>{__do_IRQ+281} <ffffffff8011130a>{do_IRQ+74}

Mar  6 03:49:19 pylon <ffffffff8010e931>{ret_from_intr+0}  <EOI> <ffffffff8010c520>{default_idle+0}

Mar  6 03:49:19 pylon <ffffffff8010c540>{default_idle+32} <ffffffff8010c68f>{cpu_idle+63}

Mar  6 03:49:19 pylon <ffffffff8073a7b5>{start_kernel+453} <ffffffff8073a28b>{_sinittext+651}

Mar  6 03:49:19 pylon

Mar  6 03:49:19 pylon handlers:

Mar  6 03:49:19 pylon [<ffffffff803c9000>] (pdc_interrupt+0x0/0x1f0)

Mar  6 03:49:19 pylon Disabling IRQ #18
```

Looking in /proc/irq/18 I see the empty directory "libata", whereas /proc/irq/5 has nothing now (other smp_affinity, of course).... and, naturally, "fdisk -ls" crashed my single user bash, so I'm in for another hard reboot, and out of ideas.

----------

## rasmussen

Hi!

I have the same controller and I'm afraid -- a very similar problem.

I have just barely managed to rescue my RAID-5 array after I tried to add a new drive sitting on the Promise controller. Odd thing is that the drive that borked out was not on the Promise controller, but on the mainboard (sata_sil).

I haven't seen the IRQ messages that you get, but the other errors are extremely similar to yours. Do you have some debugging enabled in your kernel?

I also tested all drives with badblocks (taking a long time), but none of the drives come up with errors.

Have you in the meantime found a solution?

Anyway, here are my specs/logs:

development-sources:

```

odin root # uname -a                                                                                                         

Linux odin 2.6.11.4y #1 Thu Mar 24 15:37:48 CET 2005 i686 AMD Athlon(tm) XP 3000+ AuthenticAMD GNU/Linux

```

```

Mar 24 20:31:34 odin ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA

Mar 24 20:31:34 odin ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA

Mar 24 20:31:34 odin Probing IDE interface ide0...

Mar 24 20:31:34 odin hda: Maxtor 6Y080L0, ATA DISK drive

Mar 24 20:31:34 odin ide0 at 0x1f0-0x1f7,0x3f6 on irq 14

Mar 24 20:31:34 odin Probing IDE interface ide1...

Mar 24 20:31:34 odin hdc: Maxtor 6Y120P0, ATA DISK drive

Mar 24 20:31:34 odin ide1 at 0x170-0x177,0x376 on irq 15

Mar 24 20:31:34 odin Probing IDE interface ide2...

Mar 24 20:31:34 odin Probing IDE interface ide3...

Mar 24 20:31:34 odin Probing IDE interface ide4...

Mar 24 20:31:34 odin Probing IDE interface ide5...

Mar 24 20:31:34 odin hda: max request size: 128KiB

Mar 24 20:31:34 odin hda: 160086528 sectors (81964 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(133)

Mar 24 20:31:34 odin hda: cache flushes supported

Mar 24 20:31:34 odin /dev/ide/host0/bus0/target0/lun0: p1 p2 p3 p4

Mar 24 20:31:34 odin hdc: max request size: 128KiB

Mar 24 20:31:34 odin hdc: 240121728 sectors (122942 MB) w/7936KiB Cache, CHS=65535/16/63, UDMA(133)

Mar 24 20:31:34 odin hdc: cache flushes supported

Mar 24 20:31:34 odin /dev/ide/host0/bus1/target0/lun0: p1

Mar 24 20:31:34 odin libata version 1.10 loaded.

Mar 24 20:31:34 odin sata_promise version 1.01

Mar 24 20:31:34 odin ata1: SATA max UDMA/133 cmd 0xF8802200 ctl 0xF8802238 bmdma 0x0 irq 10

Mar 24 20:31:34 odin ata2: SATA max UDMA/133 cmd 0xF8802280 ctl 0xF88022B8 bmdma 0x0 irq 10

Mar 24 20:31:34 odin ata3: SATA max UDMA/133 cmd 0xF8802300 ctl 0xF8802338 bmdma 0x0 irq 10

Mar 24 20:31:34 odin ata4: SATA max UDMA/133 cmd 0xF8802380 ctl 0xF88023B8 bmdma 0x0 irq 10

Mar 24 20:31:34 odin ata1: no device found (phy stat 00000000)

Mar 24 20:31:34 odin scsi0 : sata_promise

Mar 24 20:31:34 odin ata2: no device found (phy stat 00000000)

Mar 24 20:31:34 odin scsi1 : sata_promise

Mar 24 20:31:34 odin ata3: no device found (phy stat 00000000)

Mar 24 20:31:34 odin scsi2 : sata_promise

Mar 24 20:31:34 odin ata4: dev 0 cfg 49:2f00 82:7c6b 83:7b09 84:4003 85:7c69 86:3a01 87:4003 88:407f

Mar 24 20:31:34 odin ata4: dev 0 ATA, max UDMA/133, 240121728 sectors:

Mar 24 20:31:34 odin ata4: dev 0 configured for UDMA/133

Mar 24 20:31:34 odin scsi3 : sata_promise

Mar 24 20:31:34 odin Vendor: ATA       Model: Maxtor 6Y120M0    Rev: YAR5

Mar 24 20:31:34 odin Type:   Direct-Access                      ANSI SCSI revision: 05

Mar 24 20:31:34 odin sata_sil version 0.8

Mar 24 20:31:34 odin ata5: SATA max UDMA/100 cmd 0xF8804080 ctl 0xF880408A bmdma 0xF8804000 irq 11

Mar 24 20:31:34 odin ata6: SATA max UDMA/100 cmd 0xF88040C0 ctl 0xF88040CA bmdma 0xF8804008 irq 11

Mar 24 20:31:34 odin ata5: dev 0 cfg 49:2f00 82:7c6b 83:7b09 84:4003 85:7c69 86:3a01 87:4003 88:207f

Mar 24 20:31:34 odin ata5: dev 0 ATA, max UDMA/133, 234375000 sectors:

Mar 24 20:31:34 odin ata5: dev 0 configured for UDMA/100

Mar 24 20:31:34 odin scsi4 : sata_sil

Mar 24 20:31:34 odin ata6: dev 0 cfg 49:2f00 82:7c6b 83:7b09 84:4003 85:7c69 86:3a01 87:4003 88:207f

Mar 24 20:31:34 odin ata6: dev 0 ATA, max UDMA/133, 234375000 sectors:

Mar 24 20:31:34 odin ata6: dev 0 configured for UDMA/100

Mar 24 20:31:34 odin scsi5 : sata_sil

Mar 24 20:31:34 odin Vendor: ATA       Model: Maxtor 6Y120M0    Rev: YAR5

Mar 24 20:31:34 odin Type:   Direct-Access                      ANSI SCSI revision: 05

Mar 24 20:31:34 odin Vendor: ATA       Model: Maxtor 6Y120M0    Rev: YAR5

Mar 24 20:31:34 odin Type:   Direct-Access                      ANSI SCSI revision: 05

Mar 24 20:31:34 odin SCSI device sda: 240121728 512-byte hdwr sectors (122942 MB)

Mar 24 20:31:34 odin SCSI device sda: drive cache: write back

Mar 24 20:31:34 odin SCSI device sda: 240121728 512-byte hdwr sectors (122942 MB)

Mar 24 20:31:34 odin SCSI device sda: drive cache: write back

Mar 24 20:31:34 odin /dev/scsi/host3/bus0/target0/lun0: unknown partition table

Mar 24 20:31:34 odin Attached scsi disk sda at scsi3, channel 0, id 0, lun 0

Mar 24 20:31:34 odin SCSI device sdb: 234375000 512-byte hdwr sectors (120000 MB)

Mar 24 20:31:34 odin SCSI device sdb: drive cache: write back

Mar 24 20:31:34 odin SCSI device sdb: 234375000 512-byte hdwr sectors (120000 MB)

Mar 24 20:31:34 odin SCSI device sdb: drive cache: write back

Mar 24 20:31:34 odin /dev/scsi/host4/bus0/target0/lun0: p1

Mar 24 20:31:34 odin Attached scsi disk sdb at scsi4, channel 0, id 0, lun 0

Mar 24 20:31:34 odin SCSI device sdc: 234375000 512-byte hdwr sectors (120000 MB)

Mar 24 20:31:34 odin SCSI device sdc: drive cache: write back

Mar 24 20:31:34 odin SCSI device sdc: 234375000 512-byte hdwr sectors (120000 MB)

Mar 24 20:31:34 odin SCSI device sdc: drive cache: write back

Mar 24 20:31:34 odin /dev/scsi/host5/bus0/target0/lun0: p1

Mar 24 20:31:34 odin Attached scsi disk sdc at scsi5, channel 0, id 0, lun 0

Mar 24 20:31:34 odin Attached scsi generic sg0 at scsi3, channel 0, id 0, lun 0,  type 0

Mar 24 20:31:34 odin Attached scsi generic sg1 at scsi4, channel 0, id 0, lun 0,  type 0

Mar 24 20:31:34 odin Attached scsi generic sg2 at scsi5, channel 0, id 0, lun 0,  type 0

Mar 24 20:31:34 odin mice: PS/2 mouse device common for all mice

Mar 24 20:31:34 odin input: AT Translated Set 2 keyboard on isa0060/serio0

Mar 24 20:31:34 odin md: raid0 personality registered as nr 2

Mar 24 20:31:34 odin md: raid5 personality registered as nr 4

Mar 24 20:31:34 odin raid5: automatically using best checksumming function: pIII_sse

Mar 24 20:31:34 odin pIII_sse  :  5640.000 MB/sec

Mar 24 20:31:34 odin raid5: using function: pIII_sse (5640.000 MB/sec)

Mar 24 20:31:34 odin md: md driver 0.90.1 MAX_MD_DEVS=256, MD_SB_DISKS=27

```

And here's what happened as I took one drive offline (not faulty, but needed elsewhere) to replace it with a new drive on the Promise controller...

```

Mar 24 21:43:54 odin raid5: Disk failure on hdc1, disabling device. Operation continuing on 2 devices

Mar 24 21:43:54 odin RAID5 conf printout:

Mar 24 21:43:54 odin --- rd:3 wd:2 fd:1

Mar 24 21:43:54 odin disk 0, o:1, dev:sdb1

Mar 24 21:43:54 odin disk 1, o:1, dev:sdc1

Mar 24 21:43:54 odin disk 2, o:0, dev:hdc1

Mar 24 21:43:54 odin RAID5 conf printout:

Mar 24 21:43:54 odin --- rd:3 wd:2 fd:1

Mar 24 21:43:54 odin disk 0, o:1, dev:sdb1

Mar 24 21:43:54 odin disk 1, o:1, dev:sdc1

Mar 24 21:44:30 odin md: unbind<hdc1>

Mar 24 21:44:30 odin md: export_rdev(hdc1)

Mar 24 21:45:00 odin md: bind<sda1>

Mar 24 21:45:00 odin RAID5 conf printout:

Mar 24 21:45:00 odin --- rd:3 wd:2 fd:1

Mar 24 21:45:00 odin disk 0, o:1, dev:sdb1

Mar 24 21:45:00 odin disk 1, o:1, dev:sdc1

Mar 24 21:45:00 odin disk 2, o:1, dev:sda1

Mar 24 21:45:00 odin .<6>md: syncing RAID array md0

Mar 24 21:45:00 odin md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.

Mar 24 21:45:00 odin md: using maximum available idle IO bandwith (but not more than 200000 KB/sec) for reconstruction.

Mar 24 21:45:00 odin md: using 128k window, over a total of 117185792 blocks.

Mar 24 21:48:08 odin sshd[6432]: Accepted password for root from ::ffff:172.16.100.11 port 1142 ssh2

Mar 24 21:49:20 odin ata6: command 0xc8 timeout, stat 0x58 host_stat 0x0

Mar 24 21:49:20 odin ata6: status=0x58 { DriveReady SeekComplete DataRequest }

Mar 24 21:49:20 odin SCSI error : <5 0 0 0> return code = 0x8000002

Mar 24 21:49:20 odin sdc: Current: sense key=0xb

Mar 24 21:49:20 odin ASC=0x47 ASCQ=0x0

Mar 24 21:49:20 odin end_request: I/O error, dev sdc, sector 15926991

Mar 24 21:49:20 odin raid5: Disk failure on sdc1, disabling device. Operation continuing on 1 devices

Mar 24 21:49:20 odin ATA: abnormal status 0x58 on port 0xF88040C7

Mar 24 21:49:20 odin ATA: abnormal status 0x58 on port 0xF88040C7

Mar 24 21:49:20 odin ATA: abnormal status 0x58 on port 0xF88040C7

Mar 24 21:49:50 odin ata6: command 0xc8 timeout, stat 0xd8 host_stat 0x1

Mar 24 21:49:50 odin ata6: status=0xd8 { Busy }

Mar 24 21:49:50 odin SCSI error : <5 0 0 0> return code = 0x8000002

Mar 24 21:49:50 odin sdc: Current: sense key=0xb

Mar 24 21:49:50 odin ASC=0x47 ASCQ=0x0

Mar 24 21:49:50 odin end_request: I/O error, dev sdc, sector 15926999

Mar 24 21:49:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:49:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:49:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:50:01 odin /usr/sbin/cron[6450]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons)

Mar 24 21:50:20 odin ata6: command 0xc8 timeout, stat 0xd8 host_stat 0x1

Mar 24 21:50:20 odin ata6: status=0xd8 { Busy }

Mar 24 21:50:20 odin SCSI error : <5 0 0 0> return code = 0x8000002

Mar 24 21:50:20 odin sdc: Current: sense key=0xb

Mar 24 21:50:20 odin ASC=0x47 ASCQ=0x0

Mar 24 21:50:20 odin end_request: I/O error, dev sdc, sector 15927007

Mar 24 21:50:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:50:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:50:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:50:50 odin ata6: command 0xc8 timeout, stat 0xd8 host_stat 0x1

Mar 24 21:50:50 odin ata6: status=0xd8 { Busy }

Mar 24 21:50:50 odin SCSI error : <5 0 0 0> return code = 0x8000002

Mar 24 21:50:50 odin sdc: Current: sense key=0xb

Mar 24 21:50:50 odin ASC=0x47 ASCQ=0x0

Mar 24 21:50:50 odin end_request: I/O error, dev sdc, sector 15927015

Mar 24 21:50:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:50:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:50:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:51:20 odin ata6: command 0xc8 timeout, stat 0xd8 host_stat 0x1

Mar 24 21:51:20 odin ata6: status=0xd8 { Busy }

Mar 24 21:51:20 odin SCSI error : <5 0 0 0> return code = 0x8000002

Mar 24 21:51:20 odin sdc: Current: sense key=0xb

Mar 24 21:51:20 odin ASC=0x47 ASCQ=0x0

Mar 24 21:51:20 odin end_request: I/O error, dev sdc, sector 15927023

Mar 24 21:51:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:51:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:51:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:51:50 odin ata6: command 0xc8 timeout, stat 0xd8 host_stat 0x1

Mar 24 21:51:50 odin ata6: status=0xd8 { Busy }

Mar 24 21:51:50 odin SCSI error : <5 0 0 0> return code = 0x8000002

Mar 24 21:51:50 odin sdc: Current: sense key=0xb

Mar 24 21:51:50 odin ASC=0x47 ASCQ=0x0

Mar 24 21:51:50 odin end_request: I/O error, dev sdc, sector 15927031

Mar 24 21:51:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:51:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:51:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:52:20 odin ata6: command 0xc8 timeout, stat 0xd8 host_stat 0x1

Mar 24 21:52:20 odin ata6: status=0xd8 { Busy }

Mar 24 21:52:20 odin SCSI error : <5 0 0 0> return code = 0x8000002

Mar 24 21:52:20 odin sdc: Current: sense key=0xb

Mar 24 21:52:20 odin ASC=0x47 ASCQ=0x0

Mar 24 21:52:20 odin end_request: I/O error, dev sdc, sector 15927039

Mar 24 21:52:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:52:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:52:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:52:50 odin ata6: command 0xc8 timeout, stat 0xd8 host_stat 0x1

Mar 24 21:52:50 odin ata6: status=0xd8 { Busy }

Mar 24 21:52:50 odin SCSI error : <5 0 0 0> return code = 0x8000002

Mar 24 21:52:50 odin sdc: Current: sense key=0xb

Mar 24 21:52:50 odin ASC=0x47 ASCQ=0x0

Mar 24 21:52:50 odin end_request: I/O error, dev sdc, sector 15927047

Mar 24 21:52:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:52:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:52:50 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:53:05 odin sshd[6469]: Accepted password for root from ::ffff:172.16.100.11 port 1143 ssh2

Mar 24 21:53:20 odin ata6: command 0xc8 timeout, stat 0xd8 host_stat 0x1

Mar 24 21:53:20 odin ata6: status=0xd8 { Busy }

Mar 24 21:53:20 odin SCSI error : <5 0 0 0> return code = 0x8000002

Mar 24 21:53:20 odin sdc: Current: sense key=0xb

Mar 24 21:53:20 odin ASC=0x47 ASCQ=0x0

Mar 24 21:53:20 odin end_request: I/O error, dev sdc, sector 15927055

Mar 24 21:53:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:53:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

Mar 24 21:53:20 odin ATA: abnormal status 0xD8 on port 0xF88040C7

...

```

At this point I rebooted in panic and forced a reassembly of the 2 remaining disks of the array. At least that way I got my data back.

----------

## Richard Morris

Hi freeix,

I've just had a similar problem as you, and I think I have solved it. 

I have a computer built out of the following : Gigabyte GA-7DXR+, Athlon XP, 1 GB Ram, 4 Maxtor HDD's, 2 NEC DVD Drives. It usually runs services for me, and occasionally, usually when I was trying to do somthing like emerging new software packages or using it as a print server it would just stop, because I almost always use the computer remotely via ssh, no messages were displayed on the ssh terminal and I couldn't un-blank the display if I went to the computer, I could however ping the computer.

I reinstalled the computer today and whilst it was installing I started to get the same messages as you, and as I was actually sitting at the keyboard I could actually see the error message appear  :Smile: 

```
Disabling IRQ #19
```

that was used by the Promise PDC20276 controler and it promptly dropped the disks and stopped working although as before I could ping it.

After some playing and discovering that I could make it happen to order by plugging in a usb device, I discovered that my problems were being caused by ACPI, and the way that IRQ's were being handled by the BIOS in the PC. I've flashed the BIOS and that has fixed it for me. 

A new BIOS possibly won't fix it for you, but it may certainly be worth looking at your BIOS settings and putting your controller on a reserved IRQ.

I hope this helps.

----------

## rasmussen

 *Richard Morris wrote:*   

> 
> 
> that was used by the Promise PDC20276 controler and it promptly dropped the disks and stopped working although as before I could ping it.
> 
> After some playing and discovering that I could make it happen to order by plugging in a usb device, I discovered that my problems were being caused by ACPI, and the way that IRQ's were being handled by the BIOS in the PC. I've flashed the BIOS and that has fixed it for me. 
> ...

 

I have tried just about all combinations of ACPI and APIC on/off but the Promise controller is still crapping out on me. All combinations I have tried ends up with the kernel dropping a message with "IRQ 18: nobody cared", followed by "Disabling IRQ #18" which effectively drops the controller and the drives on the floor, making a hard reboot necessary.

A few questions:

Do you have (IO-)APIC enabled in your kernel?

Do you have APCI enabled in your kernel?

When you write "flashed the BIOS" do you mean a BIOS upgrade or just clearing the settings?

I'll try your idea of setting the controller on a reserved IRQ (if that ASUS bios allows me to).

----------

## Richard Morris

 *rasmussen wrote:*   

> 
> 
> A few questions:
> 
> Do you have (IO-)APIC enabled in your kernel?
> ...

 

Hi,

I have the following APIC settings in my kernel:

```
CONFIG_X86_GOOD_APIC=y

CONFIG_X86_LOCAL_APIC=y

CONFIG_X86_IO_APIC=y
```

The ACPI options are 

```
# Power management options (ACPI, APM)

# ACPI (Advanced Configuration and Power Interface) Support

CONFIG_ACPI=y

CONFIG_ACPI_BOOT=y

CONFIG_ACPI_INTERPRETER=y

CONFIG_ACPI_SLEEP=y

CONFIG_ACPI_SLEEP_PROC_FS=y

CONFIG_ACPI_AC=y

CONFIG_ACPI_BATTERY=y

CONFIG_ACPI_BUTTON=y

CONFIG_ACPI_VIDEO=m

CONFIG_ACPI_FAN=y

CONFIG_ACPI_PROCESSOR=y

CONFIG_ACPI_THERMAL=y

# CONFIG_ACPI_ASUS is not set

CONFIG_ACPI_IBM=m

# CONFIG_ACPI_TOSHIBA is not set

CONFIG_ACPI_BLACKLIST_YEAR=0

# CONFIG_ACPI_DEBUG is not set

CONFIG_ACPI_BUS=y

CONFIG_ACPI_EC=y

CONFIG_ACPI_POWER=y

CONFIG_ACPI_PCI=y

CONFIG_ACPI_SYSTEM=y

# CONFIG_ACPI_CONTAINER is not set

CONFIG_PNPACPI=y

# CONFIG_SERIAL_8250_ACPI is not set

```

I flashed the BIOS with a new BIOS and reset it to the optomised defaults.

----------

## Gherald

My ACPI is enabled.  APIC is disabled.  I've been trying 2.6.12-rc1 for a couple days and haven't gotten an IRQ disabled message yet, but I don't actually have any drives plugged into the controller right now...

I'm just gonna wait for rc2 and test then, as the machine is pretty busy atm.

But thank you both for confirming that it is most likely not a hardware problem.  It must be the kernel driver at fault.

----------

## rasmussen

 *Richard Morris wrote:*   

> 
> 
> I have the following APIC settings in my kernel:
> 
> ```
> ...

 

Thanks for the info. I have the same APIC settings and almost the same ACPI settings.

Regarding BIOS update -- trouble is that it makes no difference which of the last three versions I use.

----------

## rasmussen

 *freeix wrote:*   

> My ACPI is enabled.  APIC is disabled.  I've been trying 2.6.12-rc1 for a couple days and haven't gotten an IRQ disabled message yet, but I don't actually have any drives plugged into the controller right now...
> 
> 

 

Hmm, yes, trying the 2.6.12-rc1 might be an idea.

 *freeix wrote:*   

> But thank you both for confirming that it is most likely not a hardware problem.  It must be the kernel driver at fault.

 

I too have a nagging feeling that the kernel driver is the problem.

Unfortunately, the usual answer to reports on the "disabling irq" problem forwarded to the linux-ide mailing list is "replace your cables and disable APCI". Well, surprise, the same cables and drives work fine when attached to the Sil3112 controller on my mobo.

I also noticed something a bit odd after the crashes (3 crashes shown):

```

# grep libata /proc/interrupts

 17:    490000   IO-APIC-level  libata

 18:         0   IO-APIC-level  libata

```

```

# grep libata /proc/interrupts

 17:    400000   IO-APIC-level  libata

 18:         0   IO-APIC-level  libata

```

```

# grep libata /proc/interrupts

 17:    420000   IO-APIC-level  libata

 18:         0   IO-APIC-level  libata

```

Surprisingly round those number....

BTW, I tried putting only 2 drives on the Promise SATAII controller and a third on mobo-PATA. Running RAID-5 on this array has not yet produced the "disabling irq" bug. And this is although I'm massively stress testing the array with iozone (app-benchmarks/iozone) over the last hours.

Sigh.

----------

## rasmussen

In an attempt of trying something different for solving the problems with the Promise SATA controller, I yesterday replaced one of the Maxtor drives with a Samsung drive.

Guess what... the problem has gone away.

In cleartext, since yesterday I have not had a single "disabling IRQ" error or anything else related to the controller. I have seriously tortured the RAID array using iozone and haven't been able to provoke a single error -- before the drive swap it would crash within minutes doing just a single

```
iozone -s 1g -t 4
```

The Maxtor drive is dated March 10th 2004. Perhaps there's a problem between the firmware of the Maxtor drive, the Promise controller and the kernel driver.

Can't say that I feel much clever, but at least it's now working  :Rolling Eyes: 

----------

## Vlurk

I got a drive that does exactly that kind of problem.  :Sad: 

```

irq 11: nobody cared!

 [<c01369ea>] __report_bad_irq+0x2a/0x90

 [<c0136320>] handle_IRQ_event+0x30/0x70

 [<c0136adc>] note_interrupt+0x6c/0xd0

 [<c01364a6>] __do_IRQ+0x146/0x160

 [<c01043a7>] do_IRQ+0x47/0x70

 =======================

 [<c010294a>] common_interrupt+0x1a/0x20

 [<c011bab0>] __do_softirq+0x30/0x90

 [<c01044b1>] do_softirq+0x41/0x50

 =======================

 [<c011bbd3>] irq_exit+0x33/0x40

 [<c01043ae>] do_IRQ+0x4e/0x70

 [<c010294a>] common_interrupt+0x1a/0x20

handlers:

[<c0313200>] (nv_interrupt+0x0/0xc0)

[<c0325de0>] (usb_hcd_irq+0x0/0x70)

Disabling IRQ #11

ata1: command 0x35 timeout, stat 0xd0 host_stat 0x24

ata1: status=0xd0 { Busy }

SCSI error : <0 0 0 0> return code = 0x8000002

sda: Current: sense key=0xb

    ASC=0x47 ASCQ=0x0

end_request: I/O error, dev sda, sector 452047335

Buffer I/O error on device sda1, logical block 56505909

lost page write due to I/O error on sda1

ATA: abnormal status 0xD0 on port 0x9F7

ATA: abnormal status 0xD0 on port 0x9F7

ATA: abnormal status 0xD0 on port 0x9F7

ata1: command 0x35 timeout, stat 0xd0 host_stat 0x20

ata1: status=0xd0 { Busy }

SCSI error : <0 0 0 0> return code = 0x8000002

sda: Current: sense key=0xb

    ASC=0x47 ASCQ=0x0

end_request: I/O error, dev sda, sector 452047343

Buffer I/O error on device sda1, logical block 56505910

lost page write due to I/O error on sda1

ATA: abnormal status 0xD0 on port 0x9F7

ATA: abnormal status 0xD0 on port 0x9F7

ATA: abnormal status 0xD0 on port 0x9F7

ata1: command 0x35 timeout, stat 0x50 host_stat 0x24

ata1: command 0x35 timeout, stat 0x50 host_stat 0x24

ata1: command 0x25 timeout, stat 0x50 host_stat 0x24

ata1: command 0x35 timeout, stat 0x50 host_stat 0x24

```

However, it is a brand new Maxtor 6B250R0. It is ultra-ata133, but pluged via SATA : the disk is itself installed in a SATA+USB2 external enclosure from BYTECC. That enclosure rocks by the way... My SATA controller is the one built-in into the NForce2 chipset, and my motherboard is a K7N2 Delta2 (MS-6570NMS). Without  any doubt, the   drive functions properly: my enclosure works with USB while running linux, and both SATA & USB seem to work with Windows XP. I really suspect a bug with sata_nv at this point. Oh well, after all it is still "Experimental".  :Rolling Eyes: 

----------

## rasmussen

 *Vlurk wrote:*   

> I got a drive that does exactly that kind of problem. 
> 
> ...
> 
> However, it is a brand new Maxtor 6B250R0. 
> ...

 

Yeah, although I got my array up and running by replacing a Maxtor drive with a Samsung drive I still suspect that the real problem is in the kernel.

Why? 'Cause there's another Maxtor in the array with same production date as the one I removed. And the drive I removed works perfectly well on a Silicon Image controller. And finally, after rigorous drive testing I can say for sure that there are no bad sectors or similar on the Maxtor drive.  :Rolling Eyes: 

----------

## George_ch

I have the same problems using a promise SATAII 150 TX4 (PDC40518) controller , Asus a8V and 4 ST3200822AS. It looks like that whenever something unexpected happens with the disk's the controller hangs , such as power off a disk manually or disk restart due to a bad backplane.

```
irq 193: nobody cared!

Call Trace:<IRQ> <ffffffff80155d50>{__report_bad_irq+48} <ffffffff80155e29>{note_interrupt+89}

       <ffffffff801556f9>{__do_IRQ+281} <ffffffff801111f2>{do_IRQ+66}

       <ffffffff8010e8b1>{ret_from_intr+0}  <EOI> <ffffffff8010c520>{default_idle+0}

       <ffffffff8010c540>{default_idle+32} <ffffffff8010c68f>{cpu_idle+63}

       <ffffffff805207a5>{start_kernel+437} <ffffffff8052028b>{_sinittext+651}

handlers:

[<ffffffff880524b0>] (pdc_interrupt+0x0/0x1f0 [sata_promise])

Disabling IRQ #193
```

so far i tried the following kernel versions

2.6.9-gentoo-r14  http://marc.theaimsgroup.com/?l=linux-ide&m=109814647928388&w=2

2.6.11ac1

2.6.11.6

2.6.11.6  patched with 2.6.11-bk6-libata-dev1.patch.gz

So im out of ideas of what else i could try. Kernel 2.4.x ? I 2.4 supported by Gentoo for AMD64?

----------

## rasmussen

 *George_ch wrote:*   

> So im out of ideas of what else i could try.

 

I believe that there's enough "evidence" in this thread to warrant a bug report on the linux-ide mailinglist.

(But I suppose you already did that George_ch?)

It looks like there's something fishy about the interrupt handling in the sata_promise driver. Perhaps related to this.

----------

## George_ch

Rasmussen  , that's no my post in the linux-ide mailing list. I just subscribed to the list and hope to see good news soon.

----------

## rasmussen

 *George_ch wrote:*   

> Rasmussen  , that's no my post in the linux-ide mailing list. I just subscribed to the list and hope to see good news soon.

 

Okay, I suppose there's more than one George with a Promise controller  :Wink: 

BTW, I've posted the problem with the Promise controller on the linux-ide list. Let's hope that somebody there has a clue.

----------

## Kailee

So has there been any help over at linux-ide? I am having the exact same problem with 250Gb Maxtor drives; one Maxtor 7Y250P0 (PATA), three Maxtor 7Y250M0 (SATA), the SATA's hanging off - surprise - an Promise 40518 SATAII TX4 (could also be a 20518, i can't remember which). These are bound together in a Raid5 MD with two md devices (md0 and md1). md0 is 64Mb on /boot using ext2, md1 is split up using lvm2 into / (8Gb), /export (330Gb), swap  (1Gb) and /backup (350 Gb).

I get these messages when pushing the /export logical volume, and it's always /dev/sda2 that drops out of the raid5. I have no problems removing and re-adding the partition, but it's a fairly lengthy process each time (couple of hours). I tested the disks and they are fine individually.

Is there *any* news on what this might be?

Thanks,

Kailee.

----------

## rasmussen

 *Kailee wrote:*   

> So has there been any help over at linux-ide?

 

Unfortunately not.

It turned out that some people have already repeatedly reported very similar/identical problems on the list, but it's almost always written off as "you have a cable problem" ... which at least for me definitely isn't the problem.

It would be really good if you could make a posting to the list too, reporting the problems you have.

Perhaps with enough bug reports and a concerted effort we can figure out what the hell is going on.

----------

## Kailee

Bummer. Where do I subscribe?

K.

----------

## rasmussen

 *Kailee wrote:*   

> Bummer. Where do I subscribe?

 

Yeah, it's really a bummer  :Sad: 

To subscribe, just send a mail to

```

majordomo@vger.kernel.org

```

and put

```

subscribe linux-ide

```

in the body. More general info here.

----------

## Gherald

The problem seems to have been solved for me since 2.6.12-rc1.

Has anyone else tried a .12-rc?

----------

## Grooby

Has anyone figure out this problem?  I've not yet try RC1 or RC2 but I am having similar problem with my SATAII 150 TX2.  Trying to build a raid5 drive across onboard IDE and promise and it will die couple minutes after the resync of raid device.

----------

## rasmussen

 *Grooby wrote:*   

> Has anyone figure out this problem?  I've not yet try RC1 or RC2 but I am having similar problem with my SATAII 150 TX2.

 

AFAIK, no.

However, it seems that an almost scary number of bugs have been fixed in the SATA layer recently. But, after I got my RAID somehow running (see posts above) I'm too chicken to try out the newer kernels.

In any case, it would be great if you could post a bug report to the linux-ide mailing list.

----------

## kulesa

Just to throw my frustrations in on this issue with the Promise TX4 card dropping off on 2.6.12-gentoo-r10 and -r9

Sep 28 18:57:43 xtc irq 21: nobody cared (try booting with the "irqpoll" option)

Sep 28 18:57:43 xtc [<c013c4da>] __report_bad_irq+0x2a/0x90

Sep 28 18:57:43 xtc [<c013bd60>] handle_IRQ_event+0x30/0x70

Sep 28 18:57:43 xtc [<c013c5d9>] note_interrupt+0x79/0xe0

Sep 28 18:57:43 xtc [<c013beb7>] __do_IRQ+0x117/0x120

Sep 28 18:57:43 xtc [<c0105699>] do_IRQ+0x19/0x30

Sep 28 18:57:43 xtc [<c0103922>] common_interrupt+0x1a/0x20

Sep 28 18:57:43 xtc [<c0100bf0>] default_idle+0x0/0x30

Sep 28 18:57:43 xtc [<c0100c13>] default_idle+0x23/0x30

Sep 28 18:57:43 xtc [<c0100cc7>] cpu_idle+0x67/0x70

Sep 28 18:57:43 xtc handlers:

Sep 28 18:57:43 xtc [<c0307030>] (pdc_interrupt+0x0/0x1b0)

Sep 28 18:57:43 xtc Disabling IRQ #21

And from there the SATA drives on the TX4 are not longer functioning.  The boot volumes are on PATA on the AMD 768 MP bridge.

Sep 28 18:59:41 xtc end_request: I/O error, dev sda, sector 786663

Sep 28 18:59:41 xtc Buffer I/O error on device sda1, logical block 98325

Sep 28 18:59:41 xtc lost page write due to I/O error on sda1

It doesn't happen very often - but is enough to toast my server from time to time.

lspci

0000:00:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System Controller (rev 20)

0000:00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] AGP Bridge

0000:00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ISA (rev 05)

0000:00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-768 [Opus] IDE (rev 04)

0000:00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ACPI (rev 03)

0000:00:09.0 Mass storage controller: Promise Technology, Inc. PDC20518/PDC40518 (SATAII 150 TX4) (rev 02)

0000:00:10.0 PCI bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] PCI (rev 05)

0000:01:05.0 VGA compatible controller: nVidia Corporation NV4 [RIVA TNT] (rev 04)

0000:02:00.0 USB Controller: Advanced Micro Devices [AMD] AMD-768 [Opus] USB (rev 07)

0000:02:04.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev 20)

0000:02:06.0 SCSI storage controller: Adaptec AIC-7892B U160/m (rev 02)

0000:02:08.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 7 :Cool: 

----------

## Neo_0815

Any solution out there?

Got 2 300tx4 controllers, mine is bailing out the same with kernel 2.6.15-r6, irq 17 nobody cared, disabled ...

kind regards

----------

## rasmussen

I have the Promise PDC20518 SATAII150 TX4 card (which is what the thread originally was about) and I somehow managed to fix my problem.

I upgraded the kernel and applied a BIOS update for the card (found on the Promise website). I was so reliefed to finally have it working that I didn't bother going back to figure out if it was the kernel or BIOS upgrade (or both) that did the trick. That was sometime before X-mas I think. I haven't had a single "nobody cared" interrupt since.

BTW, my kernel is:

```
Linux odin 2.6.14y #1 Mon Oct 31 01:47:05 CET 2005 i686 AMD Athlon(tm) XP 3000+ AuthenticAMD GNU/Linux
```

----------

## Neo_0815

Hm, the controller got the newest bios avalaible. I am using 2.6.16.7 with irqpoll option - for now no error occured, but i dont know if its working, a crashed raid5 if it happens isnt what i want to got  :Wink: .

kind regards

----------

## I.C.Wiener

I purchased a Promise SataII 150 TX4 9 month ago when I figured out that my Sil3112 doesn't like my seagate discs and was thus working in some sort of compatibility mode only.

However so far I didn't had much luck with the TX4 either. First I used it in my dektop system and had 1-2 crashes every week. Usually when watching tv with my bttv-card and having lots of disc-activity the same time. The disc suddenly became inaccessible and the system crashed.

I thought this was somehow related to the bttv-card so I decided to buy a new samsung disc for my desktop system (which works perfectly with the onboard sil3112) and move the seagate discs with the TX4 into my dsl-router. But no way to get it working. I've been trying since Monday (5 wasted holidays!) and currently trying one last time with gentoo-sources-2.6.16-r9. Looks good so far, it never kept coping data this long before. I'll let you know as soon as it crashed   :Confused: 

Here is what I get:

```

libata version 1.20 loaded.

sata_promise 0000:00:10.0: version 1.03

ata1: SATA max UDMA/133 cmd 0xD8872200 ctl 0xD8872238 bmdma 0x0 irq 16

ata2: SATA max UDMA/133 cmd 0xD8872280 ctl 0xD88722B8 bmdma 0x0 irq 16

ata3: SATA max UDMA/133 cmd 0xD8872300 ctl 0xD8872338 bmdma 0x0 irq 16

ata4: SATA max UDMA/133 cmd 0xD8872380 ctl 0xD88723B8 bmdma 0x0 irq 16

ata1: SATA link down (SStatus 0)

scsi2 : sata_promise

ata2: SATA link up 1.5 Gbps (SStatus 113)

ata2: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4003 85:3469 86:3c01 87:4003 88:407f

ata2: dev 0 ATA-6, max UDMA/133, 390721968 sectors: LBA48

ata2: dev 0 configured for UDMA/133

scsi3 : sata_promise

ata3: SATA link down (SStatus 0)

scsi4 : sata_promise

ata4: SATA link up 1.5 Gbps (SStatus 113)

ata4: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4003 85:3469 86:3c01 87:4003 88:407f

ata4: dev 0 ATA-6, max UDMA/133, 390721968 sectors: LBA48

ata4: dev 0 configured for UDMA/133

scsi5 : sata_promise

  Vendor: ATA       Model: ST3200822AS       Rev: 3.01

  Type:   Direct-Access                      ANSI SCSI revision: 05

SCSI device sdc: 390721968 512-byte hdwr sectors (200050 MB)

sdc: Write Protect is off

sdc: Mode Sense: 00 3a 00 00

SCSI device sdc: drive cache: write back

SCSI device sdc: 390721968 512-byte hdwr sectors (200050 MB)

sdc: Write Protect is off

sdc: Mode Sense: 00 3a 00 00

SCSI device sdc: drive cache: write back

 sdc: sdc1

sd 3:0:0:0: Attached scsi disk sdc

sd 3:0:0:0: Attached scsi generic sg3 type 0

  Vendor: ATA       Model: ST3200822AS       Rev: 3.01

  Type:   Direct-Access                      ANSI SCSI revision: 05

SCSI device sdd: 390721968 512-byte hdwr sectors (200050 MB)

sdd: Write Protect is off

sdd: Mode Sense: 00 3a 00 00

SCSI device sdd: drive cache: write back

SCSI device sdd: 390721968 512-byte hdwr sectors (200050 MB)

sdd: Write Protect is off

sdd: Mode Sense: 00 3a 00 00

SCSI device sdd: drive cache: write back

 sdd: sdd1

sd 5:0:0:0: Attached scsi disk sdd

sd 5:0:0:0: Attached scsi generic sg4 type 0

kjournald starting.  Commit interval 5 seconds

EXT3 FS on dm-0, internal journal

EXT3-fs: mounted filesystem with ordered data mode.

kjournald starting.  Commit interval 5 seconds

EXT3 FS on sdd1, internal journal

EXT3-fs: recovery complete.

EXT3-fs: mounted filesystem with ordered data mode.

irq 16: nobody cared (try booting with the "irqpoll" option)

 [<c01351d6>] __report_bad_irq+0x31/0x74

 [<c01352ae>] note_interrupt+0x7d/0xa3

 [<c0134cc0>] __do_IRQ+0x95/0xc5

 [<c0104749>] do_IRQ+0x19/0x24

 [<c01030d6>] common_interrupt+0x1a/0x20

 [<c0100b3c>] default_idle+0x0/0x55

 [<c0100b68>] default_idle+0x2c/0x55

 [<c0100bff>] cpu_idle+0x5a/0x6f

 [<c032a718>] start_kernel+0x148/0x14a

handlers:

[<d88a0459>] (pdc_interrupt+0x0/0x19b [sata_promise])

Disabling IRQ #16

ata2: command timeout

ATA: abnormal status 0xFF on port 0xD887229C

ata2: translated ATA stat/err 0xff/00 to SCSI SK/ASC/ASCQ 0xb/47/00

ata2: status=0xff { Busy }

sd 3:0:0:0: SCSI error: return code = 0x8000002

sdc: Current: sense key=0xb

    ASC=0x47 ASCQ=0x0

end_request: I/O error, dev sdc, sector 27363335

EXT3-fs error (device dm-0): ext3_free_branches: Read failure, inode=720915, block=3420216

Aborting journal on device dm-0.

ata2: command timeout

ATA: abnormal status 0xFF on port 0xD887229C

ata2: translated ATA stat/err 0xff/00 to SCSI SK/ASC/ASCQ 0xb/47/00

ata2: status=0xff { Busy }

sd 3:0:0:0: SCSI error: return code = 0x8000002

sdc: Current: sense key=0xb

    ASC=0x47 ASCQ=0x0

end_request: I/O error, dev sdc, sector 1655

...same errors a few dozen times...

ATA: abnormal status 0xFF on port 0xD887229C

ata2: translated ATA stat/err 0xff/00 to SCSI SK/ASC/ASCQ 0xb/47/00

ata2: status=0xff { Busy }

sd 3:0:0:0: SCSI error: return code = 0x8000002

sdc: Current: sense key=0xb

    ASC=0x47 ASCQ=0x0

end_request: I/O error, dev sdc, sector 170919495

Buffer I/O error on device dm-0, logical block 21364736

lost page write due to I/O error on dm-0

```

I tried almost all combinations of acpi/apm on/off, (no)apic, irqpoll, (non)preemtible kernel, irq loadbalancing on/off,...

The only difference I noticed was that with irqpoll I get no error message, the kernel just completely freezes.

cat /proc/interrupts :

```

           CPU0       CPU1       

  0:    1081651       2848    IO-APIC-edge  timer

  2:          0          0          XT-PIC  cascade

 16:     200000          0   IO-APIC-level  libata

 17:      24166        122   IO-APIC-level  aic7xxx, aic7xxx, eth0

 18:    3825199          1   IO-APIC-level  zaphfc

 19:      13100          0   IO-APIC-level  uhci_hcd:usb1, eth1

NMI:          0          0 

LOC:    1084410    1084423 

ERR:          0

MIS:          0

```

Btw: I have those surprisingly round interrupt counters, too.

/* edit */

...crashed again  :Sad: 

I just found a cheap Sil3114 card on ebay. I'll try that one before wasting any more time.

----------

## punch_0k

HI everybody !

I m having the exact problem here.

```

irq 17: nobody cared (try booting with the "irqpoll" option)

[<c014175a>] __report_bad_irq+0x2a/0x90

[<c0140f79>] handle_IRQ_event+0x39/0x70

[<c0141880>] note_interrupt+0xa0/0x100

[<c01410b8>] __do_IRQ+0x108/0x120

[<c0105969>] do_IRQ+0x19/0x30

[<c0103ad2>] common_interrupt+0x1a/0x20

[<c0100dbe>] default_idle+0x2e/0x60

[<c0100e75>] cpu_idle+0x65/0x80

[<c042e94a>] start_kernel+0x16a/0x190

[<c042e300>] unknown_bootoption+0x0/0x1e0

handlers:

[<c02ce250>] (pdc_interrupt+0x0/0x1b0)

[<c02ce250>] (pdc_interrupt+0x0/0x1b0)

Disabling IRQ #17

[ata7: command timeout

[ATA: abnormal status 0xFF on port 0xE080A31C

[ata7: translated ATA stat/err 0xff/00 to SCSI SK/ASC/ASCQ 0xb/47/00

[ata7: status=0xff { Busy }

[sd 6:0:0:0: SCSI error: return code = 0x8000002

[sdf: Current: sense key=0xb

    ASC=0x47 ASCQ=0x0

end_request: I/O error, dev sdf, sector 442830655

```

This always happens under heavy I/O on the controllers.

Im using Kernel  v2.6.16-gentoo-r9

IS there anyone, who solved the problem ?

Im not sure, what to do anymore   :Crying or Very sad: 

Thanks in Advance

----------

## I.C.Wiener

I had the promise card plugged into a pci-slot where it has to share it's irq with the onbord scsi-controller. However I did not load libsata nor the sata_promise module, the card was just plugged in - unused.

But guess what: the "diabling interrupt ..." message came anyway! All devices on this interrupt stoped working. So I wonder how this can be a driver issue, if it wasn't even loaded at the time of the crash.

I need to admit, the hardware I'm using is not quite common:

It's a rather old Dual PII-300 with Intel LX/EX-chipset and the pci-slots are version 2.0 or 2.1 (5V I guess).

Perhaps it's a problem with the bios or pci 2.2 cards do not work with old 5V-pci slots.

Still waiting for the sil3114 card. I'll let you know if that controller works better...

Btw.: It would be mentioned somewhere if the driver wasn't smp-save, wouldn't it?

----------

## I.C.Wiener

Ok, finally the sil3114 arrived. Didn't work either. But, it gave me some interesting i/o-error messages instead of weird "irq disabled".

So the reason for all my trouble turned out to be a faulty sata-backplane. 2 of 3 slots are not working properly. If it was just one I would have realised within 1-2 hours. But 2 of them working "sometimes only" made me go nuts. However I need to say the promise controller just sucks. With a disc in one of the faulty slots, the system won't even boot or crashes anytime during boot before the controller gets initialized (Gentoo is installed on a real scsi-disc, and shoudn't try to access any of the sata discs during boot). And if it survives boot it'll crash during modprobe or shortly after and display the well known "disabling irq - nobody cared"-msg.

The sil3114 does not show such bad behaviour but gives adequate error-msg instead. This really helped a lot.

Oh, I almost forgot to mention, this is the product you should not buy:

http://www.icydockusa.com/product/mb453.htm

This damn thing caused me 2 weeks of trouble, cost me 90Euro + 25Euro for the sil3114,... not to mention the rma process I'll have to walk through now.

----------

## punch_0k

Hi folks,

IC Wiener, seems we almost have the same struggle, im working on this problem since a week, costs a lot of nervers.

now i tried something which _seems_ to have solved the problem.

please try edit your .config. 

add this line 

"PCI_CONFIG_MSI=yes" . it should be at line 230-240 of your .config . 

recompile kernel and try.

im still testing and had no error yet.

hopefully this solved it.

----------

## I.C.Wiener

First I need to say, my problem was clearly a hardware problem. I removed the faulty sata-backplane yesterday since everything works fine. So far I copyed 90GB from an unencrypted sata disc to an encrypted (192Bit AES) luks-partition. I guess this must be really stressing the hardware as both cpu's have 100% usage and asterisk starts lagging, damn :(

I just set asterisk's niceness to -10 which seems to have some effect, but it's still not running smoothly. So I will definitly give this "Message Signaled Interrupt" stuff a try. Thanks for the hint punch. Actually this could also solve some other problems with my hfc-isdn card which is rising a few thousand (I guess it was 8000?) interrupts per sec during a call.

Though I hope you don't mind that I don't reinstall the sata-backplane just to see wheter the system keeps crashing with faulty hardware and MSI enabled. This is actually supposed to be a productivity system and had been down for far too long already.

----------

