# [solved] SATA power problem after starting XFCE nxserver

## mbar

Hi all, after some months trying to track down this issue, I'm asking for help  :Smile: 

I have a home file server with Gentoo (~amd keyword) and 9 SATA drives, boots to console only, but from time to time I have a need to run some GUI applications (like brasero / xfburn). I have XFCE and nxserver installed.

Everything is OK until I login from another computer using nxclient. Then, in dmesg the following happens:

```
ata1.00: configured for UDMA/133

ata1: EH complete

ata3.00: configured for UDMA/133

ata3: EH complete

ata4.00: configured for UDMA/133

ata4: EH complete

ata7.00: configured for UDMA/100

ata7: EH complete

ata8.00: configured for UDMA/100

ata8: EH complete

ata9.00: configured for UDMA/100

ata9: EH complete

ata10.00: configured for UDMA/100

ata10: EH complete

ata13.00: configured for UDMA/100

ata13: EH complete

ata14.00: configured for UDMA/100

ata14: EH complete

EXT4-fs (sda3): re-mounted. Opts: commit=30,commit=0

```

It started happening like 4 months ago.

Because my server is built with desktop components, I was thinking that mainboard was failing (it was running since 2009 almost non-stop), so I bought a new mainboard and RAM and replaced it yesterday. Unfortunately it did not solve my problem.

Right now I have:

```
00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RX780/RX790 Chipset Host Bridge

00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD790 PCI to PCI bridge (external gfx0 port A)

00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD790 PCI to PCI bridge (PCI express gpp port A)

00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD790 PCI to PCI bridge (PCI express gpp port B)

00:0a.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD790 PCI to PCI bridge (PCI express gpp port F)

00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]

00:12.0 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller

00:12.1 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0 USB OHCI1 Controller

00:12.2 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller

00:13.0 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller

00:13.1 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0 USB OHCI1 Controller

00:13.2 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller

00:14.0 SMBus: Advanced Micro Devices [AMD] nee ATI SBx00 SMBus Controller (rev 3c)

00:14.1 IDE interface: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 IDE Controller

00:14.2 Audio device: Advanced Micro Devices [AMD] nee ATI SBx00 Azalia (Intel HDA)

00:14.3 ISA bridge: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 LPC host controller

00:14.4 PCI bridge: Advanced Micro Devices [AMD] nee ATI SBx00 PCI to PCI Bridge

00:14.5 USB controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI2 Controller

00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor HyperTransport Configuration

00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address Map

00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM Controller

00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Miscellaneous Control

00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link Control

01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI RV370 5B60 [Radeon X300 (PCIE)]

01:00.1 Display controller: Advanced Micro Devices [AMD] nee ATI RV370 [Radeon X300SE]

02:00.0 RAID bus controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller (rev 01)

03:00.0 RAID bus controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller (rev 01)

04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)

05:07.0 RAID bus controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)

```

My main problem with ata error handling is that after the dmesg message, HDD power saving stops completely until reboot! So my drives never stop spinning, even after restarting hdparm service.

What can I do now? PSU is relatively new, 400W Chieftec. If you need any other info, I'll post it.Last edited by mbar on Tue Nov 06, 2012 1:13 pm; edited 1 time in total

----------

## NeddySeagoon

mbar,

I suspect that with 9 drives on a 400w PSU you are pushing your luck.

Your CPU core and all of your drive spin motors will run off the 12v. The data sheet I linked to is far from clear.

It claims  a 12v combined power od 336w but 336w is far mre than can be provide by the 18A Max output current. (216w)

I don't know your drives but 1A at 12v each is reasonable, there is half of your 18A gone already.

Your CPU core will run off the 12v too.  A modern CPU working hard van easlity soak up your other 9A on it own.

On top of that you have your case fans and graphics card.

I suspect the HDDs reset because of PSU issues.  For testing can you disconnect some of the drives?

At 34 euros, thats not a particularly good PSU. You cannot get good quality parts at that price.

Do you have another higher power PSU you can switch to for testing?

----------

## mbar

I suspect that you may be right. The problem started about time I added another 2 drives to my RAID. The drives are almost all Samsungs (6 x 1,5 TB and 2 x 1 TB) and single Seagate 320 GB for system. Phenom II X3 and 16 GB of RAM doesn't help the case either... 

Will try removing some HDDs, I guess it is also time to try 500W PSU.

But have you any idea why ATA EH happens during the start of nxserver/xorg/xfce combo?

----------

## NeddySeagoon

mbar,

The CPU load is high while the program loads and initialises.  That pushes up the power consumption.

Its not just output power you need to look at in a PSU.  With so many HDDs, you want one that can supply a lot of power on the 12v.

Choose one that has two separate 12v supplies. One for your HDD and one for your CPU. 

Something like this

Notice it has 12v1 and 12v2 for a total of 27A at 12v.  

Its a poor choice, that link is just to show the feature - at that price you will be lucky if it lasts a month past the end of the warranty period.

----------

## mbar

Guess it's PSU hunting time...

----------

## mbar

Corsair CX600 V2 http://www.corsair.com/builder-series-cx600-v2-80plus-certified-power-supply.html

or

XFX Core 550W http://xfxforce.com/en-gb/Products/Power-Supply/XFX/Pro-Series/ProSeries-550W-PSU/550W-Core-Edition-Full-Wired-Bronze.aspx

are within my budget. Which one?

I started getting CRC errors on files copied to my file server, is this consistent with PSU problems? Fortunately the filesystem itself (XFS) is healthy, just did a xfs_check.

----------

## NeddySeagoon

mbar

Corsair is known to be good. I don't know the other one.

----------

## mbar

Did some testing today. Right now I'm not sure that PSU is the cause.

I left only one HDD connected (system drive), and the problem is still here:

```
ata7: SATA link down (SStatus 0 SControl 0)

ata9: SATA link down (SStatus 0 SControl 0)

ata8: SATA link down (SStatus 0 SControl 0)

ata10: SATA link down (SStatus 0 SControl 0)

ata12: SATA link down (SStatus 0 SControl 310)

ata13: SATA link down (SStatus 0 SControl 310)

ata14: SATA link down (SStatus 0 SControl 310)

md: Skipping autodetection of RAID arrays. (raid=autodetect will force)

EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: (null)

VFS: Mounted root (ext4 filesystem) readonly on device 8:3.

devtmpfs: mounted

Freeing unused kernel memory: 460k freed

BFS CPU scheduler v0.420 by Con Kolivas.

scsi 14:0:0:0: Direct-Access     USB2.0   Mobile Disk      1.00 PQ: 0 ANSI: 2

sd 14:0:0:0: Attached scsi generic sg2 type 0

sd 14:0:0:0: [sdb] 983808 512-byte logical blocks: (503 MB/480 MiB)

sd 14:0:0:0: [sdb] Write Protect is on

sd 14:0:0:0: [sdb] Mode Sense: 0b 00 80 00

sd 14:0:0:0: [sdb] No Caching mode page present

sd 14:0:0:0: [sdb] Assuming drive cache: write through

sd 14:0:0:0: [sdb] No Caching mode page present

sd 14:0:0:0: [sdb] Assuming drive cache: write through

 sdb: sdb1

sd 14:0:0:0: [sdb] No Caching mode page present

sd 14:0:0:0: [sdb] Assuming drive cache: write through

sd 14:0:0:0: [sdb] Attached SCSI removable disk

udevd[1177]: starting version 182

EXT4-fs (sdb1): mounting ext2 file system using the ext4 subsystem

EXT4-fs (sdb1): mounted filesystem without journal. Opts: (null)

EXT4-fs (sdb1): mounting ext2 file system using the ext4 subsystem

EXT4-fs (sdb1): mounted filesystem without journal. Opts: (null)

EXT4-fs (sda3): re-mounted. Opts: commit=30

Adding 8388604k swap on /dev/sda1.  Priority:-1 extents:1 across:8388604k 

EXT4-fs (sda2): mounting ext2 file system using the ext4 subsystem

EXT4-fs (sda2): mounted filesystem without journal. Opts: (null)

EXT4-fs (sdb1): mounting ext2 file system using the ext4 subsystem

EXT4-fs (sdb1): mounted filesystem without journal. Opts: (null)

EXT4-fs (sdb1): mounting ext2 file system using the ext4 subsystem

EXT4-fs (sdb1): mounted filesystem without journal. Opts: (null)

r8169 0000:04:00.0: eth0: link down

r8169 0000:04:00.0: eth0: link down

r8169 0000:04:00.0: eth0: link up

ata1.00: configured for UDMA/133

ata1: EH complete

EXT4-fs (sda3): re-mounted. Opts: commit=30,commit=0
```

I noticed (refreshing dmesg every moment) that ata EH happens at the end of xfce/nxserver load, at the (almost) precise moment that xfce desktop is shown in nxclient window. Maybe there is some kind of rogue service/process that tries to detect something? Mount something?

What are my options for debugging this, considering the problem is software in nature?

----------

## mbar

Solved with help from user fraser!

This was caused by sys-power/upower package. After modyfying USE flags, unmerging upower and a poweroff / poweron cycle, everything is back to normal. No more ATA EH in dmesg and blocked HDD sleep.

Seems to be a problem with udev flag for xfce-base/xfce4-session?

----------

