# How do I monitor *what* spins my disks up?

## Jarjar

TL;DR: How do I monitor what accesses my disks (/dev/sd* directly, and/or /dev/md*) in real-time?

I posted about a problem with udev a few weeks ago, re: that my RAID disks now refuse to spin down, but only if udev is active (if I run "udevadm control --stop-exec-queue" they allow spindown until I run --start-exec-queue again).

Very oddly, I noticed a week ago that if the md array consisting of the three disks (sdb-sdd) is running, i.e. I have it unmounted, but don't run mdadm --stop, the disks do spin down!  :Confused: 

However, they start up (exactly) once a night, and not at the same times every night, either. I've tried to find a cron job that might be causing it, but I'm totally lost. No cron job matches the spin up times, and since they vary (see below) that just confuses me even more.

I run a script that checks their status (hdparm -C; does not spin them up) and logs it every 5 minutes (also cron); here's the output, filtered through uniq and grep -v standby, thus showing when the disk spun up, with a maximum error of 5 minutes - the polling time.

 *Quote:*   

> 
> 
> sdb:
> 
> 2010-01-24 02:55:01 active/idle
> ...

 

What's weird is:

1) They all spin up every night (never daytime, except when I do it manually)

2) They DON'T always spin up at the same time (see sdb/sdc, the second line of each - 03:00 vs 03:45).

Any ideas on how to track this down?

----------

## RedSquirrel

sys-process/iotop ?

----------

## Jarjar

Hmm, does it do logging?

I'll be asleep at the time, not to mention that I don't want to stare at a top screen for *literally* every second between 2am and 4am  :Smile: 

----------

## RedSquirrel

According to the man page it does, yes.

```
       -b, --batch

              Turn on non-interactive mode.  Useful for logging I/O usage over time.

```

Be aware you'll need a couple of kernel settings which you may not have enabled, TASK_DELAY_ACCT and TASK_IO_ACCOUNTING. You can have a look at the project's homepage here.

----------

## Jarjar

Thanks!  :Smile:  I didn't see any man page on the homepage... but it turned out I had the package installed already, anyway.  :Razz: 

I made a minor edit to the source code (damn, I love open source) so that the batch mode prints the timestamp every time in batch mode.

If anyone else wants it:

(/usr/lib/python2.6/site-packages/iotop/ui.py)

```

--- ui.py     2008-07-07 21:23:39.000000000 +0200

+++ ui.py       2010-01-30 19:40:56.000000000 +0100

@@ -7,6 +7,7 @@

 import select

 import struct

 import sys

+import time

 

 from iotop.data import find_uids, TaskStatsNetlink, ProcessList

 from iotop.version import VERSION

@@ -172,7 +173,7 @@

         return map(format, processes)

 

     def refresh_display(self, total_read, total_write, duration):

-        summary = 'Total DISK READ: %s | Total DISK WRITE: %s' % (

+        summary = '[%s] Total DISK READ: %s | Total DISK WRITE: %s' % (time.strftime('%Y-%m-%d %H:%M:%S'),

                                         human_bandwidth(total_read, duration),

                                         human_bandwidth(total_write, duration))

         titles = ['  PID', ' USER', '      DISK READ', '  DISK WRITE',

```

Edit: It looks like I'll have some data to review in the morning, heh. Unfortunately it doesn't show what file(s) was/were access(ed), making it a bit more difficult. At 60 second intervals, I get ~13 processes/threads doing IO. Most can be dismissed as they run 24/7, of course, so this may indeed help me along the way.  :Smile: 

Here's my beaautiful hack:

sleep 21600; iotop -o -b -d 60 -n 180 > IOTOP_20100131_NIGHT # 01:56 - 04:56

----------

## Jarjar

OK, so I cut down everything, removed stuff that appeared 24/7, discarded processes that couldn't be responsible... and I likely found the culprit: that damned udev again!!

"udevd --daemon" performed IO a total of three times during the three hours; twice in the same minute as the disks started, and once 03:40 (which just so happens to be just before 03:45, when they often spin up, yet after the check at 03:40:00).

Why won't udev let my disks be?!

Edit: "Both" udevd IOs occured while rsync was running... That's odd. I'm going to try to move the rsync processes in my crontab and see what happens. Since they array isn't mounted, and rsync doesn't touch /dev, I don't see how it can be responsible, though.  :Confused: 

----------

