# [SOLVED] boot disk problem "not a valid root device"

## ferreirafm

Hi List,

I have experienced some boot disk problems. It is intermittent and most of the times reboot solves the problem. However, I don't want to counting on this all the time.

Very rarely I halt or reboot my box, but sometimes the message bellow is issued on the screen:

```
...

Determining the root device

Block device /dev/sda1 is not a valid root device...

Could not find the root block device in .

...

```

It seems that this problem have been fired as a bug before https://bugs.gentoo.org/224895. But there is a no clear solution to that.

Any help is appreciated.

Here goes some information:

```
mephisto ~ # emerge --info

Portage 2.1.8.3 (default/linux/amd64/10.0/desktop/kde, gcc-4.4.3, glibc-2.11.2-r0, 2.6.34-gentoo-r1 x86_64)

=================================================================

System uname: Linux-2.6.34-gentoo-r1-x86_64-Intel-R-_Core-TM-_i7_CPU_930_@_2.80GHz-with-gentoo-1.12.13

Timestamp of tree: Tue, 28 Sep 2010 18:15:02 +0000

app-shells/bash:     4.0_p37

dev-java/java-config: 2.1.11

dev-lang/python:     2.5.4-r4, 2.6.5-r3, 3.1.2-r4

dev-util/cmake:      2.8.1-r2

sys-apps/baselayout: 1.12.13

sys-apps/sandbox:    2.3-r1

sys-devel/autoconf:  2.65-r1

sys-devel/automake:  1.8.5-r4, 1.9.6-r3, 1.10.3, 1.11.1

sys-devel/binutils:  2.20.1-r1

sys-devel/gcc:       4.4.3-r2

sys-devel/gcc-config: 1.4.1

sys-devel/libtool:   2.2.10

sys-devel/make:      3.81-r2

virtual/os-headers:  2.6.30-r1

ACCEPT_KEYWORDS="amd64"

ACCEPT_LICENSE="*"

CBUILD="x86_64-pc-linux-gnu"

CFLAGS="-march=native"

CHOST="x86_64-pc-linux-gnu"

CONFIG_PROTECT="/etc /usr/share/X11/xkb /usr/share/config /var/lib/hsqldb"

CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"

CXXFLAGS="-march=native"

DISTDIR="/usr/portage/distfiles"

FEATURES="assume-digests collision-protect distlocks fixpackages news parallel-fetch protect-owned sandbox sfperms strict unmerge-logs unmerge-orphans userfetch"

GENTOO_MIRRORS="http://distfiles.gentoo.org"

LANG="pt_BR.UTF-8"

LDFLAGS="-Wl,--as-needed"

LINGUAS="pt_BR en"

PKGDIR="/usr/portage/packages"

PORTAGE_CONFIGROOT="/"

PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"

PORTAGE_TMPDIR="/var/tmp"

PORTDIR="/usr/portage"

SYNC="rsync://rsync.br.gentoo.org/gentoo-portage"

USE="X a52 aac acl acpi alsa amd64 bash-completion berkdb bluetooth branding bzip2 cairo cdr cli consolekit cracklib crypt cups cxx dbus dri dts dvd dvdr dvdread emacs embedded emboss encode exif extras fam ffmpeg firefox flac fortran ftp gcj gdbm ggc gif gimp gpm gtk gzip hal iconv imagemagick ipv6 java jpeg kde kpathsea latex lcms ldap libnotify mad midi mikmod mmx mng modules motif mp3 mp4 mpeg mudflap multilib ncurses nls nptl nptlonly nsplugin nss nvidia ogg opengl openmp pam pango pcre pdf perl png ppds pppd python qt3support qt4 readline reflection reiserfs samba sdl semantic-desktop session spell sse sse2 ssl startup-notification svg sysfs tar tcpd tetex threads tiff tk truetype type1 unicode usb vorbis x264 xcb xinerama xml xorg xulrunner xv xvid zip zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="pt_BR en" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" 

Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, MAKEOPTS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY
```

```
mephisto ~ # cat /proc/cpuinfo 

processor       : 0

vendor_id       : GenuineIntel

cpu family      : 6

model           : 26

model name      : Intel(R) Core(TM) i7 CPU         930  @ 2.80GHz

stepping        : 5

cpu MHz         : 2799.990

cache size      : 8192 KB

physical id     : 0

siblings        : 8

core id         : 0

cpu cores       : 4

apicid          : 0

initial apicid  : 0

fpu             : yes

fpu_exception   : yes

cpuid level     : 11

wp              : yes

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid

bogomips        : 5599.98

clflush size    : 64

cache_alignment : 64

address sizes   : 36 bits physical, 48 bits virtual

power management:

...
```

The grub.conf entry is as follow:

```
# For booting GNU/Linux

title Gentoo-Linux-2.6.34

root (hd0,0)

kernel /boot/kernel-genkernel-x86_64-2.6.34-gentoo-r1 root=/dev/ram0 init=/linux

rc ramdisk=8192 real_root=/dev/sda1 video=vesafb:mtrr:3,ywrap vga=792

initrd /boot/initramfs-genkernel-x86_64-2.6.34-gentoo-r1
```

Last edited by ferreirafm on Thu Sep 30, 2010 2:48 pm; edited 1 time in total

----------

## NeddySeagoon

ferreirafm,

Intermittent usually means hardware.

The drive could be very slow to spin up, so sometimes its not ready when the the kernel goes to mount real_root.

It could be generating lots of errors, causing the disk controller to reset, with the same effect as the above.

Do you see any sda related errors in dmesg ?

Install smartmontools and look at the drives internal error log.

----------

## ferreirafm

Hi NeddySeagoon,

Thanks for helping. I can't see any wrong in dmesg. 

```
mephisto init.d # dmesg | grep -A 2 sd

sd 3:0:1:0: [sda] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)

sd 3:0:1:0: [sda] Write Protect is off

sd 3:0:1:0: [sda] Mode Sense: 00 3a 00 00

sd 3:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

 sda: sda1 sda2 sda3

sd 3:0:1:0: [sda] Attached SCSI disk

sr 2:0:1:0: Attached scsi generic sg0 type 5

sd 3:0:1:0: Attached scsi generic sg1 type 0

scsi: <fdomain> Detection failed (no card)

GDT-HA: Storage RAID Controller Driver. Version: 3.05

--

REISERFS (device sda1): found reiserfs format "3.6" with standard journal

REISERFS (device sda1): using ordered data mode

REISERFS (device sda1): journal params: device sda1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

REISERFS (device sda1): checking transaction log (sda1)

REISERFS (device sda1): Using r5 hash to sort names

udev: starting version 151

input: Sleep Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0E:00/input/input2

--

REISERFS (device sda3): found reiserfs format "3.6" with standard journal

REISERFS (device sda3): using ordered data mode

REISERFS (device sda3): journal params: device sda3, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30

REISERFS (device sda3): checking transaction log (sda3)

REISERFS (device sda3): Using r5 hash to sort names

Adding 8393956k swap on /dev/sda2.  Priority:-1 extents:1 across:8393956k 

e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX

0000:00:19.0: eth0: 10/100 speed: disabling TSO
```

About the smartmontools, a long self-test of /dev/sda is currently under way.

That's intriguing because such problem has been reported in several posts, but as I said there is a no clear solution to that.

Here goes all SMART information.

```
mephisto / # smartctl --all /dev/sda

smartctl version 5.38 [x86_64-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen

Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===

Device Model:     ST31500541AS

Serial Number:    6XW1MMLF

Firmware Version: CC34

User Capacity:    1,500,301,910,016 bytes

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:   8

ATA Standard is:  ATA-8-ACS revision 4

Local Time is:    Wed Sep 29 16:45:06 2010 BRT

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

General SMART Values:

Offline data collection status:  (0x00) Offline data collection activity

                                        was never started.

                                        Auto Offline Data Collection: Disabled.

Self-test execution status:      ( 249) Self-test routine in progress...

                                        90% of test remaining.

Total time to complete Offline 

data collection:                 ( 643) seconds.

Offline data collection

capabilities:                    (0x73) SMART execute Offline immediate.

                                        Auto Offline data collection on/off support.

                                        Suspend Offline collection upon new

                                        command.

                                        No Offline surface scan supported.

                                        Self-test supported.

                                        Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine 

recommended polling time:        (   1) minutes.

Extended self-test routine

recommended polling time:        ( 255) minutes.

Conveyance self-test routine

recommended polling time:        (   2) minutes.

SCT capabilities:              (0x103f) SCT Status supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate     0x000f   114   100   006    Pre-fail  Always       -       59228893

  3 Spin_Up_Time            0x0003   100   100   000    Pre-fail  Always       -       0

  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       43

  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0

  7 Seek_Error_Rate         0x000f   063   060   030    Pre-fail  Always       -       2288666

  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       566

 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0

 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       43

183 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0

184 Unknown_Attribute       0x0032   100   100   099    Old_age   Always       -       0

187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0

188 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0

189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0

190 Airflow_Temperature_Cel 0x0022   069   067   045    Old_age   Always       -       31 (Lifetime Min/Max 25/31)

194 Temperature_Celsius     0x0022   031   040   000    Old_age   Always       -       31 (0 18 0 0)

195 Hardware_ECC_Recovered  0x001a   055   047   000    Old_age   Always       -       59228893

197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0

198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0

199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0

240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       162512972546701

241 Unknown_Attribute       0x0000   100   253   000    Old_age   Offline      -

242 Unknown_Attribute       0x0000   100   253   000    Old_age   Offline      -

SMART Error Log Version: 1

No Errors Logged

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA

# 1  Extended offline    Self-test routine in progress 90%       566         -

SMART Selective self-test log data structure revision number 1

 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.
```

Regards,

ferreirafm

----------

## ferreirafm

Hi NeddySeagoon,

There is another discussion on this topic. It is related to the Live CD kernel, however, I don't know if it is applied to my problem.

Any help is appreciated.

Regards,

ferreirafm

----------

## NeddySeagoon

ferreirafm,

Your drive looks sound and that bug isn't relevant.

If you want to investigate the kernel further, roll your own kernel, so that you do no need an initrd.

If configuring the kernel is new to you, See Pappys Seeds

That gives you a lean mean kernel configuration that you only add your hardware to.

If you do use Pappys Seeds, post kernel questions in that thread.

----------

## ferreirafm

Hi NeddySeagoon,

Here goes a great movie by cach0rr0 for working with kernel seeds.

I have compiled the sata drivers into the kernel. At least, until now everything seems to be ok.

Thanks for helping.

All the Best,

ferreirafm

----------

