# Udev can't see root disk - Server won't boot

## jonathanross

Hello all,

Here's a strange one. The system is SPARC64 running 2.6.30-gentoo-r5 with all the latest packages.

```
 * Mounting /dev ...                                                      [ ok ]

/etc/init.d/udev: line 40: /proc/sys/kernel/hotplug: Read-only file system

 * Starting udevd .../sbin/udevd already running.

                                                     [ !! ]

start-stop-daemon: warning: failed to kill 406: No such process

1 pids were not killed

No /sbin/udevd found running; none killed.

 * Checking root filesystem ...fsck.ext2: No such file or directory while trying to open /dev/hda1

/dev/hda1: 

The superblock could not be read or does not describe a correct ext2

filesystem.  If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate superblock:

    e2fsck -b 8193 <device>
```

Any ideas may save my sanity !

Thanks,

JR   :Embarassed: Last edited by jonathanross on Thu Jan 07, 2010 8:20 am; edited 2 times in total

----------

## tkhemili78

Is this a new install ??

----------

## jonathanross

Hi tkhemili78,

Thanks for looking. 

No it's been running since April last year.

JR   :Sad: 

----------

## jonathanross

It's not been rebooted for two or three months but I've not seen any problems at all and it's used constantly.

JR   :Question: 

----------

## VoidMage

Post output of 'rc-update show' and 'emerge --info'.

----------

## jonathanross

Thanks for any help VoidMage,

I can't get to that drive via a chroot for a while because I'm using the hardware to rebuild 148 packages so the Server can get back online ASAP (and it's ancient hardware) but here's an almost identical box. The dead box is in a Data Centre and although I can't chroot to it I can however try anything you suggest from a RO /dev/hda1 prompt with the init switch on the kernel line.

```
rc-update show    

            bootmisc | boot                          

             checkfs | boot                          

           checkroot | boot                          

               clock | boot                          

         consolefont | boot                                         

       fix_resolvers |      default                  

            hostname | boot                          

             keymaps | boot                          

               local |      default nonetwork        

          localmount | boot                          

             modules | boot                          

               mysql |      default                  

            net.eth0 |      default                  

              net.lo | boot                          

          portsentry |      default                  

       reject_routes |      default                  

           rmnologin | boot                          

                sshd |      default                  

              svscan |      default                  

           syslog-ng |      default                  

             urandom | boot                          

          vixie-cron |      default 
```

```
emerge --info

Portage 2.1.6.13 (default/linux/sparc/10.0, gcc-4.1.2, glibc-2.9_p20081201-r2, 2.6.30-gentoo-r5 sparc64)

=================================================================

System uname: Linux-2.6.30-gentoo-r5-sparc64-sun4u-with-gentoo-1.12.13

Timestamp of tree: Wed, 06 Jan 2010 05:15:02 +0000

ccache version 2.4 [enabled]

app-shells/bash:     4.0_p35

dev-lang/python:     2.5.4-r3, 2.6.4

dev-python/pycrypto: 2.0.1-r8

dev-util/ccache:     2.4-r7

sys-apps/baselayout: 1.12.13

sys-apps/sandbox:    1.6-r2

sys-devel/autoconf:  2.13, 2.63-r1

sys-devel/automake:  1.9.6-r2, 1.10.2

sys-devel/binutils:  2.18-r3

sys-devel/gcc-config: 1.4.1

sys-devel/libtool:   2.2.6b

virtual/os-headers:  2.6.27-r2

ACCEPT_KEYWORDS="sparc"

CBUILD="sparc-unknown-linux-gnu"

CFLAGS="-O2 -mcpu=ultrasparc -pipe"

CHOST="sparc-unknown-linux-gnu"

CONFIG_PROTECT="/etc /var/qmail/alias /var/qmail/control"

CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/udev/rules.d"

CXXFLAGS="-O2 -mcpu=ultrasparc -pipe"

DISTDIR="/usr/portage/distfiles"

FEATURES="ccache distlocks fixpackages parallel-fetch protect-owned sandbox sfperms strict unmerge-orphans userfetch"

GENTOO_MIRRORS="http://mirror.bytemark.co.uk/gentoo/ http://mirror.qubenet.net/mirror/gentoo/ http://mirror.ovh.net/gentoo-distfiles/"

LDFLAGS="-Wl,-O1"

PKGDIR="/usr/portage/packages"

PORTAGE_CONFIGROOT="/"

PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"

PORTAGE_TMPDIR="/var/tmp"

PORTDIR="/usr/portage"

SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"

USE="acl alsa apache2 berkdb bgpclassless bzip2 cdr cli cracklib crypt cups cxx dri dvd fam fix-connected-rt fortran gcc64 gdbm gpm highvolume iconv javascript justify kde logrotate mailbox modules mudflap mysql ncurses nls nptl nptlonly openmp pam pcre pppd python qmail-spp qt3 qt4 readline reflection rrdcgi session sparc spl ssl symlink sysfs tcpd tcpmd5 unicode xinetd xml xorg" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_dbd authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident include info log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif so speling status unique_id userdir usertrack vhost_alias php5" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="fbdev glint mach64 mga r128 radeon sunbw2 suncg14 suncg3    suncg6 sunffb sunleo tdfx voodoo"

Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LANG, LC_ALL, LINGUAS, MAKEOPTS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY
```

My poor brain hurts. All I can think of is a udev bug but I can't see anything obvious.   :Embarassed: 

JR

----------

## jonathanross

For those interested ... I'm an honest type and promise £50 worth (in your currency) of Amazon vouchers to whoever helps find the correct solution.

JR

----------

## VoidMage

What exactly got updated on udev/kernel line since last boot ?

Any chance that you've switched to full libsata config, that is

even your IDE discs are now sd*, not hd* ?

If so, are the device drivers (SCSI disc/cdrom) builtin ?

That's CONFIG_ATA, CONFIG_BLK_DEV_SD, CONFIG_BLK_DEV_SR.

On a related note: do you have a full log of that boot, not just last few lines ?

(perhaps dmesg will be enough)Last edited by VoidMage on Wed Jan 06, 2010 8:14 pm; edited 1 time in total

----------

## jonathanross

I've tried fstab as /dev/sda already unfortunately. Any other ideas? I'll think down that route too.

JR

----------

## VoidMage

That's fstab, what about kernel boot line ?

----------

## jonathanross

Unfortunately not:

```

kernel-2.6.30-r5 root=/dev/sda1
```

It froze here:

```

Press Stop-A (L1-A) to return to the boot prom
```

JR   :Sad: 

----------

## NeddySeagoon

jonathanross,

I expect you are using the new libata drivers that make PATA appear as SCSI devices, with SCSI names.

Check your /etc/fstab.

The root fs exists and is mounted I expect silo.conf mentions root=/dev/sd....

----------

## jonathanross

Aha !

Any pointers on how to circumvent this error (this was on the 'similar' box and they both run that udev version now) ?

```
[4231488.313969] udev: starting version 146

[4231488.313986] udev: missing sysfs features; please update the kernel or disable the kernel's CONFIG_SYSFS_DEPRECATED option; udev may fail to work correctly
```

Does it need a new kernel build ? I hope not as it means a trip to the Data Centre.

JR

----------

## jonathanross

Hi Neddy,

I changed the silo.conf drive but just got this   :Sad: 

```
[   54.636500] VFS: Cannot open root device "sda1" or unknown-block(0,0)

[   54.721183] Please append a correct "root=" boot option; here are the available partitions:

[   54.831100] 0300        19551168 hda driver: ide-gd

[   54.895311]   0301          524160 hda1

[   54.945699]   0302          524160 hda2

[   54.996096]   0303        19551168 hda3

[   55.046570]   0304         5243112 hda4

[   55.096964]   0305         5243112 hda5

[   55.147360]   0306         8016624 hda6

[   55.197756] 1600        39082680 hdc driver: ide-gd

[   55.261891]   1601          499968 hdc1

[   55.312259]   1602          499968 hdc2

[   55.362653]   1603        19550160 hdc3

[   55.413049]   1604         7906248 hdc4

[   55.463444]   1605         4000248 hdc5

[   55.513838]   1606         6643728 hdc6

[   55.564363] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

[   55.673066] Call Trace:

[   55.705172]  [0000000000862dac] mount_block_root+0x294/0x2b0

[   55.779564]  [0000000000862fb0] prepare_namespace+0x180/0x1b4

[   55.855107]  [0000000000862304] kernel_init+0x1b4/0x1cc

[   55.923789]  [000000000042b01c] kernel_thread+0x30/0x48

[   55.992475]  [00000000007025c0] rest_init+0x10/0x74

[   56.056575] Press Stop-A (L1-A) to return to the boot prom
```

----------

## Exil

downgrade udev, should be painless.

----------

## jonathanross

Thanks, Exil.

Unfortunately there's no immediate connectivity (the box falls to the 'none' prompt in the Data Centre over a terminal server and I can't run emerge) so it still means a trip to the Data Centre with another drive to boot off and chroot into.

Any ideas of how to get round that ? I can mount the distfiles partition etc.

I'll try it on the box I have in the office (it's identical hardware and I have an exact disk copy of the failing build too) but starting 5 hours ago it's still building package 40 of 148 !

JR

----------

## NeddySeagoon

jonathanross,

You are still using the old drivers - silo could no longer find your root partition, so listed everything it can see.

/dev/hda is clearly listed.

You may as well update the kernel to fix your udev issue.

Why do you have to go to the data centre for  kernel update ?

Only if you mess it up and the box doesn't come back up.

You could move to the libata drivers too if you wanted.

----------

## jonathanross

Hi Neddy,

Thanks again.

Please see my reply to Exil ... how do I get past the 'none' prompt to get a system I can boot into ?

There's no CD drives in these machines, there is a second disk that used to boot, I'm trying again now but it failed to boot twice earlier.

JR

----------

## VoidMage

Just a little note: most kernel devs discourage CONFIG_IDE now.

It will be officially marked as deprecated in 2.6.33 (I think git already has that note).

Also, if you've got both CONFIG_IDE and CONFIG_ATA drivers in kernel,

you often run into strange conflicts.

----------

## jonathanross

Thanks for the pointers again, all. Please see the output after a chroot and udev downgrade. I'm going to put the latest kernel on now.

```
* Mounting /dev ...                                                      [ ok ]

/etc/init.d/udev: /lib/udev/write_root_link_rule: /bin/sh: bad interpreter: No s

uch file or directory                                                           

/etc/init.d/udev: line 40: /proc/sys/kernel/hotplug: Read-only file system      

 * Starting udevd .../sbin/udevd already running.                               

                                                     [ !! ]                     

start-stop-daemon: warning: failed to kill 428: No such process                 

1 pids were not killed                                                          

No /sbin/udevd found running; none killed.                                      

 * Checking root filesystem ...fsck.ext2: No such file or directory while trying

 to open /dev/hda1                                                              

/dev/hda1:                                                                      

The superblock could not be read or does not describe a correct ext2            

filesystem.  If the device is valid and it really contains an ext2              

filesystem (and not swap or ufs or something else), then the superblock         

is corrupt, and you might try running e2fsck with an alternate superblock:      

    e2fsck -b 8193 <device>
```

----------

## jonathanross

Thanks for everyone's help, VoidMage and Neddy in particular.

I'm afraid to say that I'm going to ditch those two SPARC Servers and have (in less than two hours) rolled out a Postfix box with SMTP-AUTH And TLS and IMAPS via Courier on Ubuntu. I've been using Gentoo for five years but this time it was too much work   :Sad: 

There's still a few x86 Gentoo boxes running so we'll see how they go.

JR

----------

## cach0rr0

define "ditch" 

inquisitive minds would like to know. You replacing gentoo with ubuntu on the same hardware, or what's going to happen to those servers? 

I mean this doesn't seem at all insurmountable, just a bit of a slow process going through it all on the forums. 

Have you hopped on any of the IRC chans yet? Realtime communication would cover this many posts in <5 minutes.

----------

## jonathanross

Thanks for asking.

The path of least resistance has ended up being rebuilding (with a bare Ubuntu Server install) and learning Postfix from scratch (Ubuntu doesn't bundle netqmail).

Point taken about IRC. These old SPARC boxes are incredibly slow at compiling and despite using them forever with Gentoo, it's not worth the eyestrain   :Smile: 

x86 Gentoo still is use and two SPARC boxes !

JR

----------

## cach0rr0

ack. I despise defeat

I guess my confusion, and why I've not jumped in myself, is trying to figure your partition arrangement, and whether hda1 is your proper gentoo root partition, or some special rescue that allows you to boot the system read-only, or if we're diagnosing why / seems to be mounted read only (or if that's intentional?)

Dunno. And i lost my train of thought, some dude just tackled a kangaroo on tv

----------

## jonathanross

He he   :Very Happy: 

I'm the tenacious one but after what was twenty hours of work, enough is enough !

JR

----------

## NeddySeagoon

cach0rr0,

On a SPARC box, hda1 is the root and contains /boot. There are all sorts of warning about horrible things happening of you do something different.

My U10 has a really dire IDE controller (10Mb/sec on a good day) but it won't boot from anything else. I have an old Gentoo install there, just for booting but root is on a SATA drive attached to a PCI card controller. The SATA drive has a complete gentoo install too.

Updating the kernel is a nightmare, unless I get it right first time.

----------

## cen0bite

I had the same problem today with my server after updating my kernel. It turned out to be a problem with my /sys directory already being populated with files from sysfs. sysfs failed to mount on startup, because it thought it was already mounted. It guess that happend when I moved the system to the current machine and accidentally copied /sys to the hard drive. That wasn't a problem as long as the kernel version did not change.

I booted into a rescue system, moved the current /sys aside and created a new and empty /sys folder. On reboot the error was gone and the server started up just fine.

----------

## jonathanross

Thanks for the update, cen0bite.

I'll have a think about that.

JR

----------

