# network unreachable mounting root via nfs

## Tzuriel

I've got a diskless node configuration and I think I've got something wrong with NFS when the node starts to boot the kernel. I've been following the gentoo diskless howto.

The error 101 below I think means network is unreachable, but I don't see how. My node gets a static ip by pxeboot via dhcp, gets the bzImage via tftp, and starts to book the kernel at the point of the error below. I think eth0 is shutting down somehow so as not to be able to make the network reachable.

Does anyone have an idea what's going on here?

Node boot output ...

```

...

Using IPI Shorcut mode

IP-Config: No network devices available.

Looking up port of RPC 100003/2 on 192.168.2.0

portmap: RPC call returned error 101

Root-NFS: Unable to get mountd port number from server, using default

Looking up port of RPC 100005/1 on 192.168.2.10

portmap: RPC call returned error 101

Root-NFS: Unable to get mountd port number from server, using default

mount: RPC call returned error 101

Root-NFS: Server returned error 101 while mounting /diskless/192.168.2.101

VFS: Unable to mount root fs via NFS, trying floppy.

VFS: cannot open root device "nfs" or unknown-block(2,0)

Please apend a correct "root=" boot option

Kernal panic - not syncing: VFS: Unable to mount root fs on unknown-block(2,0) 

```

My /diskless/pxelinux.cfg/default file

```

DEFAULT /bzImage

APPEND ip=dhcp root=/dev/nfs nfsroot=192.168.2.10:/diskless/192.168.2.101

```

My /etc/exports file

```

# /etc/exports: NFS file systems being exported.  See exports(5).

# for each node ...

/diskless/192.168.2.101 192.168.2.101(sync,rw,no_root_squash,no_all_squash)

# This is common to all nodes ...

/opt    192.168.2.0/24(sync,ro,no_root_squash,no_all_squash)

/usr    192.168.2.0/24(sync,ro,no_root_squash,no_all_squash)

/home   192.168.2.0/24(sync,rw,no_root_squash,no_all_squash)

# This is the shared log ...

/var/log        192.168.2.101(sync,rw,no_root_squash,no_all_squash)

```

My slave nodes fstab

```

192.168.2.10:/diskless/192.168.2.101    /       nfs     sync,hard,intr,rw,nolock,rsize=8192,wsize=8192  0       0

192.168.2.10:/opt                       /opt    nfs     sync,hard,intr,ro,nolock,rsize=8192,wsize=8192  0       0

192.168.2.10:/usr                       /usr    nfs     sync,hard,intr,ro,nolock,rsize=8192,wsize=8192  0       0

192.168.2.10:/home                      /home   nfs     sync,hard,intr,rw,nolock,rsize=8192,wsize=8192  0       0

none                                    /proc   proc    defaults        0 0

192.168.2.10:/var/log                   /var/log nfs    hard,intr,rw    0 0

```

----------

## anonybosh

Did you get any log output on your server regarding the NFS mounting?

----------

## Tzuriel

Yes, in /var/log/messages on the master, I get the following output. I still haven't found what's going wrong here, other than I think the NIC is shutting down while booting.

```

Nov  1 13:09:07 master dhcpd: DHCPDISCOVER from 00:13:72:fb:0f:fc via eth0

Nov  1 13:09:07 master dhcpd: DHCPOFFER on 192.168.2.101 to 00:13:72:fb:0f:fc via eth0

Nov  1 13:09:11 master dhcpd: DHCPREQUEST for 192.168.2.101 (192.168.2.10) from 00:13:72:fb:0f:fc via eth0

Nov  1 13:09:11 master dhcpd: DHCPACK on 192.168.2.101 to 00:13:72:fb:0f:fc via eth0

Nov  1 13:09:11 master in.tftpd[10457]: RRQ from 192.168.2.101 filename pxelinux.0

Nov  1 13:09:11 master in.tftpd[10457]: tftp: client does not accept optionsNov  1 13:09:11 master in.tftpd[10458]: RRQ from 192.168.2.101 filename pxelinux.0

Nov  1 13:09:11 master in.tftpd[10459]: RRQ from 192.168.2.101 filename pxelinux.cfg/01-00-13-72-fb-0f-fc

Nov  1 13:09:11 master in.tftpd[10460]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A80265

Nov  1 13:09:11 master in.tftpd[10461]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A8026

Nov  1 13:09:11 master in.tftpd[10462]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A802

Nov  1 13:09:11 master in.tftpd[10463]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A80

Nov  1 13:09:11 master in.tftpd[10464]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A8

Nov  1 13:09:11 master in.tftpd[10465]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0A

Nov  1 13:09:11 master in.tftpd[10466]: RRQ from 192.168.2.101 filename pxelinux.cfg/C0

Nov  1 13:09:11 master in.tftpd[10467]: RRQ from 192.168.2.101 filename pxelinux.cfg/C

Nov  1 13:09:11 master in.tftpd[10468]: RRQ from 192.168.2.101 filename pxelinux.cfg/default

Nov  1 13:09:11 master in.tftpd[10469]: RRQ from 192.168.2.101 filename /bzImage

```

----------

## anonybosh

Ok, so your diskless system doesn't even seem to be contacting the master...

Your problem appears to lie right here:

```
IP-Config: No network devices available.
```

This seems to tell me that you don't have the necessary network driver compiled into your kernel!

One other thing that looks wrong is the:

```
VFS: cannot open root device "nfs" or unknown-block(2,0) 

Please apend a correct "root=" boot option
```

...which seems to be caused by your declaration of "root=/dev/nfs". Try removing that string altogether.

Also, have you tried mounting your NFS shares from another system to be certain all is well with them?

----------

## Tzuriel

 *liber8ate wrote:*   

> Ok, so your diskless system doesn't even seem to be contacting the master...
> 
> Your problem appears to lie right here:
> 
> ```
> ...

 

How can that be if I'm already on the network and grabbing the bzImage from the master? If I wasn't on the network talking to the master, then I wouldn't have been able to grab the kernel and load it. Or, maybe I'm not understanding something here.

 *liber8ate wrote:*   

> 
> 
> One other thing that looks wrong is the:
> 
> ```
> ...

 

No, haven't been able to find another linux box try. 

Ok, I'll give that a try in the morning by taking off the root=/dev/nfs. Every tutorial I saw basically used that, as well as the gentoo tutorials. And I've tried many combinations of boot params.

----------

## wynn

 *liber8ate wrote:*   

> One other thing that looks wrong is the:
> 
> ```
> VFS: cannot open root device "nfs" or unknown-block(2,0) 
> 
> ...

 The comment in /usr/src/linux/init/do_mounts.c to name_to_dev_t says

```
/*

 *      Convert a name into device number.  We accept the following variants:

 *

 *      1) device number in hexadecimal represents itself

 *      2) /dev/nfs represents Root_NFS (0xff)

 *      3) /dev/<disk_name> represents the device number of disk

 *      4) /dev/<disk_name><decimal> represents the device number

 *         of partition - device number of disk plus the partition number

 *      5) /dev/<disk_name>p<decimal> - same as the above, that form is

 *         used when disk name of partitioned disk ends on a digit.
```

so root=/dev/nfs seems to be correct.

There is another section of code (which will be selected as Tzuriel's kernel config contains CONFIG_ROOT_NFS=y)

```
#ifdef CONFIG_ROOT_NFS

        if (MAJOR(ROOT_DEV) == UNNAMED_MAJOR) {

                if (mount_nfs_root())

                        return;

                printk(KERN_ERR "VFS: Unable to mount root fs via NFS, trying floppy.\n");

                ROOT_DEV = Root_FD0;

        }

#endif
```

which shows why the error message says "unknown-block(2,0)" which is /dev/fd0.

Sorry I can't help with the main problem.

----------

## Tzuriel

 *liber8ate wrote:*   

> Ok, so your diskless system doesn't even seem to be contacting the master...
> 
> Your problem appears to lie right here:
> 
> ```
> ...

 

Great! Ok, this was the case for the driver. I don't understand it, but I changed drivers and it got past that point in the boot process.

Now the boot hangs on my diskless node at this point while booting the new bzImage I compiled with the new ethernet driver.

```

...

* Mounting devpts at /dev/pts ...              [ok]

* Remounting root filesystem read/write    [ok]

* Updating modules.dep ...                        [ok]

FATAL: could not open '/System.map': No such file or directory    [!!]

...

*Failed to set user font    [!!] (I'm ignoring these errors for now)

*Starting eth0

*   Bringing up eth0

*      192.168.2.10           [ok] 

```

And then is just hangs. Though I'm confused as to why that ip of 192.168.2.10 is my master. Shouldn't that be the ip of the node? Also, should there be a system.map file for a diskless node? Where should it go? The gentoo diskless howto doesn't mention anything about this.

----------

## Tzuriel

Apologies, thanks guys. I just had a left over /diskless/xxx/etc/conf.d/net reference that was wrong. But my diskless node now boots even though it has a boatload of failed errors during the startup. The biggest concern I have is the failed System.map file.

----------

## anonybosh

Thanks wynn for the clarification.

 *Quote:*   

> How can that be if I'm already on the network and grabbing the bzImage from the master? If I wasn't on the network talking to the master, then I wouldn't have been able to grab the kernel and load it.

 As far as I understand it, the process is:

1. Computer is turned on

2. BIOS runs its tests

3. BIOS looks for boot device (ethernet)

4. Following the PXE protocol, the BIOS uses the NIC to acquire the next booting instructions.

5. The BIOS is told to download a kernel image, and load that into memory.

--End direct BIOS control--

6. Kernel image is completely loaded (all drivers, etc).

7. Kernel looks for a root device (nfsroot)

8. The rest of the system is brought up via init scripts.

So basically, you are getting stuck at step #7 because the necessary driver is not loaded during step #6.

---

Just saw your post while editing this: *Quote:*   

> *Starting eth0 
> 
> *   Bringing up eth0 
> 
> *      192.168.2.10           [ok]

 You do not want init to start this! It screws the boot process up entirely. You will need to remove the net.eth0 from the 'boot'/'default' runlevel(s)

```
# rc-update del net.eth0
```

There was one other thing I had to edit, I am thinking that it was in the conf.d/rc file, but I cannot remember. I'll do some more looking and get back to you.

Edit:

Ok I found it; it was in the conf.d/rc file:

```
RC_PLUG_SERVICES="!net.eth0"
```

. Hope this helps.

----------

## PeterF

I started having a similar problem after upgrading a MythTV frontend diskless system from kernel 2.6.13 to 2.6.17.  Use of udev was part of the change.  liber8ate's suggestion to update the conf.d/rc file did the trick!  System booting.  Thanks!  I hadn't considered events being triggered outside the "rc-update" mechanism.

Later, also found this suggestion at http://gentoo-wiki.com/HOWTO_Gentoo_Diskless_Install to modify the conf.d/net file to something like this:

```
config_eth0=( "noop" "192.168.1.2 netmask 255.255.255.248" )
```

Either mechanism doesn't trigger the dns update that was occuring.  Static entries will be created if I cannot find another way.

Thanks again,

- Pete

----------

