# NFSv4 mount of root fs of diskless node

## srd

Moving this discussion to networking since I no longer think this is a kernel issue. The problem is that I'm unable to get a diskless node to mount the root fs via NFSv4. I'm pretty certain the problem is just a bad NFSv4 config so I was hoping I could get some more eyes on my config lines. 

Created these virtual directories on the master node:

```

mkdir /exports

mkdir /exports/gentoo-x86_64

mkdir /exports/home

mkdir /exports/opt

mkdir /exports/root

```

Here is /etc/exports on the master:

```

/exports            10.0.0.0/16(ro,fsid=0,sync,no_subtree_check,no_root_squash,no_all_squash)

/exports/gentoo-x86_64  10.0.0.0/16(ro,nohide,no_subtree_check,no_root_squash,no_all_squash)

/exports/opt        10.0.0.0/16(rw,nohide,insecure,no_subtree_check,no_root_squash,no_all_squash)

/exports/home       10.0.0.0/16(rw,nohide,insecure,no_subtree_check,no_root_squash,no_all_squash)

/exports/root       10.0.0.0/16(rw,nohide,insecure,no_subtree_check,no_root_squash,no_all_squash)

```

Added to /etc/fstab on master:

```

/home           /exports/home   none        bind        0 0

/opt            /exports/opt    none        bind        0 0

/diskless/gentoo-x86_64 /exports/gentoo-x86_64 none bind        0 0

/root           /exports/root   none        bind        0 0

```

Edited the master /etc/conf.d/nfs as follows:

```

OPTS_RPC_NFSD="100 -V 4.2 -V 4.1 -V 4 -N 3 -N 2"

```

Here is /etc/fstab on the diskless fs:

```

10.0.0.11:/gentoo-x86_64 /      nfs    ro,_netdev,auto         0 0

10.0.0.11:/opt          /opt    nfs    rw,_netdev,auto         0 0

10.0.0.11:/home         /home   nfs    rw,_netdev,auto         0 0

10.0.0.11:/root         /root   nfs    rw,_netdev,auto         0 0

```

I had nfs4 in the clients fstab, but believe this is deprecated. Also, these are virtual /exports paths. 

Here is the /diskless/pxeconfig.cfg/default file:

```

DEFAULT /gentoo-x86_64/boot/kernel-3.14.14-gentoo

APPEND ip=dhcp ro rootfstype=nfs root=/dev/nfs nfsroot=10.0.0.11:/gentoo-x86_64,nfsvers=4 init=/linuxrc

```

I'm assuming nfsroot here is pointed to the virtual /exports dir. Also tried _netdev as nfsroot option.

Runing the showmount command:

```

$ sudo showmount -e

Export list for myhost:

/exports/root          10.0.0.0/16

/exports/home          10.0.0.0/16

/exports/opt           10.0.0.0/16

/exports/gentoo-x86_64 10.0.0.0/16

/exports               10.0.0.0/16

```

The results of this config is a kernel panic. When the kernel starts to boot on the diskless node, it all goes by too quickly, but 3 seconds into it, here's the basic stacktrace left over.

```

? panic

? panic

mount_block_root

mount_root

prepare_namespace

kernel_init_freeable

? do_early_param

? rest_init

kernel_init

ret_from_fork

? rest_init 

```

Last edited by srd on Tue Mar 17, 2015 1:00 am; edited 4 times in total

----------

## szatox

Silly question, doesn't NFS4 require some userland tools? Kerberos? Some other new security feature? 

Would it work with NFS3?

Oh, you might need no_root_squash option in /etc/exports - without it anyone going for root access will be demoted to "nobody" (is it still true for v4?) so your client has very limited access.

Also, have you made friends with network traffic analyser? Wireshark used to be a great help for me when I was troubleshooting my own PXE setup.

----------

## srd

 *szatox wrote:*   

> Silly question, doesn't NFS4 require some userland tools? Kerberos? Some other new security feature? 
> 
> Would it work with NFS3?
> 
> 

 

I'm sure it does, things like nfs-utils. Kerberos? I don't know, does it? I haven't seen any tutorials that say so. My previous system works fine w/ NFS 3, but if 4 is out, I'd like to start understanding and use it.

 *szatox wrote:*   

> 
> 
> Oh, you might need no_root_squash option in /etc/exports - without it anyone going for root access will be demoted to "nobody" (is it still true for v4?) so your client has very limited access.

 

Thanks, I'll give that a try. I had it in when using NFS v3.

 *szatox wrote:*   

> 
> 
> Also, have you made friends with network traffic analyser? Wireshark used to be a great help for me when I was troubleshooting my own PXE setup.

 

I'll give wireshark a try, but w/o knowing whats supposed to be happening, through I'd try here first.

----------

## cwr

I use an exports file on the host of:

```

# /etc/exports: NFS file systems being exported.  See exports(5).

# insecure - allow ports > 1024

# no_root_squash - allow root access

# no_subtree_check - don't check further than filesystem

#

/exports         192.168.1.0/24(rw,fsid=0,no_subtree_check)

/exports/rootfs  192.168.1.0/24(rw,insecure,no_root_squash,no_subtree_check)

/exports/portage 192.168.1.0/24(rw,insecure,no_subtree_check)

# eof

```

and an fstab on the client of:

```

# NFS root filesystem.

192.168.1.20:/rootfs   /      nfs      defaults   0 0

#192.168.1.20:/portage   /usr/portage   nfs      ro,noatime,soft   0 0

#

# Common filesystems.

# Swapfile has UUID=aa64abcd-b617-44b0-9ecc-c7daa6d1dfd5

/var/swapfile      none      swap      sw      0 0

debugfs         /sys/kernel/debug debugfs   ro      0 0

```

That seems to work ok.

Will

----------

## srd

I can't see much different there than what I've tried. I'd be curious to see your pxelinux.cfg file?

----------

## cwr

I don't use PXE - it's an embedded system which boots from UBoot

Will

----------

## srd

OK. Anyone? Thoughts as to the configuration that I've posted?

----------

## szatox

No obvious mistakes there, have you managed to find out at which point computers stop chatting?

Maybe it would be a good idea to use initramfs to get started (so you know master server is configured well) and then only tune kernel options and nfs (initramfs from genkernel can boot liveCD over NFS -> root must be in sqfs)

Or get started with working NFS3 setup and then migrate it to NFS4?

----------

## srd

Actually, I have NFS3 working an the old node and am sure I can get it working easy enough. I just thought this might be a time to migrate to v4, but I'm finding out lots of people are having problems w/ NFSv4 and hearing lots of really bad comments. And I'm starting to think its justified. So, I think I may either move to something like sshfs or just move back to NFSv3 this week. I did put an initrramfs together and things were booting fine, then I pulled it because I shouldn't need it for this.

Thanks for the ideas.

----------

## n3bul4

Have you tried to manually mount the nfs share with the nfsvers=X option set?

Maybe that will reveal some useful info.

----------

## krinn

Actually i never heard anyone succeed with root over nfs with v4, and my last info (that i couldn't point you to as i don't know where i read this) was nfsv4 wasn't handling it.

What you could do, is use nfsv3 (i'm not asking you to change everything to v3, v4 could handle v3 is just how you are just calling the mount), so you can test v3 and see if it works, once it works, switch back to v4.

For server, v3 or v4 share the same config (as long as it is a valid v4)

And clients are just mounting v3 over v4 by passing nfsvers=3 and calling the mount point with the full path instead of the nfsroot ; for you it's just using /exports/rootfs instead of /rootfs

----------

## srd

 *n3bul4 wrote:*   

> Have you tried to manually mount the nfs share with the nfsvers=X option set?
> 
> Maybe that will reveal some useful info.

 

Yes, when I put a laptop on the network and mount my root file system from there manually, it mounts just fine using nfsvers=4 as an option. So that makes me think the NFS server is working and the problem is on the client.

Which leaves me with only two client side files to look at, 1) the pxelinux.cfg/default and 2) the /etc/fstab on the client. And I can't see anything wrong w/ either (and I've experimented with lots of options in the pxelinux.cfg/default file).

----------

## szatox

It's not fstab. You wouln't consider pxe to be the cuplrit if you got your root mounted, and fstab is not consulted until after this point.

It is still possible that some stuff that should be compiled in kernel is not there.

Have you inspected network traffic? What protocols do your boxes talk? At which point they stop?

----------

## srd

 *szatox wrote:*   

> 
> 
> Have you inspected network traffic? What protocols do your boxes talk? At which point they stop?

 

I've run tcpdump on the server:

```

$ tcpdump -s0 -i eth0 host 10.0.0.11

```

and only 3 packets are captured. I see no NFS activity whatsoever. 2 packets are for DHCP (used for PXE), and the 3rd is an ARP broadcast. So DHCP is used to give the diskless node an IP which is then used in downloading the kernel via TFTP. Then the kernel boots. And then it panics w/ the panic in the OP.

So somewheres at the point of trying to mount root via NFS, something is not working and an NFS request doesn't even leave the node.

----------

## szatox

NFS request doesn't even leave the node. Doesn't kernel panic say "Attempted to kill init"?

Silly question, do you have NIC driver compiled in kernel? What you describe looks like connection drop as soon as kernel takes control over hardware  :Rolling Eyes: 

If you do, that would mean kernel does not understand what the root device is. So we gotta have a closer look at those params inside pxe config.

----------

## srd

Ok, sorry, I'm not familiar w/ how the NFS protocol behaves. As far as what the kernel panic says, everything goes by too quickly for me to see what it says. I'm looking into some way to change the font size so that I can fit more data on the screen, or some way to catch the boot messages. 

Yes, I do have the NIC driver compiled in, also, the node was able to download the kernel. I just put in an initramfs hoping it would tell me something more and got the message:

```

mount: mounting 10.0.0.11:/gentoo-x86_64 on /mnt/root failed: Network is unreachable.

```

And from that, I don't see that I have an existing network. I believe your right about the connection dropping as soon as the kernel takes over.

Edit:

Just added some framebuffer support to decrease the font size in order to get more output. Here is what I can now see as the error. This is not using an initramfs.

```

VFS: Cannot open root device "(null)" or unknown-block(0,0): error -6

Please append a correct "root=" boot option: here are the available partitions:

usb 1-5.2: new low-speed USB device number 3 using ehci-pci

0b00          1048575 sr0 driver: sr

Kernel panic - Not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.7-gentoo #2

...

```

I have NFSv4 client support built into the kernel (as well as 4.1 and 4.2). As a side note, the master fs is all contained in an LVM container if that might mean anything.

----------

## szatox

This is pxe config I got for my pxe over nfs3 with initramfs from genkernel:

LABEL Gentoo

MENU LABEL Gentoo Live on PXE

LINUX gentoo/kernel-genkernel-x86-3.7.10-gentoo

APPEND ip=dhcp root=/dev/ram0 cdroot=1 real_root=/dev/nfs nfsroot=10.0.1.1:/mnt/linux.images/tftp/gentoo initrd=gentoo/initramfs-genkernel-x86-3.7.10-gentoo loop=gx86.sqfs looptype=squashfs

genkernel's initramfs only works in liveCD mode by default. Changing this requires extracting initramfs, replacing a few lines in init script and stuffing it back into a single cpio archive.

It might be worth to mention that my pxe clients requests IP twice during boot. First call is done by NIC's firmware, second is done by kernel itself. 

Anyway, I found https://www.kernel.org/doc/Documentation/kernel-parameters.txt which says your kernel command line should work. 

You might add "debug" to kernel's option to make initramfs drop you to shell early during boot and let you have a look around.

----------

## srd

szatox, thanks for that, but I think the diffs (as you stated) are your using NFSv3 and I'm not running genkernel, or using an initramfs for that matter. Here's my current latest pxelinux.cfg/default.

```

DEFAULT /gentoo-x86_64/boot/kernel-3.18.7-gentoo

APPEND ip=dhcp root=/dev/nfs rootfstype=nfs nfsroot=10.0.0.11:/gentoo-x86_64,nfsvers=4 init=/linuxrc real_root=/dev/mapper/vg0-root vga=795 ro

```

I have an LVM partition on /dev/sda2, so I've been working with using real_root=/dev/mapper/vg0-root thinking that might have something to do w/ not finding the block device (last line of kernel panic shown below). NFSv4 exports to a virtual directory (/exports) which is on my root partition, and therefore my rational to use, /dev/mapper/vg0-root. But so far, no good results.

```

Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

```

And I do have a custom initramfs that I built and when debuggin, include so that it drops me into a rescue shell to look around. Through that, I've seen that I have no network connection. I'll go that route again and search around some more. But my intention is to not use an initramfs since its shouldn't be needed to get up and running via NFS.

----------

## srd

So, trying to focus more on the errors here ...

```

VFS: Cannot open root device "(null)" or unknown-block(0,0): error -6

Please append a correct "root=" boot option: here are the available partitions:

usb 1-5.2: new low-speed USB device number 3 using ehci-pci

0b00          1048575 sr0 driver: sr

Kernel panic - Not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.7-gentoo #2

...

Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

```

Can anyone elaborate on the error code -6?

Also, it seems to be asking for a valid "root=". I have root=/dev/nfs. Any ideas as to why that would not be valid? I know its not a real device which makes this more difficult since NFS is just supposed to interpret it.

Does "unknown-block(0,0)" literally mean it doesn't recognize NFS as being the block device for the root fs? Or maybe the kernel doesn't have the correct drivers compiled in? I can't imagine what else would be needed.

Just looking for more clues as to whats going wrong.

----------

## szatox

 *Quote:*   

> And I do have a custom initramfs that I built and when debuggin, include so that it drops me into a rescue shell to look around. Through that, I've seen that I have no network connection. I'll go that route again and search around some more. But my intention is to not use an initramfs since its shouldn't be needed to get up and running via NFS.

  Good. I noticed you don't want initramfs in yourfinal version, I suggested it for troubleshooting on the way to that point.

 *Quote:*   

>  real_root=/dev/mapper/vg0-root

  is wrong. Your NFS client can't access /dev/mapper/vg0-root as a device because it's not exported. On the other hand, it shouldn't hurt either as real_root is used by init inside genkernel's initramfs and not by kernel directly. It seems to be a trick moving the logic to userspace where it can be custimized to livecd-over-tcp/ip.

To let contents of vg0-root be exported you must mount it to /exports/gentoo-x86_64. Important, to /exports/gentoo-x86_64 and not to any subdirectory because NFS doesn't export children mountpoints.Once this is done, you can access files residing on vg0-root, which is exacly what you usually want.

 *Quote:*   

> /exports/root          10.0.0.0/16
> 
> /exports/home          10.0.0.0/16
> 
> /exports/opt           10.0.0.0/16
> ...

 

/exports/gentoo-x86_64 is your client's root from host's point of view. From client's point of view it's 10.0.0.11:/gentoo-x86_64. Now ,according to papers I linked for nfs root you need following options:

root=/dev/nfs <-necessary to enable the pseudo-NFS-device

nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>]  <- If the `nfsroot' parameter is NOT given on the command line, the default "/tftpboot/%s" will be used.

ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:<dns0-ip>:<dns1-ip> <- autoconfiguration attempts to deal with empty fields. "ip" can appear alone.

rdinit=<executable file> <- The default value of this parameter is "/init"

Getting it all together, 

DEFAULT /gentoo-x86_64/boot/kernel-3.18.7-gentoo

APPEND root=/dev/nfs nfsroot=/gentoo-x86_64 ip rdinit=/linuxrc vga=795 ro 

should work as soon as you sort out that issue with network going down, assuming you have ready (extracted) filesystem in that nfsroot directory.

Oh, I found one thing more that might help:  *Quote:*   

> nfsvers=2 or nfsvers=3 — Specifies which version of the NFS protocol to use. This is useful for hosts that run multiple NFS servers. If no version is specified, NFS uses the highest supported version by the kernel and mount command. This option is not supported with NFSv4 and should not be used. 

 

Since your paths are relative to /exports rather than to root, they won't work with nfs3. Thus, if it works, you know it's v4  :Smile: 

----------

## srd

Got all that, but not seeing anything different.

 *szatox wrote:*   

> It might be worth to mention that my pxe clients requests IP twice during boot. First call is done by NIC's firmware, second is done by kernel itself.
> 
> 

 

I think requesting the IP a second time is a likely problem. I see the firmware making the first call, and that eventually gets the kernel downloaded to the node, but how does it request an IP the second time. I'm sure that's what the "ip=dhcp" param is for. But it doesn't appear to be working. Anyone have ideas on what to do about this?

When I drop to a shell after mount errors, I can clearly see that eth0 does not have an ip assigned to it. I'm really starting to think I've got NFSv4 configured properly, and because I have no network interface, NFS isn't going to mount.

----------

## srd

Dropping into a shell has shown me a problem when the BNX2 driver tries to load (which happens when I try to configure networking in the /init of my initramfs). And then I see a message about loading firmware failing. Apparently (as I've been told), the firmware for this driver was pulled out of the kernel in recent versions (running 3.18.7) so even though I have the BNX2 driver included in my kernel, I have an incomplete BNX2 driver (and therefore no networking for root NFS). And this now seems to make an initramfs mandatory for NFSv4 because there needs to be a way to load the linux-firmware package to get my Broadcom NetExtreme II NIC working.

Haven't tested this yet, but the explanation is a perfect description of what I'm seeing so it may be solved, will know when I get back to it.

----------

## szatox

You can choose to include firmware in kernel during compilation. Is this bnx2 driver an exception?

----------

## srd

When dropping to a shell and trying to bring up the interface manually, this message appeared.

```
bnx2 0000:03:00.0: Direct firmware load for bnx2/bnx2-mips-06-6.2.3.fw failed

with error -2

bnx2: Can't load firmware file "bnx2/bnx2-mips-06-06.2.3.fw"

```

This led me to the NIC used by my diskless node. So to answer your question ...

No, it's not the exception. This is a broadcom driver, quite common and in a lot of Dell's. The BNX2 kernel driver is still required, but the microcode was pulled out of the Broadcom driver which is now in the linux-firmware package. Apparently a lot of drivers are going that route. So aside from the built-in kernel driver, I now have to include my custom initramfs containing some /lib64/firmware/bnx2* top-level files and the following dir.

# tree lib64/firmware/bnx2

lib64/firmware/bnx2

|-- bnx2-mips-06-4.6.16.fw

|-- bnx2-mips-06-5.0.0.j3.fw

|-- bnx2-mips-06-5.0.0.j6.fw

|-- bnx2-mips-06-6.0.15.fw

|-- bnx2-mips-06-6.2.1.fw

|-- bnx2-mips-06-6.2.3.fw

|-- bnx2-mips-09-4.6.17.fw

|-- bnx2-mips-09-5.0.0.j15.fw

|-- bnx2-mips-09-5.0.0.j3.fw

|-- bnx2-mips-09-5.0.0.j9.fw

|-- bnx2-mips-09-6.0.17.fw

|-- bnx2-mips-09-6.2.1.fw

|-- bnx2-mips-09-6.2.1a.fw

|-- bnx2-mips-09-6.2.1b.fw

|-- bnx2-rv2p-06-4.6.16.fw

|-- bnx2-rv2p-06-5.0.0.j3.fw

|-- bnx2-rv2p-06-6.0.15.fw

|-- bnx2-rv2p-09-4.6.15.fw

|-- bnx2-rv2p-09-5.0.0.j10.fw

|-- bnx2-rv2p-09-5.0.0.j3.fw

|-- bnx2-rv2p-09-6.0.17.fw

|-- bnx2-rv2p-09ax-5.0.0.j10.fw

|-- bnx2-rv2p-09ax-5.0.0.j3.fw

`-- bnx2-rv2p-09ax-6.0.17.fw

I just tried it out and all is good again. 

My inability to mount root via NFSv4 was due to an incomplete ethernet driver being split into two packages, one which now requires the use of an initrd. 

szatox, and the rest of you, appreciate all the help in getting this solved.

----------

## szatox

Glad you got that working, however one thing is not clear for me.

When I build my kernels I can chose to include firmware in kernel. You say firmware for your NIC wasn't included, but the question is why? Because you didn't check the box to include it or because it's in linux-firmware package now? Like, really, the latter case looks weird. It's like simply taking this option away from us:

grep -i in_ker /usr/src/linux/.config

CONFIG_FIRMWARE_IN_KERNEL=y

----------

## srd

I just went back to check, and CONFIG_FIRMWARE_IN_KERNEL is enabled by default. So I really don't have a good answer as to why. I was just told that Broadcom recently decided to pull the firmware out of this driver, so even though that option is enabled, the driver doesn't actually include the firmware. There are several dozen drivers of various vendors in that package that have done the same. Guess it just depends on what drivers you use, as for this Broadcom, its in the linux-firmware package now. I just wished I would have at least gotten a warning message saying something more was needed.

----------

