# NFS sharing not mounting at boot

## jyoung

I recently updated a machine that starts some net-requiring services at boot, but, after the update, the services no longer start properly. Specifically, it needs to start an ntp client and mount an NFS share. After the bootup is complete, I can mount the NFS share manually without issue, and there are no other signs of lack of connectivity.

During the update, emerge printed this message:

http://gentoo.org/support/news-items/2015-02-02-nfs-service-changes.html

but, I've implemented the suggested change:

```
rc-update add nfsclient
```

This machine uses OpenRC, and it *seems* like the solution would be to force the nfsclient service to wait for a connection. However, I've not had any success in doing this. I've added this to /etc/conf.d/nfsclient

```

rc_after="net"

rc_need="net"

```

I also found this old post which describes the opposite problem, the machine waiting too long for a connection before giving up:

forums.gentoo.org/viewtopic-p-2704033.html

I tried the opposite solution, adding the following line to /etc/conf.d/net:

```

dhcpcd_enp0s31f6="-t 60"

```

But, without success. Any help would be great!

----------

## Hu

You mention ntp near the beginning, then focus on NFS.  Is ntp working properly?

For your NFS problem, please post the fstab lines for the affected filesystems and output from the boot so we can see whether your attempt to make nfsclient start later worked.

----------

## krinn

You have miss this one https://gentoo.org/support/news-items/2015-10-07-openrc-0-18-localmount-and-netmount-changes.html

Which basically mean, if any mount fail, the service that try to do this mount will fail, and all services that depends on that service will "not fail", but will wait for it to succeed (delayed).

The delayed issue is not new, a service that depends on another should wait for its parent to succeed, why try to start if your parent is not ready and you need it?

But what is new is that a mount attempt is now marking the whole service has fail, while previously a fail mount was just ignore and if any other mount succeed, the service was mark as succeed.

----------

## szatox

 *Quote:*   

> After the bootup is complete, I can mount the NFS share manually without issue, and there are no other signs of lack of connectivity.

 

Adding "_netdev" to fstab (in mount options column) should fix it. Once done, mount for that device will be called after network changes status to started.

Regarding ntp, did you tamper with init scripts?

What do "rc-service ntpd ineed" and "rc-service net.enp0s31f6 iprovide" say?

Funny... I wanted to ask about "rc-service -r net", but it doesn't work for any services with indirect  provider (notable example, though not limited to: net -> net.eth0 ) multiple provides on my system, and yet system boots just fine.

----------

## Jaglover

Isn't _netdev a systemd option?

----------

## jyoung

npt isn't working either. I just set the time to an incorrect time and rebooted, and ntp did not reset the time and isn't currently running. The log does show some ntp errors:

```

Jun 14 09:13:01 box24 ntpd[5145]: bind(20) AF_INET6 fe80::eb96:b401:c451:e155%2#123 flags 0x11 failed: Cannot assign requested address

Jun 14 09:13:01 box24 ntpd[5145]: unable to create socket on enp0s31f6 (4) for fe80::eb96:b401:c451:e155%2#123

Jun 14 09:13:01 box24 ntpd[5145]: failed to init interface for address fe80::eb96:b401:c451:e155%2

Jun 14 09:13:01 box24 ntpd[5145]: Listening on routing socket on fd #20 for interface updates

Jun 14 09:13:02 box24 ntpd[5145]: bind(23) AF_INET6 fe80::eb96:b401:c451:e155%2#123 flags 0x11 failed: Cannot assign requested address

Jun 14 09:13:02 box24 ntpd[5145]: unable to create socket on enp0s31f6 (5) for fe80::eb96:b401:c451:e155%2#123

Jun 14 09:13:02 box24 ntpd[5145]: failed to init interface for address fe80::eb96:b401:c451:e155%2

Jun 14 09:13:04 box24 ntpd[5145]: Listen normally on 6 enp0s31f6 [fe80::eb96:b401:c451:e155%2]:123

Jun 14 09:13:11 box24 ntpd[5145]: Listen normally on 7 enp0s31f6 138.110.75.115:123

```

I did tinker with the ntp scripts months ago, but not recently.

The line in fstab that for this particular NFS share is:

```

<IP address>:/export/cluster   /cluster nfs bg,timeo=14,_netdev,hard,intr,noatime,rsize=32768,wsize=32768,auto,nofail,_netdev 0 0

```

Note that I've just added the nofail (based on krinn's comment) and _netdev (based on szatox's comment). After reboot, still no luck.

"rc-service ntpd ineed" returns "fsck localmount dhcpcd"

and" rc-service net.enp0s31f6 iprovide" returns "net"

----------

## szatox

 *Jaglover wrote:*   

> Isn't _netdev a systemd option?

 

No. It's actually a mount option, and rc-service localmount start function does consider it. (a very short snippet here)

 *Quote:*   

> 	no_netdev="-O no_netdev"
> 
> 	mount -at "$types" $no_netdev
> 
> 

 

 *Quote:*   

> I just set the time to an incorrect time and rebooted, and ntp did not reset the time and isn't currently running

 

Ntpd does not hard reset time. If you heavily skewed your clock (i think the threshold is set on 2 minutes by default), ntpd with simply exit, possibly returning an error code. In such cases you have to either change config manually to force set time on boot, or run ntp-client before starting ntpd. Ntp-client will attempt to contact ntp servers, set your system time immediately to the correct value and then exit. From this point ntpd can take over to keep your clock in sync by speeding it up or slowing down as you lag behind or rush ahead of the world.

 *Quote:*   

> 
> 
> Jun 14 09:13:02 box24 ntpd[5145]: bind(23) AF_INET6 fe80::eb96:b401:c451:e155%2#123 flags 0x11 failed: Cannot assign requested address
> 
> Jun 14 09:13:02 box24 ntpd[5145]: unable to create socket on enp0s31f6 (5) for fe80::eb96:b401:c451:e155%2#123 

 This is weird. Do you have another process listening on this port? 

Errr.... I've just noticed this little bit below and I'm thinking about implications of kinda weird output you got there. Are you using netifrc AND dhcpcd in deamon mode at the same time (as a separate service)? I mean, are both, net.<unpredictable_interface_name> and dhcpcd services started during boot?

 *Quote:*   

> 
> 
> "rc-service ntpd ineed" returns "fsck localmount dhcpcd"
> 
> and" rc-service net.enp0s31f6 iprovide" returns "net"

 

----------

## jyoung

Okay, it looks like ntp is starting at boot. I just tried the same experiment, but with a 2min offset, and ntp corrected the time. And, the ntp daemon is now running. So it looks like the problem may be localized to nfs ...

Yes, actually, I do have net.enp0s10 and dhcpcd started during boot.

----------

## jyoung

I just tried again with dhcpcd removed from the default runlevel. Now, it works. There's still an error message about the mount failing during the boot sequence, but based on krinn's comments that to be expected (yes?).

Returning to the gentoo handbook, it says to add the net.* devices to the default runlevel, but not dhcpcd. Not sure why I did that back when I set up the system ...

----------

## Hu

 *jyoung wrote:*   

> There's still an error message about the mount failing during the boot sequence, but based on krinn's comments that to be expected (yes?).

 No, krinn's comment was that the rules for deciding whether network mounts had succeeded are different now than they once were.  If your configuration is correct and the server is up, you should get an automated mount at the right point during boot.

----------

## krinn

 *Hu wrote:*   

> that the rules for deciding whether network mounts had succeeded

 

It's not limited to network mount, it's specially visible with them as its a dependency of other, but it affect all "user" mounts (ones done from localmount).

And the problem: if you have a fs that you try to mount that is on error, even nofail is not helping.

The key is that nofail only work on non present device, a device present but on error will always report an error and the service end in error.

You endup with a cascading blocked services (which will include nfs, network...), even to a non bootable system! see https://bugs.gentoo.org/579876

This could be disable by adding ignore_mount_errors="yes" in /etc/conf.d/localmount

If your nfsclient report an error, it's bad, because anyone depending on it will be stuck. You should fix that.

ps: _netdev is an option for mount where the system is unable to determine itself if the device is network or not, a special case, using _netdev on nfs mount does nothing, system is fully aware nfs mount are network mount.

----------

## jyoung

The failed attempt at mounting occurs before the net service, , is started. Maybe part of the solution would be to change the order?

 *Quote:*   

>  ignore_mount_errors="yes" in /etc/conf.d/localmount 

 

Let me see if I understand this: adding this will ignore (safely) the initial failure to mount the NFS share, but allow nfsclient to keep attempting the mount in the background until it succeeds?

----------

## szatox

I'm glad you're making progress here, one problem solved is always a  good news.

 *Quote:*   

> There's still an error message about the mount failing during the boot sequence, but based on krinn's comments that to be expected (yes?). 

 No, your system is not supposed to even try mounting NFS until your network is up and running, and once it is up and running, nfsmount should succeed.

 *Quote:*   

> ps: _netdev is an option for mount where the system is unable to determine itself if the device is network or not, a special case, using _netdev on nfs mount does nothing, system is fully aware nfs mount are network mount.

  Does localmount script know about it/mount know that network is down? I checked script and manpages for _netdev, it explicitly orders mount to skip those devices. I'm not sure what happens without this parameter.

----------

## jyoung

My /etc/conf.d/localmount only has comments in it; I haven't altered it. Should I alter it to make localmount aware of the network issues?

----------

## szatox

jyoung, localmount is a service that keeps all of its relevant configuration in a completely unrelated file: /etc/fstab. There should be no need to ever alter it. You just make sure it's enabled in "boot" runlevel and let it do its job.

Now, I inspected a few other init script and I spotted an epic fail on our part:

nfsmount is an outdate, dummy script now. It has been replaced by a set of nfsclient + netmount.

Netmount actually "wants" nfsclient, so adding netmount to "default" runlevel should be sufficient, since it will attempt to start nfsclient if it finds nfs share in /etc/fstab.

Back to the failure mounting NFS share: which particular service attempts and fails to mount NFS share before network start?

Are you actually on linux kernel, or bsd one?

I'm confused, something doesn't fit. A quick summary of current issue would help us ensure we're on the same page.

I know it may seem redundant. Still, natural language is not a very strict protocol and redundant data helps a lot with error correction.

----------

## jyoung

I have the NFS share setup in /etc/fstab, so that means that the service which is trying to start it would be netmount (yes?)

Prior to the system update, all that was needed for this machine to start properly was the config in /etc/fstab and nfsmount in the default runlevel. This machine is a node in a cluster, and it needs to have the NFS share mounted as part of the boot processes.

netmount and nfsclient are both in the default and boot runlevels. Should I remove nfsclient?

----------

## Hu

 *jyoung wrote:*   

> I have the NFS share setup in /etc/fstab, so that means that the service which is trying to start it would be netmount (yes?)

 Maybe.  That's why szatox asked you to tell us what service is mounting it.  The output around when it fails should show what we need.  Please quote it to us.

----------

## krinn

It might help you seeing it:

```
rc-update | grep "nfs\|mount"

           localmount | boot                                          

             mount-ro |                        shutdown               

             netmount |      default                                  

```

nfsmount content should had been change (forget to etc-update?) doing nothing than just emit a deprecated message, here it is:

```
   ewarn "nfsmount is deprecated, please migrate as described in the news i

tem: 2015-02-02-nfs-service-changes"

   ewarn "This migration script will be removed after 01 Aug 2015."

```

 *szatox wrote:*   

> Does localmount script know about it/mount know that network is down?

 

No, it never know if the network is down, it only know if the device is network or not. nfs is a known network fs, that's why you don't have to hint about it with _netdev

But with or without _netdev, it will try to mount them ; to balance this, the service depends on net provider, still even if net is up, it doesn't mean it will works (net is up if the card is up, still without a cable link, the network is not "ready", another case, net is up, cable is plug, but the server itself is down or just nfsd is not start).

_netdev is for device that could be both network and local, with _netdev you make things clear the device is not local.

I balance this myself with that initscript ; because if you are mounting 6 shares from that host, and that host is down, you have to wait for each timeout (if timeout is set to 30s, it mean 30sx6 delay ; boring)

```
depend() {

   need net

}

start() {

   ebegin "Starting ifserverup"

   test=$(ping -c1 192.168.0.6);

   rc=$?

   if [ $rc -eq 0 ]; then

      /etc/init.d/netmount start

   else

      ewend $rc "Server is down"

      return 1

   fi

   return 0

}

stop() {

   ebegin "Stopping network share"

   /etc/init.d/netmount stop

   eend $?

}

```

----------

## P.Kosunen

/etc/conf.d/netmount:

rc_need="dhcpcd"

/etc/dhcpcd.conf:

waitip 4

I have "waitip 4" option in dhcpcd.conf for dhcpcd to wait until IPv4 is ready before boot continue and netmount set to wait dhcpcd before starting. Also "_netdev" must be in fstab options for NFS mounts. IIRC OpenRC paraller start must be disabled.

----------

## jyoung

Here's the snippet from /var/log/daemon.log that shows the NFS failure. And, actually, I'm seeing that NTP is failing too at first. I missed that among the messages scrolling past during boot.

```
Jun 12 15:12:45 box24 dhcpcd[4290]: control command: dhcpcd -m 2 enp0s31f6

Jun 12 15:12:45 box24 ntpdate[5439]: name server cannot be used: Temporary failure in name resolution (-3)

Jun 12 15:12:45 box24 /etc/init.d/ntp-client[5418]: ERROR: ntp-client failed to start

Jun 12 15:12:46 box24 /etc/init.d/netmount[5496]: Failed to mount /cluster

Jun 12 15:12:46 box24 /etc/init.d/netmount[5472]: ERROR: netmount failed to start
```

Krinn, yes, I did forget to do etc-update. Once I ran that, the machine no longer mounted the NFS share (recall that removing dhcpcd from boot runlevel allow NFS to mount, albeit with an error message).  One of the files that was updated by etc-update was rc.conf. The difference between the old and the updated version was that the old (working) version had:

```
rc_depend_strict="NO"
```

----------

## jyoung

During a recent reboot I was able to catch a message not reported in /var/log/daemon.log:

```
mount.nfs Network is unreachable
```

P.Kosunen, I tried your setup, modifying for IPv6:

/etc/conf.d/netmount: 

rc_need="dhcpcd" 

/etc/dhcpcd.conf: 

waitip 6 

That seems to work; there's no longer a message either at boot or in the logs about the NFS share failing to mount. To make this solution a bit more general, would there be a way to make dhcpcd wait for either IPv4 or IPv6? 

krinn, I'm very interested in your solution as it seems a good idea for cases where there's no connection. This script is in /etc/init.d, and you added it to rc, yes?

----------

## krinn

 *jyoung wrote:*   

> krinn, I'm very interested in your solution as it seems a good idea for cases where there's no connection. This script is in /etc/init.d, and you added it to rc, yes?

 

yes in /etc/init.d

implementation is easy, change ping test with your server IP and

```
rc-update add ifserverup

rc-update del netmount

```

You might have notice how simple it is, but it do the job, but if someone is in mood for improvements  :Wink: 

----------

## P.Kosunen

 *jyoung wrote:*   

> To make this solution a bit more general, would there be a way to make dhcpcd wait for either IPv4 or IPv6?

 

If you leave number out it should do that(?).

----------

## jyoung

I  just tried leaving the number out, and it still works on IPv6. Unfortunately, I don't have an IPv4 network to test with. I also ran the additional test of booting without a connection at all. Of course, it didn't start NFS or NTP, but it also didn't get stuck or anything like that.

Unless any objects, I'm going to mark this thread as SOLVED.

----------

## szatox

This solution sucks, but it's up to you to decide whether or not you're happy with it.

----------

## jyoung

This solution does feel like a bit of a hack. I'd be happy to continue investigating ...

----------

## szatox

Time to bring up the big guns then. And the bigger picture.

1)Go to /etc/rc.conf, and change rc_logger="NO" to "YES"

2) rc-update --update (just in case, shouldn't matter but it won't hurt)

3) reboot and post /var/log/rc.log and the output from rc-update (without any params). Actually, full /etc/conf.d/net in its current state could be useful too.

Let's see what's actually going on there.

----------

## jyoung

Okay, here are the results:

When running rc --update, the following errors were generated:

```

Error: modules-load is the name of a real and virtual service.

Error: tmpfiles.dev is the name of a real and virtual service.

Error: tmpfiles.setup is the name of a real and virtual service.

```

And here's /var/log/rc.log

www.pastebin.com/ucP6xvr4

The only thing in /etc/conf.d/net is 

```

dhcpcd_enp0s31f6="-t 60"

```

which is leftover from an earlier experiment to solve this problem.

----------

## szatox

rc-update --update and rc --update ain't the same thing. You're lucky you didn't fry your system.

I skimmed through your boot log and I noticed a few things:

1) You start dhcpcd in boot.

This is not necessarily bad, but certainly unusual. Net is much more common in "default", even though it doesn't make a big difference on a personal machine.

Interestingly enough, netmount service also starts there. This, wouldn't be too bad if not for 3.

2) You use both, dhcpcd AND netifrc to configure your network. This is bad. I don't suppose you explicitly disabled enp0s31f6 in dhcpcd's config, which means dhcpcd configures this interface in boot, and then netifrc calls dhcpcd again in default runlevel.

At least you can expect both of those services to produce the same result, which limits conflicts a bit, but it is a conflict nonetheless. Messy.

Chose one or the other. Either make sure dhcpcd does not provide net (and is not explicitly enabled), or remove netifrc symlink from /etc/init.d/

In either case you can completely clear /etc/conf.d/net. When empty, it defaults to dhcp, and dhcp tends to "just work", without any extra parameters.

3) What is eno1? Is this device supposed to exist?

If you replaced it with (or renamed to) enp0s31f6, remove the symlink too. This service will always fail and since it provides net, failure at this point should prevent ntp and nfs from starting. Oops.

4) Does your netmount service depend on net? Apparently it does not by default.

Go do /etc/conf.d/netmount and add the line:

rc_need="net !network"

(network is a deprecated service AND it is redundant - netmount's init script seems to be a bit outdated, but we can fix it with this config file)

With conflicts resolved and dependencies fixed you should be good to go.

----------

