# nfs4 + portage(distifiles) problem

## zhushazang

I have four machines (desktop) tha use gentoo in my network. 

After first problems with nfs4's implementation two machines (nfs4 "server" and a first nfs4 "client") working correctly. But when i tried put oher "nfs4 clients" to use the same /usr/portage/distfiles occour an error. I can crete, delete, write and read inside nfs4 partition, but the error appear in the other 2 machines that will use nfs4 mounted directory.

```

>>> Emerging (1 of 4) dev-python/sip-4.10.5

Cannot chown a lockfile: '/usr/portage/distfiles/.sip-4.10.5.tar.gz.portage_lockfile'

 * sip-4.10.5.tar.gz RMD160 SHA1 SHA256 size ;-) ...                     [ ok ]

Traceback (most recent call last):

  File "/usr/lib/portage/bin/ebuild", line 268, in <module>

    debug=debug, tree=mytree)

  File "/usr/lib/portage/pym/portage/proxy/objectproxy.py", line 32, in __call__

    return result(*args, **kwargs)

  File "/usr/lib/portage/pym/portage/package/ebuild/doebuild.py", line 838, in doebuild

    fetchonly=fetchonly):

  File "/usr/lib/portage/pym/portage/proxy/objectproxy.py", line 32, in __call__

    return result(*args, **kwargs)

  File "/usr/lib/portage/pym/portage/package/ebuild/fetch.py", line 612, in fetch

    stat_cached=mystat)

  File "/usr/lib/portage/pym/portage/util/__init__.py", line 910, in apply_secpass_permissions

    stat_cached=stat_cached, follow_links=follow_links)

  File "/usr/lib/portage/pym/portage/util/__init__.py", line 745, in apply_permissions

    os.chown(filename, uid, gid)

  File "/usr/lib/portage/pym/portage/__init__.py", line 228, in __call__

    rval = self._func(*wrapped_args, **wrapped_kwargs)

OSError: [Errno 22] Invalid argument: '/usr/portage/distfiles/sip-4.10.5.tar.gz'

 * Fetch failed for 'dev-python/sip-4.10.5', Log file:

 *  '/var/tmp/portage/dev-python/sip-4.10.5/temp/build.log'

>>> Failed to emerge dev-python/sip-4.10.5, Log file:

>>>  '/var/tmp/portage/dev-python/sip-4.10.5/temp/build.log'

 * Messages for package dev-python/sip-4.10.5:

 * Fetch failed for 'dev-python/sip-4.10.5', Log file:

 *  '/var/tmp/portage/dev-python/sip-4.10.5/temp/build.log'

```

Cannot chown a lockfile:. Why?

Att

----------

## depontius

By default, nfs is set up with the root "squashed", essentially turned into noboty.  So root has less capability on a mounted filesystem than even an ordinary user.  Normally this is OK - even desirable because it gives your users extra privacy, but putting distfiles on nfs is one situation where it isn't.

Look for the "root_squash"/"no_root_squash" options.  Normally root_squash is the default.  I believe you have to have "no_root_squash" in both /etc/exports on the server, and on the mount definition line in /etc/fstab on the client.  I've never done this, so I don't know exactly.  But I believe this is your problem.

I do have slight root squash problems on my systems, still.  My /home is nfs4 mounted, and every now an then some ebuild or another wants to touch something in /home, and fails.  (IIRC, it might be mythtv, because /home/mythtv does exist.)  I have to unmount /home and repeat the emerge.  For me this works, because my local /home is really mounted on /local.  On the native /home are per-user symlinks to /local/(user).  For me, the mythtv user is a bit of a special case, in that it's box-local.  So on my nfs server home/mythtv is yet another symlink to /local/mythtv.  So for me, unmounting /home gets the right files in /home/mythtv updates, because it's really updating /local/mythtv, and that's what's normally used.  It's just that when /home is nfs4-mounted, root can't traverse the symlink.

----------

## zhushazang

Now working.

I needed only configure correctly /etc/idmapd.conf

The line Domain need be the same in every machine that'll use the nfs exported directory.

Thanks for all

----------

## mattst88

For reference, the relevant bug report is https://bugs.gentoo.org/show_bug.cgi?id=318847

----------

## clytle374

I just developed this problem with a portage share.  

I set the domain line in /etc/idmapd.conf  the same on both machines, and it still doesn't work

Is there any other config for idmap?  I can't find anything on it, I did find mention that it needs to be running, like in a service?  If so, I can't find one.

thanks

Cory

----------

## titanofold

 *clytle374 wrote:*   

> I just developed this problem with a portage share.  
> 
> I set the domain line in /etc/idmapd.conf  the same on both machines, and it still doesn't work
> 
> Is there any other config for idmap?  I can't find anything on it, I did find mention that it needs to be running, like in a service?  If so, I can't find one.
> ...

 

Did you restart rpc.idmapd?

```
/etc/init.d/rpc.imapd restart
```

On both the server and the client?

----------

## clytle374

Thanks, I had solved this and totally forgot about this post.

Problem lies with something changing and I had to use mountvers=3 in fstab to get proper permissions.  

Unfortunately I don't remember what I did to fix rpc.idmapd  I think revdep-rebuild fixed it.  

Thanks

Cory

----------

## genterminl

I'm facing a similar problem.  I originally posted here.  I even filed a bug, that was closed as a duplicate of the one mentioned above.  I simply gave up for over a year, but now I need a larger tmpdir, which I only have on the machine I'm using as the nfs server.

In my case, on a completely unpredictable basis, an emerge, after saying "Completed installing ...." fails with a python stacktrace ending with  *Quote:*   

> OSError: [Errno 22] Invalid argument: '/home/portage/tmpdir/portage/media-sound/mpg123-1.12.1/build-info/CHOST.28249'

 although that final number is essentially random.  Recently (not sure exactly when it started) even emerges that do succeed, end with  *Quote:*   

> rm: cannot remove `/home/portage/tmpdir/portage/media-sound/lame-3.99.3/temp': Directory not empty

 even though that temp directory is empty by the time I look at it.

I have the same "Domain = home" in /etc/idmapd.conf on both boxes, rpc.idmapd is running on both boxes, and the relevant uids and gids are the same on both boxes.  At the moment, /etc/exports on the server includes

```
/exports                client1.home(fsid=0,rw,sync,root_squash,no_subtree_check)

/exports/portage/tmpdir client1.home(fsid=1,rw,async,no_root_squash,no_subtree_check)
```

although I've tried quite a few variants with no change in behavior.

Can anyone suggest anything else I can check or try?

Jack

----------

## genterminl

Short update.  I just paid more attention to the exact error "OSError: [Errno 22] Invalid argument..."  and the invalid argument is the file name ..../CHOST.nnnn and it is trying to do a chmod.  By the time I look, the file is CHOST, without the numeric extension.  At that point, emerge --resume has about 50% chance of completing that emerge, and doing "ebuild path/to/ebuild merge" seems to always complete successfully.

In addition, however, EVERY emerge or ebuild ends with  *Quote:*   

> rm: cannot remove `/home/portage/tmpdir/portage/kde-base/kapptemplate-4.7.4/temp': Directory not empty

  but again, by the time I look, that directory IS empty.

I can manually do touch and chmod on files in the mounted directory, and I'm pretty sure idmapd is configured and running correctly, so I'm now wondering if the underlying issue is one of timing, locking, or caching.  For example, if emerge does a rename or delete, and the next command fails because the return or delete returned before it actually finished or there is some caching going on that is not fully caught up, how can I confirm that - and more important - how can I fix it?

I'm really getting desperate here  - as I'm in the midst of a 160 package update (mostly KDE 4.7.4) so it's getting painful doing each one manually.

----------

## Hu

How space constrained is your build system?  It seems weird that it would be more efficient to endure the performance of building in an NFS mount than to build locally.

----------

## genterminl

libreoffice.  Everything else I could do locally.  That's the only one I really don't have room for.  However, at this point, I'm on a mission just to figure out what the problem is.  I'm starting to play with nfs mount options, as I really do suspect caching of some sort.  Setting noac has made performance go to &*)(&) but I have had no failures of the chown, but still get the failures to delete folders that are apparently actually empty.

----------

## elvis_

I had a similar problem and solved it by commenting out the ldap lines in 

/etc/idmapd.conf on the nfs server machine. I restarted the service and it all started working again.

```
# server information (REQUIRED)

#LDAP_server = ldap-server.local.domain.edu   <<<<  this one

# the default search base (REQUIRED)

#LDAP_base = dc=local,dc=domain,dc=edu    <<<<  and this one

```

I guess they must have been added during an update or something

----------

## vostorga

 *elvis_ wrote:*   

> I had a similar problem and solved it by commenting out the ldap lines in 
> 
> /etc/idmapd.conf on the nfs server machine. I restarted the service and it all started working again.
> 
> ```
> ...

 

The above fixed things up for me. I edited both server and client /etc/idmapd.conf file

----------

## genterminl

I never did have any LDAP lines in my idmapd.conf files.  I don't recall if there were any in the template, but if so, I know I would have immediately commented them out, as I have no use for LDAP.

----------

