# LDAP and NSS boot problem

## patkc66

I've installed openldap and nss_ldap.  I set up the LDAP directory so it has the equivalent of /etc/passwd, /etc/shadow, and /etc/group in it.  I then edited /etc/nsswitch.conf as follows to look in the LDAP directory as well as the files for user information.

```

(snippet from /etc/nsswitch.conf)

...

passwd:      files ldap

shadow:      files ldap

group:       files ldap

...

 
```

I tested things using the following steps, assuming uid 504 exists in LDAP as johndoe but not in /etc/passwd

```

## LDAP lookup test

# touch /tmp/test

# chown 504 /tmp/test

# ls -l /tmp/test

-rw-r--r--    1 johndoe  root            0 Apr 18 23:07 /tmp/test

#

```

Satisfied that the NSS user lookup is looking in the LDAP server as needed, I rebooted the system.

It got as far as the "Cleaning tmp files" messages and hung.

By adding additional eend 0 and ebegin statements to /etc/init.d/bootmisc, I was able to determine that the line it is hanging on is this one:

```

chown root.root /tmp/.{ICE,X11}-unix &>/dev/null

```

I then have to boot to my Boot CD, mkdir /mnt/gentoo, mount my root filesystem to /mnt/gentoo, and change /mnt/gentoo/etc/nsswitch.conf back to this to get things to boot again:

```

(snippet from /etc/nsswitch.conf)

...

passwd:      compat

shadow:      compat

group:       compat

...

```

Any ideas why things are hanging there?

By the way, I tried running chown root.root /tmp/test immediately after running the LDAP lookup test above, and it runs fine.  It's almost like the chown command is trying to look things up in the LDAP directory even though it already was able to get its answer from the files.  But since this is running pretty early in the boot runlevel, slapd hasn't been started yet.

When I looked at the man page for nsswitch.conf, I noticed that it has options you can set between files and ldap that control whether it goes on to the next one or not.  Do I need to add one of those to prevent it from trying to go to LDAP if it was able to get the answer from the files?

Thanks.

----------

## patkc66

It turns out it doesn't hang forever, just for a little while.  If I waited for a little while, it times out on whatever it was doing and continues on.  Once it gets over that little hump, things start up as normal and everything works fine.  I guess it just takes a little while to notice that the LDAP service isn't running yet, but that it can continue on anyway.

Whatever.

Anyway, it seems to work OK.  Now to figure out how to configure the entries in the directory so that Samba can work with that again.

----------

## magicOnAir

The culprit is the root.root in the /etc/init.d/bootmisc file in the line 

```
chown root.root /tmp/.{ICE,X11}-unix &>/dev/null
```

 If you change that to root:root then you are fine.

It took me a while and numerous reboots to figure that out.

----------

## patkc66

Thanks.  I made the change and things come right up now.

Pat

----------

## mryoung_fr

hi there,

i had the same problem as you recently after an nss_ldap update ... it was stalled at "cleaning /tmp". First, i removed everything in bootmisc script to test ... everything goes fine, except when starting user based services (for example, named or apache ... which use user service account).

I found out that nss_ldap was trying to connect to my ldap server, and waiting for the bind to be done. Except that openldap is started long time after named, apache, etc ...

So i started to dig around, and found a parameter in ldap.conf

```
bind_policy soft
```

Right after i set this parameter to this value, everything was working as before.

----------

## XelKarin

Hey,

I've been having the same problem...  The bind_policy soft solution was the first one I tried.  This fixed my bootup problems, but I later discovered that I couldn't log in through ssh to any account that was in LDAP.

In the logs it has the following:

```
Jun  8 13:27:25 host sshd(pam_unix)[6541]: session opened for user somebody by (uid=0)

Jun  8 13:27:25 host sshd[6536]: nss_ldap: could not search LDAP server - Server is unavailable

Jun  8 13:27:25 host sshd[6536]: fatal: login_get_lastlog: Cannot find account for uid

Jun  8 13:27:25 host sshd[6536]: syslogin_perform_logout: logout() returned an error

Jun  8 13:27:25 host sshd(pam_unix)[6541]: session closed for user somebody
```

getent and other utilities using NSS LDAP work fine.

I found this thread and checked /etc/init.d/bootmisc and discovered that in my version of baselayout that script has already been fixed.  The line containing

```
chown root.root /tmp/.{ICE,X11}-unix &> /dev/null
```

was changed to

```
chown 0:0 /tmp/.ICE,X11-unix &>/dev/null
```

----------

## mryoung_fr

yep same for me i've the following error

```
nss_ldap: could not search LDAP server - Server is unavailable 
```

but as it doesn't seem to make problem (everything's working), i'm not wored (for the moment) about it.

----------

## mryoung_fr

finally, after some tests, it seems that this error causes troubles ...

Indeed, here and then, some users can't login (because of that error).

So, i googled a bit, and i found a way to correct it.

You should use the following parameters in your ldap.conf

```

bind_policy hard

nss_reconnect_tries 3

```

The second parameter limits the number of reconnections to avoid those stalled boot, and avoid this "unable to connect to ldap server error". This is a new, but undocumented (yet) parameter.

Hope it'll help

++

----------

## bunder

so what should we set it to?  i don't want it hanging on boot, but i don't want the network ignoring ldap...

----------

## mryoung_fr

i just said it...

bind_policy hard will force ldap connection, and nss_reconnect_tries 3 will avoid the box to hang at boot.

----------

## bunder

but is 3 too low?  

do you also happen to know the wait time between retries?

i hate to seem like a pain, but i want to get this right the first time   :Wink: 

----------

## mryoung_fr

in fact, 3 is certainly not too low ... i get this information from a website telling 2 should be a good value ... if the first connection failed, then, the second should work ... that's, in a way, a bind_policy soft option, but trying 2 times and not 1 only ... which was why it causes trouble.

I set this value to 3 cause imho, 3 is a better value ... but i haven't for the moment done much tests, and i don't really know how it acts exactly.

i'll give you my feedback as soon as possible.

++

----------

## bunder

none of these work for me.  the pc still takes hours to boot.  what the heck did they break here...   :Mad: 

edit: my entire network uses this ldap setup.  i'm afraid to reboot anything else in fear that it won't come back up.

----------

## XelKarin

Just hit Ctrl-C to cancel the boot sequence.  It will ask for your root password, so enter that.  Then edit /etc/nsswitch.conf and comment out ldap for passwd, shadow, and group.  Then reboot the machine.  That should get your machine to boot with no problems.  After it boots you can add ldap back into nsswitch.conf.  I haven't rebooted since I tried mryoung's suggestions, but you may want to look into setting the bind_timelimit option in /etc/ldap.conf to a lower value.  It defaults to 30 seconds I think.

----------

## bunder

 *XelKarin wrote:*   

> Just hit Ctrl-C to cancel the boot sequence.  It will ask for your root password, so enter that.  Then edit /etc/nsswitch.conf and comment out ldap for passwd, shadow, and group.  Then reboot the machine.  That should get your machine to boot with no problems.  After it boots you can add ldap back into nsswitch.conf.  I haven't rebooted since I tried mryoung's suggestions, but you may want to look into setting the bind_timelimit option in /etc/ldap.conf to a lower value.  It defaults to 30 seconds I think.

 

i tried taking out ldap from nsswitch.conf.  if i do that, it will go, but i won't be able to log in.  for some reason if i leave ldap enabled partially nothing works.

----------

