# login as users defined in ldap need constant restart of nscd

## rsevero

I have ldap login implemented in my machines for some years now. Things used to work just fine.

Somewhere around glibc-2.3.6-r4 update I started to have problems to login as a user defined in ldap because nscd was crashing. Without nscd the login stoped.

After upgrading to glibc 2.4, nscd crashes no more but after a few hours, login as a user defined in ldap starts to fail nonetheless. To reenable login to users defined in ldap I have to restart nscd. Doing this I get a few more hours of working login. Should I include an hourly restart of nscd? I would hate to.

There is an open bug about this but things seems to be stalled

----------

## dahoste

Ditto.

Not sure what triggered this for me.  I did an 'emerge --update --deep world' sometime last week.  But I didn't notice the nscd problem until just a day or two ago.

I thought it might have something to do with a kernel tweak I made (enabling CONFIG_SECURITY), but even after reverting to my previous kernel, the nscd problem persists.

I notice that your original post is dated Oct 5, so it's unlikely that a recent emerge caused this for me, since I update regularly.  So it's more likely some bizarre configuration combination.

I'm going through my logs now to see if something stands out...

Any updated info you have on this would be appreciated.  The last post on the bug is dated Oct 9.

----------

## rsevero

 *dahoste wrote:*   

> I notice that your original post is dated Oct 5, so it's unlikely that a recent emerge caused this for me, since I update regularly.  So it's more likely some bizarre configuration combination.
> 
> 

 

Interesting observation as my post is dated Oct 5 but my problem is much older.

 *dahoste wrote:*   

> I'm going through my logs now to see if something stands out...
> 
> 

 

Please share if you find anything.

 *dahoste wrote:*   

> Any updated info you have on this would be appreciated.  The last post on the bug is dated Oct 9.
> 
> 

 

Unfortunately I posted anything else because I had no changes and I run out of ideas on how to solve this issue or where to look for the cause of the problem.

I really need some help to get out of this one.

----------

## Janne Pikkarainen

nscd has been problematic for ages. For me it was a total pain back in Red Hat 7.x/8.0 days, but in Gentoo it has been very fine. The reasons I remember why nscd was crashing for me:

- group caching was not very reliable. Disable it if you don't need it.

- malformed or otherwise strange user account entry was enough to make nscd die.

Also, have you modified /etc/nscd.conf in any way? At least couple of years back touching those cache-size values was not a very wise move.  :Wink:  I suggest you to enable nscd logging in its configuration file and see if it crashes every time in a similar way.

Here is my currently working nscd.conf:

```
#

# /etc/nscd.conf

#

# An example Name Service Cache config file.  This file is needed by nscd.

#

# Legal entries are:

#

#       logfile                 <file>

#       debug-level             <level>

#       threads                 <#threads to use>

#       server-user             <user to run server as instead of root>

#               server-user is ignored if nscd is started with -S parameters

#

#       enable-cache            <service> <yes|no>

#       positive-time-to-live   <service> <time in seconds>

#       negative-time-to-live   <service> <time in seconds>

#       suggested-size          <service> <prime number>

#       check-files             <service> <yes|no>

#

# Currently supported cache names (services): passwd, group, hosts

#

#       logfile                 /var/log/nscd.log

#       threads                 6

#       server-user             nobody

        debug-level             0

        enable-cache            passwd          yes

        positive-time-to-live   passwd          600

        negative-time-to-live   passwd          20

        suggested-size          passwd          211

        check-files             passwd          yes

        enable-cache            group           yes

        positive-time-to-live   group           3600

        negative-time-to-live   group           60

        suggested-size          group           211

        check-files             group           yes

        enable-cache            hosts           yes

        positive-time-to-live   hosts           3600

        negative-time-to-live   hosts           20

        suggested-size          hosts           211

        check-files             hosts           yes
```

----------

## dahoste

Janne,

My nscd.conf is identical to yours.

> - malformed or otherwise strange user account entry was enough to make nscd die. 

I can't guarantee that's not the case.  I used scripts to migrate my user/group info into LDAP.   Aside from this recent nscd behavior, everything seems to be working fine.

Is nscd really required for LDAP to function?  When I first ran through the gauntlet of configuring all of the LDAP/SMB mess, I remember that things didn't quite work until I enabled nscd, but I don't remember if that was actually specified as a requirement.

I'm still slowly investigating this.  If I come across anything interesting, I'll post back.  And I'll follow the bug history as well, though it seems to have gone cold.

Thanks for the replies!

----------

## Janne Pikkarainen

 *dahoste wrote:*   

> Is nscd really required for LDAP to function? 

 

That pretty much depends on your servers workload and if performance doesn't go down without nscd. For a high-load server nscd probably is a must.

One more thing: what version of pam_ldap / nss_ldap you have installed? Have you tried to upgrade them, if that's possible?

----------

## dahoste

Sorry for the delayed reply.

Here are the currently installed versions:

sys-auth/pam_ldap-180  

sys-auth/nss_ldap-249  

net-nds/openldap-2.3.24-r1

If I unmask ~x86 for all of them, it would upgrade to the following:

sys-auth/pam_ldap-182

sys-auth/nss_ldap-253

net-nds/openldap-2.3.24-r2

I'll look at the package changelogs and see what they say.

Interestingly, I went about 3 or 4 days without problems, but then had to restart nscd several times yesterday.

I hate bugs like this.

----------

## Janne Pikkarainen

 *dahoste wrote:*   

> Sorry for the delayed reply.
> 
> Here are the currently installed versions:
> 
> sys-auth/pam_ldap-180  
> ...

 

Seems to be recent enough.  :Smile: 

 *dahoste wrote:*   

> Interestingly, I went about 3 or 4 days without problems, but then had to restart nscd several times yesterday.
> 
> I hate bugs like this.

 

In my opinion this sounds like the good old nscd bugs, like a malformed/corrupted /etc/passwd entry breaking a havoc. Have you enabled nscd logging yet and read the logs? It REALLY can help you to track this down.

----------

## dahoste

I'm logging a bunch of stuff out of nscd now (logging at debuglevel=2).

I noticed that I'm getting some intrusion attempts.  Looks like ssh-related malicious intrusion attempts.

I added some quick blacklisting to my firewall to help fend it off, but obviously that's reactionary and not a full solution.

Anyway - the point is that I'm wondering, since these show up in the nscd logs, if some of these intrustion attempts are formed in such a way that they (intentionally or otherwise) cause nscd to crash.

Incidentally, what should I be looking for in the nscd logs?  I haven't seen anything that looks Really Bad(tm) yet.  Mostly just tons of cache misses.

Also, I went ahead and put nscd in paranoia mode, restarting every 15 minutes, just to mitigate the number of times that I have to physically go over to my server to kick nscd via a local login (since I've got root login's disabled over ssh).

Thanks.

----------

## dev-urandom

I remember having to re-emerge all my nss* packages after upgrading glibc. I know this sucks, but that's the way it is. glibc provides the name switching and a major update to glibc mostly needs an upgrade of all dependent packages. Please try re-emerging nss-ldap and see whether it fixes the issue.

----------

## dahoste

Roger that - I'll re-emerge all related packages.

Isn't that exactly what revdep-rebuild is for, though?

Since the errors are so non-deterministic, I'll have to wait awhile to see if re-emerging has any effect.  But I'll report back here one way or the other.

Thanks!

----------

## dev-urandom

 *dahoste wrote:*   

> Roger that - I'll re-emerge all related packages.

 

Cool, its just nss-ldap that might need reemerging btw - that and any other nss packages.

 *dahoste wrote:*   

> Isn't that exactly what revdep-rebuild is for, though?

 

revdep-rebuild can check only if libraries are no longer there. For eg. if a package used to provide libfoo.so.5 and and after an upgrade provides libfoo.so.6, then all packages that depended on libfoo.so.5 are broken and won't even start. revdep-rebuild can catch this. But this is a different case - there are certain symbols in the dependent packages that no longer resolve even though the linkage hasn't broken. You may have seen this when upgrading kde packages - some or the other application packages not rebuild always break  :Wink: 

----------

## DarkGian

Hi,

i'd like to reopen this topic as i am experiencing this issue from about a month. nscd behaves perfectly, but after a random quantity of time it stops providing functionality, reporting things such as bad getpwuid for users in the ldap database, resulting in the impossibility to log in. The only workaround is to restart the service.

On the server machine i'm running:

openldap 2.3.35-r1

glibc 2.5-r4

pam_ldap 183

nss_ldap 253

The clients have:

openldap 2.3.38

glibc 2.6.1

pam_ldap 183

nss_ldap 253

We have analized the problem taking into account possible interactions with nfs (as they are all diskless nodes), condor and the recent upgrade to a 2.6.22.1 vanilla kernel, but we don't see the point this could be the reason why nscd isn't working properly.

Additionally, the paranoia mode seems to have problems as well, as nscd crashes upon auto-restarting. I heard that this is a known issue though, connected with the uncorrect forking of the new nscd process.

Regarding the solutions proposed here, the upgrade of the related software to the current (most recent) versions didn't work, as re-emerging nss-* and modifying nscd.conf according to what Janne Pikkarainen said. Any more clues?

----------

## dahoste

Yes - I'm still interested in this issue, as it seems to have never fully been resolved for me.  In fact I just recently had to disable paranoia mode as I discovered it was indeed causing at least some of the problems.

I'm still on nss_ldap-249, as things are 'mostly' stable, and I just haven't had the stomach to reopen this can of worms with another round of package updates.

----------

## DarkGian

Even if it is apparently pointless, I'd like to let you know some interesting facts that happened lately.

First of all, the problem has not been solved. I applied the solution of introducing a cronjob that restarts nscd every hour to prevent further problems. I still have to upgrade the server machine to the latest openldap/glibc version, but the nodes are up-to-date. If anyone found any upgrades helpful, please let me know.

The strange thing is that, altough nscd no longer has the dramatic behaviour it had before, I have sometimes experienced what I would call a "random escalation of privileges" which basically consists in nscd giving to a user the credentials of another one. This is annoying, and besides being sometimes funny, it's some kind of threat and security issue: I saw a user gaining root credentials on some of his/her processes.

Usually when this happens, nscd is no longer able to provide any kind of service, thus cutting the real root out of the whole thing. Obviously it renders any cron-based workaround futile, and makes a brute force "pull-the-plug" the only solution so far.

Again, I am no expert in this matter, and I haven't found any clues on why and when this happens. But the latest developments make me seriously worried.

----------

