# Strange authentication delay/crash (SOLVED)

## Akaihiryuu

I did a round of updates to my Gentoo server on Friday/Saturday.  There were quite a few, I don't remember what all they were.  But, since then, I have had HORRIBLE delays with anything involving authentication on that server, whether it is logging in over ssh, logging in at the console (I dragged a monitor and keyboard out there last night to see if it occurred at the console too), connecting to my courier-imap server, connecting to Samba from a Windows machine, etc.  Logging in over ssh is the easiest to time...once I put in the password, it takes 20 seconds before I get the shell prompt.  Email does the same thing.

I checked /var/log/portage/elog/summary.log and it had this in it:

 *Quote:*   

> >>> Messages generated by process 31609 on 2010-11-08 21:17:45 EST for package d
> 
> ev-libs/openssl-1.0.0a-r3:
> 
> WARN: postinst
> ...

 

I did this...unfortunately during the emerge, on package 17 of 18, it seemed to hang completely.  The NAT/routing was still working fine, as I still had internet from my computer, but I couldn't log into the machine at all.  Telnet would give me a username field but just hang after inputting it without a password prompt, until it eventually timed out and disconnected after 5 minutes.  SSH did the same thing (no password prompt, timeout/disconnect after 5 minutes).  I had no choice but to completely reset the machine...once it came back up, it seemed to work again (authentication delay was still present), so I finished the instructions in there and removed the old libraries.  But the horrible delay is still there.  I am thinking it has to have something to do with PAM, but I am not even sure where to look for that.  I am still using kernel 2.6.30.9, I haven't updated that in quite some time.

Also, I have noticed, since my round of updates on Friday/Saturday, I am not getting any notifications of remote logins in /var/log/messages anymore.  Before, it would put something in there whenever there was a successful su to root, or a successful login.  I am using syslog-ng.  Does anyone have any idea what could've updated that is causing this mess?  I am definitely willing to just put a couple of things in package.mask and roll back a couple of packages if necessary to get my system working properly again.Last edited by Akaihiryuu on Wed Nov 10, 2010 6:44 pm; edited 2 times in total

----------

## Hu

After you updated, did you process configuration file changes?  You can get various strange behaviors using old configuration files with new programs.

Also, may I ask why telnet is enabled?  It is not a secure protocol.

----------

## Akaihiryuu

Yes, I did look for and run config file changes (with dispatch-conf).  As for telnet, it is ONLY enabled on my LAN, and set in hosts-allow to only allow the IP address of my main computer in (as I generally only use it for testing and for emergencies like SSH not working, so I can log in and restart sshd if necessary).

EDIT: as we speak, authentication has *completely* crashed again.  I will not be able to log into that machine in any way, other than by going out and hitting the reset button.  Fortunately, iptables still seems to work properly, so I at least still have internet.  There was nothing out of the ordinary in /var/log/messages...not much of ANYTHING in /var/log/messages.  It seems like a lot of stuff has just stopped logging.

I REALLY don't want to have to rebuild that machine, it will take me about a week, maybe two, to get it back where it is now, and I will be completely without internet the entire time it is down.

----------

## Akaihiryuu

Ok, this is getting to be an extremely serious problem.  The machine has now been *completely* inaccessible for over 12 hours now.  I know I can restore it if I go out and hit the reset button, but I do not consider rebooting a valid troubleshooting procedure in the Linux world.  There is something seriously wrong.  Right now, if I try to ssh in, it just hangs for about 5 minutes and then I get a timeout error and the ssh client I am using closes.  If I try to telnet, it actually does give me a username/password prompt, but once again after 5 minutes it times out/closes just like SSH does.

Samba gives the same problem...if I try to mount a network drive or connect to Samba from a Windows machine, it just hangs for 5 minutes and then says it's inaccessible.  Same with email, I try to connect to my email server and Thunderbird gets a timeout error after about 5 minutes.  Oddly enough, Apache and MySQL seem to be working normally...I can even use PHPMyAdmin and I am able to log in and it works (I assume this is due to MySQL not using PAM for authentication).  If I try to use Horde (which attempts to log in to my email server), it fails and times out.

As far as I can tell, iptables and everything related to that work properly, evidenced by the fact that I still have internet access on this computer.  I can also connect to the wireless that I have set up.

I have been thinking about this, and it seems that the common denominator for everything that is not working at all is PAM.  But I have no idea what to even check, other than poking around in the pam.d directory when I was able to access the machine.  It almost seems like something that PAM is trying to use segfaulted.  Unfortunately, /var/log/messages showed VERY little the last time I was able to get in.  Logging has almost completely stopped since whatever update I did seemingly trashed the entire machine.

----------

## Akaihiryuu

Ok, I managed to fix everything by first masking the current version of pambase and reverting back to the previous stable one.  Then I had to redo the revdep-rebuilds above, then I had to recompile a bunch of packages (including syslog-ng, that's why it wasn't working).  I pretty much have everything up and working again, I'm doing one final revdep-rebuild to make sure nothing else is broken.  I may try to go to the newer version of PAM again after everything else is fixed...I think I just had too much stuff broken at once.

----------

