# SSH dropping connections for no apparent reason

## Sivar

For about a month now, SSH has been randomely dropping remote connections, as if there were some sort of time limit specified. It seems to take roughly the same amount of time to drop, but I haven't timed it.

This is REALLY irritating, and I have no idea why it would be doing this.

The SSH config files don't have anything that would seem to indicate a timeout.

Any suggestions on how to fix this would be appreciated.

----------

## Deebster

Are you complaining about ssh connections where you are on the client or server end?

----------

## grimshaw

Perhaps you are indeed suffering from a timeout.  Firewalling components like to do this if there is no activity for x time.

Can you present us with a layout of any firewalls between the two ends and their type (iptables, ipchains, pix, checkpoint, whatever)?

I had this problem with IPFWADMN masquerading (changeable by "ipfwadm -M -s 3600 0 0") and similarly with IPCHAINS (changeable by "ipchains -M -S 7200 10 60).

Under IPTABLES, I keep state and load the ip_conntrack module which has a very LONG timeout value.  I believe it is also settable, although I'd have to do some digging to tell you how.

- John

----------

## Sivar

The SSH connections die even when the server is accessed directly from within the internal network, where no firewalls are used.

Now, in some cases, the connection drops after as little as two minutes, and in other cases, it takes many hours. This now leads me to believe that it is probably not a timeout, since a timeout would probably be after a consistant period of time. Additionally, a friend of mine stated that it has happened when he was in the middle of typing something, so network inactivity is probably not the cause.

Now I am even less sure what in the world the connection losses are caused by.   :Shocked: 

----------

## Sivar

I've been using Telnet, and it works liek a charm. Thus, this [probably] isn't a kernel bug, it likely has to do with openssl or ssh.

I recompiled the latest stable version of OpenSSH using conservative CFLAGS ("-O2 -mcpu=pentiumpro"), and the bug still pops up, so I do not believe it is an "overly aggressive CFLAGS" problem.

Any other ideas?

C'mon guys, I'm using TELNET here! Help!

----------

## Deebster

I have no idea...  Could you show us logs from the client and server?  May as well chuck your configs in as well...

Have you tried running ssh in verbose mode?  You could try running sshd with the -D flag (detatch), although I don't know if that gives any more information than what's in the log.

----------

## dantams

I experinece the same thing when my Internet connection is very busy, up to its limits, for example when I have several P2P filesharing apps running. This makes me believe that it has to do with too many dropped packets or too large delay.

----------

## buser

I have had similar problems.  I am running the latest version of openssh on the gentoo server and I use the latest version of SSh Secure Shell to access it.

Here is my sshd_config:

```

#Port 22

Protocol 2

#ListenAddress 0.0.0.0

#ListenAddress ::

# HostKey for protocol version 1

#HostKey /etc/ssh/ssh_host_key

# HostKeys for protocol version 2

#HostKey /etc/ssh/ssh_host_rsa_key

#HostKey /etc/ssh/ssh_host_dsa_key

# Lifetime and size of ephemeral version 1 server key

#KeyRegenerationInterval 1h

#ServerKeyBits 768

# Logging

#obsoletes QuietMode and FascistLogging

#SyslogFacility AUTH

#LogLevel INFO

# Authentication:

#LoginGraceTime 2m

#PermitRootLogin yes

#StrictModes yes

#RSAAuthentication yes

#PubkeyAuthentication yes

#AuthorizedKeysFile     .ssh/authorized_keys

# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts

#RhostsRSAAuthentication no

# similar for protocol version 2

#HostbasedAuthentication no

# Change to yes if you don't trust ~/.ssh/known_hosts for

# RhostsRSAAuthentication and HostbasedAuthentication

#IgnoreUserKnownHosts no

# Don't read the user's ~/.rhosts and ~/.shosts files

#IgnoreRhosts yes

# To disable tunneled clear text passwords, change to no here!

#PasswordAuthentication yes

#PermitEmptyPasswords no

# Change to no to disable s/key passwords

ChallengeResponseAuthentication no

# Kerberos options

#KerberosAuthentication no

#KerberosOrLocalPasswd yes

#KerberosTicketCleanup yes

#KerberosGetAFSToken no

# GSSAPI options

#GSSAPIAuthentication no

#GSSAPICleanupCredentials yes

# Set this to 'yes' to enable PAM authentication (via challenge-response)

# and session processing. Depending on your PAM configuration, this may

# bypass the setting of 'PasswordAuthentication' and 'PermitEmptyPasswords'

UsePAM yes

#AllowTcpForwarding yes

#GatewayPorts no

#X11Forwarding no

#X11DisplayOffset 10

#X11UseLocalhost yes

PrintMotd yes

#PrintLastLog yes

TCPKeepAlive yes

#UseLogin no

#UsePrivilegeSeparation yes

#PermitUserEnvironment no

Compression yes

ClientAliveInterval 1

#ClientAliveCountMax 3

#UseDNS yes

#PidFile /var/run/sshd.pid

#MaxStartups 10

# no default banner path

Banner /etc/ssh/welcome

# override default of no subsystems

Subsystem       sftp    /usr/lib/misc/sftp-server

```

I dunno if that will help.  Hope someone finds a solution.

----------

## buser

i also found this http://www.brandonhutchinson.com/OpenSSH_ClientAliveInterval.html

and changed my setting to 300.  I guess we will see if that works   :Wink: 

----------

## buser

and i noticed that i revived this thread from almost a year ago... (obviously i dont post much!)   :Embarassed: 

----------

## jonaswidarsson

Well. I don't know if you are still in trouble, but I have also been there and I found out what caused the problem and fixed it.

Here goes:

I had some embarrassing collisions in IP-address ranges using both static IP and DHCP on the same LAN. I was going nuts! Every time I logged in to the machine, it kicked me out after say ten seconds. Extreeeeeemely annoying.

Well, I changed the IP of the server to sometihing I well knew was outside the DCHP IP range.

Then it worked without interruptions, and has been working for two months now.

If this problem is not yours anymore, I did at least add one solution to a similar problem, if anyone gets a search hit.

 *buser wrote:*   

> and i noticed that i revived this thread from almost a year ago...

 I don't get it. All these posts are from 2004, right?

----------

## buser

 *jonaswidarsson wrote:*   

> 
> 
>  *buser wrote:*   and i noticed that i revived this thread from almost a year ago... I don't get it. All these posts are from 2004, right?

 

ehhh ya.......i was tired, or drunk, or both   :Embarassed:  sorry i was looking at one of the peoples join dates in the forum instead of the post dates.    :Embarassed:   again   :Wink: 

----------

## Sivar

Here is a bit of an update:

I log into my server via SSH exclusively from putty, an excellent Windows terminal program which is IMO better than any Xterm I have used.

While at first I very highly doubted that the problem was putty, because it has worked fine for years, I did find it odd that SSH/SSL was still causing the problem even after completely replacing their config files and compiling them with *extremely* stable flags (-O -pipe. Even FreeBSD likes those).

I had a friend log in remotely from his system using standard Linux SSH from console. He recently reported that he was logged in without a hitch for 16 hours, and he is currently running some overnight to see if the connection dies for no apparent reason.

I still have to doubt that it is putty, because Putty hasn't been updated in over a year, and I am using the exact same version with the exact same configuration (which is default, except I set the scrollback buffer to about a million lines).

As a general problem solving step, I have looked for other common factors between the problems. One is Windows XP.

I know that it is all the rage, especially in Linux forums, to criticize Windows and other Microsoft products. With this comes a certain lack of credibility when criticizing the product, as you will notice exists if you talk to any nonbiased person reading, say, Slashdot posts.

All that aside, Windows XP may well be the problem. I have several reasons:

Windows XP is consistantly flakier when developing software, particularly socket-based (network) software. A friend of mine can confirm this as him and the company he works for (formerly a company I worked for, sort of) have attempted this and found that XP was simply not an acceptable platform for network software. I won't get into specifics as it would be offtopic.

Another possible checkmark against Windows XP is that this problem did actually begin somewhat around the time that my desktop system was switched to Windows XP, from Windows 2000. (I used the desktop for games, which Linux sucks at, so no conversion suggestions please).

This problem exists logging in both from my laptop running XP, and from my desktop, which runs XP and Linux (and FreeBSD).

A few things I have written off as possibilities:

*My network config/hardware:

 This happens both from my school and from my home. (Of course, my school may have a flaky network too). My network hardware is all top quality Intel gear, the switch is D-Link but has worked fine for years.

* My server hardware.

 The server is 100% server-grade hardware, with a Tyan SMP server motherboard, Intel server NIC, all SCSI drives, etc. That, and it works fine for every other connection of any kind.

* My laptop hardware/architecture. My laptop runs 64-bit Linux as it is an Athlon64 (not an eMachines system, though! gag.). There have indeed been some problems with software not liking the AMD64 architecture, but the disconnection problem happens on several non-AMD64 systems as well.

What I am doing now is running 4 Xterms in GNome 2.6 and one in the console, connected to the server. One xterm will run a simple shell stript that prints how many times it has looped then sleep for 5 seconds every iteration (to test to see if activity effects the problem).

If I eliminate any Unix variables, it must be a Windows-related problem, so will try to test for a solution in that regard.

----------

## Spooky Ghost

Just to add a "me too..."

I'm also using PuTTY (0.53b) from an XP machine (also Tyan SMP with onboard 3COM NIC) to my server box.  It's running a 2.4 kernel with the latest OpenSSH.  Occasionally I just lose the connection as described above and there doesn't seem to be any pattern.  Connections from any other clients seem fine.  The XP machine dual boots and when it's running Linux to the server I've not noticed any problems so it wouldn't seem to be hardware related.

BTW PuTTY was updated just recently to 0.54 but I've noticed some problems with it repositioning the cursor when exiting from nano.

----------

## Sivar

I ran 5 SSH sessions on Linux and 4 on Windows.

All Linux sessions stayed connected all night.

Two Windows sessions stayed connected.

For the Windows sessions, I ran two logged in as root and two as a regular user. One of each ran a script that spewed text to the screen every 5 seconds.

Strangely, the ones that had no activity were disconnected and the ones that ran the script were not.

This might not seem so strange, as this is common when connection timeouts and such occur. The reason that it is strange is that, while actively using SSH connections, they drop without regard to activity. They can drop after 2 hours of inactivity or in the middle of typing something.

To recap: The disconnection problem is a Windows or Putty problem, not a Linux problem. I will do further research to see what the problem is.

----------

## buser

i uncommented the "ClientAliveInterval 300" in /etc/ssh/sshd_config and i no longer have any problems with idle sessions timing out

----------

## Sivar

Mine was already uncommented.  :Sad: 

----------

## buser

have you tried setting it to 300?  another thing you may want to check is /etc/conf.d/net  mine was configured incorrectly and it was causing some problems.  Also, when I am at the university, it seems to time out alot for no reason, and yet on my friends computers, they will leave it open for days on end with no problem.

----------

## pclissold

 *Sivar wrote:*   

> To recap: The disconnection problem is a Windows or Putty problem, not a Linux problem. 

 

I believe it is PuTTY in combination with Windows XP that is the problem. PuTTY worked fine for me under W2K but I had the same trouble as you when I switched to XP. Solution was to replace PuTTY with F-Secure SSH client.

----------

## alexf

Have you tried setting the keepalive option in PuTTY to something other than 0 (under 'Connection' config option)? I get this problem with many of the machines I connect to if I don't change this to something like 100.

----------

## Sivar

Keepalive seems to work now, but for some reason it had no effect a few weeks ago. Strange.

----------

## Jassmith

For what it's worth, I've been having these same problems recently.

I have a Gentoo system running at home that I often SSH to from a WinXP machine at work.  I use putty and have been experiencing the random disconnects.  Sometimes I'm able to reconnect right away, but most of the time the Windows box is unable to connect (putty times out tryint to open a new connection) for some time, currently it's been almost an hour.  Eventually I am able to reconnect though.

I also have a Gentoo box at work, and I can ssh just fine from there, even during the blackout period on my Windows box.

----------

## alexf

Can you ping the box even when you can't connect from work? it sounds like it could just be connection problems - the fact that you can reconnect later on suggests the daemon is behaving itself (i.e. not crashing).

I am running openssh from x86 stable and dont' have any problems, except about once every two weeks when my isp resets something, my router drops the connections, autoreconnect fails and I have to reboot it  :Sad: 

----------

## Jassmith

The network at work doesn't allow us to ping outside our internal LAN, so that hasn't ever worked.

The reason I suspect something fishy with the WinXP/putty box is that while I'm unable to connect via putty, my Linux box at work (which is connected to the same hub as the WinXP system) can connect via openssh without any problems.  That's why I don't think it's the daemon.  Also during this time the WinXP box can load web pages off my home server without any problems.

----------

## LeTene

...and I'll chime in with an identical problem. PuTTY on Windows 2000 to my home gentoo box. I strongly suspect it's the firewall timing it out, so I'm attacking it from that end. So with all the heads here we'll get a solution  :Very Happy: .

----------

## dewke

 *Jassmith wrote:*   

> For what it's worth, I've been having these same problems recently.
> 
> I have a Gentoo system running at home that I often SSH to from a WinXP machine at work.  I use putty and have been experiencing the random disconnects.  Sometimes I'm able to reconnect right away, but most of the time the Windows box is unable to connect (putty times out tryint to open a new connection) for some time, currently it's been almost an hour.  Eventually I am able to reconnect though.
> 
> I also have a Gentoo box at work, and I can ssh just fine from there, even during the blackout period on my Windows box.

 

I've had a similar problem.  I have a rack of gentoo boxes at work that i can ssh to from home.  There is *one* box out of 4 i've deployed to that gives me trouble.  Sometimes I can't ssh to it at all.  Sometimes it drops me and I can't reconnect to it at all.  It never drops off the segment at work, and we used fixed IP addresses.  What is very odd is that I can ssh to another server at work and then ssh to the troublesome box.

I've experienced the same problem using securecrt from winxp as well as the openssh client on my gentoo box at home.

Any suggestions?

----------

## weyhan

Let me guess... You have a gentoo server and your internal network are mostly win boxes. You have a dhcp server installed on your gentoo box to dish out your internal network's ip address.  One day you decided to have the firewall enabled on the gentoo box. Then things start to fall apart with random drops of your ssh session for no apparent reason. At this point you notice a trend, it only happens when you are on the win box. If so, check to see if you have configure your firewall to allow connection from your internal lan for udp port 67. If you have not, then there is your problem... Happened to me (TM)...

Apparently on winxp, I notice that if it can't get an ip address from the dhcp server, it will reset the connection. Strange...

If your situation is different, then you should check for any reason why winxp might reset the connection. Maybe look at the winxp logs and see if there are any useful error message...

HTH.

----------

