# pppoe-problem: packet lost? (ERESTARTNOHAND) / SOLVED

## ageforce

My web-connection works fine, but for few exceptions: I can't receive "www.dilbert.com" (and some other sites like "www.derstandard.at" or the MSN-login of Kopete). My webbrowser would send the request, and then time out on the "Waiting for response". This already worked several weeks ago, and I just can't remember what changed since then (I originally put the fault on the server's side). (Maybe the fact, that I switched from 2Mbit/s to 8Mbit/s. but I'm not sure).

Using a second PC and a Knoppix-cd (kernel 2.6.9) I got the following information:

* no DNS-prob.

* I can't even route through my PC. The second computer will fail on the same sites.

* using telnet I won't receive any page:

```
$ telnet www.dilbert.com 80

Trying 65.114.4.69...

Connected to www.dilbert.com.

Escape character is '^]'.

GET / HTTP/1.1

Host: www.dilbert.com

```

and no response...

* I launched strace (on the telnet-session) and got the following trace. The important line seems to be the following:

```
send(3, "Host: www.dilbert.com\r\n", 23, 0) = 23

select(4, [0 3], [3], [3], {0, 0})      = 1 (out [3], left {0, 0})

send(3, "\r\n", 2, 0)                   = 2

select(4, [0 3], [], [3], {0, 0})       = 0 (Timeout)

select(4, [0 3], [], [3], NULL)         = ? ERESTARTNOHAND (To be restarted)
```

I've unfortunately no idea what this ERESTARTNOHAND means, and what I can do to avoid it.

* When booting with Knoppix everything works fine.

* Going over a proxy (tried some anonymizer) works.

* Using the same ppp-configuration the second PC has the same problem.

I tried to downgrade to ppp-2.4.2-r10, and to reemerge 2.4.3-1. no luck... :(

I also tried reducing the MTR and MTU of my PPP connection without success.

I also tried to upgrade my kernel (although the old one already worked once).

I'm now running gentoo-2.6.11-r3

My internet-connection recently got boostet from 2Mbit/s to ~8Mbit/s. I'm not sure, if the problems started at the same time.

My internet-provider is n9uf (http://www.neuf.fr) (just in case).

a pppd-dump (I replaced the username):

```
$ sudo /etc/init.d/net.ppp0 restart

 * Bringing ppp0 down ...                                                                      [ ok ]

 * Bringing ppp0 up ...

SIOCDELRT: No such process

Plugin rp-pppoe.so loaded.

RP-PPPoE plugin version 3.3 compiled against pppd 2.4.3

pppd options in effect:

holdoff 2               # (from command line)

persist         # (from command line)

linkname ppp0           # (from command line)

dump            # (from command line)

plugin rp-pppoe.so              # (from /etc/ppp/options)

noauth          # (from command line)

user XXXXXXXXXX@neuf.fr          # (from command line)

remotename neuf         # (from command line)

eth0            # (from command line)

eth0            # (from command line)

asyncmap 0              # (from command line)

mru 1476                # (from command line)

mtu 1476                # (from command line)

lcp-echo-interval 3             # (from /etc/ppp/options)

hide-password           # (from command line)

ipparam ppp0            # (from command line)

defaultroute            # (from command line)                                                  [ ok ]

```

finally the ppp-config-files:

/etc/conf.d/net.ppp0:

```
# /etc/conf.d/net.ppp0:

# $Header: /home/cvsroot/gentoo-x86/net-dialup/ppp/files/2.4.2b3/confd.ppp0,v 1.2 2003/12/22 15:16:18 lanius Exp $

# Config file for /etc/init.d/net.ppp0

PEER="neuf"                   # Define peer (aka ISP)

DEBUG="no"                      # Turn on debugging

PERSIST="yes"                    # Redial after being dropped

ONDEMAND="no"                   # Only bring the interface up on demand?

MODEMPORT="eth0"          # TTY device modem is connected to

LINESPEED=""              # Speed pppd should try to connect at

INITSTRING=""                   # Extra init string for the modem

DEFROUTE="yes"                  # Must pppd set the default route?

HARDFLOWCTL="yes"               # Use hardware flow control?

ESCAPECHARS="yes"               # Use escape caracters ?

PPPOPTIONS="dump"                   # Extra options for pppd

USERNAME="XXXXXXXXXX@neuf.fr"       # The PAP/CHAP username

PASSWORD="YYYYYYYYYYY"               # Your password/secret.  Ugly I know, but i

                                # will work on something more secure later

                                # on.  700 permission on /etc/init.d/net.ppp0

                                # should be enouth for now.

NUMBER=""                # The telephone number of your ISP

                                # leave blank for leased-line operation.

REMIP=""                        # The ip of the remote box if it should be set

NETMASK=""                      # Netmask

IPADDR=""                       # Our IP if we have a static one

MRU="1476"                       # Sets the MRU

MTU="1476"                       # Sets the MTU

#MRU="1460"

#MTU="1460"

#MRU="768"                       # Sets the MRU

#MTU="768"                       # Sets the MTU

RETRYTIMEOUT="2"               # Retry timeout for when ONDEMAND="yes" or

                                # PERSIST="yes"

IDLETIMEOUT="600"               # Idle timeout for when ONDEMAND="yes"

PEERDNS="no"                    # Should pppd set the peer dns?

AUTOCFGFILES="yes"              # By default this scripts will generate

                                # /etc/ppp/chat-isp, /etc/ppp/chap-secrets,

                                # /etc/ppp/pap-secrets and /etc/ppp/peers/isp

                                # automatically.  Set to "no" if you experience

                                # problems, or need specialized scripts.  You

                                # will have to create these files by hand then.

AUTOCHATSCRIPT="yes"            # By default this script iwll generate

                                # /etc/ppp/chat-${PEER} automatically. Set to "no"

                                # if you experience problems, or need specialized

                                # scripts. You will have to create these files by

                                # hand then.

# Directory where the templates is stored

TEMPLATEDIR=/etc/ppp

```

/etc/ppp/options:

```
lock

plugin rp-pppoe.so

lcp-echo-interval 3

```

/etc/ppp/options-pppoe

```
noipdefault

hide-password

defaultroute

persist

lock

```

/etc/ppp/peers/neuf is empty.

I've run out of ideas now.Last edited by ageforce on Sat Mar 19, 2005 3:34 pm; edited 2 times in total

----------

## tuxmin

It would be helpful to have a tcpdump of such a telnet session (up to the timeout).

```

tcpdump/tethereal -i ppp0 port 80

```

----------

## ageforce

didn't even think about that...

```
  0.000000 80.119.136.176 -> 65.114.4.69  TCP 42817 > http [SYN] Seq=0 Ack=0 Win=5808 Len=0 MSS=1452 TSV=29949298 TSER=0 WS=2

  0.147047  65.114.4.69 -> 80.119.136.176 TCP http > 42817 [SYN, ACK] Seq=0 Ack=1 Win=33120 Len=0 TSV=767604515 TSER=29949298 WS=0 MSS=1460

  0.147096 80.119.136.176 -> 65.114.4.69  TCP 42817 > http [ACK] Seq=1 Ack=1 Win=5808 Len=0 TSV=29949446 TSER=767604515

  0.826689 80.119.136.176 -> 65.114.4.69  HTTP GET / HTTP/1.1

  0.974616  65.114.4.69 -> 80.119.136.176 TCP http > 42817 [ACK] Seq=1 Ack=17 Win=33120 Len=0 TSV=767604597 TSER=29950126

  0.974662 80.119.136.176 -> 65.114.4.69  HTTP Continuation or non-HTTP traffic

  1.981406 80.119.136.176 -> 65.114.4.69  HTTP Continuation or non-HTTP traffic

  2.135412  65.114.4.69 -> 80.119.136.176 TCP [TCP Previous segment lost] http > 42817 [ACK] Seq=4321 Ack=42 Win=33120 Len=0 TSV=767604713 TSER=29950274
```

a more verbose version is here

apparently a packet is lost, but I don't understand where, why this only happens to my gentoo, and why this affects only specific sites...

and thx

----------

## tuxmin

Guess what...

the telnet thing doesn't work for me neither (GET / HTML/1.1)!

```

1089.453107 192.168.23.153 -> 65.114.4.69  TCP 49513 > www [SYN] Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=48750954 TSER=0 WS=0

1089.606631  65.114.4.69 -> 192.168.23.153 TCP www > 49513 [SYN, ACK] Seq=0 Ack=1 Win=33120 Len=0 TSV=694038903 TSER=48750954 WS=0 MSS=1452

1089.606715 192.168.23.153 -> 65.114.4.69  TCP 49513 > www [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=48750969 TSER=694038903

1096.255103 192.168.23.153 -> 65.114.4.69  HTTP GET / HTTP/1.1

1096.405307  65.114.4.69 -> 192.168.23.153 TCP www > 49513 [ACK] Seq=1 Ack=18 Win=33120 Len=0 TSV=694039583 TSER=48751634

###  here it hangs for some seconds ###

1143.121694  65.114.4.69 -> 192.168.23.153 TCP www > 49513 [FIN, ACK] Seq=1 Ack=18 Win=33120 Len=0 TSV=694044255 TSER=48751634

1143.122219 192.168.23.153 -> 65.114.4.69  TCP 49513 > www [FIN, ACK] Seq=18 Ack=2 Win=5840 Len=0 TSV=48756321 TSER=694044255

1143.274218  65.114.4.69 -> 192.168.23.153 TCP www > 49513 [ACK] Seq=2 Ack=19 Win=33120 Len=0 TSV=694044270 TSER=48756321

```

Although I have no problem using konqueror or doing this:

```

telnet www.dilbert.com 80

Trying 65.114.4.69...

Connected to www.dilbert.com.

Escape character is '^]'.

GET               

```

Hm... can you drop a tcpdump of a real browser request? Maybe this server is mad about the browser identification.

Anyway, I start to believe that it's not a problem with your pppoe setup.

----------

## ageforce

HTTP/1.1 needs the "Host:"-line and you "send" your request with the empty line.

so after you connected to www.dilbert.com 80

you need to write the following 3 lines:

GET / HTTP/1.1

Host: www.dilbert.com

(don't forget the empty line either)

I'm currently at work, but I can post the browser-dump in the evening (CET).

----------

## tuxmin

I see,

this finally works. 

There is one thing I notice, though: If I interpret it right, your host requests an MSS of 1452 but the remote answers with 1460!

So it appears the remote tries to send a packet that is too large, retries several times an finally answers with TCP Previous segment lost... does this make sense?

How did you set CLAMPMSS in /etc/ppp/pppoe.conf?

Alex!!!

----------

## ageforce

didn't even know, I had a pppoe.conf... :)

when does it get used? (why are there so many config files for pppoe????)

here's my pppoe.conf (i removed all comments):

```
ETH=eth1

USER=bxxxnxnx@sympatico.ca

DEMAND=no

DNSTYPE=SERVER

PEERDNS=yes

DNS1=

DNS2=

DEFAULTROUTE=yes

CONNECT_TIMEOUT=30

CONNECT_POLL=2

ACNAME=

SERVICENAME=

PING="."

PIDFILE="/var/run/adsl.pid"

SYNCHRONOUS=no

CLAMPMSS=1412

LCP_INTERVAL=20

LCP_FAILURE=3

PPPOE_TIMEOUT=80

FIREWALL=NONE

LINUX_PLUGIN=

PPPOE_EXTRA=""

PPPD_EXTRA=""
```

so my CLAMPMSS is set to 1412.

----------

## tuxmin

pppoe.conf is used by the adsl-start script...

You are using ADSL, aren't you!?

----------

## ageforce

i'm starting my adsl using net.ppp0

(I have pppoe compiled into the kernel, and I've emerged ppp)

I supposse start-adsl comes from the rp-pppoe package?

```

*  net-dialup/ppp

      Latest version available: 2.4.3-r1

      Latest version installed: 2.4.3-r1

*  net-dialup/rp-pppoe

      Latest version available: 3.5-r8

      Latest version installed: [ Not Installed ]

```

----------

## tuxmin

OK, 

now I see clearer -- forget about pppoe.con  :Razz: 

Can you confirm, that e.g. www.derstandard.at shows the same MSS1452/1460 issuse like dilbert.com while working sites don't?

Just to make sure we are on the right track...

Alex!!

----------

## ageforce

yep. derstandard.at has the same issues, but so does for instance www.slashdot.org:

derstandard.at:

```
  0.000000 80.119.136.135 -> 193.154.164.57 TCP 51673 > http [SYN] Seq=0 Ack=0 Win=5808 Len=0 MSS=1452 TSV=12993054 TSER=0 WS=2

  0.069163 193.154.164.57 -> 80.119.136.135 TCP http > 51673 [SYN, ACK] Seq=0 Ack=1 Win=17424 Len=0 MSS=1460 WS=0 TSV=0 TSER=0

  0.069218 80.119.136.135 -> 193.154.164.57 TCP 51673 > http [ACK] Seq=1 Ack=1 Win=5808 Len=0 TSV=12993124 TSER=0

  6.580740 80.119.136.135 -> 193.154.164.57 HTTP GET / HTTP/1.1

  6.774793 193.154.164.57 -> 80.119.136.135 TCP http > 51673 [ACK] Seq=1 Ack=17 Win=17408 Len=0 TSV=27770540 TSER=12999636

 10.032939 80.119.136.135 -> 193.154.164.57 HTTP Continuation or non-HTTP traffic

 10.273145 193.154.164.57 -> 80.119.136.135 TCP http > 51673 [ACK] Seq=1 Ack=39 Win=17386 Len=0 TSV=27770575 TSER=13003089

 10.273189 80.119.136.135 -> 193.154.164.57 HTTP Continuation or non-HTTP traffic

 10.703281 80.119.136.135 -> 193.154.164.57 HTTP Continuation or non-HTTP traffic

 10.762935 193.154.164.57 -> 80.119.136.135 TCP [TCP Previous segment lost] http > 51673 [ACK] Seq=2881 Ack=41 Win=17384 Len=0 TSV=27770579 TSER=13003760
```

and www.slashdot.org:

```
  0.000000 80.119.136.135 -> 66.35.250.151 TCP 49643 > http [SYN] Seq=0 Ack=0 Win=5808 Len=0 MSS=1452 TSV=13079592 TSER=0 WS=2

  0.187437 66.35.250.151 -> 80.119.136.135 TCP http > 49643 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=309048132 TSER=13079592 WS=0

  0.187481 80.119.136.135 -> 66.35.250.151 TCP 49643 > http [ACK] Seq=1 Ack=1 Win=5808 Len=0 TSV=13079780 TSER=309048132

  5.139225 80.119.136.135 -> 66.35.250.151 HTTP GET / HTTP/1.1

  5.329572 66.35.250.151 -> 80.119.136.135 TCP http > 49643 [ACK] Seq=1 Ack=17 Win=5792 Len=0 TSV=309048646 TSER=13084732

 10.335885 80.119.136.135 -> 66.35.250.151 HTTP Continuation or non-HTTP traffic

 10.524222 66.35.250.151 -> 80.119.136.135 TCP http > 49643 [ACK] Seq=1 Ack=41 Win=5792 Len=0 TSV=309049166 TSER=13089930

 10.527180 80.119.136.135 -> 66.35.250.151 HTTP Continuation or non-HTTP traffic

 10.717162 66.35.250.151 -> 80.119.136.135 TCP http > 49643 [ACK] Seq=1 Ack=43 Win=5792 Len=0 TSV=309049185 TSER=13090121

 10.719394 66.35.250.151 -> 80.119.136.135 HTTP HTTP/1.1 301 Moved Permanently

 10.719418 80.119.136.135 -> 66.35.250.151 TCP 49643 > http [ACK] Seq=43 Ack=285 Win=6880 Len=0 TSV=13090313 TSER=309049185

 10.719875 66.35.250.151 -> 80.119.136.135 HTTP Continuation or non-HTTP traffic

 10.720092 80.119.136.135 -> 66.35.250.151 TCP 49643 > http [FIN, ACK] Seq=43 Ack=597 Win=7952 Len=0 TSV=13090314 TSER=309049185

 10.922397 66.35.250.151 -> 80.119.136.135 TCP http > 49643 [ACK] Seq=597 Ack=44 Win=5792 Len=0 TSV=309049205 TSER=13090314
```

----------

## tuxmin

Hm, 

the second does not count. The moved permanently answer is too small.

Try to get this image (142k) and see what happens:

http://www.funpic.hu/files/pics/00022/00022413.gif

Alex!!!

----------

## ageforce

here the first lines (if you want, I can put the whole dump somewhere):

```
  0.000000 80.119.136.135 -> 195.56.65.29 TCP 48733 > http [SYN] Seq=0 Ack=0 Win=5808 Len=0 MSS=1452 TSV=14438645 TSER=0 WS=2

  0.060337 195.56.65.29 -> 80.119.136.135 TCP http > 48733 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=46867910 TSER=14438645 WS=0

  0.060385 80.119.136.135 -> 195.56.65.29 TCP 48733 > http [ACK] Seq=1 Ack=1 Win=5808 Len=0 TSV=14438705 TSER=46867910

 15.289367 80.119.136.135 -> 195.56.65.29 HTTP GET /files/pics/00022/00022413.gif HTTP/1.1

 15.349083 195.56.65.29 -> 80.119.136.135 TCP http > 48733 [ACK] Seq=1 Ack=46 Win=5792 Len=0 TSV=46869439 TSER=14453937

 21.336147 80.119.136.135 -> 195.56.65.29 HTTP Continuation or non-HTTP traffic

 21.416879 195.56.65.29 -> 80.119.136.135 TCP http > 48733 [ACK] Seq=1 Ack=67 Win=5792 Len=0 TSV=46870046 TSER=14459985

 21.791208 80.119.136.135 -> 195.56.65.29 HTTP Continuation or non-HTTP traffic

 21.864612 195.56.65.29 -> 80.119.136.135 TCP http > 48733 [ACK] Seq=1 Ack=69 Win=5792 Len=0 TSV=46870091 TSER=14460440

 21.911035 195.56.65.29 -> 80.119.136.135 HTTP HTTP/1.1 200 OK (GIF89a)

 21.911083 80.119.136.135 -> 195.56.65.29 TCP 48733 > http [ACK] Seq=69 Ack=1425 Win=8688 Len=0 TSV=14460560 TSER=46870095

 21.911116 195.56.65.29 -> 80.119.136.135 HTTP Continuation or non-HTTP traffic

 21.911128 80.119.136.135 -> 195.56.65.29 TCP 48733 > http [ACK] Seq=69 Ack=1441 Win=8688 Len=0 TSV=14460560 TSER=46870095

 21.974852 195.56.65.29 -> 80.119.136.135 HTTP Continuation or non-HTTP traffic

 21.974889 80.119.136.135 -> 195.56.65.29 TCP 48733 > http [ACK] Seq=69 Ack=2865 Win=11568 Len=0 TSV=14460623 TSER=46870101

 21.976863 195.56.65.29 -> 80.119.136.135 HTTP Continuation or non-HTTP traffic

 21.976914 80.119.136.135 -> 195.56.65.29 TCP [TCP Dup ACK 15#1] 48733 > http [ACK] Seq=69 Ack=2865 Win=11568 Len=0 TSV=14460626 TSER=46870101 SLE=2881 SRE=4305

 21.976867 195.56.65.29 -> 80.119.136.135 HTTP Continuation or non-HTTP traffic

 21.976946 80.119.136.135 -> 195.56.65.29 TCP [TCP Dup ACK 15#2] 48733 > http [ACK] Seq=69 Ack=2865 Win=11568 Len=0 TSV=14460626 TSER=46870101 SLE=2881 SRE=4321

 22.037798 195.56.65.29 -> 80.119.136.135 HTTP Continuation or non-HTTP traffic

 22.037849 80.119.136.135 -> 195.56.65.29 TCP 48733 > http [ACK] Seq=69 Ack=4321 Win=11568 Len=0 TSV=14460686 TSER=46870107

 22.039812 195.56.65.29 -> 80.119.136.135 HTTP Continuation or non-HTTP traffic

 22.039833 80.119.136.135 -> 195.56.65.29 TCP 48733 > http [ACK] Seq=69 Ack=5745 Win=14448 Len=0 TSV=14460688 TSER=46870107

 22.040064 195.56.65.29 -> 80.119.136.135 HTTP Continuation or non-HTTP traffic

 22.040078 80.119.136.135 -> 195.56.65.29 TCP 48733 > http [ACK] Seq=69 Ack=5761 Win=14448 Len=0 TSV=14460689 TSER=46870108

 22.042368 195.56.65.29 -> 80.119.136.135 HTTP Continuation or non-HTTP traffic

 22.042409 80.119.136.135 -> 195.56.65.29 TCP 48733 > http [ACK] Seq=69 Ack=7185 Win=17328 Len=0 TSV=14460691 TSER=46870108

 22.106975 195.56.65.29 -> 80.119.136.135 HTTP Continuation or non-HTTP traffic

 22.107294 80.119.136.135 -> 195.56.65.29 TCP 48733 > http [ACK] Seq=69 Ack=7201 Win=17328 Len=0 TSV=14460756 TSER=46870114
```

----------

## tuxmin

wrong track, eh  :Sad: 

----------

## ageforce

I'll try to dump a knoppix session, and compare it. (Not sure, if i have the time today though...)

I suppose I'll have to study the PPP-specs too. (time, time, time...)

unless you have another idea of course. ;)

----------

## tuxmin

Sorry,

without having a hand on your or even better the remote machines it's rather difficult at the moment.

But post new insights -- perhaps it triggers something.

Good luck,

Alex!!!

----------

## tuxmin

Hi, here is something that might help. Section 4.5 describes exactly your problem:

http://www.roaringpenguin.com/images/resources_files/PPPoEforLinux.pdf

I'm not familiar with the setup you run, but it should be worth a try to install rp-pppoe and use the CLAMPMMS option.

Hth, Alex!!

----------

## ageforce

i'll have a look at it (unfortunately probably not in the next few days. much work to do...:(  but i'll keep you posted).

maybe i'm able to tell my pppd to do the same (as it uses rp-pppoe.so)...

big thx

----------

## ageforce

Finally got it working.

short version:

I just needed to disable ESCAPECHARS in /etc/conf.d/net.ppp0

long version:

I got some ethereal dumps from under knoppix, and under gentoo (and not just PPP0 but ETH0), and noticed no difference (the ack was still missing) -> must be something during initialisation

I installed rp-pppoe on gentoo, and adsl-start worked (after fiddeling with the config-files).

I then changed adsl-start, so it would dump me the complete list of options pppd receives (actually this is even easier, than it sounds: one just needs to add "dump" to the options, and remove the "> /dev/null" in the adsl-start-file).

I then applied all these options to net.ppp0 and got the working connection. From there on it was binary elimination.

----------

## tuxmin

Surprising, indeed!

Glad you made it...

Alex!!

----------

