# [UNSOLVABLE] dhcpcd works for ping but not ssh

## nokilli

This is for the case where the client and server are on the same subnet.  I'm running a current, stable amd64.

To expand, this works:

```

$ ping test

PING test.mshome.net (192.168.137.1) 56(84) bytes of data.

...

```

But this does not:

```

$ ssh test

ssh: Could not resolve hostname test: Name or service not known

```

Nor does this:

```

$ ssh test.mshome.net

ssh: Could not resolve hostname test.mshome.net: Name or service not known

```

But nslookup works:

```

$ nslookup test

Server:            192.168.137.1

Address:          192.168.137.1#53

Non-authoritative answer:

Name:   test.mshome.net

Address: 192.168.137.1

```

Yes, I'm using Windows 7's Internet Connection sharing on the machine I'm connecting to... this is what dhcpcd does to /etc/resolv.conf:

```

# Generated by dhcpcd from eth0

# /etc/resolv.conf.head can replace this line

domain mshome.net

nameserver 192.168.137.1

# /etc/resolv.conf.tail can replace this line

```

I don't see how it being a Windows box should make a difference though.  If ping/nslookup can get it to cough up an address, why can't ssh?

The ultimate intent is to be able to use my Mac (presently running Windows/Cygwin) to play X server, and to connect to it from my Gentoo netbook using emacs --display:test.0 so I can get the latter to take advantage of the bigger screen of the former.  emacs makes the connection using ssh.

If I put 192.168.137.1 test in my /etc/hosts file, ssh works.  But I cant/don't want to do that as Windows appears to use a RNG when doling out IP addresses.  And I don't want to create a simple subnet to connect the two and thus be able to rely on a consistent address as I'd like to be able to use the Mac running Windows as my connection to the Internet.

There's something fundamental I'm missing here... what is it?

[edit: first changed to INSOLUBLE, then realized we're not talking about liquids or gases so now it's UNSOLVABLE]

----------

## pascuol

this is very strange. I've the same kind of configuration and it's working fine.

Anything special in your /etc/ssh/shh_config file ?

----------

## nokilli

 *pascuol wrote:*   

> this is very strange. I've the same kind of configuration and it's working fine.
> 
> Anything special in your /etc/ssh/shh_config file ?

 

It's as installed by Gentoo, haven't touched a thing.  My only real config change has been to /etc/conf.d/net, and that's just to set eth0 to use dhcp.

----------

## RazielFMX

Can you run the ssh with a bunch of v's?

ssh -vvvvvvv test.mshome.net

That might give us some clue what is going on...

----------

## nokilli

 *RazielFMX wrote:*   

> Can you run the ssh with a bunch of v's?
> 
> ssh -vvvvvvv test.mshome.net
> 
> That might give us some clue what is going on...

 

Gives me:

```

debug1: Reading configuration data /home/tester/.ssh/config

debug1: Reading configuration data /etc/ssh/ssh_config

debug2: ssh_connect: needpriv 0

ssh: Could not resolve hostname test: Name or service not known

```

I originally did the ssh -v, didn't know about the -vvvvvv option, and so the needpriv 0 data is new, but I'm googling it and nothing's leaping out at me.

It's late here though, I'll take a look again in the morning.

----------

## RazielFMX

That doesn't look like the issue.  What does your /etc/resolv.conf look like?

----------

## Hu

My guess is that the permissions on /etc/resolv.conf are incorrect.  Since ping is setuid root, it can read resolv.conf anyway, but the unprivileged ssh process cannot.

----------

## nokilli

 *RazielFMX wrote:*   

> That doesn't look like the issue.  What does your /etc/resolv.conf look like?

 

```
# Generated by dhcpcd from eth0

# /etc/resolv.conf.head can replace this line

domain mshome.net

nameserver 192.168.137.1

# /etc/resolv.conf.tail can replace this line
```

----------

## nokilli

 *Hu wrote:*   

> My guess is that the permissions on /etc/resolv.conf are incorrect.  Since ping is setuid root, it can read resolv.conf anyway, but the unprivileged ssh process cannot.

 

Ah, I didn't think of that!

But that's not it.  /etc/resolv.conf is 0644.  Just for grins I made it world-writable as well and tried and still no joy.

----------

## bigbangnet

might be an idiot thing here but what about firewalls ? and did you try ssh with the ip address ?

----------

## RazielFMX

Has this ever worked?

----------

## nokilli

 *bigbangnet wrote:*   

> might be an idiot thing here but what about firewalls ? and did you try ssh with the ip address ?

 

It shouldn't be a firewall as when I put the name in /etc/hosts I can get through.

I also can connect when specifying the IP address directly.

----------

## nokilli

 *RazielFMX wrote:*   

> Has this ever worked?

 

With this specific combination of machines, it's the first time I've tried.  I've got a half dozen machines on this subnet however and I routinely connect via ssh using hostname only.

It is the first time I've tried connecting to a Windows 7 box however and it acting as the DHCP server (as it is a connection via Internet Connection Sharing on Win7) certainly qualifies as an unusual/undesirable feature here, but then why does it work with ping and nslookup?

It's almost as if ssh is saying, no thank you, this address is provided by Windows, I won't use that, go get yourself a real DHCP server.  If the address is obtained via /etc/hosts, it works.  And if the address is used directly, it works.  Truly baffling.

----------

## RazielFMX

Can you do a telnet test?

```
telnet <hostname> 22
```

And see what happens?

If you don't have telnet installed, do this:

```
emerge net-misc/netkit-telnetd
```

And then you will have telnet.

----------

## Hu

It would also be interesting to run strace -o /tmp/ssh.strace -t /usr/bin/ssh host and then examine /tmp/ssh.strace for clues.

----------

## nokilli

 *RazielFMX wrote:*   

> Can you do a telnet test?
> 
> ```
> telnet <hostname> 22
> ```
> ...

 

Doesn't work under telnet either.

```
telnet: could not resolve test/telnet: Name or service not known
```

----------

## nokilli

 *Hu wrote:*   

> It would also be interesting to run strace -o /tmp/ssh.strace -t /usr/bin/ssh host and then examine /tmp/ssh.strace for clues.

 

strace is very cool, thanks for that.

The results: http://pastebin.com/sgKRbauv

I'm not sure what to make of this.  You can see that ssh puts together the fqdn, but that's likely from the info dhcpcd puts in /etc/resolv.conf.  You can also see traffic to and from the host I'm trying to connect to, but the complicating factor is that it is also acting as the DHCP server, so it's not clear really from this output what is actually going on.

I'm not familiar with the inner details of how DHCP works, but my guess is that the Microsoft implementation is half-assed.  It is after all probably designed to just relay these lookups to whatever nameserver it's configured to use.  That ping and nslookup works is just a fluke?

I'll leave this up for another day in case somebody has another perspective, then I'll mark as solved.

Thanks for the responses.

----------

## Hu

That host is both the DNS and DHCP server.  The traffic shown by the strace is that ssh sent a DNS request asking for the name test.mshome.net.  It received a response, which apparently was NXDOMAIN.  This looks like the DNS server does not know that it should resolve the name test.mshome.net at all.  You can confirm this using dig from net-dns/bind-tools: dig @192.168.137.1 test.mshome.net.  I expect this will show the same thing that nslookup showed, although in a more verbose manner that I consider to be more informative.  :Wink:   If so, then that indicates that Windows is not using DNS to learn that name.  If 192.168.137.1 is a typical consumer home router, it probably has WINS support and Windows is learning the name that way.  If the DNS server is smart enough, you could probably get on it and add an explicit mapping.

As an aside, using mshome.net for your internal network is a bad idea unless you are also the authoritative owner for that domain.  You should pick something that is not a top level domain, such as home.local[1].

[1] Some tools have laid claim to .local and will wreak havoc on your name resolution if you use them in a network that uses a .local DNS domain.  I believe Avahi / mDNSResponder are guilty of this.

----------

## nokilli

 *Hu wrote:*   

> That host is both the DNS and DHCP server.  The traffic shown by the strace is that ssh sent a DNS request asking for the name test.mshome.net.  It received a response, which apparently was NXDOMAIN.  This looks like the DNS server does not know that it should resolve the name test.mshome.net at all.  You can confirm this using dig from net-dns/bind-tools: dig @192.168.137.1 test.mshome.net.  I expect this will show the same thing that nslookup showed, although in a more verbose manner that I consider to be more informative.   If so, then that indicates that Windows is not using DNS to learn that name.  If 192.168.137.1 is a typical consumer home router, it probably has WINS support and Windows is learning the name that way.  If the DNS server is smart enough, you could probably get on it and add an explicit mapping.

 

A little awkward, but I'm no longer using the Internet Connection sharing feature of Windows 7 so I can't perform your test (easily).  But if I understand you correctly, ssh goes through the bother of interrogating the DNS server to check that the results given by the DHCP server are correct before considering establishing a connection, whereas ping and nslookup do not?

If so, what would your opinion be as to my submitting this as a bug report on ssh... the behavior is correct, but the error it gives is clearly wrong, and then too shouldn't there be an option to override this sensibility and allow me to connect to a server if I know the address being returned is correct?

 *Hu wrote:*   

> As an aside, using mshome.net for your internal network is a bad idea unless you are also the authoritative owner for that domain.  You should pick something that is not a top level domain, such as home.local[1].

 

I totally agree, hence my discontinuing this configuration.  Recent experiences with Mac OS X (with Lion esp.) and Windows 7 have instilled within me an urgent sense that I am no longer in control over my home network.  But as much as I would love to, I'm not in a position where I can just jettison these OS's... I need to keep them around.  I have spent the past two or three weeks iterating across all of the myriad ways in which I might be able to configure my home network so that access to the Internet is preserved with some semblance of security for my data and protection of the time I'm investing in all things digital but I'm realizing that this is impossible with closed operating systems: I simply no longer trust Apple, and I never really trusted Microsoft, and it is absolutely infuriating to me that I should have to spend this kind of time to secure my network when I've paid good money for what I thought were professional products.

 *Hu wrote:*   

> [1] Some tools have laid claim to .local and will wreak havoc on your name resolution if you use them in a network that uses a .local DNS domain.  I believe Avahi / mDNSResponder are guilty of this.

 

You're speaking of ZeroConf, and I am sympathetic to why they did this.  FWIW, I did experience considerable pain with .local some years back, but my recent Linux installs suggest that they've worked these issues out.  A lot depends on whether you have novice users running about in your network... if you do, the time savings can be considerable.

Thanks for the good reply.

----------

## Hu

 *nokilli wrote:*   

> A little awkward, but I'm no longer using the Internet Connection sharing feature of Windows 7 so I can't perform your test (easily).  But if I understand you correctly, ssh goes through the bother of interrogating the DNS server to check that the results given by the DHCP server are correct before considering establishing a connection, whereas ping and nslookup do not?

 No, you misunderstood me.  Your DHCP lease included information on where to find a DNS server.  That DNS server happens to be running on the same machine as the DHCP server.  Therefore, when ssh needs to know a name, it contacts the DNS server specified in your /etc/resolv.conf.  That DNS server then responds with NXDOMAIN.

One possibility that I did not previously consider is that your DNS server may choke on requests for AAAA records.  Since ping is designed to work only on IPv4 and nslookup is ancient, they likely submit queries only for a A records.  Modern ssh with modern glibc, both of which you have, will probably try both A and AAAA.  Some broken DNS servers react very badly to this and refuse to provide a valid A record if they see a AAAA query at all.  You could confirm this by using dig to interrogate the server.

 *nokilli wrote:*   

> If so, what would your opinion be as to my submitting this as a bug report on ssh... the behavior is correct, but the error it gives is clearly wrong, and then too shouldn't there be an option to override this sensibility and allow me to connect to a server if I know the address being returned is correct?

 This is not a bug in ssh.  This is a configuration problem (or possibly a bug) in your DNS server.  The address being returned is not correct because there is no address returned to ssh.

 *nokilli wrote:*   

> I totally agree, hence my discontinuing this configuration. ...

 I do not see the security problem here, beyond the standard caveats of trusting any closed system to do exactly what it is advertised to do, no more and no less.  There is certainly a correctness problem.

----------

## wcg

On kind of a lateral note, do any of these "shoot me now" internal

broadband modems have working Linux drivers?

----------

