# 2.6.10 breaks ntpd?

## energy

It seems that using kernel 2.6.10 breaks the ntpd. I've tried stock 2.6.10, 2.6.10-cko3 and 2.6.10-morph12. NTPD just won't keep syncing the clock, there are no error messages in any log files... This is very strange since I've used the same kernel config with 2.6.9-XXX and it works perfectly!

Any ideas? I've tried almost everything... anyone having the same problem?  :Rolling Eyes: 

----------

## pilla

I am not using ntpd, but my clock went to 7 o'clock sometimes. Just too weird. 

I had to give up 2.6.10/11 because of problems with suspend and usb mouse too.

----------

## energy

The ntpd is not the only problem, the clock starts to run "overspeed" e.g. after a day it is at least +20 minutes compared to real time. This problem came bundled with the ntpd problem...

----------

## toralf

I did not have any problems using the following kernels: linux-2.6.10-gentoo-r6  linux-2.6.10-hardened-r1  linux-2.6.10-r1 

But be sure you did not have a line in your ntp.conf like:

restrict default ignore

----------

## drescherjm

 *Quote:*   

> The ntpd is not the only problem, the clock starts to run "overspeed" e.g. after a day it is at least +20 minutes compared to real time

 

I have the same problem. It drives me absolutely nuts. However I believe it existed in previous kernels < 2.6.10. In a period of a week (not connected to the internet) my clock can be off by as much as + or - 3 hours. At one point I thought it was rebooting and timezones but then this does not explain how the clock was ahead by 87 minutes... My rig is a dual processor athlon MX 2200 with 2 GB of PC2100 ECC DDR. Now running SMP gentoo-dev-sources 2.6.10r2.

----------

## markandrew

two things i noticed about ntp recently:

1) i have 2 NICs, eth0 and ath0; ntp requires both of them to be active to start. apparently this will be be fixed in the next baselayout

2) i also could not sync today whatever i tried. i noticed that my /etc/ntp.conf file was being overwritten everytime i ran ntp, removing my server list. so i copied it from the one in /usr/share, added more servers, and removed all write perms on it. still didn't work. in desperation i disabled my ath0 NIC by removing the symlink net.ath0 -> net.lo in /etc/init.d/ ... now it works. strange huh?

----------

## JjcampNR

I've also been having problems with a jumping clock, it's so damn annoying.  I'm running 2.6.10-gentoo-r3.  I'm going to try moving to r6 and see if that fixes anything.

--Josh

----------

## energy

Anyone had luck finding the cause of the problem?

I wonder if it's a bug with certain chipsets or just an option with kernel that doesn't work in combination with other option(s)... I have a nForce chipset.

----------

## pilla

has anybody tried 2.6.11-rc2?

----------

## drescherjm

I have the original AMD MP chipset on my dual processor TYAN Thunder AMD K7 2200 MP rig.

----------

## AlterEgo

Try openntpd.

It's easier, and it behaves   :Cool:   on any kernel I've tried.

----------

## energy

Tried openntpd, tried about everything... for a moment I thought I found a solution but no.

If sync time manually e.g. "ntpdate tick.keso.fi" it works and clock is in sync. The ntpd itself finds the servers and keeps track of offset but just won't keep syncing. I used debug and log files reveal these errors for every server:

```
 9 Feb 15:48:01 ntpd[9195]: peer 130.233.224.2 event 'event_reach' (0x84) status 'unreach, conf, 1 event, event_reach' (0xa014)

```

As I said it worked with earlier kernel version so there is nothing wrong with my configs. I've tried different configurations with no success.. here's my current ntp.conf if someone wants to take a look:

```

# Add some useful log messages to our ntpd.log file

# 

logconfig       =syncevents +peerevents +sysevents +allclock

logfile         /var/log/ntpd.log

driftfile       /var/lib/ntp/ntp.drift

# Allow local clients to query this server, but not update it

#

restrict default ignore

restrict 127.0.0.1

restrict 192.168.1.0 mask 255.255.255.0 nomodify nopeer notrap

# Stratum-2 servers

#

server tock.keso.fi

restrict tock.keso.fi nomodify notrap noquery

```

This one is taken from some thread in this forum.. my original config was a bit different but it didn't matter. I've tried googling about this problem but no-one seems to know what's causing it.

EDIT: Removed all but one server from the list

----------

## didl

I also had trouble with ntpd using 2.6.10. However, it turned out that

the problem was ntpd droping privileges from user root to ntp. I

was able to fix this via emerging ntpd with

```

USE=+nodroproot

```

HTH

----------

## sl70

I've also been having the same problem since about a month or so ago. I have no idea why it should have started all of a sudden.

Last week I uninstalled ntp and installed openntp. This seemed to work for a while but yesterday I started getting these messages in my log:

```
Feb 11 12:19:32 musume ntpd[31832]: peer 199.184.165.135 now valid

Feb 11 12:20:02 musume ntpd[31832]: peer 199.184.165.135 now invalid

Feb 11 12:20:17 musume ntpd[31832]: peer 199.184.165.135 now valid

Feb 11 12:21:21 musume ntpd[31831]: adjusting local clock by -2144.407760s

Feb 11 12:24:53 musume ntpd[31831]: adjusting local clock by -2145.419356s

Feb 11 12:28:44 musume ntpd[31831]: adjusting local clock by -2146.442124s

Feb 11 12:28:44 musume ntpd[31831]: adjtime failed: Invalid argument

Feb 11 12:32:21 musume ntpd[31831]: adjusting local clock by -2147.515935s

Feb 11 12:32:21 musume ntpd[31831]: adjtime failed: Invalid argument

```

Since that time whenever ntpd tries to adjust the time it puts out an invalid argument message. And now after about a day my clock is 45 minutes fast. Here's my ntpd.conf:

```
# Addresses to listen on (ntpd does not listen by default)

listen on 192.168.1.0

# sync to a single server

#server ntp.example.org

# use a random selection of 8 public stratum 2 servers

# see http://twiki.ntp.org/bin/view/Servers/NTPPoolServers

servers pool.ntp.org

```

What is going on???  :Confused: 

----------

## sl70

I upgraded the kernel to 2.6.10-r6 and now the clock is running right. I had a little adjustment of 0.7 sec when I booted up and ntpd started, but nothing since then. 

Whew! Glad that's fixed.

----------

## energy

I tried 2.6.10-r6 and it didn't help, ntpd didn't work neither did the clock.

However I think I found out why.. there seems to be problem with nForce-

chipset and timer. With "working" kernel there are these messages at boot:

```

Feb 11 23:19:47 chevron ENABLING IO-APIC IRQs

Feb 11 23:19:47 chevron ..TIMER: vector=0x31 pin1=2 pin2=-1

Feb 11 23:19:47 chevron ..MP-BIOS bug: 8254 timer not connected to IO-APIC

Feb 11 23:19:47 chevron ...trying to set up timer (IRQ0) through the 8259A ...  failed.

Feb 11 23:19:47 chevron ...trying to set up timer as Virtual Wire IRQ... failed.

Feb 11 23:19:47 chevron ...trying to set up timer as ExtINT IRQ... works.

```

This part is missing completely with .10 kernels... also I took a look at 

.9 -> .10 changelog at kernel.org. There has been a lot of forking with timer code.

----------

## sl70

Ack. Posted too soon. I'm not getting those ntp messages anymore, but my clock is still running too fast. I guess all we can do is wait until a new kernel comes out.

----------

## pompafrittes

 *sl70 wrote:*   

> Ack. Posted too soon. I'm not getting those ntp messages anymore, but my clock is still running too fast. I guess all we can do is wait until a new kernel comes out.

 

Got those messages too before but they appeared because the time was one hour wrong. when i set the time a little closer to correct time i got thiese instead

```

Mar  2 19:55:55 src@sture ntpd[29373]: adjusting local clock by 260.723623s

Mar  2 19:58:26 src@sture ntpd[29373]: adjusting local clock by 260.604434s

Mar  2 20:01:24 src@sture ntpd[29373]: adjusting local clock by 260.319579s

Mar  2 20:05:25 src@sture ntpd[29373]: adjusting local clock by 260.169114s

```

but they will dissapear when the time is correct.

----------

## gtbX

I've been having this problem with all 3 of my gentoo boxes.  The NForce1 box will run an hour fast after a week, while the PII fileserver will be an hour slow.  My laptop is even stranger - after I bring it out of sleep mode, the clock is ~30 hours faster than it should be.  All are running 2.6.10-r6 (except for the laptop, only going back to 2.6.9-r13 would fix it).

I was going to install ntpd as a stopgap, but this thread is making me think twice.

----------

## Legoguy

Try enabling the Power Management Timer Support in ACPI for the laptop.

Everyone else + the laptop: Try enabling Enhanced Real Time Clock in Device Drivers -> Character Devices. If it's already checked... then, well.. I can't really think what might be the cause of this (I don't experience it..)

----------

## pilla

has anybody tested 2.6.11 final?

----------

## timmy

 *pilla wrote:*   

> has anybody tested 2.6.11 final?

 

Yes, and I'm still seeing this problem with both ntp and openntpd on my nforce2 / Athlon 2500+ machine.

I'm just waiting for 2005.0 before installing on my new Dell Dimension 5000 Pentium4. It'll be interesting to see if it also displays the same problem.

Tim

----------

## timmy

I've managed to get ntp working on my nforce2 machine: It seems my clock was running just too fast for ntp or openntpd to work.

The answer was to use /usr/bin/tickadj to reduce the tick (to 9935 in my case) as described here.

There's a post on gentoo forums here that has a patch for /etc/init.d/ntpd that will automatically apply and save the adjusted tick.

Tim

----------

## energy

Finally! Thanks to timmy I got it working using using tickadj.

Right value for me seems to be 9960.

But I really would like to know why this problem appears only

when moving to higher kernel version than 2.6.9? What did the

developers break this time?  :Smile: 

----------

## kallamej

One of the problems with newer kernels is explained here. Tried to switch to openntpd, but it seems to just pick a random time stamp to send to the servers and doesn't sync either as described here, for instance. Back to 2.6.7...

Edit: The random time stamps is by design.

----------

## pilla

Now that I am running 2.6.11-r2 the problem with the wierd clocks is not showing anymore. Give it a try.

----------

## kallamej

 *pilla wrote:*   

> Now that I am running 2.6.11-r2 the problem with the wierd clocks is not showing anymore. Give it a try.

 

Which sources? I have a speeding clock with hardened-dev-sources-2.6.10/11-rx and non-syncing (open)ntpd. Everything is perfect with h-d-s-2.6.7-rx.

----------

## pilla

 *kallamej wrote:*   

>  *pilla wrote:*   Now that I am running 2.6.11-r2 the problem with the wierd clocks is not showing anymore. Give it a try. 
> 
> Which sources? I have a speeding clock with hardened-dev-sources-2.6.10/11-rx and non-syncing (open)ntpd. Everything is perfect with h-d-s-2.6.7-rx.

 

Oh sorry, forgot to say it's gentoo-sources.

vanilla-sources didn't do, but I haven't bothered to look at the patches to see if any of them was geared towards this specific problem.

----------

## kallamej

Same behaviour with 2.6.11-gentoo-r3.  :Sad:  Openntpd doesn't sync under 2.6.7 either, unless it's supposed to take over an hour before it does anything. Will have a look at tickadj later.

Edit (much later): Well, added iburst to my server lines, and now at least it has synced. Let's see in a couple of hours if it can discipline the clock to stop the time drift as well.

Edit2: Too much drift for it to be able to discpline the clock. A tickadj of 9958 seems about right for me. This is on an Nforce2 board as well.

----------

## timmy

Just for info: I installed 2005.0 on my Dell D5000 Pentium4 over the weekend, and ntp is quite happy to sync without using tickadj.

My nforce2 PC needed a tickadj of 9929.

----------

## spider-man

I was having the same problem then I read this.

This part was particularily relevant:  *Quote:*   

> If you're using a 2.6 series kernel, make sure you have it compiled with CONFIG_SECURITY_CAPABILITIES, otherwise the root dropping will fail. If you used menuconfig or equivalent to generate your .config, CONFIG_SECURITY_CAPABILITIES will not appear unless you enabled security models, which adds CONFIG_SECURITY=yes and half a dozen sub-options, one of which is CONFIG_SECURITY_CAPABILITIES.

 

I hope this works for you

----------

## kallamej

I think the nforce2 problem may be related to this, but I haven't rebooted into 2.6.7 since my last post in this thread so I can't verify it right now. I do know, though, that some kernel versions ago the timer was in XT-PIC mode and not IO-APIC as it is with 2.6.11.

----------

## drescherjm

I just wanted to say my problem was finally fixed by deleting /etc/adjtime and setting the clock to local.

----------

## energy

 *kallamej wrote:*   

> I think the nforce2 problem may be related to this, but I haven't rebooted into 2.6.7 since my last post in this thread so I can't verify it right now. I do know, though, that some kernel versions ago the timer was in XT-PIC mode and not IO-APIC as it is with 2.6.11.

 

That link cleared things a bit, my problem seems to be IO-APIC. Is there any way to disable IO-APIC taking over the timer or do I have to disable IO-APIC from kernel and recompile without it  :Question: 

----------

## kallamej

 *kallamej wrote:*   

> I think the nforce2 problem may be related to this

 

Bumping just to confirm that disabling FSB spread spectrum has stablised my clock. I finally remembered to check the BIOS with the last kernel upgrade.

----------

## whkinney

 *spider-man wrote:*   

> I was having the same problem then I read this.
> 
> This part was particularily relevant:  *Quote:*   If you're using a 2.6 series kernel, make sure you have it compiled with CONFIG_SECURITY_CAPABILITIES, otherwise the root dropping will fail. If you used menuconfig or equivalent to generate your .config, CONFIG_SECURITY_CAPABILITIES will not appear unless you enabled security models, which adds CONFIG_SECURITY=yes and half a dozen sub-options, one of which is CONFIG_SECURITY_CAPABILITIES. 
> 
> I hope this works for you

 

I am having problems with ntpd (kernel-2.6.11-gentoo-r9) similar to those reported here: the ntp daemon does not sync correctly with the server, and the clock drifts off the correct time rapidly. The fix described above did not solve the problem, nor did deleting /etc/adjtime. The problem occurs both on a system configured to sync to a server on my local network:

```

restrict 127.0.0.1 nomodify

server 192.168.1.1

peer 192.168.1.2

driftfile /var/lib/ntp/ntp.drift

```

as well as a system configured to sync to pool.ntp.org:

```

restrict 127.0.0.1 nomodify

server 0.us.pool.ntp.org

server 1.us.pool.ntp.org

server 2.us.pool.ntp.org

driftfile /var/lib/ntp/ntp.drift

```

In the example above where I am syncing to the local network, the machine at 192.168.1.2 is running vanilla-sources 2.4.27, and works perfectly with the following ntp.conf file:

```

restrict 127.0.0.1 nomodify

server 192.168.1.1

driftfile /var/lib/ntp/ntp.drift

```

-- Will

----------

## kallamej

@whkinney: Do the other suggestions in this thread help? Is it an nforce2 board?

----------

## whkinney

 *kallamej wrote:*   

> @whkinney: Do the other suggestions in this thread help? Is it an nforce2 board?

 

If the other suggestions had helped, I wouldn't have posted  :Wink: . I have not messed with tickadj, the primary reason being that ntpd previously worked flawlessly on the machines in question. Besides, isn't the whole point of ntpd to compensate for local clock drift??

One of the machines with the problem (a dual-Xeon workstation) has a Supermicro board, and the other (a Dell Inspiron 5100 laptop), 'm not sure. 

-- Will

----------

## kallamej

Well, ntpd can only do so much. If the drift is more than, iirc, 512 ppm you have to do something more or accept that the clock will have to be reset often. How large is your drift? Do you get it to sync at all? What does ntpq -p say?

----------

## whkinney

 *kallamej wrote:*   

> Well, ntpd can only do so much. If the drift is more than, iirc, 512 ppm you have to do something more or accept that the clock will have to be reset often. How large is your drift? Do you get it to sync at all? What does ntpq -p say?

 

ntpd worked fine until very recently -- unless I had a simultaneous hardware failure on two completely different systems, I doubt it is due to excessive drift on the local clock. My ntp.drift file on the workstation is:

```

>more /var/lib/ntp/ntp.drift

322.767

```

and ntpq -p gives:

```

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

 mx1.gs.washingt 140.142.1.8      3 u   16   64   77   62.945  1282.40 1216.17

 ftp.cerias.purd 128.10.252.6     2 u   16   64   77   18.886  1249.96 1241.99

 ns2.dns.pciwest 207.126.98.204   2 u   16   64   77   97.666  2713.72 1011.73

```

A comparison to the time signal on www.time.gov indicates the system clock is currently running about three seconds slow. Until recently, I was getting accuracy on the level of milliseconds to hundredths of a second. At various (hard to predict or duplicate) times, I have found my clock off by a number of minutes. 

-- Will

----------

## kallamej

Since there are no +'es or a * next to any of the servers you're trying to sync against, your ntpd has not been able to sync. Have you tried to add iburst (server 0.us.pool.ntp.org iburst)? Set the clock with ntpdate first (you can use the ntp-client init script) and monitor /var/log/ntpd.log.

----------

## whkinney

 *kallamej wrote:*   

> Since there are no +'es or a * next to any of the servers you're trying to sync against, your ntpd has not been able to sync. Have you tried to add iburst (server 0.us.pool.ntp.org iburst)? Set the clock with ntpdate first (you can use the ntp-client init script) and monitor /var/log/ntpd.log.

 

The +'es and *'s come and go, so it appears that ntpd syncs and then loses sync. For example, here is two peer queriies a few minutes apart:

```

ntpq

ntpq> pe

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

+blade.avnf.com  140.221.8.88     2 u   20   64  377   39.063  2242.92 514.338

+gabe.kjsl.com   207.200.81.113   2 u   17   64  377   68.100  1828.05 373.904

*ftp.cerias.purd 128.10.252.7     2 u   24   64  377   18.897  1434.74 619.933

ntpq> pe

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

 blade.avnf.com  140.221.8.88     2 u    4   64    1  102.176  671.631   0.008

 gabe.kjsl.com   207.200.81.113   2 u    3   64    1   68.474  688.672   0.008

 ftp.cerias.purd 128.10.252.7     2 u    6   64    1   18.971  689.128   0.008

```

The basic behavior is that my clock wanders around like a drunken sailor, and I lose and regain synchronizatin to the servers in a way that I can't figure out. 

Here is my ntpd log file since the last time I started the daemon a couple of hours ago:

```

Jun 20 13:52:52 trogdor ntpd[9112]: ntpd 4.2.0a@1.1190-r Sat Jun 11 09:29:13 EDT 2005 (1)

Jun 20 13:52:52 trogdor ntpd[9112]: precision = 8.000 usec

Jun 20 13:52:52 trogdor ntpd[9112]: Listening on interface wildcard, 0.0.0.0#123

Jun 20 13:52:52 trogdor ntpd[9112]: Listening on interface lo, 127.0.0.1#123

Jun 20 13:52:52 trogdor ntpd[9112]: Listening on interface eth0, 128.205.65.87#123

Jun 20 13:52:52 trogdor ntpd[9112]: kernel time sync status 0040

Jun 20 13:52:53 trogdor ntpd[9112]: frequency initialized 322.767 PPM from /var/lib/ntp/ntp.drift

Jun 20 13:57:13 trogdor ntpd[9112]: synchronized to 128.10.252.10, stratum 2

Jun 20 13:57:15 trogdor ntpd[9112]: time reset +2.385548 s

Jun 20 13:57:15 trogdor ntpd[9112]: kernel time sync disabled 0041

Jun 20 14:02:32 trogdor ntpd[9112]: synchronized to 128.95.231.7, stratum 3

Jun 20 14:02:33 trogdor ntpd[9112]: synchronized to 128.10.252.10, stratum 2

Jun 20 14:09:06 trogdor ntpd[9112]: synchronized to 64.5.0.129, stratum 2

Jun 20 14:10:11 trogdor ntpd[9112]: synchronized to 128.95.231.7, stratum 3

Jun 20 14:12:16 trogdor ntpd[9112]: synchronized to 128.10.252.10, stratum 2

Jun 20 14:12:18 trogdor ntpd[9112]: time reset +2.284929 s

Jun 20 14:17:40 trogdor ntpd[9112]: synchronized to 64.5.0.129, stratum 2

Jun 20 14:23:03 trogdor ntpd[9112]: synchronized to 128.10.252.10, stratum 2

Jun 20 14:25:12 trogdor ntpd[9112]: synchronized to 128.95.231.7, stratum 3

Jun 20 14:26:18 trogdor ntpd[9112]: synchronized to 128.10.252.10, stratum 2

Jun 20 14:27:21 trogdor ntpd[9112]: synchronized to 128.95.231.7, stratum 3

Jun 20 14:30:38 trogdor ntpd[9112]: synchronized to 128.10.252.10, stratum 2

Jun 20 14:34:49 trogdor ntpd[9112]: synchronized to 64.5.0.129, stratum 2

Jun 20 14:34:54 trogdor ntpd[9112]: time reset +5.246261 s

Jun 20 14:44:39 trogdor ntpd[9112]: synchronized to 64.5.0.129, stratum 2

Jun 20 14:51:05 trogdor ntpd[9112]: time reset +3.260086 s

Jun 20 14:56:25 trogdor ntpd[9112]: synchronized to 64.5.0.129, stratum 2

Jun 20 15:02:48 trogdor ntpd[9112]: synchronized to 128.95.231.7, stratum 3

Jun 20 15:03:02 trogdor ntpd[9112]: synchronized to 128.10.252.10, stratum 2

Jun 20 15:06:14 trogdor ntpd[9112]: synchronized to 64.5.0.129, stratum 2

Jun 20 15:06:21 trogdor ntpd[9112]: synchronized to 128.10.252.10, stratum 2

Jun 20 15:11:40 trogdor ntpd[9112]: time reset +1.227823 s

Jun 20 15:17:02 trogdor ntpd[9112]: synchronized to 128.10.252.10, stratum 2

Jun 20 15:27:46 trogdor ntpd[9112]: synchronized to 64.5.0.129, stratum 2

Jun 20 15:27:50 trogdor ntpd[9112]: time reset +0.548744 s

Jun 20 15:33:11 trogdor ntpd[9112]: synchronized to 64.5.0.129, stratum 2

```

Note that this appears to be converging on a syncronized time quite nicely, but it ultimately never settles down to an accurate clock sync. 

A similar behavior occurs on my Dell laptop connected to a local subnet, syncing to a local server 192.168.1.1. In this case, there ought not to be any network issues, but ntpd will sync to the local server for a while, then lose sync and drift off again.

----------

## kallamej

Time resets every twenty minutes is far too often for ntpd to be able to discipline the clock. It typically needs hours without resets to do it properly. Have you tried increasing the tickadj slightly? Are you losing interrupts? Tried disabling apic and acpi? I don't have many more ideas I'm afraid.

----------

## whkinney

 *kallamej wrote:*   

> Time resets every twenty minutes is far too often for ntpd to be able to discipline the clock. It typically needs hours without resets to do it properly. Have you tried increasing the tickadj slightly? Are you losing interrupts? Tried disabling apic and acpi? I don't have many more ideas I'm afraid.

 

As I said, I haven't (yet) messed with tickadj. ACPI is already disabled. I don't know how to dis/enable apic, nor how to tell if I'm losing interrupts, I'm afraid. Let me stress that similar behavior is occurring on completely different hardware in completely different network environments. 

I took your earlier suggestion and added iburst to the server lines and restarted the ntp system. Here is the config file:

```

restrict 127.0.0.1

server 0.us.pool.ntp.org iburst

server 1.us.pool.ntp.org iburst

server 2.us.pool.ntp.org iburst

driftfile /var/lib/ntp/ntp.drift

```

Here is a transcript of the command line:

```

[root@trogdor:ntpd] /etc/init.d/ntpd stop

 * Stopping ntpd ...                                                     [ ok ]

[root@trogdor:ntpd] ntpdate time.nist.gov

20 Jun 15:44:10 ntpdate[11518]: step time server 192.43.244.18 offset 1.355353sec

[root@trogdor:ntpd] /etc/init.d/ntpd start

 * Starting ntpd ...                                                     [ ok ]

[root@trogdor:ntpd] ntpq -p

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

*ns3.dns.pciwest 132.163.4.103    2 u    6   64    1   83.482  105.809  38.095

+ftp.cerias.purd 128.10.252.7     2 u    5   64    1   18.688  129.580  41.351

+familjen.svenss 128.252.19.1     2 u    4   64    1   43.552  112.591  31.295

[root@trogdor:ntpd] ntpq -p

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

*ns3.dns.pciwest 132.163.4.103    2 u   25   64  377   83.714  210.753 327.655

+ftp.cerias.purd 128.10.252.7     2 u   24   64  377   18.934  233.092 321.549

xfamiljen.svenss 128.252.19.1     2 u   22   64  377   42.849  1077.43 818.306

[root@trogdor:ntpd] date; ntpq -p

Mon Jun 20 15:53:12 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

*ns3.dns.pciwest 132.163.4.103    2 u   14   64  377   83.714  210.753 468.686

+ftp.cerias.purd 128.10.252.7     2 u   13   64  377   18.934  233.092 451.465

xfamiljen.svenss 128.252.19.1     2 u   10   64  377   42.849  1077.43 736.406

[root@trogdor:ntpd] !!

date; ntpq -p

Mon Jun 20 15:58:33 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

 ns3.dns.pciwest 132.163.4.103    2 u   10   64  377   83.935  266.688 914.679

xftp.cerias.purd 128.10.252.7     2 u    8   64  377   19.646  1075.08 332.784

*familjen.svenss 128.252.19.1     2 u    8   64  377   42.849  1077.43 159.719

[root@trogdor:ntpd] !!

date; ntpq -p

Mon Jun 20 16:05:18 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

+ns3.dns.pciwest 132.163.4.103    2 u   29   64  377   89.496  1508.85 155.289

+ftp.cerias.purd 128.10.252.6     2 u   32   64  377   20.052  1514.71 150.063

*familjen.svenss 192.43.244.18    2 u   26   64  377   43.534  1426.47  87.719

[root@trogdor:ntpd] !!

date; ntpq -p

Mon Jun 20 16:08:16 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

 ns3.dns.pciwest .STEP.          16 u  301   64    0    0.000    0.000 4000.00

 ftp.cerias.purd .STEP.          16 u  242   64    0    0.000    0.000 4000.00

 familjen.svenss .STEP.          16 u  160   64    0    0.000    0.000 4000.00

[root@trogdor:ntpd] !!

date; ntpq -p

Mon Jun 20 16:09:06 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

 ns3.dns.pciwest .STEP.          16 u  351   64    0    0.000    0.000 4000.00

 ftp.cerias.purd .STEP.          16 u  292   64    0    0.000    0.000 4000.00

 familjen.svenss .STEP.          16 u  210   64    0    0.000    0.000 4000.00

[root@trogdor:ntpd] !!

date; ntpq -p

Mon Jun 20 16:09:43 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

+ns3.dns.pciwest 132.163.4.103    2 u   14   64    1   87.303  384.012  21.766

*ftp.cerias.purd 128.10.252.7     2 u   16   64    1   19.385  381.690  14.949

+familjen.svenss 128.252.19.1     2 u   15   64    1   43.474  380.606  14.946

[root@trogdor:ntpd] !!

date; ntpq -p

Mon Jun 20 16:14:39 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

+ns3.dns.pciwest 132.163.4.103    2 u   65   64   37   83.911  577.851 159.423

*ftp.cerias.purd 128.10.252.7     2 u    5   64   77   19.385  381.690 135.139

+familjen.svenss 128.252.19.1     2 u    6   64   77   42.841  620.616 188.187

[root@trogdor:ntpd] !!

date; ntpq -p

Mon Jun 20 16:18:20 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

+ns3.dns.pciwest 132.163.4.103    2 u   24   64  377   83.911  577.851  98.682

*ftp.cerias.purd 128.10.252.7     2 u   31   64  377   19.324  664.101 124.088

+familjen.svenss 128.252.19.1     2 u   31   64  377   42.841  620.616 123.169

[root@trogdor:ntpd] !!

date; ntpq -p

Mon Jun 20 16:20:25 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

+ns3.dns.pciwest 132.163.4.103    2 u   19   64  377   83.911  577.851 604.041

*ftp.cerias.purd 128.10.252.7     2 u   29   64  377   19.171  767.938 381.796

+familjen.svenss 128.252.19.1     2 u   28   64  377   42.841  620.616 429.104

[root@trogdor:ntpd] !!

date; ntpq -p

Mon Jun 20 16:22:27 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

 ns3.dns.pciwest 132.163.4.103    2 u   12   64  377   83.674  2271.18 1318.98

 ftp.cerias.purd 128.10.252.6     2 u   24   64  377   19.056  2317.71 1389.98

 familjen.svenss 192.43.244.18    2 u   20   64  377   42.841  620.616 993.216

[root@trogdor:ntpd] !!

date; ntpq -p

Mon Jun 20 16:23:31 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

 ns3.dns.pciwest 132.163.4.103    2 u   11   64  377   83.674  2271.18 1161.04

 ftp.cerias.purd 128.10.252.6     2 u   23   64  377   19.056  2317.71 1229.47

 familjen.svenss 192.43.244.18    2 u   20   64  377   42.267  2363.13 1257.50

[root@trogdor:ntpd]

[root@trogdor:ntpd] !!

date; ntpq -p

Mon Jun 20 16:27:11 EDT 2005

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================

+ns3.dns.pciwest 132.163.4.103    2 u   35   64    1   83.609  178.711   1.456

*ftp.cerias.purd 128.10.252.7     2 u   32   64    1   18.719  180.737   1.088

+familjen.svenss 128.252.19.1     2 u   32   64    1   42.146  183.201   4.733

[root@trogdor:ntpd]         

```

And here is the ntpd log for the same time period:

```

Jun 20 15:43:52 trogdor ntpd[9112]: ntpd exiting on signal 15

Jun 20 15:44:18 trogdor ntpd[11578]: ntpd 4.2.0a@1.1190-r Sat Jun 11 09:29:13 EDT 2005 (1)

Jun 20 15:44:18 trogdor ntpd[11578]: precision = 7.000 usec

Jun 20 15:44:18 trogdor ntpd[11578]: Listening on interface wildcard, 0.0.0.0#123

Jun 20 15:44:18 trogdor ntpd[11578]: Listening on interface lo, 127.0.0.1#123

Jun 20 15:44:18 trogdor ntpd[11578]: Listening on interface eth0, 128.205.65.87#123

Jun 20 15:44:18 trogdor ntpd[11578]: kernel time sync status 0040

Jun 20 15:44:20 trogdor ntpd[11578]: frequency initialized 322.767 PPM from /var/lib/ntp/ntp.drift

Jun 20 15:44:29 trogdor ntpd[11578]: synchronized to 64.5.1.130, stratum 2

Jun 20 15:44:29 trogdor ntpd[11578]: kernel time sync disabled 0041

Jun 20 15:45:26 trogdor ntpd[11578]: kernel time sync enabled 0001

Jun 20 15:58:23 trogdor ntpd[11578]: synchronized to 64.32.179.58, stratum 2

Jun 20 16:08:07 trogdor ntpd[11578]: synchronized to 128.10.252.10, stratum 2

Jun 20 16:08:09 trogdor ntpd[11578]: time reset +1.512477 s

Jun 20 16:09:29 trogdor ntpd[11578]: synchronized to 128.10.252.10, stratum 2

Jun 20 16:22:03 trogdor ntpd[11578]: synchronized to 64.32.179.58, stratum 2

Jun 20 16:22:07 trogdor ntpd[11578]: no servers reachable

Jun 20 16:25:16 trogdor ntpd[11578]: synchronized to 128.10.252.10, stratum 2

Jun 20 16:25:19 trogdor ntpd[11578]: time reset +2.435262 s

Jun 20 16:26:38 trogdor ntpd[11578]: synchronized to 128.10.252.10, stratum 2

```

Very strange, no?

-- Will

----------

## kallamej

The only thing iburst does is to hammer the time servers a bit initially when ntpd starts and after each time reset to speed up the syncing. You can the interrupts with cat /proc/interrupts, you don't want lots of ERRs. If you have apic enabled in your kernel, you can boot with the noapic boot parameter to turn it off.

----------

## whkinney

 *kallamej wrote:*   

> The only thing iburst does is to hammer the time servers a bit initially when ntpd starts and after each time reset to speed up the syncing. You can the interrupts with cat /proc/interrupts, you don't want lots of ERRs. If you have apic enabled in your kernel, you can boot with the noapic boot parameter to turn it off.

 

Zero ERRs in /proc/interrupts. 

AHA! Restarting with the "boot=noapic" option appears to fix the problem on both affected machines! The ntpd log file is quiet, and the offsets are not spinning off into outer space within minutes. I'll keep an eye on the server for a day or two to make sure it stays sync'ed, and report back.

I am assuming that this is not the way apic is supposed to work...

-- Will

----------

## whkinney

 *whkinney wrote:*   

> 
> 
> AHA! Restarting with the "boot=noapic" option appears to fix the problem on both affected machines! The ntpd log file is quiet, and the offsets are not spinning off into outer space within minutes. I'll keep an eye on the server for a day or two to make sure it stays sync'ed, and report back.
> 
> 

 

Turning of apic did NOT solve the problem. Things seemed stable at first, but then the huge clock offsets returned. Likewise, playing with tickadj did nothing. 

However, I have recompiled the kernels on both machines with "Power Management Timer Support" enabled under ACPI, and both boxes are now running ntpd in a very stable manner, with clock offesets on the order of 10 ms. Looks like this fixed the problem. 

-- Will

----------

## czhang

Yes, I confirmed that enable this "Power Management Timer Support" and "Security options - Default Linux Capabilities" make openntpd sync with pool.ntp.org again.

Also, on an AMD64 machine, the "Power Management Timer Support" is not available in the kernel's menuconfig.

 *whkinney wrote:*   

> 
> 
> However, I have recompiled the kernels on both machines with "Power Management Timer Support" enabled under ACPI, and both boxes are now running ntpd in a very stable manner, with clock offesets on the order of 10 ms. Looks like this fixed the problem. 
> 
> -- Will

 

----------

