# kernel crash - Clocksource tsc unstable

## nanos

Hallo zusammen!

Ein Testserver macht immer wieder mal Probleme.

Wenn ich z.B. gerade über ssh darauf arbeite steht plötzlich alles und erst nach ca. 1 Stunde kann ich wieder darauf zugreifen.

Auch dann läuft alles wie in Zeitlupe, vieles funktioniert nicht mehr und auch das Datum ist verstellt.

Aus der Logdatei /var/log/messages hab ich folgendes rausgeschnitten:

```
Mar 16 12:59:38 srv60 Clocksource tsc unstable (delta = 2011914924989 ns)

Mar 16 12:59:38 srv60 ------------[ cut here ]------------

Mar 16 12:59:38 srv60 WARNING: at kernel/hrtimer.c:445 hrtimer_reprogram+0x48/0x9b()

Mar 16 12:59:38 srv60 Modules linked in:

Mar 16 12:59:38 srv60 ------------[ cut here ]------------

Mar 16 12:59:38 srv60 Pid: 4345, comm: lighttpd Not tainted 2.6.25-gentoo-r7 #1

Mar 16 12:59:38 srv60 WARNING: at kernel/hrtimer.c:445 hrtimer_reprogram+0x48/0x9b()

Mar 16 12:59:38 srv60 Modules linked in:

Mar 16 12:59:38 srv60 [<c011d6cf>] Pid: 1, comm: init Not tainted 2.6.25-gentoo-r7 #1

Mar 16 12:59:38 srv60 warn_on_slowpath+0x40/0x4f

Mar 16 12:59:38 srv60 [<c0143748>] get_page_from_freelist+0x69/0x339

Mar 16 12:59:38 srv60 [<c01eacbc>] __next_cpu+0x12/0x21

Mar 16 12:59:38 srv60 [<c01170ea>] find_busiest_group+0x210/0x60a

Mar 16 12:59:38 srv60 [<c012ee3a>] hrtimer_reprogram+0x48/0x9b

Mar 16 12:59:38 srv60 [<c012eeee>] enqueue_hrtimer+0x61/0xd4

Mar 16 12:59:38 srv60 [<c012f4a6>] hrtimer_start+0xe5/0x10f

Mar 16 12:59:38 srv60 [<c0119898>] hrtick_set+0x8f/0xd8

Mar 16 12:59:38 srv60 [<c03a6874>] schedule+0x56f/0x5a1

Mar 16 12:59:38 srv60 [<c03a6b7a>] schedule_timeout+0x6b/0x86

Mar 16 12:59:38 srv60 [<c0124ac5>] process_timeout+0x0/0x5

Mar 16 12:59:38 srv60 [<c03a6b75>] schedule_timeout+0x66/0x86

Mar 16 12:59:38 srv60 [<c01665c5>] do_sys_poll+0x21b/0x2df

Mar 16 12:59:38 srv60 [<c0166f6d>] __pollwait+0x0/0xac

Mar 16 12:59:38 srv60 [<c011d6cf>]  [<c0118138>] default_wake_function+0x0/0x8

Mar 16 12:59:38 srv60 [<c0118138>] default_wake_function+0x0/0x8

Mar 16 12:59:38 srv60 [<c0118138>] default_wake_function+0x0/0x8

Mar 16 12:59:38 srv60 [<c0191ebb>] __ext3_get_inode_loc+0x104/0x2bb

Mar 16 12:59:38 srv60 [<c019c113>] __ext3_journal_dirty_metadata+0x13/0x32

Mar 16 12:59:38 srv60 [<c0191d51>] ext3_mark_iloc_dirty+0x27b/0x2e1

Mar 16 12:59:38 srv60 [<c0117e1a>] __wake_up+0x29/0x39

Mar 16 12:59:38 srv60 warn_on_slowpath+0x40/0x4f

Mar 16 12:59:38 srv60 [<c019e959>] journal_stop+0x144/0x14d

Mar 16 12:59:38 srv60 [<c019884e>] __ext3_journal_stop+0x19/0x34

Mar 16 12:59:38 srv60 [<c0193cd5>] ext3_ordered_write_end+0xcd/0xfd

Mar 16 12:59:38 srv60 [<c02564b6>] e1000_xmit_frame+0x99c/0x9e8

Mar 16 12:59:38 srv60 [<c03a7b7c>] _spin_lock_bh+0x8/0x1e

Mar 16 12:59:38 srv60 [<c03a7b7c>] _spin_lock_bh+0x8/0x1e

Mar 16 12:59:38 srv60 [<c031cb18>] dev_hard_start_xmit+0x1f1/0x256

Mar 16 12:59:38 srv60 [<c032881e>] __qdisc_run+0xca/0x17a

Mar 16 12:59:38 srv60 [<c031ecb2>]  [<c01eacbc>] dev_queue_xmit+0x254/0x27a

Mar 16 12:59:38 srv60 [<c03330ee>] __next_cpu+0x12/0x21

Mar 16 12:59:38 srv60 [<c01170ea>] ip_finish_output+0x1c3/0x1f9

Mar 16 12:59:38 srv60 [<c0331d04>] ip_local_out+0x15/0x17

Mar 16 12:59:38 srv60 find_busiest_group+0x210/0x60a

Mar 16 12:59:38 srv60 [<c0116497>]  [<c0333714>] ip_queue_xmit+0x259/0x29c

Mar 16 12:59:38 srv60 can_migrate_task+0x3b/0x59

Mar 16 12:59:38 srv60 [<c012ee3a>]  [<c0124e9a>] hrtimer_reprogram+0x48/0x9b

Mar 16 12:59:38 srv60 __mod_timer+0x98/0xa2

Mar 16 12:59:38 srv60 [<c03152ae>]  [<c012eeee>] enqueue_hrtimer+0x61/0xd4

Mar 16 12:59:38 srv60 sk_reset_timer+0xc/0x16

Mar 16 12:59:38 srv60 [<c03a7b7c>]  [<c012f4a6>] _spin_lock_bh+0x8/0x1e

Mar 16 12:59:38 srv60 [<c0315281>] hrtimer_start+0xe5/0x10f

Mar 16 12:59:38 srv60 lock_sock_nested+0x88/0x90

Mar 16 12:59:38 srv60 [<c0119898>] hrtick_set+0x8f/0xd8

Mar 16 12:59:38 srv60 [<c016b7d1>]  [<c03a6874>] destroy_inode+0x24/0x33

Mar 16 12:59:38 srv60 [<c0159f5c>] schedule+0x56f/0x5a1

Mar 16 12:59:38 srv60 kmem_cache_free+0x60/0x69

Mar 16 12:59:38 srv60 [<c016b7d1>] destroy_inode+0x24/0x33

Mar 16 12:59:38 srv60 [<c03a6b7a>] schedule_timeout+0x6b/0x86

Mar 16 12:59:38 srv60 [<c0169b4e>]  [<c0124ac5>] d_kill+0x37/0x46

Mar 16 12:59:38 srv60 [<c016a7fa>] dput+0x21/0xc5

Mar 16 12:59:38 srv60 [<c015d3a1>] process_timeout+0x0/0x5

Mar 16 12:59:38 srv60 __fput+0x11e/0x147

Mar 16 12:59:38 srv60 [<c03a6b75>] schedule_timeout+0x66/0x86

Mar 16 12:59:38 srv60 [<c0167434>] sys_poll+0x3b/0x6f

Mar 16 12:59:38 srv60 [<c0166a48>]  [<c01044da>] sysenter_past_esp+0x5f/0x85

Mar 16 12:59:38 srv60 do_select+0x371/0x3c8

Mar 16 12:59:38 srv60 [<c0166f6d>] __pollwait+0x0/0xac

Mar 16 12:59:38 srv60 [<c0118138>] default_wake_function+0x0/0x8

Mar 16 12:59:38 srv60 [<c012f615>] ktime_get_ts+0x11/0x3a

Mar 16 12:59:38 srv60 =======================

Mar 16 12:59:38 srv60 [<c012f64b>] ktime_get+0xd/0x21

Mar 16 12:59:38 srv60 [<c0116ca8>] hrtick_start_fair+0xe1/0x122

Mar 16 12:59:38 srv60 [<c0115bd1>] enqueue_task+0xa/0x14

Mar 16 12:59:38 srv60 [<c011812f>] try_to_wake_up+0xc7/0xd0

Mar 16 12:59:38 srv60 [<c01eacbc>] __next_cpu+0x12/0x21

Mar 16 12:59:38 srv60 [<c01170ea>] find_busiest_group+0x210/0x60a

Mar 16 12:59:38 srv60 [<c01eacbc>] __next_cpu+0x12/0x21

Mar 16 12:59:38 srv60 [<c01eacbc>] __next_cpu+0x12/0x21

Mar 16 12:59:38 srv60 [<c01eacbc>] __next_cpu+0x12/0x21

Mar 16 12:59:38 srv60 [<c01170ea>] find_busiest_group+0x210/0x60a

Mar 16 12:59:38 srv60 [<c012ecf0>] hrtimer_forward+0xe4/0x100

Mar 16 12:59:38 srv60 [<c0118608>] rebalance_domains+0x119/0x32e

Mar 16 12:59:38 srv60 [<c01eacbc>] __next_cpu+0x12/0x21

Mar 16 12:59:38 srv60 [<c01170ea>] find_busiest_group+0x210/0x60a

Mar 16 12:59:38 srv60 ---[ end trace 15c408695b3eea0d ]---

Mar 16 12:59:38 srv60 [<c016a411>] __d_lookup+0x91/0xdf

Mar 16 12:59:38 srv60 [<c0166d44>] core_sys_select+0x2a5/0x2c6

Mar 16 12:59:38 srv60 [<c012ecf0>] hrtimer_forward+0xe4/0x100

Mar 16 12:59:38 srv60 [<c0118608>] rebalance_domains+0x119/0x32e

Mar 16 12:59:38 srv60 [<c016d835>] mntput_no_expire+0x11/0x54

Mar 16 12:59:38 srv60 [<c016406b>] path_walk+0x90/0x98

Mar 16 12:59:38 srv60 [<c01642c0>] do_path_lookup+0x11e/0x139

Mar 16 12:59:38 srv60 [<c01efbd4>] copy_to_user+0x25/0x39

Mar 16 12:59:38 srv60 [<c015ed0c>] cp_new_stat64+0xfc/0x10e

Mar 16 12:59:38 srv60 [<c01670b8>] sys_select+0x9f/0x180

Mar 16 12:59:38 srv60 [<c01044da>] sysenter_past_esp+0x5f/0x85

Mar 16 12:59:38 srv60 =======================

Mar 16 12:59:38 srv60 ---[ end trace 15c408695b3eea0d ]---
```

Das war heute um 15:00

Kann es sein das dieser crash etwas mit der Timer frequency im kernel zu tun hat?

Die Meldung deutet ja auch auf den High Resolution Timer hin.

Das Ganze läuft als virtueller Host auf einer VMware ESXi

----------

## nanos

In den meisten Foren wo dieser Fehler beschrieben ist steht gleich nach einer möglichen Lösung, das diese nur beschränkt funktioniert.

Jetzt habe ich folgende Seite gefunden http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1007020

Da hilft wohl nur auf die Version 2.6.26 zu warten   :Sad: 

----------

## Max Steel

Ist doch schon raus, siehe Sig.

Sie ist allerdings Arch-masked

Aber das ist kein Problem

einfach echo "=sys-kernel/gentoo-sources-2.6.26-r2" >> /etc/portage/package.keywords/kernel und los gehts.

----------

