# 2.6.35 is not ready for prime time

## curmudgeon

TWO different machines, identical behavior.

KDM comes up, and everything appears normal. The instant a single key is pressed in the password field, the machine goes dead (screen black, network connections down, does not respond to pings). After rebooting, there is no indication in the logs of what happened.

2.6.34 did not have this problem.

----------

## PaulBredbury

Yeah, but which patchset? 2.6.35.4? Be specific.

----------

## toralf

2.6.35.1-4 runs fine here with KDE 4.4.5 at a 32bit x86 ThinkPad with integrated Intel graphic card. Did you tested the kernel with a straight xdm login (or even startx) ?

----------

## curmudgeon

 *PaulBredbury wrote:*   

> Yeah, but which patchset? 2.6.35.4? Be specific.

 

I meant the stable one, "of course" (implied, but I do admit, not obvious).

----------

## PaulBredbury

 *curmudgeon wrote:*   

> the stable one

 

You're still being ridiculously vague. Gentoo patchset, or vanilla, or what? We're not mind-readers. I use pf, for example.

Also, the stable one today might be different to the stable one tomorrow. So, state a proper version number.

----------

## Naib

werks4me

----------

## curmudgeon

 *PaulBredbury wrote:*   

> Gentoo patchset, or vanilla, or what?

 

gentoo-sources-2.6.35-r4

None of the vanilla-sources on the 2.6.35 series are stable on any architecture.

----------

## Veldrin

neither are gentoo-sources.

----------

## curmudgeon

 *toralf wrote:*   

> Did you tested the kernel with a straight xdm login (or even startx) ?

 

Hadn't tried that before (and  didn't get very far with it). Much to my surprise, the same thing happens at the console - the first key pressed to type in the username borks the system, although this time I do get a very long stack trace (of which I can only see the tail end).

I don't know any way of capturing that, but "cpu_idle" stood out several times.

----------

## curmudgeon

 *Veldrin wrote:*   

> neither are gentoo-sources.

 

What do you call x86 here?

```

$ grep KEYWORDS /usr/portage/sys-kernel/gentoo-sources/gentoo-sources-2.6.35-r4.ebuild 

KEYWORDS="~alpha ~amd64 ~arm ~hppa ~ia64 ~ppc ~ppc64 ~sh ~sparc x86"

```

----------

## Veldrin

yep - missed that one - *shame on me*

----------

## toralf

[quote="curmudgeon"] *toralf wrote:*   

> the first key pressed to type in the username borks the system, although this time I do get a very long stack trace (of which I can only see the tail end).
> 
> I don't know any way of capturing that, ....

 Make a photo and attach it onto a Gentoo bugzilla report.

----------

## Anon-E-moose

I run the zen sources 2.6.35 with X 1.9.0 and it has been rock solid.

What version of X, which video card and driver version?

----------

## curmudgeon

 *Anon-E-moose wrote:*   

> What version of X, which video card and driver version?

 

All stable versions (I only care about it working).

x11-base/xorg-server-1.7.7-r1

Intel Corporation 82Q963/Q965 Integrated Graphics Controller

x11-drivers/xf86-video-intel-2.9.1

----------

## Anon-E-moose

post /var/log/Xorg.0.log please

----------

## curmudgeon

 *toralf wrote:*   

> Make a photo and attach it onto a Gentoo bugzilla report.

 

OK, I tried that (but I had a lot of trouble getting a decent picture. Is this good enough to be useful?

https://bugs.gentoo.org/attachment.cgi?id=245638

----------

## toralf

 *curmudgeon wrote:*   

> Is this good enough to be useful?
> 
> https://bugs.gentoo.org/attachment.cgi?id=245638

 No, I'm afraid.

To give the devs a good starting point, prabably you should increase the font size within /etc/conf.d/consolefont and then try it again.

----------

## drescherjm

Neither 2.6.35 (gentoo sources or vanilla-sources) or 2.6.36-rc (git) will work on my i7 machine. Well to be exact it does work however it turns my 3.0GHz i7 so slow that typing is like a 300 baud modem.. 

https://forums.gentoo.org/viewtopic-t-838842-highlight-.html

And the problem persists with all power management disabled in the bios.

2.6.34 gentoo-sources on the same machine works fine.

----------

## kernelOfTruth

 *toralf wrote:*   

>  *curmudgeon wrote:*   Is this good enough to be useful?
> 
> https://bugs.gentoo.org/attachment.cgi?id=245638 No, I'm afraid.
> 
> To give the devs a good starting point, prabably you should increase the font size within /etc/conf.d/consolefont and then try it again.

 

++

at least it's somewhat readable:

both machines are laptops ?

try to disable wireless lan

just use google: [url=http://lmgtfy.com/?q=bad%3A+scheduling+from+the+idle+thread!] *click* [/url]

unfortunately there's no instant fix for this - try googling some more (gotta to work now)

----------

## Naib

 *drescherjm wrote:*   

> Neither 2.6.35 (gentoo sources or vanilla-sources) or 2.6.36-rc (git) will work on my i7 machine. Well to be exact it does work however it turns my 3.0GHz i7 so slow that typing is like a 300 baud modem.. 
> 
> https://forums.gentoo.org/viewtopic-t-838842-highlight-.html
> 
> And the problem persists with all power management disabled in the bios.
> ...

 

goo to know, my i7 parts due today/tomorrow so need to rebuild kernel. I will downgrade to 24 for now

----------

## toralf

 *drescherjm wrote:*   

> Well to be exact it does work however it turns my 3.0GHz i7 so slow that typing is like a 300 baud modem.. 

 Interesting - I observed this with my ThinkPad T400 (Core2 Duo) with an early 2.6.36-rc1 version, but it disappears with a later release candidate and therefore I ignored it.

----------

## curmudgeon

 *kernelOfTruth wrote:*   

> both machines are laptops ?
> 
> try to disable wireless lan

 

No, both machines are desktops. One is a P4 and the other is a Core 2.

----------

## Naib

 *drescherjm wrote:*   

> Neither 2.6.35 (gentoo sources or vanilla-sources) or 2.6.36-rc (git) will work on my i7 machine. Well to be exact it does work however it turns my 3.0GHz i7 so slow that typing is like a 300 baud modem.. 
> 
> https://forums.gentoo.org/viewtopic-t-838842-highlight-.html
> 
> And the problem persists with all power management disabled in the bios.
> ...

 

The i7 re-introduces hyperthreading, which results in 8 virtual cores. do you have hyperthreading enabled or disabled in your kernel?

----------

## drescherjm

Enabled, I see all 8 cores. This works on 2.6.34. I can try disabling that this evening (3 or so hours from now) and see if that fixes the issue.

----------

## alexdu

 *curmudgeon wrote:*   

> The instant a single key is pressed in the password field, the machine goes dead (screen black, network connections down, does not respond to pings).

 

I'd like to note two things.

1. (I guess you did it) It's time to migrate from "< > ATA/ATAPI/MFM/RLL support (DEPRECATED)" to "<*> Serial ATA and Parallel ATA drivers". It's easy - just reconfig you kernel, then update your bootloader config (e.g. grub.conf) and /etc/fstab in favour of change ALL /dev/hd. devices to /dev/sd. devices.

If something goes wrong - you need to reboot from a LiveCD. Also your old kernels which use /dev/hd. devices now became useless.

2. (It's your case I belive) You should (try) "adjust" the version of udev you use. There are 3 options here:

- use udev-149, mask udev-151-r4 (if you have ATA (PATA - just another name) disk and using deprecated ATA drivers)

- use udev-151-r4, add USE flag "devfs-compat" (if you have ATA (PATA) disk and using new PATA drivers)

- use udev-160 (if you have ATA (PATA) disk and using new PATA drivers)

Also, It's good idea to use kernel 2.6.35+, since, for example, 2.6.34 not working with old ATA /dev/hda drivers and udev-151-r4, but works with the new drivers and udev-151-r4  :Smile: 

It was tested on: vanilla-sources, x86 arch, Intel Celeron and PIIMMX. I do  believe neither  amd64 arch nor SATA disks aren't effected.

I do hope this will help.

----------

## curmudgeon

 *Anon-E-moose wrote:*   

> post /var/log/Xorg.0.log please

 

I don't know how relevant that is (since I, unexpectedly have the same problem at the console), but I can do that:

http://pastebin.com/PBzpkt0Q

----------

## curmudgeon

 *alexdu wrote:*   

> 1. (I guess you did it) It's time to migrate from "< > ATA/ATAPI/MFM/RLL support (DEPRECATED)" to "<*> Serial ATA and Parallel ATA drivers".

 

Did that years ago. :)

Seriously, I don't save all of my old .configs, but I found one from gentoo-sources-2.6.25-r9, and I had already moved to the new ATA drivers then.

 *alexdu wrote:*   

> [2. (It's your case I belive) You should (try) "adjust" the version of udev you use. There are 3 options here:
> 
> - use udev-149, mask udev-151-r4 (if you have ATA (PATA - just another name) disk and using deprecated ATA drivers)
> 
> - use udev-151-r4, add USE flag "devfs-compat" (if you have ATA (PATA) disk and using new PATA drivers)
> ...

 

I am using (again :) ) the stable version - udev-151-r4. Are you saying I should either re-emerge with the devfs-compat USE flag enabled or upgrade to unstable udev-160?

Since the machine works fine remotely (as long as no one touches the keyboard) with gentoo-sources-2.6.35-r4 (I use ssh to do all updates and such), I would like a bit of an explanation of why you expect this to fix things. :)

 *alexdu wrote:*   

> [Also, It's good idea to use kernel 2.6.35+, since, for example, 2.6.34 not working with old ATA /dev/hda drivers and udev-151-r4, but works with the new drivers and udev-151-r4 :)

 

My biggest concern is http://news.softpedia.com/news/Critical-Vulnerability-Silently-Patched-in-Linux-Kernel-152678.shtml (surprised to see so little publicity about it). gentoo-sources 2.6.35-r4 was the first (stable) kernel offered that addressed this. I have since upgraded all of my machines to gentoo-sources 2.6.34-r6.

----------

## Naib

 *drescherjm wrote:*   

> Enabled, I see all 8 cores. This works on 2.6.34. I can try disabling that this evening (3 or so hours from now) and see if that fixes the issue.

 

http://www.phoronix.com/scan.php?page=article&item=intel_p55&num=4

 *Quote:*   

> 
> 
> During this early testing, we also found that the CPU core frequencies never increased to their Intel Turbo Boost frequencies when they were encountering a load. Intel's Turbo Boost Technology was not working under Linux. Once we disabled Turbo Boost from the BIOS, our sporadic performance problems were eliminated too. The performance numbers stopped fluctuating and dropping so much between runs and there were finally stable performance figures. Turbo Boost never boosted the performance under Linux or even the frequencies for that matter, but just seemed to cause some problems in our early testing. It would be nice though to see proper Intel Turbo Boost Technology for Linux, but again, as long as its disabled, the CPU should function and performance as expected for its base frequency.

 

----------

## Naib

actually mixed reports

https://bugzilla.kernel.org/show_bug.cgi?id=15064  says it is fixed

also http://computing-intensive.blogspot.com/2009/09/how-to-make-turbo-boost-work-under.html

----------

## drescherjm

I have moved my discussion to a separate thread because I feel I was hijacking the OP's thread.

https://forums.gentoo.org/viewtopic-t-842775-highlight-.html

----------

## alexdu

 *curmudgeon wrote:*   

> I am using (again  ) the stable version - udev-151-r4. Are you saying I should either re-emerge with the devfs-compat USE flag enabled or upgrade to unstable udev-160?

 Yes, exactly.

 *curmudgeon wrote:*   

> Since the machine works fine remotely (as long as no one touches the keyboard) with gentoo-sources-2.6.35-r4 (I use ssh to do all updates and such), I would like a bit of an explanation of why you expect this to fix things. 

 You said you have two broken machines. Pretty sure, this is not a hardware problem: hardware are different. Also it not looks like a drivers problem: about a half of drivers are also different. It looks like software problem, kernel problem, as you said before. But udev is the part of linux kernel.

http://kernel.org/ :

stable:  	2.6.35.4  	2010-08-26

stable:  	2.6.34.6  	2010-08-26

http://www.kernel.org/pub/linux/utils/kernel/hotplug/ :

udev-149.tar.gz                           03-Dec-2009 13:53  643K  

udev-151.tar.gz                           27-Jan-2010 09:25  620K  

udev-160.tar.gz                           11-Jul-2010 22:01  655K  

udev-161.tar.gz                           11-Aug-2010 13:54  657K  

I can't give you 100% guarantee that it helps or not even hang your server, but it helped to me just in the same situation (except kdm). You could try it on 'desktop'?

 *curmudgeon wrote:*   

> My biggest concern is http://news.softpedia.com/news/Critical-Vulnerability-Silently-Patched-in-Linux-Kernel-152678.shtml (surprised to see so little publicity about it). gentoo-sources 2.6.35-r4 was the first (stable) kernel offered that addressed this. I have since upgraded all of my machines to gentoo-sources 2.6.34-r6.

 You said 'stable'...   :Rolling Eyes: 

P.S.

just as is:

```
> grep -P -e ">>> emerge.+(vanilla|udev)" /var/log/emerge.log

...

1255944515:  >>> emerge (3 of 3) sys-kernel/vanilla-sources-2.6.30.7 to /

1256766214:  >>> emerge (3 of 4) sys-kernel/vanilla-sources-2.6.30.9 to /

1257781473:  >>> emerge (3 of 4) sys-fs/udev-146-r1 to /

1259825168:  >>> emerge (8 of 11) sys-kernel/vanilla-sources-2.6.31.6 to /

1263473730:  >>> emerge (136 of 169) sys-fs/udev-146-r1 to /

1263474803:  >>> emerge (142 of 169) sys-kernel/vanilla-sources-2.6.31.6 to /

1267445213:  >>> emerge (18 of 24) sys-fs/udev-149 to /

1267445358:  >>> emerge (19 of 24) sys-kernel/vanilla-sources-2.6.31.12 to /

1271789289:  >>> emerge (28 of 49) sys-kernel/vanilla-sources-2.6.32.9 to /

1273266273:  >>> emerge (11 of 12) sys-kernel/vanilla-sources-2.6.32.9 to /

1279016133:  >>> emerge (11 of 16) sys-kernel/vanilla-sources-2.6.34 to /

1279821693:  >>> emerge (2 of 2) sys-fs/udev-151-r4 to /

1283020731:  >>> emerge (1 of 1) sys-kernel/vanilla-sources-2.6.35.3 to /

1283109923:  >>> emerge (1 of 1) sys-kernel/vanilla-sources-2.6.35.4 to /

1283254545:  >>> emerge (3 of 3) sys-fs/udev-151-r4 to /

1283342806:  >>> emerge (1 of 1) sys-kernel/vanilla-sources-2.6.35.4 to /

1283369669:  >>> emerge (1 of 1) sys-fs/udev-149 to /

1283371233:  >>> emerge (1 of 1) sys-fs/udev-160 to /

1283372523:  >>> emerge (1 of 1) sys-fs/udev-151-r4 to /

1283373600:  >>> emerge (1 of 1) sys-fs/udev-160 to /

1283374227:  >>> emerge (1 of 1) sys-fs/udev-149 to /

```

----------

## curmudgeon

 *alexdu wrote:*   

>  *curmudgeon wrote:*   I am using (again :) ) the stable version - udev-151-r4. Are you saying I should either re-emerge with the devfs-compat USE flag enabled or upgrade to unstable udev-160? Yes, exactly.

 

Upgraded to udev-160. No change. I was skeptical that that would work, but I was really hoping that it would. :(

----------

## Shining Arcanine

 *curmudgeon wrote:*   

>  *alexdu wrote:*    *curmudgeon wrote:*   I am using (again  ) the stable version - udev-151-r4. Are you saying I should either re-emerge with the devfs-compat USE flag enabled or upgrade to unstable udev-160? Yes, exactly. 
> 
> Upgraded to udev-160. No change. I was skeptical that that would work, but I was really hoping that it would. 

 

I cannot reproduce your issues on either amd64 or x86. I am using sys-kernel/vanilla-sources-2.6.35.4. I suspect that your issue has to do with how your kernel is compiled. I never thought I would say this, but my suggestion is to try genkernel. It should be a good way to rule out the possibility of the kernel configuration being incorrect.

Also, would you please post your .config file?

----------

## curmudgeon

 *Shining Arcanine wrote:*   

> I suspect that your issue has to do with how your kernel is compiled. I never thought I would say this, but my suggestion is to try genkernel. It should be a good way to rule out the possibility of the kernel configuration being incorrect.
> 
> Also, would you please post your .config file?

 

I have been compiling my own kernels for fifteen years (and have never seen anything like this). I basically used the same .config files on the two machines that I have used "forever" (since I obtained the machines - that goes back to at least 2.6.16).

No offense to the developers, but genkernel looks like a bit of a mess. Would I use the same .config file? I don't really think much of modules, and prefer to build non-modular kernels when possible (I have never used an initrd, except of course, when booting from live CDs or DVDs), so using genkernel has never really appealed to me. I don't want to rule it out for testing, but I probably will try vanilla-sources first to see what happens.

The .config on one machine (the easier of the two to do debugging on):

http://pastebin.com/usdi4yL5

----------

## Shining Arcanine

 *curmudgeon wrote:*   

>  *Shining Arcanine wrote:*   I suspect that your issue has to do with how your kernel is compiled. I never thought I would say this, but my suggestion is to try genkernel. It should be a good way to rule out the possibility of the kernel configuration being incorrect.
> 
> Also, would you please post your .config file? 
> 
> I have been compiling my own kernels for fifteen years (and have never seen anything like this). I basically used the same .config files on the two machines that I have used "forever" (since I obtained the machines - that goes back to at least 2.6.16).
> ...

 

It is a troubleshooting measure. It is not intended as a fix. Unless someone can catch your issue by eyeballing your .config file, figuring out what to do next depends on whether genkernel works or not. If it works, we have a working .config and a broken .config and figuring out what is breaking things could be accomplished by progressively morphing one into the other until the results of kernel compilations change. If it does not work, then you are likely encountering a kernel regression, which will require doing git-bisects on the kernel sources between the kernel version that worked and the kernel version that does not work.

----------

## transpetaflops

Just wanted to add some (hopefully) useful input to this discussion. I've run 2.6.35 since it was available in the tree and I have gentoo-sources-2.6.35-r5 running on two systems and -r4 on one right now without any issues. If you like to compare my configs with your own, I've put them on pastebin. I do however use startx with Gnome - no KDM/GDM/XDM.

Intel Core i7 920 (AMD64)

http://pastebin.com/sA6qKB9f

AMD Athlon 64 X2 4600+ (AMD64)

http://pastebin.com/nC8ccrXy

Intel Atom N270 (x86)

http://pastebin.com/qMjwaKtx

This kernel config is slightly different from the others because of the eee PC's unique nature. KVM support and no-op disk scheduler for example.

----------

## curmudgeon

I have tracked this down (sort of):

This appears to be another instance of kernel bug 15758 (although completely different hardware and manifestation).

https://bugzilla.kernel.org/show_bug.cgi?id=15758

So now I have two questions:

First, can someone running 2.6.35.3 (or any 2.6.35 vanilla sources - the problem exists with gentoo sources, as well, but the kernel developers don't care about that) do the following?

Uncomment line 130 in /usr/src/linux-2.6.35.3/drivers/char/Makefile (no need to make any other changes, at all), compile, boot, and see if anything strange happens.

Second, are there any gentoo developers who work on the kernel (or is there any way to poke the upstream developers into getting this fixed)?

----------

## curmudgeon

Bumping due to a lack of (recent) replies.

----------

## curmudgeon

Still looking for someone who has an idea regarding how to get a kernel developer to take a look at this bug.

----------

## drescherjm

Have you tried newer kernels. 

gentoo-sources-2.6.36 is now out.

----------

## curmudgeon

 *drescherjm wrote:*   

> Have you tried newer kernels. 
> 
> gentoo-sources-2.6.36 is now out.

 

Just tried it. Same problem (machine blows up the minute any key is pressed). None of the kernel developers have (apparently) even looked at this problem, though, from my limited knowledge, it doesn't seem that difficult to fix. Once again I will ask if anyone can poke one of the kernel developers into taking a look at this.

----------

## augury

I'm still emerging sys-apps/module-init-tools.

I'd say make a patch and send it to bugzilla but there is so much more to this.

----------

