# Kernel related graphics freeze with Intel 4500MHD

## Quincy

First some general information: I have an Intel 4500MHD integrated graphics controller with amd64 stable x11-drivers/xf86-video-intel-2.20.13 installed (dri sna udev -glamor -uxa -xvmc). Kernel module is DRM_I915 with working KMS. At work I'm using a dual monitor setup and sometimes also at home which was always running without major and/or persistent problems. Regularly updating my system gentoo-sources-3.6.11 got installed right before Christmas and things started to change:

When running with the new kernel KDE randomly freezes: Mouse is moving, SSH login still possible, Shutdown via ACPI Events works, but whole display is frozen. This happens after some time (mostly in the range of hours), but only when using more than one monitor. As I had hard time to track the problem, because it does not seem to occur at any special action, I was trying many different things (reinstalling stuff, avoiding to use some programs, etc.) and searching for a solution before as a kind of last resort switching back to the old kernel (3.5.7). When running with this version I never experienced these problems again. Once in a while switching to the newer kernel version when other packages got upgraded repeatedly ended with frozen KDE.

I hoped upgrading to kernel 3.7.9 last weekend would solve the problem, but the system just crashed again at work with two monitors after running for two days at home with only a single one.   :Sad: 

So currently I'm pretty sure that the problem is related to any change introduced in thei915 kernel driver between 3.5.7 and 3.6.11 releases, but this is far beyond my knowledge. The kernel has been configured by me, but I did not change any obviously to the graphics system related settings.

I would also appreciate any further hints where to look for a solution on my own, because I currently can work around the problem, but kernel I don't want to be stuck with kernel version 3.5.7.

----------

## audiodef

I can think of two suggestions:

1. Does this happen when using something other than KDE? Fluxbox, LXDE, XFCE, etc.?

2. Try Pappy's Kernel Seeds. It straightens out a lot of defconfig cruft that could get in the way.

----------

## CkoTuHa

did you know that gen4 intel chipset is pain in the ass for intel ?

https://bugs.freedesktop.org/show_bug.cgi?id=53385

https://bugs.freedesktop.org/show_bug.cgi?id=55984

I vaguely recall that in some bug they ( intel gfx devs ) Wilson/Vetter apparently said that the issue/bug is only possible if the hw is not behaving as specd. If your  gen4 chipset is "rev 07" then you are in that ditch of people who will be "blessed" with gfx issues. If that comes out of the horse mouth, what can you hope for ?

----------

## Quincy

 *Quote:*   

> 1. Does this happen when using something other than KDE? Fluxbox, LXDE, XFCE, etc.?

 

I could not try for very long time, but it did not happen within 1,5h days. After that I had the feeling that turning off desktop effect/compositing in KDE could also help and indeed it looks as this has an effect. The system did not crash for the last 2,5 days now I'm "just" experiencing many rendering bugs: Black boxes around tooltip windows, scrambled system tray icons, flickers of the whole screen...

 *Quote:*   

> 2. Try Pappy's Kernel Seeds. It straightens out a lot of defconfig cruft that could get in the way.

 

Isn't that just a kernel config file? As I said it was happily working with gentoo-sources-3.5.7, just started occurring with later versions without changing settings on my side.

@CkoTuHa

I know that this chipset (I have a rev 07 chip) is horrible, it's not the first time I'm facing problems with a driver/kernel/desktop combination. They have always been less tremendous only producing glitches, but not locking up the full (graphics) system and they somehow (because of other people tracking down and solving them) disappeared with newer driver/kernel/desktop versions for me. This time I'm more curious because I at least know some conditions linked to the issue and the situation is worse with a newer kernel version, so kind of introducing an error and therefore I'm hoping that it will be somehow reverted. Your linked bug reports seem to point to hardly trackable timing issues as far as I understand. So I will have to figure out how to really debug the problem and enter the discussion over there?!

----------

## CkoTuHa

 *Quincy wrote:*   

> ...I'm hoping that it will be somehow reverted....

 

There is very slim chance it is going to automagically get better.

 *Quincy wrote:*   

> 
> 
> So I will have to figure out how to really debug the problem and enter the discussion over there?!

 

just buy new hardware already. It is better for you in long term.

So long

----------

## CkoTuHa

might want to take a look at this:

http://www.spinics.net/lists/stable-commits/msg23061.html

----------

## CkoTuHa

where it is said that it is the HW:

https://bugzilla.redhat.com/show_bug.cgi?id=538163

----------

## Quincy

I don't think that buying new hardware solves everything. Now it is the graphics card and next time there is network card or what else. Who can check before buying any hardware that there are no errors arising some time in the future when drivers are (further) developed.

I'm wondering why the original bug report you found was closed in 2010 and the corresponding patch was submitted just some weeks ago and included into the stable starting with 3.7.6. Unfortunately (?) I never activated this option at all, so this bug should not be relevant to me. This is consistent with kernel 3.7.9 failing and also 3.6.11 doing so. So I should go back in time and watch for patches between 3.5.7 and 3.6.11 when the problem arises.

----------

## CkoTuHa

First of all: we don't live in ideal world where things just work. You can apply this as follows. Say you want to hunt down the bug, there is only one way to do it - trial and error, since the sw and hw development is a complicated process where there are no guarantees whatsoever, no amount of logic and sanity will get things straight.

Secondly, you will be better off if you assimilate the paradigm above. I state it here again: software development and practical computing are not a deterministic models of the theoretical computer models. That is any machine you build in flesh is susceptible to design flaws, unlike a theoretical turing machine for instance. Even if you build one ideal machine, it still will have things it will never be able to compute reliably, definitely and deterministically. But that is whole another issue.

That is why you can't really be quite sure that if X -> Y, means not Y -> not X when it comes to conjecturing why some thing is not behaving as it should on your machine. If you think that the above is BS and completely wrong, roll up the sleeves and good luck to you, more power to you.

ps: did you ever think why on almost every software license there are clauses about guarantees ? The one that says no liability is accepted, and you agree hereby to use it as is etc...

pps: have you heard that Nasa's JPL is now trying to fix an issue/bug on Curiosity ? That is freaking nasa, with their spec of hw/sw with unlimited US budget backing them up still having computer bugs. Heuristic for you: if they have bugs, you have em for sure. Deal with it.

----------

## Quincy

Because of the working system with desktop effects turned off, I wanted to further look into this thing.

I quickly switched them back on with new kernel 3.7.10, but the system crashed. This time I had the feeling that changing a window caused the whole thing.

After that I kept desktop effects on, but changed composite to "XRender" instead of "OpenGL" and did not experience any crash within the last 3 days constantly working in dual monitor configuration. So it is kind of narrowing down a bit...

@CkoTuHa

I agree on your "non perfect world", but this starts to get a bit to philosophical/general. I'm pretty sure that there have been bugs, there are bugs and there always will be bugs, but there will be always people like us trying to spot and solve them, that's the thing bug trackers are about. If this bug is "implemented" in software or hardware or both and how to get around (if possible) is the second step.

----------

