# 2.5.74 released.

## idl

Changelog

Release announcement

----------

## Lovechild

Nothing of interest - the most interesting thing on lkml right now is Con Kolivas' lastest insane hacks O1int and time gran. (the latter is in pfeifer sources now I think). 

Also Mike has an idea for a general framework to boost priorities during special sitiuations like audio and video access, sounds very promising for us poor desktop users.

----------

## RedBeard0531

Actualy, it looks like it fixes the xfs problems I was having. Each reboot, it would fail log replay, now I know why.

----------

## abracadaver

i'm using the vanilla and there's really no noticeable differences from what i was using before , which was .73-bk6,  Rhythmbox skips a little less with each release though, but direct alsa with xmms has been unshakeable as of the later revisions of 73

i do, however, have kind of a general kernel question,  about the cfq and as schedulers,  are they available in all kernels or just mm?  and is there a way to tell for sure if you have it enabled, or are you just to assume because you put the parameter in your kernel line that it worked?

----------

## RedBeard0531

 *abracadaver wrote:*   

> i do, however, have kind of a general kernel question,  about the cfq and as schedulers,  are they available in all kernels or just mm?  and is there a way to tell for sure if you have it enabled, or are you just to assume because you put the parameter in your kernel line that it worked?

 

I cant answer your question, biut I HIGHLY suggest you stick to AS, the defualt elevator. It offers much higher performance on HD intensive apps. On some programs, I've noticed a 5-10x speedup! The only problem is that mp3's skip everynow and again. This has to do with the way that they work.

*the technichal stuff i've peiced together, so it may be wrong*

 AS  (anticipatory scheduler) gives proccesses a larger period of time to stream data, then anticipates their needs, and caches them. IIRC it give more weight to priority.  The CFQ (consistantly fair que (SP?)) is more fair. each proccess is strictly limited in the amount of io it can get at a time. This benefits programs like XMMS which dont use good buffers. It allows them to get io  more often, reducing skips. This is inneficiant as the HD is constantly seeking to differant area, the slowest opperation for most HDs. Also it will slowdont things like compiling and encoding video as they are always having to give up access to the HD ,causing them to idle.

btw- is there any good docs explaining this? I checked /usr/src/linux/Documentation, but didnt find anything.

----------

## broschi

Nothing interesting, what about:

 *linus wrote:*   

> 
> 
> htree: set the dir index bit in the right place
> 
> ext3: fix page lock vs journal_start ranking bug
> ...

 

Is the reign of terror over? The htree bug is responsible for millions of wasted GB including a few of my own. I wish somebody more knownledgable than me could enlight us.

----------

## pens

<greg@kroah.com>

	I2C: add i2c-ali1535 bus driver

	Ported from the i2c cvs tree.

Hopefully this will allow my laptop to finally read sensors data...

/me rushes off to compile...

----------

## idl

mm1 is out

 *Quote:*   

> . Included Con's CPU scheduler changes.  Feedback on the effectiveness of
> 
>   this and the usual benchmarks would be interesting.

   :Very Happy: 

----------

## Safrax

.74-mm1 seems to have gotten rid of that annoying bug that was randomly "losing" processes..  It feels sooooo smooth too..

----------

## bssteph

 *port001 wrote:*   

> mm1 is out
> 
>  *Quote:*   . Included Con's CPU scheduler changes.  Feedback on the effectiveness of
> 
>   this and the usual benchmarks would be interesting.  

 

I second that  :Very Happy: 

I don't know if it was specifically the O1int hacks or something else beyond my understanding, but the previously mentioned ALSA skip I was experiencing is gone. 100%. The only time XMMS misses a beat is when whipping an xterm around and a new file opens, and even then it's not but a pop. Crux + Nautilus seems to be a lot more responsive, too. XMMS, xchat, galeon, and doing a lot of large directory reads in nautilus is one of my "this feels good" benchmarks, and I think this the first time I've been pleased with the results.

On the downside, however, since .73-mm2, the kernel has been oopsing (hooray trying to kill init) on boot if I have Use DMA by default enabled. But, I think my chipset is one of those warned against in the kernel config, so I guess I'll live.

----------

## Lovechild

.74-mm1 is much better but far from prefect - Con has he might have some tricks up his sleeve yet to aid the crappiness of the scheduler.

----------

## idl

Moving and resizing windows performance has greatly decreased for me, windows can't keep up with the mouse... looks funny, windows moving on their own.

hmmm, think it could be related to min_timeslice 1 ?

----------

## Lovechild

 *port001 wrote:*   

> Moving and resizing windows performance has greatly decreased for me, windows can't keep up with the mouse... looks funny, windows moving on their own.
> 
> hmmm, think it could be related to min_timeslice 1 ?

 

Time slices set to 1 is known to cause trashing max = min = 10 is probably the safest if you must decrease them.

----------

## handsomepete

2.5.74 nuked ACPI for me.  Had to boot with ACPI=off.  Other than that, seems nice.  I haven't locked up yet, still getting the badness errors.  *shrug*  If anything I feel like I should always reply to these threads just to bookmark them.

----------

## idl

@Lovechidl, changing min_timeslice solved it.

 *handsomepete wrote:*   

> 2.5.74 nuked ACPI for me.  Had to boot with ACPI=off.  Other than that, seems nice.  I haven't locked up yet, still getting the badness errors.  *shrug*  If anything I feel like I should always reply to these threads just to bookmark them.

 

Look just below the reply button at the bottom of the page  :Wink: 

----------

## handsomepete

I know this is veering off topic, sorry.  Doesn't "watching" the topic just send you e-mails?  I'm not even sure I used a real e-mail address to sign up for these forums.  I usually just login here, view my posts then view posts since my last visit and that's how I track.  *shrug*  It's probably the stupid way to do it, but since I mostly browse at work it's virtually the only way.  I don't leave my email open all day (or check it that often, really).

So, does anyone know what the Badness in pci_find_subsys at drivers/pci/search.c:128 errors are all about?  Someone mentioned that they're not volitile, but they do line up with processes stalling on my system (which seems less frequent with .74).

----------

## togge

I get compiling error when i enable APM in the kernel

Anyone else have the same problem, or is it just me ?

----------

## Freak_NL

Well.. finally figured out what was going PANIC all the time something was compiling.. XFS' page buffer apparantly did. Finally figured that out after setting my console font a little smaller (more lines)  :Confused: 

..so now I'm back with a ReiserFS root.  :Smile:  I'll see about cloning this partition to a XFS setup for a bugreport. (I love tar; Linux: the cloneable OS)

Vanilla 2.5.74 seems to like my desktop computer though.  :Smile: 

Now on to the laptop.. 2.5 series means losing my touchpad (nothing special, it's just like a PS/2 mouse) and losing network.. My onboard Realtek gives a garbage HWaddr in ifconfig and happily transmits stuff who knows where, but doesn't receive a single bit.  :Sad: 

----------

## -leliel-

got it working after a few hours ...

but it seems to be very slow with my wm (kahakai/waimea) and X ...  :Sad: 

so long

----------

## thubble

ck's O2int patch has been released! It's apparently an improvement on O1int, and it patches against 2.5.74-mm1. Also, his granularity patch is available, which apparently makes desktop performance smoother but decreases throughput. Both are available at http://members.optusnet.com.au/ckolivas/kernel/2.5/

Going to go patch and recompile 2.5.74-mm1. I'll let you know how it goes.

----------

## bart

Like Freak_NL I've some troubles with my laptop since 2.5.73.

My mouse (normal ps/2 synaptics) doesn'r do anything.

In 2.5.73 there was a "psmouse compile fix". I'm afraid that this fix broke more than it fixed  :Smile: 

----------

## Lovechild

 *thubble wrote:*   

> ck's O2int patch has been released! It's apparently an improvement on O1int, and it patches against 2.5.74-mm1. Also, his granularity patch is available, which apparently makes desktop performance smoother but decreases throughput. Both are available at http://members.optusnet.com.au/ckolivas/kernel/2.5/
> 
> Going to go patch and recompile 2.5.74-mm1. I'll let you know how it goes.

 

Please report back on the effects of this patch - Con told me on IRC that he was working on fixing a set of issues I and other reported with the old patch. As I'm unable to get in touch with my Linux PC atm I cannot test this patch.

----------

## idl

I'm not realy noticing any improvements with the O2 sched patch   :Confused: 

As for window movement, why don't I just turn opaque off?   :Laughing: 

EDIT: Wireframe is what I mean   :Wink: 

----------

## robmoss

This is the second 2.5 kernel I've tried - but the first one I tried doesn't really count, it was very early (2.5.4 or something like that, I think) and didn't boot. So, right now, I'm able to post on here using a 2.5 kernel. Largely, all is going well!

However, I have a fairly major problem, or two. The first (and most serious) is that gnome-terminal is now completely and utterly shafted. It doesn't segfault - I could probably fix it if it was doing that. No, this is far worse. I start gnome-terminal, and am presented with nothing but a box and a flashing cursor. I don't get a prompt - just a cursor. It won't accept any input. Has anyone else had this problem? More to the point, if so, has anyone managed to fix it yet?!

The second problem is that on all consoles apart from tty1, the cursor disappears entirely. This is also really annoying, as it means that I have to guess where the cursor is when I'm trying to edit previous lines, unless I keep it at the end of the line by deleting everything (not exactly preferable).

So apart from a fairly drastic lack of a usable command-line environment, I am, on the whole, very happy indeed with this new kernel.

Someone please help me out - I don't want to go back to a 2.4 kernel simply because I've lost my command line!!

Cheers,

Rob

[EDIT: I found the alsactl store thingy, so never mind about the muting thing...]

----------

## floam

robmoss2k: Go into kernel config and turn on Unix98 PTS support to get your terminals working.

----------

## robmoss

Actually, I've got another problem, too! Using certain DVDs, I'm unable to mount /dev/cdroms/cdrom1 (my DVD drive, /dev/dvd -> /dev/cdroms/cdrom1 for mplayer's benefit) on /mnt/dvd. I get loads of errors about ide-scsi and some data being discarded. I can't get rid of SCSI emulation as I need it for my CD writer. The only way I've found of fixing this is to turn off DMA by default; however, I'm not actually sure this fixes the problem, and I quite like DMA, keeps things going relatively quickly.  :Sad: 

Any ideas on this one? Is it a known problem?

Cheers,

Rob

----------

## robmoss

 *floam wrote:*   

> robmoss2k: Go into kernel config and turn on Unix98 PTS support to get your terminals working.

 

Okay, I'll try that - I'll let you know how I get on tomorrow. The terminals are there, and they're functional - it's just that I don't get a cursor  :Sad: 

I can see how this would be fixed by the above, but - will this fix my gnome-terminal problem?

Thanks,

Rob

----------

## Vanquirius

Does anybody know when nforce2's onboard NIC (emerge nforce-net) is going to work with the 2.5.x series (if ever)???...

Just wondering. I feel technologically excluded.

----------

## bssteph

 *Lovechild wrote:*   

> 
> 
> Please report back on the effects of this patch - Con told me on IRC that he was working on fixing a set of issues I and other reported with the old patch. As I'm unable to get in touch with my Linux PC atm I cannot test this patch.

 

I tried patching 2.5.74-mm1 with both the O2 and granularity patches (granularity I did by hand, didn't want to test the diff)... Not really noticing any improvement, maybe even a degregation.

XMMS is again skippy in the first couple seconds of a song when I do things in Nautilus (perhaps even worse than before then O1 patch showed up).

I think I noticed this with O1 as well, but don't remember: moving a window in GNOME will commonly choke the desktop refresh. Example: XMMS, Galeon, Nautilus, Xchat, and a kernel recompile going. I move around Nautilus, and my two desktop icons are commonly not redrawn fast enough, and my desktop looks blank in that area. Other windows generally have lots of old window draws of the top window when i move it around, those go away when i slow down

I'm assuming O2 is an attempt to fix this (Con said he was working on it), but Gecko engine rendering still manages to chop everything up pretty well from time to time.

Currently recompiling w/o granularity. Will post again.

----------

## bssteph

Right. That was unimpressive.  :Sad: 

The kernel with O2 applied works fine, but is pretty much just as reported for O2 + granularity.

Right now I'm sitting in "vanilla" 2.5.74-mm1, just O1 (no granularity), and XMMS performance is a lot better than above. When browsing around in Nautilus (2 instances at once even!) XMMS doesn't lose a beat in the first 5 seconds as it does with O2. I zippily start a song, open Nautilus, browse a large dir, then go back to ~, and XMMS is fine. Certainly not the case with O2, which will lag a second or so.

However, X responsiveness is terrible. Even with just a Galeon window on a workspace (others in a different workspace), when I first start whipping the window around, my icons go missing immediately, and for a long time. After a while and/or I let go, everything seems to adjust and further movement draws the icons fine. Moving a window over another quickly, however, makes all hell break loose. The moving window is generally choppy, the underlying window gets a lot of fragments, and there's more fragments in the moving if it goes offscreen and comes back. Really cruddy. XMMS keeps up fine during all of this though.

Well, I hope this is somewhat helpful for Lovechild/Con/anyone. I'm in -mm1 right now, but I kept the kernel with O2 around in case someone wants more testing.

EDIT: Still a slight hiccup in XMMS performance when Galeon does a lot of rendering (say, going back to a large, graphical page) in the first couple seconds of XMMS play. Don't know if this is just the XMMS issue not being entirely gone, or because of the Gecko engine rendering being a hog, or some vile combination, or something else altogether.Last edited by bssteph on Sat Jul 05, 2003 5:55 pm; edited 1 time in total

----------

## floam

robmoss2k: it's not just gnome-terminal, its all terminals.

----------

## RedBeard0531

 *floam wrote:*   

> robmoss2k: Go into kernel config and turn on Unix98 PTS support to get your terminals working.

 

Dont forget to enable support for pts device file system under psuedo-filesystems. Also, either upgerade baselayout, or add this to fstab.

```
devpts         /dev/pts   devpts      defaults      0 0
```

----------

## robmoss

Okay, gnome-terminal now works. Thanks guys!  :Smile: 

However, I still have a problem with any terminal other than vc/1. There's still no cursor. Is this a configuration problem, or a kernel problem? It just seems kinda weird that I get a cursor on the default terminal but nowhere else.

Also, I'm getting lots of really strange errors relating to my DVD drive. What's this "DMA problem" I keep hearing about? I can't find any reference to what it actually is, despite much trawling through the forums!!!

Cheers,

Rob

----------

## robmoss

Okay, this is really weird.

When I have "Use PCI DMA by default when available" my DVD drive becomes utterly useless. Mount complains about a bad superblock when trying to mount either a DVD or a CD; mplayer complains with a DVD like this:

```
Playing DVD title 1

libdvdread: Could not open device with libdvdcss.

libdvdread: Can't open /dev/dvd for reading

Couldn't open DVD device: /dev/dvd
```

or like this:

```
Playing DVD title 1

Reading disc structure, please wait...

libdvdread: Can't open file VIDEO_TS.IFO.

Can't open VMG info!
```

However, I can mount CDs using my CD writer with no problems at all. I haven't tried writing a CD yet, but I don't have any blank CD-Rs to hand, or any CD-RWs at all, so I can't try this.  :Sad: 

cat /proc/scsi/scsi gives me this:

```
Attached devices:

Host: scsi0 Channel: 00 Id: 00 Lun: 00

  Vendor: PLEXTOR  Model: CD-R   PX-W1210A Rev: 1.10

  Type:   CD-ROM                           ANSI SCSI revision: 02

Host: scsi1 Channel: 00 Id: 00 Lun: 00

  Vendor: SAMSUNG  Model: DVD-ROM SD-612   Rev: 0.1

  Type:   CD-ROM                           ANSI SCSI revision: 02
```

Both drives are using SCSI emulation (as you've probably worked out...). I don't know if I actually need to use this? Any ideas?

Anyway, any help would be much appreciated, as there's only this (relatively major problem now) and the lack of a cursor on vc/2 - vc/6 (relatively minor problem, I can handle using gnome-terminal, slow as it is) that's troubling me with this otherwise excellent kernel.

Cheers,

Rob

----------

## thubble

I'm getting the same results as everyone else with granularity, O2 patches. I've also set min_timeslice = max_timeslice = 10 (anyone know a good value for this? min_timeslice = 1 causes problems, like leaving zombie processes all over the place).

xmms skipping is about the same (no problems after a couple seconds). This could probably be solved by using a sound server like arts, but I'm not going to bother setting that up - a couple of blips is tolerable. Mouse jerkiness is basically gone, and windows move normally. All in all, the patches aren't perfect, but they're improving.

----------

## bssteph

thubble: I found a while ago that a good way to beat the XMMS blues is to output using the crossfade plugin. That loads the next song while the current song nears completion (so it can start applying effects), so you don't get the weird chop on start of a new song. It essentially makes your playlist one big file. You can configure it so it sounds more or less like a normal output plugin.

I get weird window moving lag with -mm1+O2 and moving XMMS. It will keep up, but after a while the window will just get "behind" the mouse, and when i let go, it will finish up the moving it should have done. It's weird. Whipped the window around while it was playing and nothing happened. Let go and started moving it normally, and it immediately lagged behind. Didn't try anything with the timeslices, will do now.

----------

## idl

 *bssteph wrote:*   

> I get weird window moving lag with -mm1+O2 and moving XMMS. It will keep up, but after a while the window will just get "behind" the mouse, and when i let go, it will finish up the moving it should have done. It's weird. Whipped the window around while it was playing and nothing happened. Let go and started moving it normally, and it immediately lagged behind. Didn't try anything with the timeslices, will do now.

 

Yep same problem here, I had it before applying O2 aswell. I'm starting to think its a problem with one of the patches in mm1, so i'm just waiting for mm2 to see if its fixed.

----------

## bssteph

 *port001 wrote:*   

> 
> 
> Yep same problem here, I had it before applying O2 aswell. I'm starting to think its a problem with one of the patches in mm1, so i'm just waiting for mm2 to see if its fixed.

 

I'd thought that too while I was rebooting. Hopefully by then Con will have another trick up his (impressively long) sleeve. :)

Changed the timeslices to both = 10, nothing really seems different except I noticed the mouse doesn't chop up when starting X (woo?). Moving windows may be a bit less choppy but that's about it.

----------

## RedBeard0531

Does anyone know why con's stuff isnt in www.kernel.org's people section? He is an "offcial" kernel hacker isnt he?

----------

## robmoss

I've almost fixed my problem. Unfortunately, I'm left being unable to play DVDs. And still I have no cursor on vc/2 - vc/6, but I'm not that bothered about that as gnome-terminal works now.

It would appear that my DVD drive, which identifies itself as a SAMSUNG DVD-ROM SD-612 (Rev 0.1), won't work under the 2.5 kernel with DMA enabled, but will work under the 2.5 kernel without DMA enabled and will also work under the 2.4 kernel with or without DMA enabled.

Should I file this as a bug to lkml or wherever it is you send such things? Something has definitely broken between 2.4 and 2.5, but I don't know what I'm doing quite well enough yet to work out whether or not this one's already been discovered.

----------

## krazo

Hey guys...

Now do you all use ALSA for XMMS or OSS. It seems ALSA has trouble with scheduling (or whatever) because in XMMS an error msg about I/O comes up and a skip happens, yet in OSS emulation I don't get a skip or anything. Both tested when Galeon refreshes or does anything basically.

Another thing, is it possible to use ATAPI cdwriting with a cdrw/dvd combo?

Thanks.

----------

## Freak_NL

Robmoss2k: Kernel.org's Bugzilla is where the 2.5 kernel bugs go. Perhaps you can find your problem already listed there?

Hmm, since 2.5.73 ALSA seems smooth in xmms now here.

----------

## bssteph

 *krazo wrote:*   

> Hey guys...
> 
> Now do you all use ALSA for XMMS or OSS. It seems ALSA has trouble with scheduling (or whatever) because in XMMS an error msg about I/O comes up and a skip happens, yet in OSS emulation I don't get a skip or anything. Both tested when Galeon refreshes or does anything basically.
> 
> Another thing, is it possible to use ATAPI cdwriting with a cdrw/dvd combo?
> ...

 

I've been using the ALSA output plugin. I tried using the OSS plugin to be handled by the compatibility layer, and it seemed to not skip at first, but now it is acting exactly the same as with the ALSA plugin (which is to be expected, I would think?).

It is possible, yes. Just did a test burn on 2.5.73 with my laptop's CDRW/DVDROM combo and it worked fine.

----------

## krazo

 *bssteph wrote:*   

>  *krazo wrote:*   Hey guys...
> 
> Now do you all use ALSA for XMMS or OSS. It seems ALSA has trouble with scheduling (or whatever) because in XMMS an error msg about I/O comes up and a skip happens, yet in OSS emulation I don't get a skip or anything. Both tested when Galeon refreshes or does anything basically.
> 
> Another thing, is it possible to use ATAPI cdwriting with a cdrw/dvd combo?
> ...

 

What is it doing exactly? Skipping at the same spots?

----------

## dylix

after i apply the mm1 patch i get this error... any ideas?

```
root@reaction:/usr/src/linux> make   

  CC      arch/i386/kernel/asm-offsets.s

In file included from include/asm/page.h:15,

                 from include/asm/processor.h:13,

                 from include/linux/prefetch.h:13,

                 from include/linux/list.h:7,

                 from include/linux/signal.h:4,

                 from arch/i386/kernel/asm-offsets.c:7:

include/linux/config.h:6:22: asm/kgdb.h: No such file or directory

make[1]: *** [arch/i386/kernel/asm-offsets.s] Error 1

make: *** [arch/i386/kernel/asm-offsets.s] Error 2
```

----------

## idl

mm2 is out.

Announcement

Take a look at:

o2init.patch

and

o1-interactivity.patch

----------

## bssteph

 *krazo wrote:*   

>  *bssteph wrote:*    *krazo wrote:*   Hey guys...
> 
> Now do you all use ALSA for XMMS or OSS. It seems ALSA has trouble with scheduling (or whatever) because in XMMS an error msg about I/O comes up and a skip happens, yet in OSS emulation I don't get a skip or anything. Both tested when Galeon refreshes or does anything basically.
> 
> Another thing, is it possible to use ATAPI cdwriting with a cdrw/dvd combo?
> ...

 

In the first 5-10 seconds of playback, it will skip if I do something heavy (going back in Galeon to a graphical page, browsing directories in Nautilus my test cases). After that period of time, everything is fine.

----------

## handsomepete

 *Quote:*   

> time-goes-backwards.patch
> 
>   demonstrate do_gettimeofday() going backwards

 

Hmmmm....

----------

## bssteph

 *port001 wrote:*   

> mm2 is out.
> 
> Announcement
> 
> 

 

Builds fine. No big change in what we've been doing for tests, and judging from Con's message in the O2 patch, he's still working on it.

Another interesting side effect is that the nvidia drivers build but fail to be loaded, claiming

'nvidia: Unknown symbol pmd_offset'

Dunno what's causing this, but I've gotten two reports so far.

I built with min_timeslice = 5, max_timeslice = 10 out of curiosity, nothing special and nothing's dying yet. Will try again with both = 10 + granularity as I have nothing better to do. :)

----------

## robmoss

That nvidia thing is quite worrying. I'll have to have a go and tell you what I get - I need to use the nvidia kernel module as well, and it works fine with 2.5.74-mm1. Fingers crossed...

----------

## idl

I get the pmd_offset error aswell   :Confused: 

----------

## idl

ok, I got rid of the pmd_offset error and the module loads fine  :Very Happy: . But X doesn't start. It seems the code I commented out was critical  :Laughing: 

----------

## krazo

Is anyone encountering processes that just freeze up when there is heavy IO? It seems I can reproduce it when I emerge sync, untar a large file (ie the kernel) or even compile.. I won't be able to kill it either..

Thanks.

----------

## Safrax

 *krazo wrote:*   

> Is anyone encountering processes that just freeze up when there is heavy IO? It seems I can reproduce it when I emerge sync, untar a large file (ie the kernel) or even compile.. I won't be able to kill it either..
> 
> Thanks.

 

Yeah.  I call this the disappearing program bug (even though the program really doesnt disappear...)

The problem is not present in 2.5.72-mm3.

When I went poking around in /proc looking at the (pid)/status file, I found that those processes were listed as asleep even though they shouldn't have been.

Anyone got a fix?

----------

## Freak_NL

Yup, here too..

I used to have a XFS root filesystem, but that caused kernel panics in the XFS page buffer when some heavy IO was going on. With ReiserFS I only noticed problems with compiling and unpacking OpenOffice, but hey, at least I now have a Ximian OpenOffice 1.0.3  :Razz: 

Out of curiousity, what filesystem do you all use on your root partition? I've had XFS panics on two computers now, so I'm starting to doubt my kernel configuration skills a bit.  :Sad: 

----------

## krazo

 *Freak_NL wrote:*   

> Yup, here too..
> 
> I used to have a XFS root filesystem, but that caused kernel panics in the XFS page buffer when some heavy IO was going on. With ReiserFS I only noticed problems with compiling and unpacking OpenOffice, but hey, at least I now have a Ximian OpenOffice 1.0.3 
> 
> Out of curiousity, what filesystem do you all use on your root partition? I've had XFS panics on two computers now, so I'm starting to doubt my kernel configuration skills a bit. 

 

I use ReiserFS. Does this problem show up when you emerge sync too? In fact, it happens to Galeon as well now that I think about it... :\ It began in 2.5.73-mm2 I believe.

----------

## handsomepete

 *krazo wrote:*   

>  *Freak_NL wrote:*   Yup, here too..
> 
> I used to have a XFS root filesystem, but that caused kernel panics in the XFS page buffer when some heavy IO was going on. With ReiserFS I only noticed problems with compiling and unpacking OpenOffice, but hey, at least I now have a Ximian OpenOffice 1.0.3 
> 
> Out of curiousity, what filesystem do you all use on your root partition? I've had XFS panics on two computers now, so I'm starting to doubt my kernel configuration skills a bit.  
> ...

 

Ditto.  I've got the same problem, especially if I emerge sync and do anything else (usually web browsing) at the same time - very reproducible.  Glad to hear I'm not alone.  Has anyone passed this along yet?

port001: Ever find a fix for the nvidia problem?

----------

## Peaceable Frood

My Broadcom 4401 isn't working, I heard that the the driver in the kernel is broken and has been in the 2.5 series for quite sometime. How do I fix this problem, as its my only one.

----------

## Freak_NL

I haven't passed it along yet, since I was having some problems with a dying HD that incidentally included my root partition, so I wasn't sure about what to blame. Now that my system is transferred to a fresh HD I was thinking using a spare partition to recreate the XFS setup and go for a full debug kernel and pass the results upstream.

I'm just a bit daunted by the task of having to manually typing the resulting kernel panic into a textfile..  :Sad: 

With ReiserFS the only problem I encountered was with a loooooong OpenOffice emerge. Have you ever tried to chop up an ebuild so you could resume with the already created objects? It's scary...

It's just that ReiserFS doesn't go for the kernel panic, the problem seems to occur higher up and just hangs the cp or cc or tar or ld or whatever is doing heavy IO at the time. They then proceed to act like undead and refuse to die or live, they don't show up as zombies though..  :Confused: 

----------

## Safrax

 *Freak_NL wrote:*   

> Out of curiousity, what filesystem do you all use on your root partition? I've had XFS panics on two computers now, so I'm starting to doubt my kernel configuration skills a bit. 

 

ReiserFS.

----------

## robmoss

Hmm, this is strange - I quite often have very heavy I/O tasks in progress, using ReiserFS, and am yet to have any sort of crash. Weird...

I'm currently compiling 2.5.74-mm2, and will follow that with a reboot and an attempt to compile the nvidia-kernel module. Wish me luck! If I get it working, I'll report back.

----------

## handsomepete

 *robmoss2k wrote:*   

> Hmm, this is strange - I quite often have very heavy I/O tasks in progress, using ReiserFS, and am yet to have any sort of crash. Weird...

 

If you're feeling adventurous, here's my steps to reproduce.  Note this worked every time in 2.5.73-mm1, but less frequently in 2.5.74-mm1 (actually stalls but can sometimes recover).

1.) Open Mozilla.  Open multiple tabs within mozilla

2.) emerge sync 

3.) Towards the middle/end of the sync process, start loading new urls in multiple mozilla tabs (adding new tabs if you feel like it)

Result: Emerge will stall on updating the cache and the mozilla window will stop refreshing/responding.  It doesn't exactly cause a crash, but the system stays unstable from that point forward.  I think untarring a large file also does the trick.

----------

## thubble

I believe I've found the cause of the pmd_offset error with nvidia drivers. It's caused by the new highpmd.patch introduced in 2.5.74-mm2. To remove it, apply this patch:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.74/2.5.74-mm2/broken-out/highpmd.patch

and recompile your kernel. This patch probably won't do anything for you anyway unless you use highmem (>4GB physical memory).

I'm temporarily using the nv driver while compiling Ximian Desktop 2, but I'll try to test this as soon as possible. Please let me know if it works. Thanks.

----------

## robmoss

 *handsomepete wrote:*   

> If you're feeling adventurous

 

I was. I tried it; it still didn't crash! Maybe it's a kernel problem relating to specific hardware? Or maybe I'm just incredibly lucky?!

----------

## bssteph

 *thubble wrote:*   

> I believe I've found the cause of the pmd_offset error with nvidia drivers. It's caused by the new highpmd.patch introduced in 2.5.74-mm2. To remove it, apply this patch:
> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.74/2.5.74-mm2/broken-out/highpmd.patch
> 
> and recompile your kernel. This patch probably won't do anything for you anyway unless you use highmem (>4GB physical memory).
> ...

 

Perfect.

Unapplied the patch and nvidia modprobes without a hitch. Wonder if the patch will find its way into the kernel permanently...

Anyway, great, and thanks for the fix! :)

----------

## Freak_NL

Ya know.. Come to think of it, both systems are quite different (my younger brother's Athlon comp. and my desktop Celly system) but the common factor is the Via Apollo chipset.

handsomepete: you wouldn't happen to have a Via Apollo chipset?  :Confused: 

I'd try this on my Intel chipset laptop, but 2.5 makes my network functionality go away (8139too, weird..), which is quite inconvenient.

----------

## handsomepete

 *robmoss2k wrote:*   

>  *handsomepete wrote:*   If you're feeling adventurous 
> 
> I was. I tried it; it still didn't crash! Maybe it's a kernel problem relating to specific hardware? Or maybe I'm just incredibly lucky?!

 

I have a feeling the people with problems are in the minority.  I'm on a KT400 chipset.  There appears to be a pair of possibly related reports on bugzilla, but with substantially different hardware.  *shrug*

Thanks for the nvidia fix, thubble.

----------

## idl

 *handsomepete wrote:*   

>  *robmoss2k wrote:*    *handsomepete wrote:*   If you're feeling adventurous 
> 
> I was. I tried it; it still didn't crash! Maybe it's a kernel problem relating to specific hardware? Or maybe I'm just incredibly lucky?! 
> 
> I have a feeling the people with problems are in the minority.  I'm on a KT400 chipset.  There appears to be a pair of possibly related reports on bugzilla, but with substantially different hardware.  *shrug*
> ...

 

I've never had a problem with ReiserFS   :Confused:   I just tried those steps to reproduce the problem but it didnt happen. I had xmms playing and the kernel compiling aswell, using 2.5.74-mm1.

----------

## idl

 *bssteph wrote:*   

>  *thubble wrote:*   I believe I've found the cause of the pmd_offset error with nvidia drivers. It's caused by the new highpmd.patch introduced in 2.5.74-mm2. To remove it, apply this patch:
> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.74/2.5.74-mm2/broken-out/highpmd.patch
> 
> and recompile your kernel. This patch probably won't do anything for you anyway unless you use highmem (>4GB physical memory).
> ...

 

Hmm, kernel wont compile for me with the patch removed   :Confused:   *Quote:*   

> arch/i386/kernel/built-in.o(.text+0x4c07): In function `mark_screen_rdonly':
> 
> : undefined reference to `pmd_offset'
> 
> arch/i386/mm/built-in.o(.text+0x2d7): In function `set_pte_pfn':
> ...

 

----------

## handsomepete

 *port001 wrote:*   

> I've never had a problem with ReiserFS    I just tried those steps to reproduce the problem but it didnt happen. I had xmms playing and the kernel compiling aswell, using 2.5.74-mm1.

 

I doubt it's a Reiser problem.  I think it might be a scheduler thing, but like I said, it seems much improved with 2.5.74-mm1, so it may not be much of an issue anymore (has only cropped up a couple times since I booted .74 as opposed to 6+ times per day with .73).  I'm going to give .74-mm2 a shot and see if I can recreate it later.

 *Quote:*   

> Hmm, kernel wont compile for me with the patch removed  

 

Aw, nuts.

----------

## Kummer

I have that IO hangup a lot in 5.73, mostly with mozilla and emerge in parallel.

It also gives me a garbled X-display when changing between virtual terminals.

Its a P4, with SIS645 chipset, nvidia-driver (with nvidia-gart and ACPI-IO on) , and reiserfs.

Couldn't try 74 yet, b/c of that module-problem.

----------

## bssteph

 *port001 wrote:*   

> 
> 
> Hmm, kernel wont compile for me with the patch removed  :?  *Quote:*   arch/i386/kernel/built-in.o(.text+0x4c07): In function `mark_screen_rdonly':
> 
> : undefined reference to `pmd_offset'
> ...

 

:(

I dunno what would cause that, and unfortunately can't check right now. Maybe some other patch is interfering? I patched a mm2 kernel that I believe was priorly patched with granularity, but I don't see how that'd affect anything.

Maybe the patching went haywire? I did 'patch -p1 < highpmd.patch', which detected that the patch was previously applied, and forced me to manually answer "y" to all the questions it asked about undoing or something (I have very little experience with patching by hand :\ ). I could check more later when I get home, but that won't be until 12 hr from now.

----------

## krazo

 *Freak_NL wrote:*   

> Ya know.. Come to think of it, both systems are quite different (my younger brother's Athlon comp. and my desktop Celly system) but the common factor is the Via Apollo chipset.
> 
> handsomepete: you wouldn't happen to have a Via Apollo chipset? 
> 
> I'd try this on my Intel chipset laptop, but 2.5 makes my network functionality go away (8139too, weird..), which is quite inconvenient.

 

I'm on an Inspiron 8200, P4 with Intel chipset and it happens to me.. so it can't only be Via...

EDIT: I have ACPI enabled as well...

----------

## mark

I was having my kde desktop lockup, usually when compiling and browsing the web under 2.5.73.  Another forum member suggested I recompile the kernel without ACPI this seemed to resolve the problem, it may work for others.

Having said that I'm now running 2.5.74-mm1 with ACPI and so far no lockup. However, its early days  :Smile: .

I have an Abit vp6 smp board which I believe Appolo pro based.

Mark

----------

## Safrax

I removed the highpmd patch from 2.5.74-mm2 and I have yet to experience any process screw ups even when under extremely high-io conditions while compiling.

Edit:  After an hour or so of testing the heavy-io bug appeared.  So I got curious.. I tried removing the o1 and o2 patches, recompiled, rebooted, system felt faster over all, but that still didn't solve the problem.  Arrrgghh this is so annoying.  Processes seem to freeze whenever the system runs of out physical ram and begin to go to swap.

----------

## handsomepete

Yikes.  It's weird posting from lynx.  Everything was screwed when I got home, including the i/o bug (2.5.74-mm1 w/ ACPI=OFF (yep) - tar was blocked, IO-Wait in top was ~94%) and, for no apparent reason, mozilla stopped resolving forums (and bugs).gentoo.org properly.  I'm in the midst of compiling .74-mm2 with the patch unapplied and debugging turned on (not that I really know what to do with it - hints welcome).

----------

## sabre66

Well , been working with 2.5.74-mm2 with the highpcm patch applied just to get nvidia-kernel to work . Any website I first go to apon a reboot and restart of mozilla takes forever ( resolving blah blah blah ) not a problem with a kernel<2.5.74 . Mozilla seems to have lost the ability to retain bookmarks and homepage info ( don't know if its kernel related or not )Compiling kernel 2.5.74-xx takes on my pc about 30sec longer as opposed to anything lesser (bzImages all seem to be same size), glxgears 10% slower , hdparms the same. Again I'm really not sure if it's all kernel related , but ill drop down to 2.5.73-mm3 for now it really seems to run so much nicer on my pc with less hassel.

(BTW XP3000 ,nf2 ,msi g4 ti 4200 ,1G muskin ram 2x512 matched) 

Oh well awaiting 2.6 , should be real  nice.....

----------

## bssteph

Not that this thread isn't already wraught with odd bugs and patch discussions, but Con has released an O3int, which patches against mm2.

Get 'er here.

http://members.optusnet.com.au/ckolivas/kernel/2.5/

I think it's a sizable improvement over O2int. I'm using O3int + granularity + min_timeslice = max_timeslice = 10, and things are smooth for the most part.

XMMS _usually_ keeps up when I'm doing Nautilus browses. Sometimes when I open XMMS, start a song, and immediately go to browsing, it will stop for couple fractions of a second. Other times not. Kind of peculiar.

Moving windows around is nice, where it was just so-so before. Dragging Nautilus over this very Galeon window leaves a lot of artifacts in the Galeon window though.

XMMS window dragging getting behind the mouse is still possible, but seemingly not nearly as likely to happen. And it doesn't immediately start happening again after letting go and starting a slow drag.

Haven't seen any terrible chop due to web rendering, but I haven't browsed around much yet.

Woo woo. Things are certainly getting better.

----------

## Exner

I want to add my curious experience to the interesting tests other people have been posting. Warning - offtopic of 2.5.74.

I have used 2.5.68-mm2, 2.5.69-mm3. Then I used 2.5.73-mm2 for 7 days. All along my desktop usability seemed to be getting worse with windows slow to redraw and multi media skips.

2 days ago I went back to 2.5.73-mm2 and recompiled after removing the HZ-100 patch, and changing -march=athlon-tbird to the default -march=athlon. This has solved my desktop slowness.

I solved my multimedia skips by turning off icon previews in nautilus. I think nautilus should launch its gst-* preview apps with a suitable nice value.

----------

## bssteph

I got the hanging process bug last night. At least for me, it has nothing to do with heavy I/O.

I locked in the ./configure stage of a libbonobo install. I believe the exact line was checking for a C preprocessor or some such... It'd just hanged. Killable, but totally hanged. I say I don't think this has to do with I/O because this was from console, logged in as root, and aside from some daemons, nothing else was running. After killing it, I tried to log in from another vt, which hanged after entering my password.

_I've had this bug before._

I was getting it on both of my machines running the mm patchset about two weeks ago. I don't remember what the mm versions were, but I'd thought they'd recently gone away, but I guess not. My box had an uptime of probably somewhere near 6 hours when it had the dying process problem. This machine's been up longer but I haven't reproduced here. I'm going to try to compile something, and then my plan is to patch a .74 kernel with just the Oint patches and granularity, and see what happens.

Oh, machines:

Box #1 (desktop): 1.4 GHz Athlon, 384 MB RAM, one SCSI, one IDE hard drive. VIA82CXXX chipset. GeForce 3, nvidia driver.

Box #2 (laptop): 1.6 GHz Pentium 4, 256 MB RAM, one IDE hard drive. Intel PIIXn chipset. Radeon Mobility, kernel's radeon driver.

Both have been booting using ACPI for IRQ routing. ACPI compiled in, as is PnP support and the PnP device name database. Both using preempt, MTRRs, and a bunch of standard fare kernel opts. I fear this is something beyond a simple driver issue.

----------

## bssteph

Hmmph. I can't reproduce here. And I used to be able to. Before it didn't take much, either. Start a compile, open X-Chat, maybe do some browsing in Galeon, and that'd take care of it. Nothing now. X-Chat open, doing lots of browsing in Nautilus, doing things in Galeon, opening random games, playing stuff in XMMS and then quickly swapping to a video (card can't do mixing), and all this while compiling. Everything kept right up.

This got me to thinking. Maybe there's other forces at work here. I diff'ed my two configs and didn't come up with much different. The different drivers of course, and a couple other options either related to those or probably not relevant given the circumstances. A couple of things caught my eye, though.

Here's my first salvo at randomly trying to destroy this bug.

Specifically for people that are getting the problems: Anyone NOT using "Enhanced real time clock support"?

----------

## Safrax

 *bssteph wrote:*   

> Specifically for people that are getting the problems: Anyone NOT using "Enhanced real time clock support"?

 

I use it on all computers I have been able to duplicate this bug on.  Problems seem to occur most often when using tar/gzip/bzip2 for compression or decompression.  The process bug also seems to manifest when the system needs to swap things out, but of course im not a programmer, nor do I understand the workings of the kernel beyond simplistic concepts, so I am probably wrong.

----------

## bssteph

 *Safrax wrote:*   

> I use it on all computers I have been able to duplicate this bug on.  Problems seem to occur most often when using tar/gzip/bzip2 for compression or decompression.  The process bug also seems to manifest when the system needs to swap things out, but of course im not a programmer, nor do I understand the workings of the kernel beyond simplistic concepts, so I am probably wrong.

 

Neither do I, really. I'm just trying stuff at random.  :Smile: 

I asked about enhanced RTC because right now I can't get any lockage on my laptop. It's using RTC emulation instead of enhanced RTC (which the desktop uses). That was a recent change, meaning that two weeks ago I was using enhanced. There was a portion where my desktop was using emulation as well, when I was trying random stuff for my sound card. And that could very well have coincided with the time when I didn't get the lockup bug. So yeah. I'm recompiling the desktop's kernel right now, so I'll report back later (although it seems to take a lot of time before these bugs show up).

ramdisk also caught my eye, because this machine uses framebuffer. So I enabled that as well, in vain attempts to get something worked out. I'm guessing some of you are using framebuffer though.

----------

## Safrax

Hmm.  Okay, I'll see what happens when the desktop runs on rtc emulation instead...

----------

## R-Type

after receiving "Unknown symbo pmd_offset" errors, I spent sometime digging around the mm sections of the kernel.  I THINK pmd_offset_kernel() is the function nvidia.o should be calling.  Try editing nv.c and changing pmd_offset to pmd_offset_kernel.  Then build the module.  Its been working fine for me so far.  I've stress-tested it with quake3, q2 and some opengl screensavers.

I'd post a patch, but I have nowhere to upload it atm.

----------

## Safrax

 *R-Type wrote:*   

> after receiving "Unknown symbo pmd_offset" errors, I spent sometime digging around the mm sections of the kernel.  I THINK pmd_offset_kernel() is the function nvidia.o should be calling.  Try editing nv.c and changing pmd_offset to pmd_offset_kernel.  Then build the module.  Its been working fine for me so far.  I've stress-tested it with quake3, q2 and some opengl screensavers.
> 
> I'd post a patch, but I have nowhere to upload it atm.

 

I just remove the highpmd patch  :Razz:  but your solution seems to be the right one.

----------

## bssteph

 *R-Type wrote:*   

> after receiving "Unknown symbo pmd_offset" errors, I spent sometime digging around the mm sections of the kernel.  I THINK pmd_offset_kernel() is the function nvidia.o should be calling.  Try editing nv.c and changing pmd_offset to pmd_offset_kernel.  Then build the module.  Its been working fine for me so far.  I've stress-tested it with quake3, q2 and some opengl screensavers.
> 
> I'd post a patch, but I have nowhere to upload it atm.

 

Oh wow. I'd never even thought to check if the kernel call was in the part of the drivers we can actually see the source for...  :Smile:  Will try in a bit, want to finish my gnome upgrade first.

----------

## handsomepete

So far I've had great luck with .74-mm2 (ACPI works again, no locks yet, got the nvidia driver working), but as soon as I tried to untar mozilla nightly sources, tar got blocked by OO.o (I/O wait shot up to 96%, not swapping).  I'm losing stability, so I better submit this.  :Smile: 

Here's the debugging output of the stall:

```
Debug: sleeping function called from illegal context at mm/page_alloc.c:545

Call Trace:

 [<c011b7df>] __might_sleep+0x5f/0x80

 [<c013c794>] buffered_rmqueue+0xe4/0x1b0

 [<c013cae0>] __alloc_pages+0x280/0x300

 [<e4b57c99>] __nvsym00258+0x15/0x1c [nvidia]

 [<c0118368>] pte_alloc_one+0x18/0x50

 [<c0144f9c>] pte_alloc_map+0x3c/0xc0

 [<c0145fc4>] remap_page_range+0xb4/0x1f0

 [<e4b43372>] nv_kern_mmap+0x2d2/0x314 [nvidia]

 [<c014877b>] do_mmap_pgoff+0x31b/0x6f0

 [<c010fb3a>] sys_mmap2+0x9a/0xd0

 [<c01093cb>] syscall_call+0x7/0xb
```

If anyone's trying to do comparisons:

lspci

.config

R-Type: If that patch works, upload it to bugs.gentoo.org with a request to modify the ebuild.  If the change in the kernel sticks, we'll probably want the patch included with the ebuild.

----------

## bssteph

Looks like the prior nvidia driver hack isn't quite all that's necessary. This is from the Linux-Kernel Archive:

http://www.ussg.iu.edu/hypermail/linux/kernel/0307.0/att-1349/01-NVIDIA_kernel-1.0-4363-highpmd.diff

This patches nv-linux.h and nv.c to work with the highpmd patch in mm2 (and likely beyond)

----------

## idl

 *bssteph wrote:*   

> Looks like the prior nvidia driver hack isn't quite all that's necessary. This is from the Linux-Kernel Archive:
> 
> http://www.ussg.iu.edu/hypermail/linux/kernel/0307.0/att-1349/01-NVIDIA_kernel-1.0-4363-highpmd.diff
> 
> This patches nv-linux.h and nv.c to work with the highpmd patch in mm2 (and likely beyond)

 

Thanks, works a treat.

----------

## thubble

 *bssteph wrote:*   

> Specifically for people that are getting the problems: Anyone NOT using "Enhanced real time clock support"?

 

I'm not using enhanced RTC (or generic RTC emulation) and I still get the lockup/dead process bug.  :Sad: 

----------

## bssteph

Desktop just locked up again. IO-wait was in the 80-90% range, happened while listening to a mp3, emerge syncing, and doing casual browsing. First noticed it when I couldn't switch tabs in Galeon. Quit XMMS and everything was back to normal (after foolishly ctrl-c'ing the emerge sync). Quickly quit gnome and am currently compiling a 2.5.74 kernel with Con's patches applied by hand.

Kernel compile just locked up, too. There goes my random guess.

 :Evil or Very Mad: 

Laptop just had a lockup too. I need to blow something up.

 :Evil or Very Mad:   :Evil or Very Mad: Last edited by bssteph on Mon Jul 07, 2003 9:58 pm; edited 2 times in total

----------

## R-Type

Oh well, I tried heheh.  I"ll have to try that patch out..

----------

## Makaveli[FIN]

Can somebody help me out a little bit? I've downloaded NVIDIA_kernel-1.0-4363.tar.gz (already unpacked) and NVIDIA_kernel-1.0-4363-2.5.diff. Now, how do I exaclty patch NVIDIA_kernel with that diff-file?   :Confused: 

----------

## Safrax

 *Makaveli[FIN] wrote:*   

> Can somebody help me out a little bit? I've downloaded NVIDIA_kernel-1.0-4363.tar.gz (already unpacked) and NVIDIA_kernel-1.0-4363-2.5.diff. Now, how do I exaclty patch NVIDIA_kernel with that diff-file?  

 

move the patch to the directory with the drivers in it and do this:

```
patch -p1 <(nameofpatchwithoutparentheses)
```

----------

## krazo

Has anyone gotten supermount-ng to compile on 2.5.74-mm2?

EDIT: whoops had 2.6.74 :\

----------

## Safrax

 *krazo wrote:*   

> Has anyone gotten supermount-ng to compile on 2.6.74-mm2?

 

Nope  :Sad: 

----------

## Ansorg

 *Safrax wrote:*   

> 
> 
> move the patch to the directory with the drivers in it and do this:
> 
> ```
> ...

 

great, but unfortunately patch generates an error

```
Hunk #1 succeeded at 234 with fuzz 1 (offset 9 lines).

patching file nv.c

Hunk #2 FAILED at 2105.

1 out of 2 hunks FAILED -- saving rejects to file nv.c.rej

```

```
cat nv.c.rej

***************

*** 2105,2115 ****

      if (pgd_none(*pg_dir))

          goto failed;

  

-     pg_mid_dir = pmd_offset(pg_dir, address);

-     if (pmd_none(*pg_mid_dir))

          goto failed;

  

-     NV_PTE_OFFSET(address, pg_mid_dir, pte);

  

      if (!pte_present(pte))

          goto failed;

--- 2105,2116 ----

      if (pgd_none(*pg_dir))

          goto failed;

  

+     NV_PMD_OFFSET(address, pg_dir, pg_mid_dir);

+ 

+     if (pmd_none(pg_mid_dir))

          goto failed;

  

+     NV_PTE_OFFSET(address, &pg_mid_dir, pte);

  

      if (!pte_present(pte))

          goto failed;

```

looked at the nv.c file but don't quite understand the diff stuff and what is supposed to get replaced by the patch ...

any hints?

----------

## Ansorg

ok, figured the patch thingy out (manual editing of nv.c)

and now XFree works again with nvidia drivers

thanks

----------

## maor

finally the jumpy mouse was fixed now it's just left to compile the nvidia driver and all be great.

----------

## bssteph

Argh. I just gave up. Whittled my config down to a minimal system, used it for a 2.5.74 virgin kernel, and STILL had IO-wait madness and process freeze. Threw in the towel right then and there.

Good news is that Con's patches aren't to blame, I don't think. 2.5.73+O3int+granularity = nice. :)

----------

## Safrax

 *bssteph wrote:*   

> Argh. I just gave up. Whittled my config down to a minimal system, used it for a 2.5.74 virgin kernel, and STILL had IO-wait madness and process freeze. Threw in the towel right then and there.
> 
> Good news is that Con's patches aren't to blame, I don't think. 2.5.73+O3int+granularity = nice. 

 

I still have that bug in 2.5.73..  I've gotta go to 2.5.72 before I can successfully get away from it.

----------

## floam

Got a new system bootstrapped with gcc 3.3 and kernel 2.5.74-mm2 (using its headers, as well) Only real problem experianced is xfree. XFree 4.3.0-r3 fails halfway through compiling, however 4.3.99.7 from cyfred compiles fine, though oopes and causes a segfault in ebuild.sh (or vice-versa?) when stripping libGLU (always at the same spot) Heres the oops, maybe someone can figure something out. 

```
Oops: 0000 [#1]

CPU:    0

EIP:    0060:[<00000000>]    Not tainted VLI

EFLAGS: 00010286

EIP is at 0x0

eax: c03befc0   ebx: fffffff4   ecx: d7780bdc   edx: d7780bdc

esi: cde351dc   edi: ce5c3c00   ebp: da29df70   esp: da29df08

ds: 007b   es: 007b   ss: 0068

Process strip (pid: 30971, threadinfo=da29c000 task=d90e2140)

Stack: c0158762 cde351dc ce5c3c00 da29df70 ffffffd8 00000242 da29df70 00000002

       c0158ebc da29df78 d7780bc0 da29df70 dde800c0 da29df78 00000001 d7780bc0

       c5bbe5c0 00000241 00000002 cfc5c000 da29c000 c014a49e cfc5c000 00000242

Call Trace:

 [<c0158762>] __lookup_hash+0xa2/0xd0

 [<c0158ebc>] open_namei+0x2ac/0x3e0

 [<c014a49e>] filp_open+0x3e/0x70

 [<c014a89b>] sys_open+0x5b/0x90

 [<c0108ecb>] syscall_call+0x7/0xb

Code:  Bad EIP value.

 /usr/sbin/ebuild.sh: line 1219: 30971 Segmentation fault      strip --strip-debug ${x}
```

Sucks being stuck without xfree  :Sad: 

----------

## iamarug

I was thinking about doing a fresh install on a new hard drive and system with the dev kernel. I think the pre release version of the kernel should be out any day now and hopefully all will work well with a gcc 3.3 bootstrap.

Otherwise, I will cry   :Crying or Very sad: 

scratch that, I am a grown man, I now replace crying with drinking   :Twisted Evil: 

----------

## Nicom

This is my first time trying out a 2.5x kernel, and it won't boot up. It freezes most of the way through all the init, while starting adsl. When I accidentally chose the wrong nic module it booted all the way and just failed on adsl connect, but when I chose the correct module(ne2k-pci) it freezes where I said. Anyone know what's up?

----------

## MetalGod

Ok some oopps and kernel panics and nvidia DMA problems.

But 2.5.74-mm1 works fine

----------

## thubble

2.5.74-mm3 is out! It includes the O3 patch, but most interesting is this:

reiserfs-dirty-memory-fix.patch - The ClearPageDirty() in there is wrong - it doesn't adjust the VM's dirty memory accounting.  The system thinks it's full of dirty memory and stops.

Maybe this was the problem with the IO lockups? (Was anyone getting these without using ReiserFS?) Anyway, I'm gonna go compile this now.

----------

## Safrax

 *thubble wrote:*   

> 2.5.74-mm3 is out! It includes the O3 patch, but most interesting is this:
> 
> reiserfs-dirty-memory-fix.patch - The ClearPageDirty() in there is wrong - it doesn't adjust the VM's dirty memory accounting.  The system thinks it's full of dirty memory and stops.
> 
> Maybe this was the problem with the IO lockups? (Was anyone getting these without using ReiserFS?) Anyway, I'm gonna go compile this now.

 

Hopefully.

This is not directed to anyone in particular..

What good are the con kolivas interactivity patches?  For me, they slow the entire machine down, and make the skipping problems worse!  So bad I usually wind up reversing the patches from the kernel..  Perhaps I'm doing something wrong?

----------

## Kummer

Well, now the nvidia-kernel ebuild seems to apply a highmem patch, but it doesn't work for me   :Crying or Very sad: 

The module loads fine, but on starting X the nvidia driver bails with a

'cannot allocate DMA context' (IIRC)

Even worse, the patch is applied even when using 73 again, making nvidia unusable there, too

(before changing back the ebuild)

----------

## bssteph

Both of my machines and a friend's machine are reiserfs only systems (with the exception of an unmounted /boot). Someone having the problems please report back having used -mm3. I would try but don't have the time to stress the system before I leave.

As for the Con patches, I don't really know what you could do "wrong", per se. Their goal, obviously, is to increase interactivity, with some quabbles as to the effect it has on throughput. IANAKC (I Am Not A Kernel Coder, anyone ever used that?), but from what I understand the patches attempt to do a better job at guessing what needs to be scheduled by giving scheduler bonuses to tasks that aren't sleeping often. Processes that are just waiting for their hunk of the CPU get dealt with sooner, while those in sleep are treated normally.

At least that's how I see it from comments in the patches and things. Anyone correct me if I'm wrong.

For me the patches have been wonderful, although they seem a bit like hitting the problem with an oversized hammer... things were better before, I feel it may be better to find what happened to that and fix, but this is working well. And maybe if the changes were somewhere else, we get the best of both worlds.  :Wink: 

I don't know what you should do though, Safrax.. O3+granularity+timeslice hack is doing well for me, I've preempt enabled in the kernel, perhaps that is affecting something? And I don't know what effect would be had otherwise, but be sure you're using the anticipatory scheduler (if you don't use elevator=, you're fine)

----------

## maor

i have problem compiling mm3 complianing about a mistake in apm.c anyone else exprience this ?

----------

## krazo

 *maor wrote:*   

> i have problem compiling mm3 complianing about a mistake in apm.c anyone else exprience this ?

 

Yup, this is being discussed on the kernel mailing list.. a patch is floating around on it too..

----------

## floam

2.5.74-mm3 is out, not sure if it's in portage yet, it wasn't last night. Fixed a bunch of oopses I was getting, so I'm happy.

----------

## Safrax

 *floam wrote:*   

> 2.5.74-mm3 is out, not sure if it's in portage yet, it wasn't last night. Fixed a bunch of oopses I was getting, so I'm happy.

 

Its in portage.

----------

## Safrax

I am happy to report that the process bug appears to be gone with 2.5.74-mm3.  After an hour of looping bonnie++, looping a kernel compile, and surfing I have yet to experience the bug!  Yay!

/me knocks on wood.

----------

## krazo

The IO wait madness seems to be fixed in 2.5.74-mm3... =D

----------

## alsh

 *Kummer wrote:*   

> Well, now the nvidia-kernel ebuild seems to apply a highmem patch, but it doesn't work for me  
> 
> The module loads fine, but on starting X the nvidia driver bails with a
> 
> 'cannot allocate DMA context' (IIRC)
> ...

 

I get the same nvidia error while trying to start x. Anyone know how to fix it?

----------

## Safrax

 *alsh wrote:*   

>  *Kummer wrote:*   Well, now the nvidia-kernel ebuild seems to apply a highmem patch, but it doesn't work for me  
> 
> The module loads fine, but on starting X the nvidia driver bails with a
> 
> 'cannot allocate DMA context' (IIRC)
> ...

 

I use the older 3123 driver.  It doesn't have the 2D problems of the 4XXX series (that were supposedly fixed but still are not to an acceptable level IMO) and seems to work better with the development kernels.

I simply changed the pmd_offset instance to pmd_offset_kernel in nv.c as was recommended in this thread, and the driver works fine.

----------

## thubble

Well, whaddya know... con has released O4int!

http://members.optusnet.com.au/ckolivas/kernel/2.5/patch-O4int-0307101041

Apparently it will prevent fully interactive tasks from becoming lower priority from small bursts of CPU usage. Maybe this'll solve the X problems (cursor jerkiness, etc.)?

Anyway, it's too late tonight, I'll test it tomorrow.

----------

## Nicom

Nobody else has adsl lockup problems then? I wonder what is wrong with my configuration.

----------

## R-Type

 *Safrax wrote:*   

> 
> 
> I use the older 3123 driver.  It doesn't have the 2D problems of the 4XXX series (that were supposedly fixed but still are not to an acceptable level IMO) and seems to work better with the development kernels.
> 
> I simply changed the pmd_offset instance to pmd_offset_kernel in nv.c as was recommended in this thread, and the driver works fine.

 

Just keep in mind that I did this as a quick-hack and at the very least its grossly incomplete if not outright wrong.  Someone posted that the ebuild for 4349's been updated with a better fix.  I recommend looking into what's done to that ebuild and see if you can retrofit it to 3123...

...of course, if it ain't broke....;)

----------

## floam

The 3xxx drivers really don't have much reason to be used. They are not better for the 2.5 series, no new patches are being released for them. The 2D problems have been nearly completely destroyed (as of 4363) as long as you have renderaccel on. Plus they're faster 3D-wise. What's the deal?

----------

## kuba

using the patch NVIDIA_kernel-1.0-4363-highpmd.diff supplied by portage i get this error when starting X -> 'cannot allocate DMA context' (IIRC); changing NVIDIA_kernel-1.0-4363-highpmd.diff by inserting the patch supplied in a link in a post in this thread, everything is fine and X starts normally.

----------

## Lovechild

Davide has pumped out a patch, could someone please test - I broke my box (again) so I'm unable to test right now, emerge -e world takes forever and a day.

http://www.xmailserver.org/linux-patches/softrr.html

----------

## zatalian

2.5.74-mm2 and 2.5.74-mm3 both refuse to work for me. They just stop booting. I tried with ACPI=OFF, but that does not change a thing.

All other development kernels (since 2.5.66) have been working great for me.

These are the last lines i get when booting 2.5.74-mm3

```

...

Console: colour VGA+ 80x25

Calibrating delay loop... 1437.69 BogoMIPS

Memory: 772208k/786424k available (3019k kernel code, 13440k reserved, 1089k data, 208k init, 0k highmem)

```

I don't see any errormessages in these lines. Can somebody tell me what the next line is supposed to be?

Thanks.

----------

## bssteph

I return to good news, I see. Downloading mm3 right now, but likely won't test until this afternoon (it's 5 am now... ).

Lovechild: Will gladly test; should the SCHED_SOFTRR patch be used in conjunction with Con's OXint patches, or should I patch a virgin 2.5.74 kernel?

Gotta zip: sleep deprivation is making my eyes jitter back and forth randomly and I'm getting dizzy

----------

## Lovechild

 *bssteph wrote:*   

> I return to good news, I see. Downloading mm3 right now, but likely won't test until this afternoon (it's 5 am now... ).
> 
> Lovechild: Will gladly test; should the SCHED_SOFTRR patch be used in conjunction with Con's OXint patches, or should I patch a virgin 2.5.74 kernel?
> 
> Gotta zip: sleep deprivation is making my eyes jitter back and forth randomly and I'm getting dizzy

 

If I understand the idea correctly it shouldn't matter, but try the -mm + O4int + sched-softrr first, that would be a rocking combo I think  :Smile: 

If that failes give vanilla a spin.

----------

## Tuna

zatalian: do you have SMP enabled? try without. my kernel hung at the same spot, and disableing SMP did the trick for me.

other issues i experienced with the kernel:

USB Mouse only wants to work if ACPI disabled. (acpi=off as kernel parameter)

nvidia module is not working right of the box (tested on mm3).

normally it would exit with that well known DMA error. but at first i was getting this:

```
(**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32

(==) NVIDIA(0): RGB weight 888

(==) NVIDIA(0): Default visual is TrueColor

(==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)

(--) NVIDIA(0): Linear framebuffer at 0xEC000000

(--) NVIDIA(0): MMIO registers at 0xEA000000
```

where it chickenend out.

----------

## zatalian

i have a dual p3, so disabling smp is not an option for me.

----------

## Safrax

How can I make a patch?  

```
diff --normal -r (kerneldir1) (kerneldir2) > (patchname)
```

 doesn't seem to work as expected.

I figure some people here might not be able to get the 2.5.74-mm3 + Con's + SCHEDRR to patch properly and I'd like to save em some time.

Edit:  Forgot to say it had the o1,o2,o3,o4 and granularity patches.

It feels really smooth.

----------

## daen1543

 *Tuna wrote:*   

> nvidia module is not working right of the box (tested on mm3).
> 
> normally it would exit with that well known DMA error. but at first i was getting this:
> 
> ```
> ...

 

I'm getting the same behavior. What's worse, is that now 2.5.74-mm1 which I had earlier running well with nvidia drivers, refuses to with the same symptoms. Weird 

 :Shocked: 

----------

## thubble

 *Safrax wrote:*   

> How can I make a patch?  
> 
> ```
> diff --normal -r (kerneldir1) (kerneldir2) > (patchname)
> ```
> ...

 

```
diff -ur (kerneldir1) (kerneldir2) > (patchname)
```

Note that I've never done this before, but this seems like the logical thing to do. Good luck.

----------

## Safrax

Thanks Thubble.  That did the trick.  I'll have a patch against 2.5.74-mm3 ready soon.

I have the patch ready but no place to host it.  Its currently 2.4KB in size.

----------

## bssteph

-mm3 + O4int + granularity + SOFTRR works, but feels like a bit of a regression. Got some mouse chop when Mozilla was rendering, didn't have that in just O3int + granularity. Will likely revert to -mm3 + granularity + SOFTRR.

I think I'm understanding what's going on with SOFTRR now, and it seems nice. Had XMMS use realtime priority (not running as root), and it seemed to not lag up as badly (I don't know what it is, but right now my Nautilus + XMMS blues are back, and with a vengance) as before. I'm assuming the patch has the realtime looks the same to userspace, therefore XMMS was using SOFTRR? If I'm wrong please correct. Starting that recompile now.

----------

## bssteph

 *Safrax wrote:*   

> Thanks Thubble.  That did the trick.  I'll have a patch against 2.5.74-mm3 ready soon.
> 
> I have the patch ready but no place to host it.  Its currently 2.4KB in size.

 

I can host, at least for a while. PM me with it in a code block or something.

http://www.bssteph.net/kernel/patch-O4int_and_softrr0.3

There. This should patch cleanly.

To apply the patch do this..

place the patch in /usr/src/linux-2.4.74-mm3/

cd /usr/src/linux-2.5.74-mm3

patch -p1 < patch_O4int_and_softrr0.3

You MUST have a clean 2.5.74-mm3 merge of course.

Note: sorry Safrax, this is my own patch. I had problems with yours. :\

----------

## Lovechild

Davide's idea is to follow text book scheduler design - less time more freqently to interactive tasks since we don't want them to saturate the CPU and thus starve other tasks, and longer time, less frequent for background jobs - at least that's what he described in his first mail.

I would set 

CHILD_PENALTY to 95 

and MAX TIMESLICE around 100-120, MIN stays at 10.

with this approach, since we want to deligate exponential rates from lowest to highest nice levels.

The problem of course is that the current scheduler scheme do not properly handle the two keys to scheduling seperately - Urgency and importance.

See sound, video, etc. may not be important but it's urgent - but the current nice scheme is very one dimensional.

Thus to provide handling of urgency, we have what's called interactivity - but this approach isn't really complete in it's current form.

I would like a setup that put more weight on urgency, since this would be the best approach for all setups (servers don't have many urgent tasks - thus importance scaling is the only concern). It seems however that the current setup prefers importance as the determining factor.

Maybe the way to go is to allow each driver to set an interactive flag or level and parse this to the scheduler when handling requests. This could be botched into the current framework with little trouble. It would however be ugly like hell. 

Maybe we need to go look at different scheduling technics like adaptive scheduling and such to provide good scalability on the desktop PCs - the old scheduler is dumb enough not to care really, where as the new one seems to try very hard to scale (it's O(1) after all - meaning equal time to handle a request despite the number of processes, the old one scaled linearly to the amount of processes O(n)).

The hard part about scheduling is that no one scheme fits perfectly on all setups - so we have to pick the very best fit for our target, and with Linux taking on the desktop, embedded and the highend - one single scheduler hardly seems to do the trick.

----------

## Safrax

 *bssteph wrote:*   

> Note: sorry Safrax, this is my own patch. I had problems with yours. :\

 

Ahh well.  It was the first time I made a patch.  Although I'd like to know more about why it didn't work.

Lovechild:  Nice post.  I agree with you on the state of the linux scheduler.  I've always wondered how hard it would be to allow the user to pick a scheduler (assuming there were multiple schedulers to choose from) in menuconfig.  Perhaps someday someone will code all that.

----------

## Lovechild

 *Safrax wrote:*   

>  *bssteph wrote:*   Note: sorry Safrax, this is my own patch. I had problems with yours. :\ 
> 
> Ahh well.  It was the first time I made a patch.  Although I'd like to know more about why it didn't work.
> 
> Lovechild:  Nice post.  I agree with you on the state of the linux scheduler.  I've always wondered how hard it would be to allow the user to pick a scheduler (assuming there were multiple schedulers to choose from) in menuconfig.  Perhaps someday someone will code all that.

 

I doubt that will ever happen, I think the more likely thing to happen will be exporting a bunch for scheduler knobs to sysfs and having userspace adjust it to fit the machine - O(1) is quite flexable if you disregard the general rules (like it used to and the 2.4 backport still does).

I fear however that the current trend will continue - everytime someone suggests an interactive improvement to mainline the webserver,database bigshots yell and scream that this is the end of highend scalability and that only jerkoffs and whiners would consider this. In the end servers pay for dinner so they get to keep their high throughput. And if that happens I see the only solution being a desktop oriented fork of the kernel backed by the movers and shakers in that field - Mandrake, Lindows, SuSE, etc. Since some many things really need to be changed to make Linux really good on the desktop.

----------

## bssteph

Safrax: Not why it didn't work, but there was a 0.3 for softrr and it seems the patch you gave me did 0.2. Also it applied granularity, which seemed bad for me, but then again, I remember my performance of .73+O1,2,3int being better than .74-mm3+O4.. Granularity will patch fine on its own, I imagine.

Anyway, the patch's actual error was that it complained every hunk was malformed. This may have been because of the board software replacing tabs with spaces, which I forgot about. But either way, the tabbing was enough to just have me do it myself. I'd only noticed it when I tried applying the patch.

Lovechild: Having drivers say how they wanted to be treated is probably the easiest, but as you said, certainly messy. And in the end it'd be nice if the scheduler were just good enough to do it on its own.

How much of these issues are handled by the elevators?

And the way interactivity is working now, does it say "I'm something that's certainly interacting with the user"? Because it seems a nice indication, but as far as I can tell from what you said, the better statement for these processes is "This needs to be done NOW"...the urgency.

While I was out I thought of a couple things that would be interesting to try with the scheduler, but never felt like actual fixes. Just workarounds. And admittedly, I'm not entirely knowledgable at this, but I gave it a think.

Anyway. O4+softrr is nice, there's a regression since my .73+O1,2,3 kernel, but that seems to have come from something else, as just .74-mm3 seemed slow. Once the reiserfs fix gets merged I'll probably leave mm again and just do sched patches.

(this post would have been more timely but I was literally dragged to go see T3 about halfway through it)

----------

## AlterEgo

2.5.75 is out  :Smile:  

----------

