# MTRR allocation not optimal

## kernelOfTruth

Hi guys,

I'm getting the following error message in dmesg at boot:

mtrr_cleanup: can not find optimal value

please specify mtrr_gran_size/mtrr_chunk_size

http://pastebin.com/sJqqhaRg

when specifying any settings it still fails and continues in searching gran_size, chunk_size but is not successful

 *Quote:*   

> cat /proc/mtrr 
> 
> reg00: base=0x000000000 (    0MB), size= 8192MB, count=1: write-back
> 
> reg01: base=0x200000000 ( 8192MB), size= 1024MB, count=1: write-back
> ...

 

any ideas on how to "fix" this ?

Many thanks in advance  :Smile: 

edit:

does it even play a role when PAT is enabled ?

----------

## aCOSwt

Are you running a 3.10-rc?

If yes then... all I can tell right now is that... you are not alone !   :Very Happy: 

BTW... lose cover RAM: -0G... that looks quite odd.

----------

## kernelOfTruth

 *aCOSwt wrote:*   

> Are you running a 3.10-rc?
> 
> If yes then... all I can tell right now is that... you are not alone !  
> 
> BTW... lose cover RAM: -0G... that looks quite odd.

 

haha - yeah   :Mr. Green: 

yes, that looks weird   :Confused: 

some miscalculation, kernel bug, issues on my box - no idea   :Sad: 

just looked through kernel log and I'm not 100% sure but it didn't seem to have existed with rc3

I'm running latest upstream sources - so it must be a regression

----------

## wcg

2011: http://lkml.indiana.edu/hypermail/linux/kernel/1106.0/01263.html

Some discussion: http://my-fuzzy-logic.de/blog/index.php?/archives/41-Solving-linux-MTRR-problems.html

MTRRs and PATs (page attribute tables), a little clarity:

https://patchwork.kernel.org/patch/1424661/

----------

## kernelOfTruth

thanks for those links !

the (supposedly) optimal settings for my system are:

```
cat /usr/src/mtrr_proc-mtrr_3.8.13.txt 

reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back

reg01: base=0x080000000 ( 2048MB), size= 1024MB, count=1: write-back

reg02: base=0x100000000 ( 4096MB), size= 4096MB, count=1: write-back

reg03: base=0x200000000 ( 8192MB), size= 1024MB, count=1: write-back

reg04: base=0x0d0000000 ( 3328MB), size=  256MB, count=1: write-combining

```

I wrote a simple bash-script which I'll launch via an init script on boot

but beforehand I'd like to know your opinion whether it's correct:

```
#!/bin/sh

echo "disable=5" > /proc/mtrr

echo "disable=4" > /proc/mtrr

echo "disable=3" > /proc/mtrr

echo "disable=2" > /proc/mtrr

echo "disable=1" > /proc/mtrr

echo "disable=0" > /proc/mtrr

echo "base=0x0 size=0x80000000 type=write-back" > /proc/mtrr

echo "base=0x80000000 size=0x40000000 type=write-back" > /proc/mtrr

echo "base=0x100000000 size=0x100000000 type=write-back" > /proc/mtrr

echo "base=0x200000000 size=0x40000000 type=write-back" > /proc/mtrr

echo "base=0x0d0000000 size=0x010000000 type=write-combining" > /proc/mtrr
```

is 

```
echo "disable=5" > /proc/mtrr

echo "disable=4" > /proc/mtrr

echo "disable=3" > /proc/mtrr

echo "disable=2" > /proc/mtrr

echo "disable=1" > /proc/mtrr

echo "disable=0" > /proc/mtrr
```

the correct order ?

or would be

```
echo "disable=0" > /proc/mtrr

echo "disable=1" > /proc/mtrr

echo "disable=2" > /proc/mtrr

echo "disable=3" > /proc/mtrr

echo "disable=4" > /proc/mtrr

echo "disable=5" > /proc/mtrr
```

better ? - if yes - what is "better" in this context ?

edit:

I've read on the net that the order "matters" - hm, ok

but in what connection ?

from higher ranges to lower ones ?

edit.

I'm getting several of the following messages when attempting to run the script successfully:

 *Quote:*   

> bash: echo: write error: Invalid argument 

 

the result however "seems" OK:

 *Quote:*   

> cat /proc/mtrr 
> 
> reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back
> 
> reg01: base=0x080000000 ( 2048MB), size= 1024MB, count=1: write-back
> ...

 

but output of dmesg e.g. says:

 *Quote:*   

> [  109.603518] mtrr: MTRR 5 not used
> 
> [  109.604112] mtrr: MTRR 4 not used
> 
> [  109.604403] mtrr: MTRR 3 not used
> ...

 

when running the (modified) script several times:

 *Quote:*   

> [  287.193885] mtrr: MTRR 5 not used
> 
> [  287.620787] mtrr: MTRR 0 not used
> 
> [  290.169564] mtrr: MTRR 1 not used
> ...

 

what does that mean ?

----------

## kernelOfTruth

continuing the past post:

changing the order indeed make a change: 

it now applies the changes without error:

```
cat /proc/mtrr 

reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back

reg01: base=0x080000000 ( 2048MB), size= 1024MB, count=1: write-back

reg02: base=0x100000000 ( 4096MB), size= 4096MB, count=1: write-back

reg03: base=0x200000000 ( 8192MB), size= 1024MB, count=1: write-back

reg04: base=0x0d0000000 ( 3328MB), size=  256MB, count=1: write-combining
```

mtrr doesn't play a large role anymore (probably little on desktops) 

but next time I encounter this on my laptop - it might be helpful

current lkml activity:

[url=[PATCH buf-fix] kernel, range: fix broken mtrr_cleanup] http://marc.info/?l=linux-kernel&m=137077095817521&w=2 [/url]

nice   :Smile: 

----------

## aCOSwt

 *kernelOfTruth wrote:*   

> current lkml activity:
> 
> [url=[PATCH buf-fix] kernel, range: fix broken mtrr_cleanup] http://marc.info/?l=linux-kernel&m=137077095817521&w=2 [/url]
> 
> nice  

 

Cool! Thanks for the link kOT!

----------

## wcg

I looked in /proc/cpuinfo on a couple of amd systems. They both

have "pat" in flags (athlon X2 and phenom II). I take it that means

"page attribute table" support, which will override mtrr settings in

operation on a page-by-page basis in the page tables (setting acesses

to the page uncached, writeback, writethrough, write-protected,

or write-combining). So I can disable the mtrr_cleanup and forget it

on those systems?

Apparently PAT support was officially added to the kernel in

2.6.26: http://kernelnewbies.org/Linux_2_6_26#head-619b02eadc63322536a5ac956d72ca32035216c3

(I do not really get how PAT is "complementary" to mtrrs, though.

One of the kernel messages says the PAT setting on the page

takes precedence over the mtrr setting for the range that page

is in, so the kernel/cpu must check each page anyway for kernels

that have PAT enabled on cpus that support it.)

edit:

I guess in general it is reasonable to leave the mtrr sanitizer enabled

in case your cpu turns up on a list of "PAT support broken on these

cpu models" when the kernel is setting up page attributes in the

page tables. The worst that can happen is log noise.

----------

## kernelOfTruth

 *wcg wrote:*   

> I looked in /proc/cpuinfo on a couple of amd systems. They both
> 
> have "pat" in flags (athlon X2 and phenom II). I take it that means
> 
> "page attribute table" support, which will override mtrr settings in
> ...

 

thanks !

ok, so it's not obsolete at all - hm   :Confused: 

https://bugzilla.kernel.org/show_bug.cgi?id=59491

"proper" fix for this issue:

[url="[PATCH 1/2] x86,mtrr: Fix original mtrr range get for mtrr_cleanup"] http://marc.info/?l=linux-kernel&m=137080807327118&w=2 [/url]

[url="[PATCH 2/2] x86, range: make add_range use blank slot"] http://marc.info/?l=linux-kernel&m=137080805927115&w=2 [/url]

----------

## wcg

We wish the mtrr sanitizer was obsolete, and for a lot of cpus with

working PAT support it is, but it is not obsolete on all of the systems

that linux might run on, and that is the kicker.

You get somebody with Shorewall running on a pII or a K6 or something,

they do not need the performance of a faster cpu, they probably do not

need the performance of write-combining support for their gpu, but they

do not have working PAT support, so they need usable mtrr ranges, and

the kernel may be better at setting those up than their BIOS.

This closed thread from stackoverflow has a detailed explanation

of what happens for loads and stores in the different caching modes

set up by these "memory type" ranges and page attributes:

http://stackoverflow.com/questions/13297178/how-mtrr-registers-implemented

----------

## kernelOfTruth

thanks for that link, wcg !

ran into that problem again but with my new computer:

 *Quote:*   

> [    0.000000] MTRR variable ranges enabled:
> 
> [    0.000000]   0 base 0000000000 mask 7800000000 write-back
> 
> [    0.000000]   1 base 0800000000 mask 7FE0000000 write-back
> ...

 

stupid   :Sad: 

edit:

hm, seems that it got fixed while switching from the iGPU to the nvidia card (also had graphical lockups during usage)

optimal settings now are (at least for this boot):

 *Quote:*   

> [    0.000000] MTRR default type: uncachable
> 
> [    0.000000] MTRR fixed ranges enabled:
> 
> [    0.000000]   00000-9FFFF write-back
> ...

 

----------

## Yamakuzure

Just fix it by looking where the best results come out. The best would be "lose cover RAM" to be zero. It must not be negative. If there is no zero result, take the best one with a positive lose.

On my system I have in /etc/default/grub:

```
enable_mtrr_cleanup mtrr_spare_reg_nr=1 mtrr_gran_size=32M mtrr_chunk_size=1G
```

Resulting in a lose cover RAM of 8MB or so. (Must check, after some days uptime dmesg doesn't reach back that far any more...)

----------

