# Kernel settings with Dual Xeon board?

## nellson

I am hoping to learn a bit from someone that knows how to optimize Gentoo for specific MB/CPU combos.

I have an Intel Server Board SE7525GP2 with two 3.0ghz Xeon cpu's, 2 gigs of memory.

Gentoo does see them as 4 CPU's, and I was curious when a buddy of mine ran a compiler race on several packages on his single proc AMD less ram&cpu speed, and came in a very close second every time. Just has me asking if maybe i am not tunes right?

Here was his little test, and he noticed a few errors in my logs I hadn't.

Any Advice on my system?

gubbie ~ # cat /proc/cpuinfo

processor       : 0

vendor_id       : GenuineIntel

cpu family      : 15

model           : 4

model name      : Intel(R) Xeon(TM) CPU 3.00GHz

stepping        : 3

cpu MHz         : 3000.000

cache size      : 2048 KB

(times two or four, depending on how you want to count em)

Blue ~ # cat /proc/cpuinfo

processor       : 0

vendor_id       : AuthenticAMD

cpu family      : 15

model           : 44

model name      : AMD Sempron(tm) Processor 3000+

stepping        : 0

cpu MHz         : 1800.000

cache size      : 128 KB

(only one here, no hyperthread or anything, and note the huge

difference in MHz and cache)

gubbie ~ # cat /proc/meminfo

MemTotal:      2074640 kB

Blue ~ # cat /proc/meminfo

MemTotal:       970444 kB

(not sure what speed your memory is, but mine is a gig of mere

DDR266)

gubbie ~ # splat gcc

 * sys-devel/gcc-3.4.6

        Emerged at: Tue Apr 11 20:13:10 2006

        Build time: 21 minutes, and 54 seconds

 * sys-devel/gcc-3.4.6-r1

        Emerged at: Sun Apr 30 14:06:26 2006

        Build time: 23 minutes

Blue ~ # splat gcc

 * sys-devel/gcc-3.4.6-r2

        Emerged at: Thu Sep  7 21:37:30 2006

        Build time: 28 minutes, and 26 seconds

        Emerged at: Fri Sep  8 03:13:02 2006

        Build time: 23 minutes, and 43 seconds

gubbie ~ # splat glibc

 * sys-libs/glibc-2.4-r1

        Emerged at: Fri Mar 24 03:53:33 2006

        Build time: 30 minutes, and 50 seconds

 * sys-libs/glibc-2.4-r3

        Emerged at: Sat May 13 16:04:21 2006

        Build time: 28 minutes, and 4 seconds

Blue ~ # splat glibc

 * sys-libs/glibc-2.4-r3

        Emerged at: Thu Sep  7 22:47:36 2006

        Build time: 41 minutes, and 9 seconds

        Emerged at: Fri Sep  8 03:36:45 2006

        Build time: 38 minutes, and 12 seconds

gubbie ~ # splat bind

 * net-dns/bind-9.3.2-r2

        Emerged at: Mon Jul 24 19:44:41 2006

        Build time: 5 minutes, and 52 seconds

 * net-dns/bind- 9.3.2-r3

        Emerged at: Wed Aug  2 10:00:04 2006

        Build time: 6 minutes, and 12 seconds

Blue ~ # splat bind

 * net-dns/bind-9.3.2-r3

        Emerged at: Fri Sep  8 01:43:05 2006

        Build time: 8 minutes, and 5 seconds

----

So, although gubbie did win across the board, it was nowhere near the

complete ass-whooping I expected. Compiling gcc they probably strip

the make flags but even at that, it should've been a race between a

single 1800mhz processor vs. a single 3000mhz processor and therefore

had more than a 43-second difference.

Also, in the input devices section of your kernel config, I would

suggest completely removing the "event debugging" one. Even if its

just a module, udev automatically starts it up every time causing

this spew of garbage in dmesg:

[17204488.508000] evbug.c: Event. Dev: isa0060/serio1/input0, Type:

0, Code: 0, Value: 0

[17204488.644000] evbug.c: Event. Dev: isa0060/serio1/input0, Type:

1, Code: 272, Value: 0

[17204488.644000] evbug.c: Event. Dev: isa0060/serio1/input0, Type:

0, Code: 0, Value: 0

And have you deciphered these mce events yet:

[20714790.020000] MCE: The hardware reports a non fatal, correctable

incident occurred on CPU 0.

[20714790.020000] MCE: The hardware reports a non fatal, correctable

incident occurred on CPU 1.

[20714790.020000] Bank 0: cc00000320040189

[20714790.020000] MCE: The hardware reports a non fatal, correctable

incident occurred on CPU 1.

[20714790.020000] Bank 0: cc00000320040189

[20714790.020000] MCE: The hardware reports a non fatal, correctable

incident occurred on CPU 0.

[20714790.020000] Bank 2: 9100000000000153

[20714790.020000] Bank 2: 9100000000000153

Portage 2.1.1_pre4-r4 (default-linux/x86/2006.0, gcc-3.4.6/vanilla, glibc-2.4-r3, 2.6.17-gentoo i686)

=================================================================

System uname: 2.6.17-gentoo i686 Intel(R) Xeon(TM) CPU 3.00GHz

Gentoo Base System version 1.12.4

Last Sync: Fri, 08 Sep 2006 09:50:01 +0000

distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]

app-admin/eselect-compiler: 2.0.0_rc2-r1

dev-lang/python:     2.4.3-r1

dev-python/pycrypto: 2.0.1-r5

dev-util/ccache:     [Not Present]

dev-util/confcache:  [Not Present]

sys-apps/sandbox:    1.2.18.1

sys-devel/autoconf:  2.13, 2.60

sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2

sys-devel/binutils:  2.17

sys-devel/gcc-config: 2.0.0_rc1

sys-devel/libtool:   1.5.22

virtual/os-headers:  2.6.17

ACCEPT_KEYWORDS="x86 ~x86"

AUTOCLEAN="yes"

CBUILD="i686-pc-linux-gnu"

CFLAGS="-march=pentium4 -O2 -fomit-frame-pointer -pipe"

CHOST="i686-pc-linux-gnu"

CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/lib/fax /usr/share/X11/xkb /usr/share/config /var/bind /var/spool/fax/etc"

CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/eselect/compiler /etc/gconf /etc/java-config/vms/ /etc/revdep-rebuild /etc/splash /etc/terminfo /etc/texmf/web2c"

CXXFLAGS="-march=pentium4 -O2 -fomit-frame-pointer -pipe"

DISTDIR="/usr/portage/distfiles"

FEATURES="autoconfig candy distcc distlocks metadata-transfer sandbox sfperms strict"

GENTOO_MIRRORS="http://gentoo.osuosl.org/ http://ftp.ucsb.edu/pub/mirrors/linux/gentoo/ http://mirror.espri.arizona.edu/gentoo/ "

LINGUAS=""

MAKEOPTS="-j5"

PKGDIR="/usr/portage/packages"

PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude='/distfiles' --exclude='/local' --exclude='/packages'"

PORTAGE_TMPDIR="/var/tmp"

PORTDIR="/usr/portage"

PORTDIR_OVERLAY="/usr/local/portage /usr/portage/local/layman/vmware"

SYNC="rsync://rsync.us.gentoo.org/gentoo-portage"

USE="x86 7zip X a52 aac acpi alsa apache2 ati avi berkdb bitmap-fonts bzip2 cdr cli crypt cups curl dbus dga djvu dlloader dmx dpms dri dts dv dvb dvd dvdr dvdread dvi eds elf elibc_glibc emboss encode esd ethereal exif fame ffmpeg fftw fits flac flash font-server foomaticdb fortran fpx fuse gdbm gecko-sdk geoip gif gimp glitz glx gnome gnutls gpgme gphoto2 gpm gs gstreamer gtk gtk2 gtkhtml gtkspell gzip h323 hal hddtemp id3 idn imagemagick imlib imlib2 input_devices_evdev input_devices_keyboard input_devices_mouse input_devices_synaptics isdnlog ithreads java jbig jfs jp2 jpeg jpeg2k kernel_linux lame lcms libg++ libsamplerate libwww lm_sensors logrotate lzo lzw mad math mbox md5sum mikmod milter ming mjpeg mmap mmx mmxext mng motif mozilla mozsvg mp3 mp4 mp4live mpeg mpeg2 mplayer mysql mysqli mythtv nautilus ncurses nls nptl nptlonly nsplugin ntfs ogg opengl oss pam pcre pdf pdflib perl pic pie player png ppds pppd pwdb python pyzor qt3 qt4 quicktime rar razor readline reflection reiserfs rle rpm samba sdk sdl sensord server session sftp slang smime smp snmp sox spamassassin speex spell spf spl srs sse sse-filters sse2 ssl stream svg svgz sysfs szip t1lib tcpd theora threads tiff tools tos truetype truetype-fonts type1 type1-fonts udev unzip usb userland_GNU v4l v4l2 vdr vhosts video_cards_ati video_cards_v4l video_cards_vesa vlm vorbis win32codecs wma wma123 wmf x264 xfs xinetd xml xml2 xmms xorg xosd xpm xprint xscreensaver xv xvid yv12 zip zlib zvbi"

Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, PORTAGE_RSYNC_EXTRA_OPTS

----------

## phorn

You might want to run "top" while compiling, in a different terminal or screen window.

Then hit number "1" to see info for all 4 CPU's

You should see all four at about 95-100% or so while compiling, and you should see about three "cc1" processes at the top, two of which are using about 100% CPU. If you see high values for anything else (interrupts, or idiling, or waiting) then that may explain the problem.  If everything's near 100% CPU, then your system's operating as efficiently as it can, so that would be really wierd that it's so much slower, unless it's a network or harddrive issue.

Maybe your hard drive's the bottleneck... I would recommend mounting /var/tmp as a tmpfs (in your fstab)... since you have enough RAM to hold it (I do it even on servers with 700MB of RAM, and they can take it except when compiling GCC, but 2GB RAM will be no problem for GCC.)

```
tmpfs           /var/tmp        tmpfs   defaults                0       0
```

```
mount -ttmpfs tmpfs /var/tmp
```

----------

## nellson

Ok, I started an "emerge -uDav --newuse world" and it had 88 packages to hit. When it actually hit the compiler, top only showned 1 instance of cc1, and it took the first CPU to 70% so far. Then I remembered that as I have two identical systems, I treid setting up DISTCC, and sure enough there were 4 instances of cc1 and distcc running, taking maybe 30-40% each. Kinda odd that the initiating machine has just one. Perhaps I'll shut off the DISTCC stuff and try the ram based tmpfs as you mentioned.

I disabled DISTCC and restarted to compile. No instances on the second server, but only 1 cc1 even while compiling gcc itself.

I am using -j5 in my /etc/make.conf.

----------

## drescherjm

A couple of things. A single cpu sempron 3000 should be [EDIT]around the same speed[/EDIT] at compiling than a single xeon at 3.0GHz. I can go into the reason for this if you want to know. The second is that some ebuilds will ignore the MAKEOPTS from /etc/make.conf and use 1 so this forces a single cpu build.

----------

## DawgG

just a stupid question: have u enabled and configured all the necessary SMP-opts in the kernel?

----------

## drescherjm

One idea I have thought of is that -j5 is too high. Think of this HT (hyperthreading) will give you a -5% to +25 % performance boost verses a single processor but in this situation you are counting this as 100%. In my opinion the best number for MAKEOPTS should be 3 or 4 but not 5 and I will tell you setting this number too high will reduce performance.

----------

## nellson

Ok, I had heard that using the -j5 while following the HOWTO may not actually help, and that some compiling actually works better if you set it for 3 (counting on the physical CPU's +1. I will try that on the same compile.

Other than that, if the speed is only during a compile, maybe I ought to check other CPN benchmarkers that don't just deal with my compiler?

As for the SMP in the Kernel, no.. I am not sure I have them right. The section on CPU prefs in make menuconfig: I bolded one questionable setting, the preemptive stuff, should I be running it as a server?

  x x                    [*] Symmetric multi-processing support                                                     x x

  x x                        Subarchitecture Type (PC-compatible)  --->                                             x x

  x x                        Processor family (Pentium-4/Celeron(P4-based)/Pentium-4 M/Xeon)  --->                  x x

  x x                    [ ] Generic x86 support                                                                    x x

  x x                    [*] HPET Timer Support                                                                     x x

  x x                    ( :Cool:  Maximum number of CPUs (2-255)                                                         x x

  x x                    [*] SMT (Hyperthreading) scheduler support                                                 x x

  x x                    [*] Multi-core scheduler support                                                           x x

  x x                        Preemption Model (Preemptible Kernel (Low-Latency Desktop))  --->                      x x

  x x                    [*] Preempt The Big Kernel Lock                                                            x x

  x x                    [*] Machine Check Exception                                                                x x

  x x                    <*>   Check for non-fatal errors on AMD Athlon/Duron / Intel Pentium 4                     x x

  x x                    [*]   check for P4 thermal throttling interrupt.                                           x x

  x x                    < > Toshiba Laptop support                                                                 x x

  x x                    < > Dell laptop support                                                                    x x

  x x                    [ ] Enable X86 board specific fixups for reboot                                            x x

  x x                    <M> /dev/cpu/microcode - Intel IA32 CPU microcode support                                  x x

  x x                    <*> /dev/cpu/*/msr - Model-specific register support                                       x x

  x x                    <*> /dev/cpu/*/cpuid - CPU information support                                             x x

  x x                        Firmware Drivers  --->                                                                 x x

  x x                        High Memory Support (4GB)  --->                                                        x x

  x x                        Memory model (Flat Memory)  --->                                                       x x

  x x                    [*] Allocate 3rd-level pagetables from highmem                                             x x

  x x                    [ ] Math emulation                                                                         x x

  x x                    [*] MTRR (Memory Type Range Register) support                                              x x

  x x                    [*] Boot from EFI support (EXPERIMENTAL)                                                   x x

  x x                    [*] Enable kernel irq balancing                                                            x x

  x x                    [*] Use register arguments                                                                 x x

  x x                    [*] Enable seccomp to safely compute untrusted bytecode                                    x x

  x x                        Timer frequency (250 HZ)  --->

----------

## drescherjm

Your kernel params look good to me assuming the smiley face is at least a 4. Ok I just looked it is the defaut 8...

[EDIT]I do see one thing. Generic x86 is not set. 

 *Quote:*   

>  x x [ ] Generic x86 support x x

 

I have this set in my x86 configs.[/EDIT]

----------

## drescherjm

 *Quote:*   

>  I bolded one questionable setting, the preemptive stuff, should I be running it as a server? 

 

I do not think so. I do have quite a few smp boxes but I have always set it to the same as you becasue if not the system will be unresponsive at times under high cpu load.

----------

## drescherjm

Are you using speed step in the CPUFreq processor drivers section of the kernel .config file?

----------

## nellson

I believe I am using CPU frequency scaling. So my system stays cool when it's not busy.

That a wise move for a server?

          [*] CPU Frequency scaling                                                                  x x

  x x                    [ ]   Enable CPUfreq debugging                                                             x x

  x x                    <*>   CPU frequency translation statistics                                                 x x

  x x                    [*]     CPU frequency translation statistics details                                       x x

  x x                          Default CPUFreq governor (performance)  --->                                         x x

  x x                    ---   'performance' governor                                                               x x

  x x                    <M>   'powersave' governor                                                                 x x

  x x                    <M>   'userspace' governor for userspace frequency scaling                                 x x

  x x                    <M>   'ondemand' cpufreq policy governor                                                   x x

  x x                    <M>   'conservative' cpufreq governor                                                      x x

  x x                    ---   CPUFreq processor drivers                                                            x x

  x x                    < >   ACPI Processor P-States driver                                                       x x

  x x                    < >   AMD Mobile K6-2/K6-3 PowerNow!                                                       x x

  x x                    < >   AMD Mobile Athlon/Duron PowerNow!                                                    x x

  x x                    < >   AMD Opteron/Athlon64 PowerNow!                                                       x x

  x x                    < >   Cyrix MediaGX/NatSemi Geode Suspend Modulation                                       x x

  x x                    < >   Intel Enhanced SpeedStep                                                             x x

  x x                    < >   Intel Speedstep on ICH-M chipsets (ioport interface)                                 x x

  x x                    < >   Intel SpeedStep on 440BX/ZX/MX chipsets (SMI interface)                              x x

  x x                    <*>   Intel Pentium 4 clock modulation                                                     x x

  x x                    < >   nVidia nForce2 FSB changing                                                          x x

  x x                    < >   Transmeta LongRun                                                                    x x

  x x                    ---   shared options

----------

## nellson

I am not sure why, but my buddy didn't think that was necessary. Where would I find out more about it? 

 *drescherjm wrote:*   

> Your kernel params look good to me assuming the smiley face is at least a 4. Ok I just looked it is the defaut 8...
> 
> [EDIT]I do see one thing. Generic x86 is not set. 
> 
>  *Quote:*    x x [ ] Generic x86 support x x 
> ...

 

----------

## drescherjm

 *Quote:*   

> That a wise move for a server?

 

Yes you should have power saving turned on, but we are trying to track down the cause of reduced performance and this is an area which could cause that.

----------

## nellson

Ok, I have -J3 and /var/tmp in ram set, I disabled DISTCC, and I started "emerge gcc'

I only ever get 1 instance of cc1 in TOP (opened to show all CPU's) and occasionally 1 of the CPU's will hit 100%, never more than two at a time over 80%, watching updates every sec)

Could the DISTCC I set up be not quite out of the picture? Or is it normal for gcc to ignore the makeopts? and run with just -J1 or something?

Nick

----------

## nellson

So, I am guessing not a huge difference, eh? 

 * sys-devel/gcc-4.1.1-r1

        Emerged at: Sat Sep  9 11:25:32 2006      <- previous setup

        Build time: 49 minutes

        Emerged at: Sun Sep 10 15:11:48 2006     <- -J3, tmpfs for /var/tmp, no DISTCC

        Build time: 48 minutes, and 4 seconds

----------

## DawgG

i think gcc is a very peculiar build. why don't u try sth else that's pretty big, like firefox or kdelibs? (as this box seems to be no "production" server  :wink: )

( @nellson: 

CONFIG_X86_GENERIC:

 Instead of just including optimizations for the selected

 x86 variant (e.g. PII, Crusoe or Athlon), include some more

 generic optimizations as well. This will make the kernel

 perform better on x86 CPUs other than that selected.

 This is really intended for distributors who need more

 generic optimizations. 

i don't think this will do much harm, but it wont do any good either.

)

----------

## drescherjm

I have verified on a couple of smp systems that gcc-4.1.1-r1 only will use one cpu during the build process and times are in the 40 to 50 min range. To do this I was running xwindows and had a gDesklets cpu usage gauge running and the cpu usage never went above 55% percent on a 2 processor system with most of the time it was stuck at 50%.

----------

## nellson

Here is mozilla-firefox:

* www-client/mozilla-firefox-1.5.0.6

        Emerged at: Mon Aug  7 21:16:48 2006

        Build time: 24 minutes, and 15 seconds

        Emerged at: Mon Sep 11 20:16:27 2006

        Build time: 23 minutes, and 56 seconds

Barely any difference.  So either I am doing pretty good all around, or I am missing something.

What other benchmarks would you folks use to validate system optimization? 

I figure what they are used for can play a factor, so "Gubbie" is an Apache2/Bind/Sendmail system, with Xorg logins for various user tasks. Goonie is a Asterisk PBX + VMWare Server + Kismet + Prelude IDS. 

I am not sure which system needs more of any specific resource, but there ya have it. 

I'll see if I can find some resource benchmarking tools and compare them with similar systems.

Nick

----------

## drescherjm

Did you compare that number (mozilla firefox) to your friends number? During any compairson make sure ccache is off. I accidently recompiled  www-client/mozilla-firefox-1.5.0.6 with ccache on and my time was ~ 8 minutes.

----------

