# [UNSOLVED] segfault in ld-2.11.3.so during revdep-rebuild

## JohnBlbec

i can see following in 2.6.38.4-zen

kernel: ld-linux-x86-64[14251]: segfault at 200040 ip 00007f985f26b77c sp 00007fffa0e430a0 error 4 in ld-2.11.3.so[7f985f269000+1e000]

kernel: ld-linux-x86-64[15942]: segfault at 200040 ip 00007f476e38c77c sp 00007fffeeac4020 error 4 in ld-2.11.3.so[7f476e38a000+1e000]

kernel: ld-linux-x86-64[18139]: segfault at 200040 ip 00007f68e841e77c sp 00007fffce021dc0 error 4 in ld-2.11.3.so[7f68e841c000+1e000]

kernel: ld-linux-x86-64[18146]: segfault at 200040 ip 00007fce12de277c sp 00007fff31ccef50 error 4 in ld-2.11.3.so[7fce12de0000+1e000]

kernel: ld-linux-x86-64[18153]: segfault at 200040 ip 00007fa0918b077c sp 00007fff8a176140 error 4 in ld-2.11.3.so[7fa0918ae000+1e000]

kernel: ld-linux-x86-64[18199]: segfault at 200040 ip 00007f6184f4377c sp 00007fff56d6ac90 error 4 in ld-2.11.3.so[7f6184f41000+1e000]

kernel: ld-linux-x86-64[18894]: segfault at 200040 ip 00007fd95500577c sp 00007fffd8c37440 error 4 in ld-2.11.3.so[7fd955003000+1e000]

```
$ emerge --info

Portage 2.1.9.42 (default/linux/amd64/10.0, gcc-4.4.5, libc-0-r0, 2.6.38.4-zen x86_64)

=================================================================

System uname: Linux-2.6.38.4-zen-x86_64-Intel-R-_Core-TM-2_Quad_CPU_Q9450_@_2.66GHz-with-gentoo-2.0.2

Timestamp of tree: Sun, 08 May 2011 16:30:01 +0000

ccache version 2.4 [enabled]

app-shells/bash:     4.1_p9

dev-java/java-config: 2.1.11-r3

dev-lang/python:     2.7.1-r1, 3.1.3-r1

dev-util/ccache:     2.4-r9

dev-util/cmake:      2.8.4-r1

sys-apps/baselayout: 2.0.2

sys-apps/openrc:     0.8.2-r1

sys-apps/sandbox:    2.4

sys-devel/autoconf:  2.13, 2.65-r1

sys-devel/automake:  1.9.6-r3, 1.10.3, 1.11.1

sys-devel/binutils:  2.20.1-r1

sys-devel/gcc:       4.4.5

sys-devel/gcc-config: 1.4.1-r1

sys-devel/libtool:   2.2.10

sys-devel/make:      3.81-r2

sys-kernel/linux-headers: 2.6.36.1

sys-libs/glibc:      2.11.3

virtual/os-headers:  0

ACCEPT_KEYWORDS="amd64"

ACCEPT_LICENSE="* -@EULA"

CBUILD="x86_64-pc-linux-gnu"

CFLAGS="-march=native -O2 -pipe"

CHOST="x86_64-pc-linux-gnu"

CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/openvpn/easy-rsa"

CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"

CXXFLAGS="-march=native -O2 -pipe"

DISTDIR="/usr/portage/distfiles"

FEATURES="assume-digests binpkg-logs ccache distlocks fixlafiles fixpackages news parallel-fetch protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"

FFLAGS=""

GENTOO_MIRRORS="http://gentoo.mirror.dkm.cz/pub/gentoo/ http://ftp.fi.muni.cz/pub/linux/gentoo/ "

LANG="en_US.UTF-8"

LDFLAGS="-Wl,-O1 -Wl,--as-needed"

LINGUAS="en cs"

MAKEOPTS="-j5"

PKGDIR="/usr/portage/packages"

PORTAGE_CONFIGROOT="/"

PORTAGE_RSYNC_EXTRA_OPTS="--exclude-from=/etc/portage/rsync-excludes"

PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"

PORTAGE_TMPDIR="/var/tmp"

PORTDIR="/usr/portage"

PORTDIR_OVERLAY="/usr/local/portage"

SYNC="rsync://rsync.gentoo.org/gentoo-portage"

USE="X acl alsa amd64 avahi bash-completion berkdb bzip2 cli cracklib crypt cups cxx dbus dri gdbm gif gpm iconv jpeg mmx modules mp3 mudflap multilib ncurses nls nptl nptlonly nvidia opengl openmp pam pcre perl png pppd pulseaudio python qt3support readline session slang sse sse2 ssl symlink sysfs tcpd threads tiff truetype unicode xorg zlib" ALSA_CARDS="hda-intel" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en cs" PHP_TARGETS="php5-3" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" 

Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS
```

Last edited by JohnBlbec on Tue May 17, 2011 7:55 pm; edited 4 times in total

----------

## krinn

broken toolchain build broken programs, hence the virtual/os-headers outputing a 0 version ?

rebuild your toolchain with valid headers.

----------

## JohnBlbec

 *krinn wrote:*   

> broken toolchain build broken programs, hence the virtual/os-headers outputing a 0 version ?
> 
> rebuild your toolchain with valid headers.

 

thanks for your answer. i will try to rebuild toolchain and system...

----------

## krinn

temp disable ccache while doing it, might be longer but might save you time to not reuse bad cache code when rebuilding it

----------

## aCOSwt

 *krinn wrote:*   

> broken toolchain build broken programs, hence the virtual/os-headers outputing a 0 version ?

 

I get the same emerge --info report.

This does not seem wrong to me, though surprising, os-headers is actually a 0 version.   :Confused: 

http://znurt.org/search.php?search=&q=os-headers&x=0&y=0

----------

## krinn

after checking, i have it too  :Smile: 

another portage output change? a bit weird, anyway my suggest is still valid: segfault = hardware or toolchain or the program is buggy. i just prefer bet on toolchain first, as many users just let portage update when it wish the toolchain leaving it in a unknow status. Also ld-2.11.3.so is glibc.

This won't kill anyone, and with (now sure) clean toolchain, user could goes to hardware tracking issue (the classic mem -> heat -> overclock...)

----------

## JohnBlbec

 *krinn wrote:*   

> temp disable ccache while doing it, might be longer but might save you time to not reuse bad cache code when rebuilding it

 

good point, thanks

----------

## JohnBlbec

 *JohnBlbec wrote:*   

>  *krinn wrote:*   broken toolchain build broken programs, hence the virtual/os-headers outputing a 0 version ?
> 
> rebuild your toolchain with valid headers. 
> 
> thanks for your answer. i will try to rebuild toolchain and system...

 

it seems it helps :-)

----------

## JohnBlbec

unfortunately, it appears today again even i have replaced ram (for sure) :-(

----------

## krinn

sad to read that...

option1: toolchain, option2: hardware, option3: software

option3: did you get the message randomly, anything attach to it, maybe a buggy program is doing the segfault... any clues exepct those ?

also, tried with another zen version, or non-zen kernel (not all patchs are good to get)

----------

## JohnBlbec

 *krinn wrote:*   

> sad to read that...
> 
> option1: toolchain, option2: hardware, option3: software
> 
> option3: did you get the message randomly, anything attach to it, maybe a buggy program is doing the segfault... any clues exepct those ?
> ...

 

yes i am getting the message randomly. toolchain has been recompiled and that is why i do not think one is the problem but you are right, i will try other kernel. thanks for the advice. i hope it will not be a hw problem :-(

----------

## aCOSwt

My 2cts just for fun as it takes less time than testing another kernel :

```
# setarch x86_64 --addr-compat-layout ldd
```

----------

## Hu

Random failures usually implicate hardware.  What is your typical internal temperature when you get a failure?

----------

## JohnBlbec

 *Hu wrote:*   

> Random failures usually implicate hardware.  What is your typical internal temperature when you get a failure?

 

cpu ... 28C

gpu ... 36C

in case ... 32C

----------

## krinn

like Hu said, generally random failures are hardware, it could be anything, from bad ram modules, to incompatible ram modules with another one, or with a m/b, to random peeks in the power supply, overclock, temp...

you should try a livedvd/cd, and let it work with it to see if another env get same results, if yes, you know some hardware part is failing (not dead, but still not working as it should)

----------

## JohnBlbec

 *krinn wrote:*   

> like Hu said, generally random failures are hardware, it could be anything, from bad ram modules, to incompatible ram modules with another one, or with a m/b, to random peeks in the power supply, overclock, temp...
> 
> you should try a livedvd/cd, and let it work with it to see if another env get same results, if yes, you know some hardware part is failing (not dead, but still not working as it should)

 

i agree and i am absolutely for - i will try livecd...

just for info:

- my pc is not overclocked

- ram modules are ok, i am sure

- power suply is ok too (btw. ups is used)

- temperature is ok

my suspicious is mb :-(

----------

## krinn

by power supply i mean the psu, so even using ups won't help, if the psu is dying it might gave unstable power to 1 or many lines, and depending what device is attach to it, that device will suffer from that random supply. And temperatures are climbing, this mean psu need to deliver more to all components because fan load should be higher to keep temp low, add to that that a psu efficiency will suffer itself from heat and humidity, the psu average delivery should be lowered already, so one with a poor psu with an already poor efficiency will be worst, and one with a highend psu with high efficiency but short on psu power will suffer from too many device drowing power.

Just hint you that, because it's always easy to find an hardware that is dead, but a real pain when you're seeking hardware part that isn't dead yet, but have one foot on the grave.

----------

## JohnBlbec

 *krinn wrote:*   

> by power supply i mean the psu, ...

 

i understood, it was just additional info

 *krinn wrote:*   

> so even using ups won't help, if the psu is dying it might gave unstable power to 1 or many lines, and depending what device is attach to it, that device will suffer from that random supply. And temperatures are climbing, this mean psu need to deliver more to all components because fan load should be higher to keep temp low, add to that that a psu efficiency will suffer itself from heat and humidity, the psu average delivery should be lowered already, so one with a poor psu with an already poor efficiency will be worst, and one with a highend psu with high efficiency but short on psu power will suffer from too many device drowing power.
> 
> Just hint you that, because it's always easy to find an hardware that is dead, but a real pain when you're seeking hardware part that isn't dead yet, but have one foot on the grave.

 

exactly, you are right again. it is really quite difficult to find what is the problematic part of hw :-(

----------

## JohnBlbec

i have localized the problem and now i am able to reproduce the bug. i can see following segfault

anytime i execute revdep-rebuild :-(

```
May 15 13:35:52 rpc-linux kernel: ld-linux-x86-64[27821]: segfault at 200040 ip 00007fe9ef89877c sp 00007fffd3066bb0 error 4 in ld-2.11.3.so[7fe9ef896000+1e000]

May 15 13:35:53 rpc-linux kernel: ld-linux-x86-64[29577]: segfault at 200040 ip 00007f356308577c sp 00007fffc8a4ef10 error 4 in ld-2.11.3.so[7f3563083000+1e000]

May 15 13:35:54 rpc-linux kernel: ld-linux-x86-64[31774]: segfault at 200040 ip 00007f84a09e277c sp 00007fff2e925a20 error 4 in ld-2.11.3.so[7f84a09e0000+1e000]

May 15 13:35:54 rpc-linux kernel: ld-linux-x86-64[31781]: segfault at 200040 ip 00007fbbd021877c sp 00007fff65e260e0 error 4 in ld-2.11.3.so[7fbbd0216000+1e000]

May 15 13:35:54 rpc-linux kernel: ld-linux-x86-64[31788]: segfault at 200040 ip 00007fd2bea2477c sp 00007fff8fcc0d60 error 4 in ld-2.11.3.so[7fd2bea22000+1e000]

May 15 13:35:54 rpc-linux kernel: ld-linux-x86-64[31834]: segfault at 200040 ip 00007f4f3537877c sp 00007fffd92dec90 error 4 in ld-2.11.3.so[7f4f35376000+1e000]

May 15 13:35:55 rpc-linux kernel: ld-linux-x86-64[32529]: segfault at 200040 ip 00007f1bc2f3577c sp 00007fffc062a300 error 4 in ld-2.11.3.so[7f1bc2f33000+1e000]
```

----------

## krinn

great, no more random, you're half way to solve. just check on what revdep-rebuild fail/trigger the error (i mean phase, collecting datas, finding programs or asking the rebuild)

----------

## JohnBlbec

 *krinn wrote:*   

> great, no more random, you're half way to solve. just check on what revdep-rebuild fail/trigger the error (i mean phase, collecting datas, finding programs or asking the rebuild)

 

```
# revdep-rebuild 

 * Configuring search environment for revdep-rebuild

 * Environment mismatch from previous run, deleting temporary files...

 * Checking reverse dependencies

 * Packages containing binaries and libraries broken by a package update

 * will be emerged.

 * Collecting system binaries and libraries

 * Generated new 1_files.rr

 * Collecting complete LD_LIBRARY_PATH

 * Generated new 2_ldpath.rr

 * Checking dynamic linking consistency

[ 9% ]
```

and just now the first segfault appears

----------

## JohnBlbec

well, i have contacted paul varner (he is a maintainer of gentoolikt package)

and this is his answer:

```
You are probably experiencing Bug 187644. Try running:

revdep-rebuild --no-ld-path

and see if that helps. Unfortunately, we don't have a real solution to

this problem yet.

Regards,

Paul
```

unfortunately, it did not help :-(

----------

## krinn

did you check the revdep-rebuild.3_ldd_errors file ?

maybe it cannot be fix, but maybe it could be avoid (like if your ldd segfault on a forgotten dead lib lying in your system that could be remove)

----------

## JohnBlbec

 *krinn wrote:*   

> did you check the revdep-rebuild.3_ldd_errors file ?
> 
> maybe it cannot be fix, but maybe it could be avoid (like if your ldd segfault on a forgotten dead lib lying in your system that could be remove)

 

i will check it...

well, here is a content of file 3_errors.rr

----------

## krinn

just saw your answer...

well, i don't see any segfault, some unknow exit code, but no segfault.

why you have /usr/lib64/debug directory at all ? You're using the debug use flag ? I mean intentionally?

----------

## JohnBlbec

 *krinn wrote:*   

> just saw your answer...
> 
> well, i don't see any segfault, some unknow exit code, but no segfault.
> 
> why you have /usr/lib64/debug directory at all ? You're using the debug use flag ? I mean intentionally?

 

hi krin. i am using debug glibc use flag and features splitdebug for the same package because i need to use valgrind

and i do not know other solution :-(

----------

