# Desktop hijacks the network and refuses to respond

## Hyper_Eye

A couple times a month my network becomes completely unresponsive to any client requests. Machines can't talk to each other, talk to the router, or talk to the internet. When I look at my switches I see all of the LED status lights blinking rapidly. I also find that my main Gentoo desktop is unresponsive. Hitting keys on the keyboard won't wake up the monitors. I can't ssh to it. If I disconnect the ethernet cable from the Gentoo machine or hit the reset button on it the network will immediately start working properly. After the Gentoo machine reboots everything appears normal. There are no entries in /var/log/messages for the period of time that it was unresponsive.

This issue is completely unpredictable. It only happens a couple times a month, always after a period of inactivity (typically over night), and I've never had any monitoring going in anticipation as I have no idea how long to keep such tools running. This morning I found the issue occurring again. I disconnected the Gentoo desktop from the network but did not reset the machine. I installed wireshark on a laptop running Linux and connected the ethernet from the Gentoo machine directly into the laptop. I could see that the indicators began blinking the same way the switch indicators did. Wireshark showed no activity. I then connected the Gentoo desktop back into the switch. The status LED for that specific connection began blinking continuously but the network did not immediately become inaccessible to other machines. Wireshark showed only expected traffic. It did not show anything coming from the Gentoo desktop's MAC address. After a few minutes the network did become unresponsive again and all of the status LED's started blinking quickly in tandem with the Gentoo desktop's status LED. Wireshark then showed no activity. The final thing I tried was connecting the Gentoo desktop directly into the router. It exhibited the same behavior. Looking at the router's web interface I could see that it did show a 1000mbit ethernet connection on the port but it showed no client there ("none" for the MAC address).

At this point I'm stuck. I don't know how to proceed to diagnose this problem. I originally associated the issue with leaving a Windows XP guest running in Virtualbox and I wrote a thread here based on that assumption (https://forums.gentoo.org/viewtopic-t-986016.html). I do not believe that Virtualbox is related at this point. I have not left Virtualbox running but this issue is still occurring. This is a serious issue and I really need to attempt to diagnose it. Any assistance would be greatly appreciated.

My Kernel Config

```
# emerge --info

Portage 2.2.10 (default/linux/amd64/13.0/desktop/kde, gcc-4.8.2, glibc-2.19, 3.13.5-gentoo x86_64)

=================================================================

System uname: Linux-3.13.5-gentoo-x86_64-Intel-R-_Core-TM-_i7-3770_CPU_@_3.40GHz-with-gentoo-2.2

KiB Mem:    16394632 total,  14946760 free

KiB Swap:    4194296 total,   4194296 free

Timestamp of tree: Tue, 08 Apr 2014 23:15:01 +0000

ld GNU ld (GNU Binutils) 2.24

app-shells/bash:          4.2_p46-r1

dev-java/java-config:     2.2.0

dev-lang/python:          2.7.6, 3.2.5-r3, 3.3.5, 3.4.0

dev-util/cmake:           2.8.12.2-r1::kde

dev-util/pkgconfig:       0.28-r1

sys-apps/baselayout:      2.2

sys-apps/openrc:          0.12.4

sys-apps/sandbox:         2.6-r1

sys-devel/autoconf:       2.13, 2.69

sys-devel/automake:       1.4_p6-r1, 1.10.3, 1.11.6, 1.12.6, 1.13.4, 1.14.1

sys-devel/binutils:       2.24-r2

sys-devel/gcc:            4.7.3-r1, 4.8.2

sys-devel/gcc-config:     1.8

sys-devel/libtool:        2.4.2

sys-devel/make:           4.0-r1

sys-kernel/linux-headers: 3.14 (virtual/os-headers)

sys-libs/glibc:           2.19

Repositories: gentoo dmwoodlx2_local kde vincent gamerlay sunrise roslin steam-overlay anders-larsson luman science

ACCEPT_KEYWORDS="amd64 ~amd64"

ACCEPT_LICENSE="* -@EULA"

CBUILD="x86_64-pc-linux-gnu"

CFLAGS="-march=native -O2 -pipe"

CHOST="x86_64-pc-linux-gnu"

CONFIG_PROTECT="${CONFIG_PROTECT} /etc /etc/idea/conf /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/maven-bin-3.1/conf /usr/share/polkit-1/actions /usr/share/themes/oxygen-gtk/gtk-3.0 /var/lib/hsqldb"

CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/splash /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"

CXXFLAGS="-march=native -O2 -pipe"

DISTDIR="/usr/portage/distfiles"

FCFLAGS="-O2 -pipe"

FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync"

FFLAGS="-O2 -pipe"

GENTOO_MIRRORS="http://distfiles.gentoo.org"

LANG="en_US.utf8"

LDFLAGS="-Wl,-O1 -Wl,--as-needed"

MAKEOPTS="-j8"

PKGDIR="/usr/portage/packages"

PORTAGE_CONFIGROOT="/"

PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"

PORTAGE_TMPDIR="/var/tmp"

PORTDIR="/usr/portage"

PORTDIR_OVERLAY="/usr/local/portage /var/lib/layman/kde /var/lib/layman/vincent /var/lib/layman/gamerlay /var/lib/layman/sunrise /var/lib/layman/roslin /var/lib/layman/steam /var/lib/layman/anders-larsson /var/lib/layman/luman /var/lib/layman/science"

SYNC="rsync://rsync.us.gentoo.org/gentoo-portage"

USE="X a52 aac aacs acl acpi alsa alstream amd64 berkdb bindist bluetooth bluray branding bzip2 cairo cdda cddb cdr cli cmake consolekit cracklib crypt cuda cups cxx dbus declarative dri dts dvd dvdr emboss encode exif fakevim fam fbcondecor ffmpeg firefox flac fortran ftp gdbm gif git gpm gtk hddtemp iconv imagemagick ipv6 java javascript joystick jpeg kde kipi lame lastfm lcms ldap libnotify lirc lm_sensors mad md5sum mercurial midi minizip mmx mng modules mp3 mp4 mpeg multilib ncurses network nls nptl nsplugin ogg openal opengl openmp pam pango pcre pdf perl phonon plasma png policykit portmidi ppds python qt3support qt4 readline s3 samba sdl semantic-desktop session spell sqlite sse sse2 sse3 ssl startup-notification subversion svg tcpd threads tiff timidity truetype udev udisks unicode upower usb valgrind vdpau vim-syntax vorbis wxwidgets x264 xcb xcomposite xinerama xml xpm xscreensaver xv xvid zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="efi-64" INPUT_DEVICES="keyboard mouse joystick" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en" LIRC_DEVICES="sb0540" NETBEANS_MODULES="*" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-5" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_3" RUBY_TARGETS="ruby19 ruby20" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"

Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
```

```
# lspci

00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller (rev 09)

00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)

00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)

00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04)

00:19.0 Ethernet controller: Intel Corporation 82579V Gigabit Network Connection (rev 04)

00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)

00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4)

00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 2 (rev c4)

00:1c.5 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c4)

00:1c.6 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 7 (rev c4)

00:1c.7 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 8 (rev c4)

00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)

00:1f.0 ISA bridge: Intel Corporation Z77 Express Chipset LPC Controller (rev 04)

00:1f.2 SATA controller: Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)

00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)

01:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 670] (rev a1)

01:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1)

03:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)

04:00.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 41)

05:00.0 Multimedia audio controller: Creative Labs SB0400 Audigy2 Value

05:01.0 FireWire (IEEE 1394): VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller (rev c0)

06:00.0 Ethernet controller: Qualcomm Atheros AR8161 Gigabit Ethernet (rev 10)

07:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller (rev 11)
```

----------

## krinn

When it bug, did you check your card IP ?

If for some reason, a tool change your IP to another one that is use by another host, you can get that kind of mess.

And if your IP is fine, you can seek out if it's not your network card that is buggy or put in a a buggy state. Try unload/reload the module the card, it should re-init the device and works.

----------

## Hyper_Eye

When the bug occurs I can't interact with the system at all. It is completely unresponsive.

----------

## Hyper_Eye

This happened again this morning. I plugged the laptop in and started wireshark. I could see that there was no regular activity. There were just tons of arp requests as the machines tried to establish who was who. As soon as I rebooted the Gentoo machine the router sent a local master announcement and domain/workgroup announcement. The network also started working correctly. There was still no indication of why the Gentoo desktop machine was frozen up or why the network was broken.

----------

## Hyper_Eye

```
Apr 21 06:21:40 dmwoodlx kernel: [233564.845395] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cec36000

Apr 21 06:21:40 dmwoodlx kernel: [233564.845528] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cec71000

Apr 21 06:21:40 dmwoodlx kernel: [233564.845571] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cec91000

Apr 21 06:51:42 dmwoodlx kernel: [235367.380752] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cecf1000

Apr 21 06:51:42 dmwoodlx kernel: [235367.380796] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ced11000

Apr 21 06:51:42 dmwoodlx kernel: [235367.380853] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ced31000

Apr 21 07:21:44 dmwoodlx kernel: [237169.826064] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ced7d000

Apr 21 07:21:44 dmwoodlx kernel: [237169.826201] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cedb1000

Apr 21 07:21:44 dmwoodlx kernel: [237169.826223] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cedd1000

Apr 21 07:51:45 dmwoodlx kernel: [238972.365676] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cee31000

Apr 21 07:51:45 dmwoodlx kernel: [238972.365729] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cee51000

Apr 21 07:51:45 dmwoodlx kernel: [238972.365754] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cee71000

Apr 21 08:21:47 dmwoodlx kernel: [240774.876977] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ceebe000

Apr 21 08:21:47 dmwoodlx kernel: [240774.877030] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ceef1000

Apr 21 08:21:47 dmwoodlx kernel: [240774.877048] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cef11000

Apr 21 08:51:49 dmwoodlx kernel: [242577.360005] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cef71000

Apr 21 08:51:49 dmwoodlx kernel: [242577.360022] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cef91000

Apr 21 08:51:49 dmwoodlx kernel: [242577.360389] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cefb1000

Apr 21 09:21:51 dmwoodlx kernel: [244379.835306] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000ceffe000

Apr 21 09:21:51 dmwoodlx kernel: [244379.835333] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cf031000

Apr 21 09:21:51 dmwoodlx kernel: [244379.835368] hwdev DMA mask = 0x000000007fffffff, dev_addr = 0x00000000cf051000

Apr 21 10:30:50 dmwoodlx syslog-ng[2705]: syslog-ng starting up; version='3.4.7'
```

One thing that is consistent with this issue are these messages in my kernel log. This is always the last thing in the log before the problem occurs. You can see that once this starts happening it happens every 30 minutes.

Here is the code in the kernel that triggers this message:

lib/swiotlb.c

```
void *

swiotlb_alloc_coherent(struct device *hwdev, size_t size,

                       dma_addr_t *dma_handle, gfp_t flags)

{

        dma_addr_t dev_addr;

        void *ret;

        int order = get_order(size);

        u64 dma_mask = DMA_BIT_MASK(32);

        if (hwdev && hwdev->coherent_dma_mask)

                dma_mask = hwdev->coherent_dma_mask;

        ret = (void *)__get_free_pages(flags, order);

        if (ret) {

                dev_addr = swiotlb_virt_to_bus(hwdev, ret);

                if (dev_addr + size - 1 > dma_mask) {

                        /*

                         * The allocated memory isn't reachable by the device.

                         */

                        free_pages((unsigned long) ret, order);

                        ret = NULL;

                }

        }

        if (!ret) {

                /*

                 * We are either out of memory or the device can't DMA to

                 * GFP_DMA memory; fall back on map_single(), which

                 * will grab memory from the lowest available address range.

                 */

                phys_addr_t paddr = map_single(hwdev, 0, size, DMA_FROM_DEVICE);

                if (paddr == SWIOTLB_MAP_ERROR)

                        return NULL;

                ret = phys_to_virt(paddr);

                dev_addr = phys_to_dma(hwdev, paddr);

                /* Confirm address can be DMA'd by device */

                if (dev_addr + size - 1 > dma_mask) {

                        printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",

                               (unsigned long long)dma_mask,

                               (unsigned long long)dev_addr);

                        /* DMA_TO_DEVICE to avoid memcpy in unmap_single */

                        swiotlb_tbl_unmap_single(hwdev, paddr,

                                                 size, DMA_TO_DEVICE);

                        return NULL;

                }

        }

        *dma_handle = dev_addr;

        memset(ret, 0, size);

        return ret;

}

EXPORT_SYMBOL(swiotlb_alloc_coherent);
```

To get there there must be no free pages or the memory isn't reachable by the device.

If anyone knows what could cause this or how I might elicit more information from my system I would appreciate it.

----------

## Hyper_Eye

I'm still hoping for a solution to this problem. Should I post this issue somewhere else? Is there a place where someone may have a better idea of what is causing the DMA allocation errors? Thanks.

----------

## toneus

I am experiencing many if not all of the same. In my search for an answer, I tested the the Memory on my server with memtest86. My RAM is failing Test #2 gloriously! Test #2 is a parallel memory.

 *Quote:*   

> Test 1 [Address test, own address, Sequential]
> 
>     Each address is written with its own address and then is checked for consistency. In theory previous tests should have caught any memory addressing problems. This test should catch any addressing errors that somehow were not previously detected. This test is done sequentially with each available CPU.
> 
> Test 2 [Address test, own address, Parallel]
> ...

 

I have gone through the memtest86 on each individual RAM module, in differing memory module slots. Each continues to fail Text #2. I don't think it's a module or MOBO issue.

This has now happened for my last two kernel upgrades, and I'm currently using Linux 3.12.13-gentoo.

Did we miss an important setting, or RAM related change in the latest kernels?

This is one of the most difficult problems I've even encountered on Linux. There is literally no smoking gun, no log message, and no way to access the system once it is this state.

Any help would be greatly appreciated!

Toneus

----------

## Hyper_Eye

Are you running memtest in Gentoo? Booting to a memtest CD I don't get any memory errors.

----------

## toneus

I am running memtest on boot basically like a CD.

----------

## chithanh

Maybe it is a problem with your network card. It looks like the ethernet port is left in a state that causes problems to your router. You could try one or more of the following:

Install a different network card in the computer

Force ethernet connection to 100 Mbps with ethtool

Connect another ethernet switch in between the router and the computer

----------

