# Another 'What happened to my WIFI' thread...

## dpshak

Well, I've not had to ask for help here for a while, between this forum and google, I've been able to find the answers...  THIS one has me stumped though, and I can't seem to find a solution!   :Confused: 

Recently did a world updated which installed OpenRC 0.10.2 and I updated the kernel from 3.3.4 to 3.4.0.  When I update a kernel, I always copy the .config from the old kernel into the new source and do 'make oldconfig' - addressing whatever pops up during the process.  On occasion, this backfires on me, but I always manage to figure it out...

Now, when I started 3.4.0 for the first time, I noticed my wireless wasn't working.  So I tried to start it by hand:

```
Truck ~ # /etc/init.d/net.wlan0 restart

 * Bringing up interface wlan0

 *   Configuring wireless network for wlan0

 *   Scanning for access points

 *     Found "DPSHomeNet" at 00:12:17:30:F9:3E, managed, encrypted

Error for wireless request "Set AP Address" (8B14) :

    SET failed on device wlan0 ; Device or resource busy.

 *   Connecting to "DPSHomeNet" in managed mode (WEP enabled) ...                                                                                                                                                          [ !! ]

 *   Couldn't associate with any access points on wlan0

 *   Failed to configure wireless for wlan0

 * ERROR: net.wlan0 failed to start
```

Um, WHAT?!?

Googling around, I didn't find much that applied to MY problem, but I did find something about having WLAN0 in the boot level (as opposed to default level). 

I also took a look at my 'net' config file:

```
# This blank configuration will automatically use DHCP for any net.*

# scripts in /etc/init.d.  To create a more complete configuration,

# please review /usr/share/doc/openrc/net.example and save your configuration

# in /etc/conf.d/net (this file :]!).

# Hardwired Network

config_eth0="dhcp"

# Wireless Network

#config_wlan0="dhcp"

modules="iwconfig"

config_wlan0="dhcp"

mode_wlan0="managed"

preferred_aps="DPSHomeNet schneider"

key_DPSHomeNet="####-####-####-####-####-####-## enc open"
```

I also moved the 'modules="iwconfig"' above the 'config_wlan0="dhcp"' just in case.... despite the fact that this has been working for years!

Rebooting and watching the init process, it tried TWICE to connect to the access point, instead of the once it usually does, failing both times.  Once the system was up and running, sans wireless, I tried to start it manually:

```
Truck ~ # /etc/init.d/net.wlan0 restart

* Bringing up interface wlan0

 *   Configuring wireless network for wlan0

 *   Scanning for access points

 *     Found "DPSHomeNet" at 00:12:17:30:F9:3E, managed, encrypted

Error for wireless request "Set AP Address" (8B14) :

    SET failed on device wlan0 ; Device or resource busy.

 *   Connecting to "DPSHomeNet" in managed mode (WEP enabled) ...                                                                             [ ok ]

 *     wlan0 connected to SSID "DPSHomeNet" at 00:12:17:30:F9:3E

 *     in managed mode (WEP enabled)

 *   dhcp ...

 *     Running dhcpcd ...

dhcpcd[2968]: version 5.5.6 starting

dhcpcd[2968]: all: not configured to accept IPv6 RAs

dhcpcd[2968]: wlan0: rebinding lease of 192.168.1.105

dhcpcd[2968]: wlan0: carrier lost

dhcpcd[2968]: wlan0: broadcasting for a lease

dhcpcd[2968]: timed out                                                                                                                       [ !! ]

 * ERROR: net.wlan0 failed to start
```

I'm lost and don't really have any idea where to look for the solution!  There were no substantial changes, at least that looked to me like they would cause this problem, in the new kernel...  Where should I start?!?

BTW: I'm scribbling this, from the same machine, using 3.3.4.  EVERYTHING else seems to be the same...i.e. OpenRC 0.10.2, same configuration files, etc.

Here's what a restart looks like in 3.3.4:

```
Truck ~ # /etc/init.d/net.wlan0 restart

 * Unmounting network filesystems ...                                                                                                         [ ok ]

 * Bringing down interface wlan0

 *   Stopping dhcpcd on wlan0 ...                                                                                                             [ ok ]

 *   Removing addresses

 * Bringing up interface wlan0

 *   Configuring wireless network for wlan0

 *   Scanning for access points

 *     Found "DPSHomeNet" at 00:12:17:30:F9:3E, managed, encrypted

 *   Connecting to "DPSHomeNet" in managed mode (WEP enabled) ...                                                                             [ ok ]

 *     wlan0 connected to SSID "DPSHomeNet" at 00:12:17:30:F9:3E

 *     in managed mode (WEP enabled)

 *   dhcp ...

 *     Running dhcpcd ...

dhcpcd[2912]: version 5.5.6 starting

dhcpcd[2912]: all: not configured to accept IPv6 RAs

dhcpcd[2912]: wlan0: rebinding lease of 192.168.1.105

dhcpcd[2912]: wlan0: acknowledged 192.168.1.105 from 192.168.1.1

dhcpcd[2912]: wlan0: checking for 192.168.1.105

dhcpcd[2912]: wlan0: leased 192.168.1.105 for 86400 seconds

dhcpcd[2912]: forked to background, child pid 2941                                                                                            [ ok ]

 *     received address 192.168.1.105/24
```

And here are the running modules, under BOTH kernels:

```
Truck ~ # lsmod

Module                  Size  Used by

vboxnetadp             17478  0 

vboxnetflt             14861  0 

vboxdrv              1779580  2 vboxnetflt,vboxnetadp

iwlwifi               189536  0 

mac80211              197884  1 iwlwifi

cfg80211              165762  2 mac80211,iwlwifi
```

I'm STUMPED!  Anybody have any ideas how I can troubleshoot this?

----------

## olek

You could first try to compare the old kernel's config with the new one. It really sounds like a configuration issue to me.

----------

## dpshak

 *olek wrote:*   

> You could first try to compare the old kernel's config with the new one. It really sounds like a configuration issue to me.

 

That's basically what happens, at least as I understand it, when you do a 'make oldconfig'  It changes changes the configuration of the new kernel to match the old kernel and asks you about anything it finds in the new kernel that wasn't in the old kernel.  

Having said that, that was my first thought too.  I DID look, but I didn't see any changes in the new kernel that looked, to me, like they would effect the wifi!  Maybe I missed something?!?

----------

## Rexilion

Could you post the output of:

 *Quote:*   

> diff -u oldkernelconfig newkernelconfig
> 
> dmesg # under the new kernel

 

Please?

----------

## dpshak

 *Rexilion wrote:*   

> Could you post the output of:
> 
>  *Quote:*   diff -u oldkernelconfig newkernelconfig
> 
> dmesg # under the new kernel 
> ...

 

Here is the diff: http://pastebin.com/FfCJGKzV

And here is the dmesg from 3.4.0: http://pastebin.com/XAMXi3Un

Hopefully that works, I just created a pastebin account and have never used it before...  Thanks for looking!

ETA:  I DID make some changes to the 3.4.0 kernel while I was checking the configuration.  As you can see it is build #5.  Nonetheless, I don't THINK I made any changes that should effect the WIFI!

----------

## Rexilion

 *dpshak wrote:*   

> ETA:  I DID make some changes to the 3.4.0 kernel while I was checking the configuration.  As you can see it is build #5.  Nonetheless, I don't THINK I made any changes that should effect the WIFI!

 

But you did:

 *Quote:*   

> +CONFIG_IWLWIFI_DEBUG=y
> 
> +CONFIG_IWLWIFI_DEBUG_EXPERIMENTAL_UCODE=y

 

I see some more red flags in there, but experimental_ucode stands out quite a bit   :Exclamation: 

----------

## dpshak

 *Rexilion wrote:*   

>  *dpshak wrote:*   ETA:  I DID make some changes to the 3.4.0 kernel while I was checking the configuration.  As you can see it is build #5.  Nonetheless, I don't THINK I made any changes that should effect the WIFI! 
> 
> But you did:
> 
>  *Quote:*   +CONFIG_IWLWIFI_DEBUG=y
> ...

 

I HAVE experimented with a number of different configuration settings to see what, if any, effect they would have on the problem; OR, if they could help troubleshoot the problem.  That was one of the items I tested - both set and unset.  It failed to solve/help solve the problem.  As a matter of fact, as I was scribbling this, I went ahead and unset it and recompiled the kernel.  Still the same problem.   :Confused: 

If you would be so kind as to point out the other red flags that you see, I will make the changes and see what happens!  

Thanks for the help so far!   :Smile: 

----------

## Rexilion

First of all, where do you get your ucode from?

I suggest you try the package linux-firmware:

 *Quote:*   

> emerge -q1 linux-firmware

 

and later in your config, I see this:

 *Quote:*   

>  CONFIG_PCIEASPM=y

 

You have not changed it, but try disabling it.

Try the above suggestions one by one. Good luck!

----------

## dpshak

 *Rexilion wrote:*   

> First of all, where do you get your ucode from?
> 
> I suggest you try the package linux-firmware:
> 
>  *Quote:*   emerge -q1 linux-firmware 
> ...

 

Hmmm PFM!!!    :Shocked: 

I originally downloaded the ucode from intellinuxwireless.org.  So, I started by installing the linux-firmware package. This overwrote the ucode that I had originally installed.  BUT after restarting the laptop (BTW, the wireless failed to connect to my AP) I checked dmesg and it reported the same version and build that I originally had.  <shrugs>  It was worth a try!

Next, I tried unsetting the CONFIG_PCIEASPM.  I CAN'T unset it - or at least, I don't know how!  I tried manually changing it in the .config file, rebuilt the kernel, checked the new config and it was still set.  Then I tried 'make menuconfig' and discovered that that option is one of those with the '-*-' and can't be changed.  The only way I seem to be able to make it go away is to uncheck PCIE and I don't think that will work so good!    :Very Happy: 

lspci reports: 

```
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09)
```

Now, to the PFM!  I've been swapping back and forth between the kernels, so I could get on the 'net when needed.  I noticed that last couple times (both before and after installing the linux-firmware package) I boot into the earlier kernel, the wireless wouldn't associate with my access point during the init process.  BUT, I could manually start it and it worked fine.  This last TWO TIMES I boot into the new kernel I was able to get the wireless to associate with my AP by starting it manually!    :Shocked:   :Confused: 

I did, between our last responses, do another world update.  The updates included: dbus, nspr, syslog-ng, nss, perl-core/Digest-MD5, virtual/Digest-MD5, udev, libsdl, pango, java-config, psmisc, thunderbird, and openssh.  

Is it possible that udev or, maybe, dbus is causing the problem?  Based on the files in my first post, showing the manual startup of the wireless, it looked to me like 2 processes(?) were trying to control it.  One process that saw the wireless AP, connected to it and fired off dhcp and another that didn't see the AP and shut down the card - hence the 'carrier lost' message while dhcp was running?!?  

Again, thanks for your help so far, we seem to be heading in the right direction!  Any more thoughts and ideas will be appreciated!  Unfortunately, I'm back to work tomorrow (the laptop goes with me in the truck) so I may not be able to check back too frequently.

----------

## BillWho

dpshak,

You mentioned doing a world update - did you check which ebuilds were emerged:   :Question: 

```
genlop -l  --date 06/06/2012  --date 06/08/2012
```

This will list the emerged ebuilds for 6/6 and 6/7

You also mentioned that it looked like 2 processes were trying to take control. Did you check:

```
cat /etc/udev/rules.d/70-persistent-net.rules
```

Good luck   :Wink: 

----------

## dpshak

 *BillWho wrote:*   

> dpshak,
> 
> You mentioned doing a world update - did you check which ebuilds were emerged:  
> 
> ```
> ...

 

Here is my package list: http://pastebin.com/LCwVy0Dx

I went ahead and included everything from 27May12, when I did the initial update that brought in the 3.4.0 kernel, right through 09Jun12 - in case you might notice something that looks suspect....

Here is what my /etc/udev/rules.d/70-persistent-net.rules:

```
Truck rules.d # cat *-net.rules

# This file was automatically generated by the /lib64/udev/write_net_rules

# program, run by the persistent-net-generator.rules rules file.

#

# You can modify it, as long as you keep each rule on a single

# line, and change only the value of the NAME= key.

# PCI device 0x10ec:0x8168 (r8169)

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="54:04:a6:08:6c:60", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

# PCI device 0x8086:0x0885 (iwlagn)

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="40:25:c2:6f:63:d0", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="wlan*", NAME="wlan0"

```

  That looks good to me, but I'm going to rename the file, reboot the laptop and see what happens - then I'm off to work...  :Sad: 

Thanks for your ideas BillWho!

----------

## ashtophet

Same behaviour here (old ~x86 box, RTL3070 usb):

kernel 3.3.8(-hardened): it works flawlessly.

kernel 3.4.3(-hardened): looong time to associate to WPA, dhclient not working, statically assigned IP works.

Seems like a kernel regression to me... I did not try 3.4.4 neither 3.5-rc* to see if it has been already solved.

cfr. ixquicking around showed some neighbours, http://aptosid.com/index.php?name=PNphpBB2&file=viewtopic&p=15138 https://bugs.archlinux.org/task/30319

----------

## ashtophet

Confirmed that same thing happens with 3.4.0, 3.4.4 and 3.5-rc3. Reading 3.4 Changelog didn't give me any tip...

Definitely we are not alone: https://bugzilla.redhat.com/show_bug.cgi?id=831488 https://bugzilla.redhat.com/show_bug.cgi?id=828731

----------

## dpshak

Well after a little over a week in the truck, it's been hit and miss.  Sometimes it will connect during the init process and sometimes I have to manually start it after KDE starts.  Doesn't matter whether it's an encrypted or open access point.  

When I'm in the truck I generally use a cell modem, so I haven't played around with it too much.  I'm hesitant to mess with a working machine when I have to rely on it!

Given the number of different wifi 'cards' this is happening to, one would have to think it is a kernel problem.  Well, maybe by the time I do a world update, when I get home, at the end of the month, it will be fixed again!

----------

## grumblebear

Well, it seems different things are disussed in this thread. For me, having the problem of not working DHCP with ath9k_htc, the patch given here http://permalink.gmane.org/gmane.linux.kernel.wireless.general/93099, fixes the issue.

As far as I could see, it has not made it into 3.4.4-rc, so if you want an official release, you might have to wait for 3.4.5.

----------

