# Improving battery life, disabling DGPU - bbswitch, nvidia

## Shoaloak

Hello gentoo(wo)men,

I want to disable my dedicated Nvidia GPU (Geforce 965m) when I'm not using it to improve the battery life of my laptop.

I am currently running bbswitch together with the proprietary nvidia blob which seems to be working.

```
$ glxspheres64 

Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)

Visual ID of window: 0xb2

Context is Direct

OpenGL Renderer: Mesa DRI Intel(R) HD Graphics 530 (Skylake GT2) 

61.603280 frames/sec - 68.749260 Mpixels/sec

$ optirun glxspheres64

Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)

Visual ID of window: 0x21

Context is Direct

OpenGL Renderer: GeForce GTX 965M/PCIe/SSE2

73.597201 frames/sec - 82.134477 Mpixels/sec
```

However, at boot I notice this message:

```
$ dmesg | grep bbswitch

[   39.704016] bbswitch: version 0.8

[   39.704020] bbswitch: Found integrated VGA device 0000:00:02.0: \_SB_.PCI0.GFX0

[   39.704025] bbswitch: Found discrete VGA device 0000:01:00.0: \_SB_.PCI0.PEG0.PEGP

[   39.704132] bbswitch: detected an Optimus _DSM function

[   39.704137] bbswitch: device 0000:01:00.0 is in use by driver 'nvidia', refusing OFF

[   39.704160] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on
```

The line that caught my attention is  *Quote:*   

> device 0000:01:00.0 is in use by driver 'nvidia', refusing OFF

 

I think that Nvidia is loaded before bbswitch, which is odd since i have this /etc/modprobe.d/bbswitch.conf file:

```
blacklist nvidia

blacklist nouveau

options bbswitch load_state=0
```

And i rebuilt my initramfs using buildkernel which in turn uses genkernel.

Removing the nvidia module myself is also not possible:

```
# rmmod nvidia

rmmod: ERROR: Module nvidia is in use by: nvidia_modeset

# rmmod nvidia_modeset

rmmod: ERROR: Module nvidia_modeset is in use by: nvidia_drm

# rmmod nvidia_drm

rmmod: ERROR: Module nvidia_drm is in use

# modprobe -r -i nvidia_drm

modprobe: FATAL: Module nvidia_drm is in use.
```

Does anybody have an idea how i could fix this?

Somebody using Arch mentioned something something about modules.

----------

## R0b0t1

Did you follow up on what the bug report says, where the card turns off after being used?

The fix is claimed to be in https://github.com/Bumblebee-Project/Bumblebee/pull/762, however I wasn't able to find anything explaining the startup behavior. When I used Ubuntu a number of years ago I believe I was having the same issue. This would have been right around the time Optimus came out.

----------

## Shoaloak

 *R0b0t1 wrote:*   

> Did you follow up on what the bug report says, where the card turns off after being used?

 

This is the point, my card is always on. I can't shut it down.

```
# cat /proc/acpi/bbswitch            

0000:01:00.0 ON

# tee /proc/acpi/bbswitch <<<OFF

OFF

# cat /proc/acpi/bbswitch            

0000:01:00.0 ON
```

 *R0b0t1 wrote:*   

> The fix is claimed to be in https://github.com/Bumblebee-Project/Bumblebee/pull/762, however I wasn't able to find anything explaining the startup behavior. When I used Ubuntu a number of years ago I believe I was having the same issue. This would have been right around the time Optimus came out.

 

I've read the bugreports but I can't find anything that helps me solving this problem   :Sad: 

----------

## Holysword

 *Shoaloak wrote:*   

>  *R0b0t1 wrote:*   Did you follow up on what the bug report says, where the card turns off after being used? 
> 
> This is the point, my card is always on. I can't shut it down.
> 
> ```
> ...

 

Are you using bumblebee/primus also? If so, try running primusrun glxgears, and afterwards run  dmesg | grep -C 10 bbswitch.

If you see a message like 

```
pci 0000:01:00.0: Refused to change power state, currently in D0
```

then welcome to the club. It is a bug without known workaround, something to do with some kernel-4.4+ modification in the KMS and power management modules:

https://github.com/Bumblebee-Project/bbswitch/issues/140

----------

## firasuke

I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:

https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/

A couple of users have found it helpful, give it a shot and tell me what went wrong  :Razz: 

Good Luck

----------

## Holysword

 *firasuke wrote:*   

> I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:
> 
> https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/
> 
> A couple of users have found it helpful, give it a shot and tell me what went wrong 
> ...

 

Does your guide tackle the issue with bbswitch failing to change the ACPI state of the card?

----------

## firasuke

 *Holysword wrote:*   

>  *firasuke wrote:*   I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:
> 
> https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/
> 
> A couple of users have found it helpful, give it a shot and tell me what went wrong 
> ...

 

Yes, it does. Check the (optional) USE flags section, I've mentioned similar cases and included 2 workarounds to fix this problem.

Several users have reported that they work. Give them a try and tell me how things go.

----------

## Holysword

 *firasuke wrote:*   

>  *Holysword wrote:*    *firasuke wrote:*   I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:
> 
> https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/
> 
> A couple of users have found it helpful, give it a shot and tell me what went wrong 
> ...

 

That doesn't help here, sadly. The modules nvidia_modesetting and nvidia_uvm do not even exist, but I still fail to turn off the card.

----------

## firasuke

 *Holysword wrote:*   

>  *firasuke wrote:*    *Holysword wrote:*    *firasuke wrote:*   I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:
> 
> https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/
> 
> A couple of users have found it helpful, give it a shot and tell me what went wrong 
> ...

 

Sorry to hear that. I'd suggest that you go through the guide again, this time in greater detail. Check the comments' section as well as some users have undergone a situation similar to yours.

I'd recommend that you double check the USE flags of your nvidia-drivers, and re emerge them, and (if possible) remove the /lib/modules/YOURKERNELVERSION directory as it might still have the nvidia_kms and uvm modules (even if you removed the USE flags, these may not have been removed so you have to manually ensure that all other nvidia modules besides nvidia.ko should be removed).

I'd also recommend that you switch to the live versions (-9999) for bbswitch,bumblebee and primus.

Hopefully, it'll work for you this time!

If the error is still persisting, leave a comment here and on the website as I'm much more active there (with logs if possible).

Best of luck

----------

## Holysword

 *firasuke wrote:*   

>  *Holysword wrote:*    *firasuke wrote:*    *Holysword wrote:*    *firasuke wrote:*   I've written an article on how to configure bumblebee on gentoo linux on my website and I'm constantly updating it (hopefully will add it to the gentoo wiki once I confirm it's 100% working), you may want to check it out:
> 
> https://www.dotslashlinux.com/2017/06/04/setting-up-bumblebee-on-gentoo-linux/
> 
> A couple of users have found it helpful, give it a shot and tell me what went wrong 
> ...

 

I've done all those suggestions even before your comment.

It really doesn't work.

There is a thread on github about that, apparently the people with Lenovo laptops are the lucky ones, because some workarounds work for them.

----------

## firasuke

 *Holysword wrote:*   

>  *firasuke wrote:*    *Holysword wrote:*   
> 
> That doesn't help here, sadly. The modules nvidia_modesetting and nvidia_uvm do not even exist, but I still fail to turn off the card. 
> 
> Sorry to hear that. I'd suggest that you go through the guide again, this time in greater detail. Check the comments' section as well as some users have undergone a situation similar to yours.
> ...

 

I'm using a Toshiba laptop, with a super buggy bios and it works fine for me (and it worked fine for several users, some were using Thinkpads others were using ASUS, the list goes on and on).

Your dmesg should only include "nvidia". What does lsmod show? Are you sure only nvidia is being loaded? How about nvidia_drm, this shouldn't be loading if you've followed the guide properly.

Well I really hoped for some logs but if you're sure it's an upstream problem then I do hope that it gets fixed soon.

----------

## Holysword

 *firasuke wrote:*   

> I'm using a Toshiba laptop, with a super buggy bios and it works fine for me (and it worked fine for several users, some were using Thinkpads others were using ASUS, the list goes on and on).
> 
> Your dmesg should only include "nvidia". What does lsmod show? Are you sure only nvidia is being loaded? How about nvidia_drm, this shouldn't be loading if you've followed the guide properly.
> 
> Well I really hoped for some logs but if you're sure it's an upstream problem then I do hope that it gets fixed soon.

 

I'm not *sure* that it is due to upstream, but so far I am convinced that it is   :Razz: 

The module nvidia-drm does not exist here:

```
sleipnir ~ # uname -a

Linux sleipnir 4.9.16-gentoo #9 SMP Fri Sep 8 20:27:48 CEST 2017 x86_64 Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz GenuineIntel GNU/Linux

sleipnir ~ # find /lib/modules/4.9.16-gentoo/ -name *nvidia*.ko

/lib/modules/4.9.16-gentoo/video/nvidia.ko

sleipnir ~ #
```

With some combinations of kernel+nvidia-drivers, the fan starts as soon as bumblebee starts and never turns off again. With other combinations, the fan starts only after calling primusrun. Either way, its not possible to turn off the GPU and it fires the complaint:

```
[  306.471528] pci 0000:01:00.0: Refused to change power state, currently in D0
```

There are a bunch of other errors and warnings though. I'm not quite sure what they mean, my full dmesg you can find here: https://pastebin.com/5BFpGw57

----------

## firasuke

 *Holysword wrote:*   

>  *firasuke wrote:*   I'm using a Toshiba laptop, with a super buggy bios and it works fine for me (and it worked fine for several users, some were using Thinkpads others were using ASUS, the list goes on and on).
> 
> Your dmesg should only include "nvidia". What does lsmod show? Are you sure only nvidia is being loaded? How about nvidia_drm, this shouldn't be loading if you've followed the guide properly.
> 
> Well I really hoped for some logs but if you're sure it's an upstream problem then I do hope that it gets fixed soon. 
> ...

 

Looking through your dmesg I found something interesting:

```
[  301.189363] vgaarb: this pci device is not a vga device

[  301.199609] vgaarb: this pci device is not a vga device
```

It should look similar to this:

```
[    1.926048] pci 0000:00:02.0: vgaarb: setting as boot VGA device

[    1.926100] pci 0000:00:02.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none

[    1.926185] pci 0000:00:02.0: vgaarb: bridge control possible

[    1.926231] vgaarb: loaded

[    1.986304] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
```

Can you check your kernel and see if 

```
CONFIG_VGA_ARB=y

CONFIG_VGA_ARB_MAX_GPUS=2
```

----------

## Holysword

 *firasuke wrote:*   

> Looking through your dmesg I found something interesting:
> 
> ```
> [  301.189363] vgaarb: this pci device is not a vga device
> 
> ...

 

It looks like this:

```
Symbol: VGA_ARB [=y]                                                          

Type  : boolean                                                               

Prompt: VGA Arbitration                                                       

  Location:                                                                   

    -> Device Drivers                                                         

(1)   -> Graphics support                                                     

  Defined at drivers/gpu/vga/Kconfig:1                                        

  Depends on: HAS_IOMEM [=y] && PCI [=y] && !S390                             

  Selected by: VGA_SWITCHEROO [=n] && HAS_IOMEM [=y] && X86 [=y] && ACPI [=y] 

                                                                              

                                                                              

Symbol: VGA_ARB_MAX_GPUS [=16]                                                

Type  : integer                                                               

Prompt: Maximum number of GPUs                                                

  Location:                                                                   

    -> Device Drivers                                                         

      -> Graphics support                                                     

(2)     -> VGA Arbitration (VGA_ARB [=y])                                     

  Defined at drivers/gpu/vga/Kconfig:12                                       

  Depends on: HAS_IOMEM [=y] && VGA_ARB [=y]
```

I dunno where the 16 came from, but if you think it could be the source of the problem I have no objections to testing it.

----------

## firasuke

 *Holysword wrote:*   

>  *firasuke wrote:*   Looking through your dmesg I found something interesting:
> 
> ```
> [  301.189363] vgaarb: this pci device is not a vga device
> 
> ...

 

Setting CONFIG_VGA_ARB_MAX_GPUS to 16 is an overkill, if you're using an optimus laptop this should be set to 2 (it won't solve your current problem though).

Looking through the web, this looks like a previous bug that was somehow fixed in 3.10 (Bugzilla Kernel 63641) (Bumblebee Github Issue #159)

I found a couple of workarounds, one of them is changing the BusID of your nvidia card in /etc/bumblebee/xorg.conf.nvidia (personally I don't think it has to do anything with the kernel bug above but give it a try).

The other workaround that should work is patching vgaarb to allow it to detect the 3d controller (which is confusing as it was fixed according to the bugzilla link above). You can find the patch file here (vgaarb patch).

Then you apply the patch to your kernel:

```
patch -Np1 -i patch_file.patch
```

Hopefully, it should work if you applied the patch correctly. Keep me updated!

----------

## Holysword

 *firasuke wrote:*   

> Setting CONFIG_VGA_ARB_MAX_GPUS to 16 is an overkill, if you're using an optimus laptop this should be set to 2 (it won't solve your current problem though).
> 
> Looking through the web, this looks like a previous bug that was somehow fixed in 3.10 (Bugzilla Kernel 63641) (Bumblebee Github Issue #159)
> 
> I found a couple of workarounds, one of them is changing the BusID of your nvidia card in /etc/bumblebee/xorg.conf.nvidia (personally I don't think it has to do anything with the kernel bug above but give it a try).
> ...

 

16 is most likely the default.

This patch is from 2012, kernel-4.9 is from 2016. Is there any reason why it was not incorporated into the master version?

----------

## firasuke

 *Holysword wrote:*   

>  *firasuke wrote:*   Setting CONFIG_VGA_ARB_MAX_GPUS to 16 is an overkill, if you're using an optimus laptop this should be set to 2 (it won't solve your current problem though).
> 
> Looking through the web, this looks like a previous bug that was somehow fixed in 3.10 (Bugzilla Kernel 63641) (Bumblebee Github Issue #159)
> 
> I found a couple of workarounds, one of them is changing the BusID of your nvidia card in /etc/bumblebee/xorg.conf.nvidia (personally I don't think it has to do anything with the kernel bug above but give it a try).
> ...

 

Yes 16 is the default value.

Ikr, that's why I was confused... According to the bugzilla link, it was fixed in 3.10 which is really weird...

It won't hurt if you give it a try. Lemme know what happens after applying it, I'm curious   :Very Happy:   :Very Happy: 

----------

## Holysword

 *firasuke wrote:*   

> Yes 16 is the default value.
> 
> Ikr, that's why I was confused... According to the bugzilla link, it was fixed in 3.10 which is really weird...
> 
> It won't hurt if you give it a try. Lemme know what happens after applying it, I'm curious   

 

It does get rid of that error. However, it still cannot change the ACPI state of the card. This is the new dmesg: https://pastebin.com/6nKSHWMT

I don't know what this new line means, though:

```
[  106.141685] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=none

```

----------

## firasuke

Ok brought you some workarounds, let's see how many workarounds is needed to fix this thing   :Laughing: 

Try adding 

```
acpi_osi=!Windows\x202013" acpi_osi=Linux nogpumanager
```

to your kernel command line and notify your bootloader about these changes.

If you're using tlp or powertop or laptop-mode-tools, try disabling them for now or even uninstall them as they may interfere with bbswitch.

Another workaround is to play with your IOMMU kernel settings (although I personally see no benefit whatsoever from doing this but you can give it a try).

It looks like kernel commit 1cc0c998fdf2cb665d625fb565a0d6db5c81c639 was the root of all these problems.

Good Luck   :Very Happy: 

----------

## Holysword

 *firasuke wrote:*   

> Ok brought you some workarounds, let's see how many workarounds is needed to fix this thing  
> 
> Try adding 
> 
> ```
> ...

 

Is the second part of your command line correct? would it recognise "Linux nogpumanager" with space and all?

None of the mentioned tools are installed here.

----------

## firasuke

 *Holysword wrote:*   

>  *firasuke wrote:*   Ok brought you some workarounds, let's see how many workarounds is needed to fix this thing  
> 
> Try adding 
> 
> ```
> ...

 

Removed the outside quotes since they were ambiguous (I presumed that you should put them in GRUB_CMDLINE_LINUX="HERE"), add this to your kernel command line (boot parameters) with the quotes listed here and everything you can use ' (single quotation marks) instead of " (double) if your boot params line uses " (double) like grub's :

```
"acpi_osi=!Windows\x202013" acpi_osi=Linux nogpumanager
```

I noticed another thing earlier, that you're using systemd. In the guide on my website I added steps that required modifying bumblebee's service's script and removed a couple of lines (that check if xorg is installed).

Did you make sure that you did the equivalent thing for systemd?

----------

## Holysword

 *firasuke wrote:*   

> Removed the outside quotes since they were ambiguous (I presumed that you should put them in GRUB_CMDLINE_LINUX="HERE"), add this to your kernel command line (boot parameters) with the quotes listed here and everything you can use ' (single quotation marks) instead of " (double) if your boot params line uses " (double) like grub's :
> 
> ```
> "acpi_osi=!Windows\x202013" acpi_osi=Linux nogpumanager
> ```
> ...

 

Nothing.

```
† sleipnir † ~ $  cat /proc/cmdline

root=/dev/sda2 rootfstype=ext4 init=/usr/lib/systemd/systemd acpi_osi=!Windows\x202013 acpi_osi=Linux nogpumanager
```

----------

## firasuke

This is really getting interesting, If you don't mind can you share some more logs?

Let's start with:

1- lspci -nnkkvvv

2- lsmod

3- your kernel's .config file (if possible)

4- rc-update (if possible)

5- /var/log/Xorg.0.log

6- /var/log/rc.log (if any/ if possible)

I updated the previous reply of mine, I mentioned that I've included some steps in my guide on how to modify the bumblebee's service script for OpenRC to get it working, did you do the equivalent of this for systemd?

Can you manually turn the card off? Or unload the nvidia module?

```
modprobe -r nvidia && echo "OFF" >> /proc/acpi/bbswitch
```

Did you try using different kernel versions? Different patchsets?

Can you try adding this to your kernel command line:

```
i915.enable_hd_vgaarb=1 enable_hd_vgaarb=1
```

Hopefully, we'll get this solved...

----------

## firasuke

Ok, I've been trying to recreate the problem that you have, and I came upon one important file /etc/modprobe.d/nvidia-rmmod.conf, having already disabled uvm and ksm, this file looks like this:

```
# Nvidia UVM support

remove nvidia modprobe -r --ignore-remove nvidia-drm nvidia-modeset nvidia-uvm nvidia
```

Just make sure that you remove every other module except for nvidia, so the end file should look like this:

```
remove nvidia modprobe -r --ignore-remove nvidia
```

Let me know if it worked for you!

----------

## Holysword

 *firasuke wrote:*   

> Ok, I've been trying to recreate the problem that you have, and I came upon one important file /etc/modprobe.d/nvidia-rmmod.conf, having already disabled uvm and ksm, this file looks like this:
> 
> ```
> # Nvidia UVM support
> 
> ...

 

Those modules do not exist in my machine.

I am starting a fresh new Gentoo Installation to see if it fixes the problem. It will take a while.

EDIT#1: It does not.

I'll stick to some other Distro until Gentoo gets usable again.

----------

## Holysword

This gets weirder over time.

After installing Gentoo again from the scratch (two times) I got the "same behaviour", except that, if I start bumblebee and quickly start an application via primusrun (anything, e.g. primusrun xterm) and *never* close this window, the GPU fan doesn't start.

I can then normally use primus, but the fan never turns on no matter what. I don't know if that's good or bad, or if it means that my GPU is permanently at its lower setting. Moreover, "primusrun nvidia-settings" complain that my nvidia-card is not in use. primsurun glxinfo returns the correct information. Something else confusing me is the following:

```
† sleipnir † ~ $  LIBGL_DEBUG=verbose primusrun glxgears

libGL: OpenDriver: trying /usr/lib64/dri/tls/i965_dri.so

libGL: OpenDriver: trying /usr/lib64/dri/i965_dri.so

libGL: Can't open configuration file /home/holysword/.drirc: No such file or directory.

libGL: Using DRI2 for screen 0

libGL: OpenDriver: trying /usr/lib64/dri/tls/i965_dri.so

libGL: OpenDriver: trying /usr/lib64/dri/i965_dri.so

libGL: Can't open configuration file /home/holysword/.drirc: No such file or directory.

libGL: Using DRI2 for screen 0

libGL: Can't open configuration file /home/holysword/.drirc: No such file or directory.

^C

† sleipnir † ~ $
```

Why is it opening the i965 driver even under primus?

----------

