# CUDA doesn't recognize gpu? [SOLVED]

## scorch_dev

I'm running into issues after emerging the the cuda sdk and the cuda toolkit. While both seem to emerge fine, cuda doesn't seem to recognize my gpu. If i run the demo program query_devices from /opt/cuda/demo_suite, I find that it returns "Error 30 ('Unkown Error')" and then exits out. I'm running a gtx 1080 with cuda toolkit version 8.0.61 and nvidia sdk version 8.0.61, with nvidia-drivers version 375.26. lspci and dmesg both seem to indicate that nothing is out of the ordinary though.Last edited by scorch_dev on Wed Apr 05, 2017 2:11 pm; edited 2 times in total

----------

## Roman_Gruber

i'm not sure if this is related:

https://wiki.gentoo.org/wiki/NVidia/nvidia-drivers

 *Quote:*   

> Driver fails to initialize when MSI interrupts are enabled
> 
> The Linux NVIDIA driver uses Message Signaled Interrupts (MSI) by default. This provides compatibility and scalability benefits, mainly due to the avoidance of IRQ sharing. Some systems have been seen to have problems supporting MSI, while working fine with virtual wire interrupts. These problems manifest as an inability to start X with the NVIDIA driver, or CUDA initialization failures.
> 
> MSI interrupts can be disabled via the NVIDIA kernel module parameter NVreg_EnableMSI=0. This can be set on the command line when loading the module, or more appropriately via the distribution's kernel module configuration files (such as those under /etc/modprobe.d/). 

 

----------

## scorch_dev

 *Roman_Gruber wrote:*   

> i'm not sure if this is related:
> 
> https://wiki.gentoo.org/wiki/NVidia/nvidia-drivers
> 
>  *Quote:*   Driver fails to initialize when MSI interrupts are enabled
> ...

 

Thanks for the advice, I can try this. I had run across this and tried to disable the MSI interrupts in-kernel, but it hadn't fixed the problem, but I hadn't checked the /etc/modprobe.d/ directory for the nvidia configuration file. I'll try that and get back.

As well, since posting last, I've tried a few things to no avail. I've tried re-emerging each of the cuda packages (toolkit as well as the sdk), the nvidia-drivers, and llvm. I tried upgrading my nvidia-drivers package to 378.13 in the off-chance this would fix it. Though I don't know the effect of it yet, because it caused an "NVRM: API mismatch" error on starting the x-server. I'll see if I can resolve the error by trawling the forums, and, see, once I remove this new error, if the original error was fixed.

----------

## Roman_Gruber

 *scorch_dev wrote:*   

> 
> 
>  Though I don't know the effect of it yet, because it caused an "NVRM: API mismatch" error on starting the x-server. 

 

emerge new kernel source

change /usr/src/linux symlink to new kernel source

build new kernel + initramfs when wanted + with always a new name for the kernel. I use a datecode which I append to my kernels

```
make --jobs 8 && make --jobs 8 modules_install
```

```
uname -a

Linux ASUS-G75VW 4.9.18-gentoo-28-03-2017 .....
```

 Indicates i build that kernel on the 28th of march. Name the file accordingly, and also set the kernel feature to name the kernel ! 

boot new kernel (adapt bootlaoder by swapping kernel name and copy the files over)

verify new kernel is in use: uname -a

emerge nvidia-drivers

(usually reboot, but not needed anymroe these days) lazy approach

use the x server

cleanup old kernels / kernel sources / old kernel modules in /lib/modules/kernel-name-x-y-...

--

It is not that needed anymore these days. Less fuss is the approach above! Clean state

I*m quite sure guys will post after myself telling htis is not needed, bla bla ...

When you want ot be sure, to always have a clean state, without fuss. Do the approach above!

Do not use grub scripts of desctructions. adapt the bootloader by hand. it is just a plain text file, very easy to edit with e.g. nano.

----------

## scorch_dev

 *Quote:*   

> 
> 
> emerge new kernel source 
> 
> change /usr/src/linux symlink to new kernel source 
> ...

 

This piece of advice led me to an important realization, that ultimately lead me to the solution after several days now. So, when trying to rebuild my kernel to repair my NVRM error, I realized that my kernel version didn't sync up with the kernel set in my /usr/src/linux symlink, because a quick uname -a showed that my kernel version being loaded was version 4.9.6-r1, despite the fact that I had upgraded my kernel about maybe two weeks ago to 4.9.16. So, I was installing all of the drivers, kernel modules, etc. with a symlink that pointed to 4.9.16, but grub was loading in 4.9.6-r1.

Apparently I had forgotten to update grub after my kernel change update. A quick grub-mkconfig call fixed the boot issue, I re-emerged the nvidia-drivers, the cuda-toolkit, and the cuda-sdk. After all of this, I was able to succesfully query the device and get everything back. Thanks for the help.

----------

## Roman_Gruber

I recommend updating grub by hand, it's just a plain text file. I usually just change the kernel name and the title of the boot section.

----------

