# [solved]NVIDIA: could not open the device /dev/nvidia0

## Gibbo_07

So i've been running linux+nvidia+gentoo for many many years now however this time even i'm stumped. This card is known to work fine previously under linux and still works fine in windows or using nouvea. Recently upgraded cpu/mobo so have done fresh install which otherwise works fine with nouveau.

nvidia-drivers builds fine, module loads successfully, however startx produces the error in the title as well as "NVRM: rm_init_adapter(0) failed" in kernel msgs. been trying to solve this for cpl months now no cigar hoping someone else got ideas as everyone i've found with same error seems to be running a laptop with optimus however this is a desktop.

```

lspci -k

01:00.0 VGA compatible controller: NVIDIA Corporation GF114 [GeForce GTX 560 Ti] (rev a1)

   Subsystem: Gigabyte Technology Co., Ltd Device 3515

   Kernel driver in use: nvidia

   Kernel modules: nvidia

```

```

# nvidia-xconfig: X configuration file generated by nvidia-xconfig

# nvidia-xconfig:  version 319.23  (buildmeister@swio-display-x86-rhel47-11)  Thu May 16 20:17:21 PDT 2013

Section "ServerLayout"

    Identifier     "Layout0"

    Screen      0  "Screen0"

    InputDevice    "Keyboard0" "CoreKeyboard"

    InputDevice    "Mouse0" "CorePointer"

EndSection

Section "Files"

EndSection

Section "InputDevice"

    # generated from data in "/etc/conf.d/gpm"

    Identifier     "Mouse0"

    Driver         "mouse"

    Option         "Protocol"

    Option         "Device" "/dev/input/mice"

    Option         "Emulate3Buttons" "no"

    Option         "ZAxisMapping" "4 5"

EndSection

Section "InputDevice"

    # generated from default

    Identifier     "Keyboard0"

    Driver         "kbd"

EndSection

Section "Monitor"

    Identifier     "Monitor0"

    VendorName     "Unknown"

    ModelName      "Unknown"

    HorizSync       28.0 - 33.0

    VertRefresh     43.0 - 72.0

    Option         "DPMS"

EndSection

Section "Device"

    Identifier     "Device0"

    Driver         "nvidia"

    VendorName     "NVIDIA Corporation"

EndSection

Section "Screen"

    Identifier     "Screen0"

    Device         "Device0"

    Monitor        "Monitor0"

    DefaultDepth    24

    SubSection     "Display"

        Depth       24

    EndSubSection

EndSection

```

[/code]

/var/log/messages:

```

Jun  2 13:24:36 bulldozer kdm[4463]: X server died during startup

Jun  2 13:24:36 bulldozer kdm[4463]: X server for display :0 cannot be started, session disabled

Jun  2 13:24:36 bulldozer cron[4528]: (CRON) STARTUP (V5.0)

Jun  2 13:24:42 bulldozer login[4543]: pam_unix(login:session): session opened for user root by LOGIN(uid=0)

Jun  2 13:24:42 bulldozer login[4557]: ROOT LOGIN  on '/dev/tty1'

Jun  2 13:25:26 bulldozer kernel: [  100.485211] NVRM: RmInitAdapter failed! (0x26:0x38:1170)

Jun  2 13:25:26 bulldozer kernel: [  100.485241] NVRM: rm_init_adapter(0) failed

Jun  2 13:25:26 bulldozer kdm[5362]: X server died during startup

Jun  2 13:25:26 bulldozer kdm[5362]: X server for display :0 cannot be started, session disabled

```

Xorg.0.log:

http://temp-share.com/show/3YgF8v49x

dmesg:

```

[   55.224867] nvidia: module license 'NVIDIA' taints kernel.

[   55.224870] Disabling lock debugging due to kernel taint

[   55.231314] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem

[   55.231454] [drm] Initialized nvidia-drm 0.0.0 20130102 for 0000:01:00.0 on minor 0

[   55.231462] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  319.23  Thu May 16 19:36:02 PDT 2013

[   55.629289] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0019 address=0x00000000c6800000 flags=0x0030]

[   55.629293] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0019 address=0x000000041bcbf000 flags=0x0010]

[   63.693115] NVRM: RmInitAdapter failed! (0x26:0x38:1170)

[   63.693122] NVRM: rm_init_adapter(0) failed

[  100.358936] [drm] Module unloaded

[  102.713661] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=io+mem

[  102.713810] [drm] Initialized nvidia-drm 0.0.0 20130102 for 0000:01:00.0 on minor 0

[  102.713819] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  319.23  Thu May 16 19:36:02 PDT 2013

[  107.165927] NVRM: Xid (0000:01:00): 62, !3852(7ffc)

[  115.166148] NVRM: RmInitAdapter failed! (0x26:0x38:1170)

[  115.166176] NVRM: rm_init_adapter(0) failed

[  147.506384] NVRM: RmInitAdapter failed! (0x26:0x38:1170)

[  147.506399] NVRM: rm_init_adapter(0) failed

[  534.754427] emerge (10252) used greatest stack depth: 4184 bytes left

[  535.930555] emerge (5570) used greatest stack depth: 3816 bytes left

[  544.952080] [drm] Module unloaded

[  547.294707] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=io+mem

[  547.294851] [drm] Initialized nvidia-drm 0.0.0 20130102 for 0000:01:00.0 on minor 0

[  547.294859] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  319.23  Thu May 16 19:36:02 PDT 2013

[  559.747823] NVRM: RmInitAdapter failed! (0x26:0x38:1170)

[  559.747856] NVRM: rm_init_adapter(0) failed

[  571.855287] NVRM: RmInitAdapter failed! (0x26:0x38:1170)

[  571.855302] NVRM: rm_init_adapter(0) failed

[  668.036260] NVRM: RmInitAdapter failed! (0x26:0x38:1170)

[  668.036289] NVRM: rm_init_adapter(0) failed

[  771.022771] NVRM: RmInitAdapter failed! (0x26:0x38:1170)

[  771.022781] NVRM: rm_init_adapter(0) failed

```

WHAT IVE TRIED:

- many kernel tweaks so am now thinking kernel config not the problem however can post it if need be

- /dev/nvidia0 exists and the permissions look fine. these seem to be dynamically created each boot so not sure why the error.

- vmalloc increase

- users startingx are definitely in the video etc group that shouldn't be the problem.

- rolled back and forth between xorg-server and nvidia-driver versions otherwise system is ~amd64 and current

Anyone got ideas? i'm sick of falling back to nouveau!Last edited by Gibbo_07 on Wed Jun 05, 2013 10:03 am; edited 2 times in total

----------

## roarinelk

unload nouveau or better, disable it in the kernel? It and the nvidia blob don't coexist.

----------

## Gibbo_07

 *roarinelk wrote:*   

> unload nouveau or better, disable it in the kernel? It and the nvidia blob don't coexist.

 

Hi thanks for the reply.

I forgot to clarify that I have built literally dozens of kernels from 3.9.0 to current 3.9.4 some following the gentoo nvidia guide, nouveau and even DRI completely unselected while some with DRI and nouveau as modules (so I could get to the desktop). And all sorts of variations in between. MTRR/PAT are enabled etc. I always check that lspci -k shows nvidia driver has the card too.

This is a fresh kernel config with the new hardware and install so thought I must have missed something small however now i'm just stumped :S Nothing seems to budge me past this it's really frustrating  :Sad: 

edit: ill post up my kernel config when I get home from work

----------

## s4e8

add "amd_iommu=off" to kernel command line.

----------

## Gibbo_07

 *s4e8 wrote:*   

> add "amd_iommu=off" to kernel command line.

 

Curious how iommu comes into play here, hadn't considered changing it.

I can turn it off in the BIOS (is currently on and set to 32 or 64mb iirc).

So would disabling in BIOS or kernel line be best? kernel line would disable the kernel IOMMU not the bios provided one?

I thought iommu was good for linux w/ AMD (i've just gone intel -> AMD, don't ask why  :Razz: ) 

Thanks for the tip shall investigate!

----------

## s4e8

 *Gibbo_07 wrote:*   

> 
> 
> Curious how iommu comes into play here, hadn't considered changing it.
> 
> I can turn it off in the BIOS (is currently on and set to 32 or 64mb iirc).
> ...

 

dmesg report iommu access violation from video adapter 1:0.0.

```

[   55.629289] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0019 address=0x00000000c6800000 flags=0x0030]

[   55.629293] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0019 address=0x000000041bcbf000 flags=0x0010] 

```

----------

## Gibbo_07

 *s4e8 wrote:*   

> 
> 
> 

 

This guy.... 

Spot on with iommu being culprit however I get the feeling it's a kernel bug seeing how a workaround is needed for what I believe would be an iommu standard(?) implementation on my asus sabertooth 990fx 2.0. Not sure I want to know why anymore just glad to be on nvidia-drivers.

So cheers, cause I sure as heck am havin a drink now that this battle is over!

----------

## Ant P.

If you believe it to be a kernel bug, report it to bugzilla.kernel.org.

----------

## Gibbo_07

 *Ant P. wrote:*   

> If you believe it to be a kernel bug, report it to bugzilla.kernel.org.

 

Shall do after I figure out exactly why this happened.

My hardware is all very common parts and been around for quite a while so I shouldn't expect such a kernel parameter to be necessary for someone with single GPU.

asus sabertooth 990fx

8150fx

560 ti.

Anyway thx again

----------

## roarinelk

in the log excerpt you posted initially, both noveau and the blob are loaded. the iommu rightly complains

that the second driver cannot do transfers to it since the device has already been claimed.

It's not a kernel bug.

----------

## Gibbo_07

 *roarinelk wrote:*   

> in the log excerpt you posted initially, both noveau and the blob are loaded. the iommu rightly complains
> 
> that the second driver cannot do transfers to it since the device has already been claimed.
> 
> It's not a kernel bug.

 

If so that log mighta been taken using one of the nvidia+nouveau(module) kernels, trust me when I say there was no change when nouveau was completely stripped from both kernel and userland with blacklist entry to boot. lspci -k was always used to confirm what was using the card during testing. 

Besides it should be perfectly OK to have a kernel setup to do both drivers if nouveau is a module. Infact my current kernel is setup this way right now (since applying iommu workaround) however userland support has since been removed for nouveau - i'll do the kernel blob next time I change something major needing a rebuild.

Any time a user needs to apply a very obscure and not well published workaround for very common hardware there is a bug by nature. Whether it's the kernel or nvidia-drivers i'll have to find out.

----------

