# Change SAS driver from module to builtin = unbootable kernel

## Zolcos

I have an Intel C606 SAS controller, the use of which requires a firmware file. Currently I load the driver as a kernel module, and leave the firmware file in the right place on the filesystem and everything works fine.

Pursuant to the advice in the Gentoo Security Handbook I want to try and build this into the kernel so I can turn off module loading.

I switched the driver to be builtin and I enabled the option to include the firmware file in the kernel as well. But trying to boot the resulting kernel fails and gives me this screen:

http://img854.imageshack.us/img854/8799/panicg.jpg

I can enable the option to include the firmware file in the kernel and still get a bootable kernel, but once I switch the driver over I get this issue.

Kernel config: http://pastebin.com/23hPJiXi

Kernel make output: https://gist.github.com/aeheathc/4712265

make reported section mismatches, so the output above is with the option enabled to get more information about those.

----------

## Zolcos

If I do a 'make clean' before deploying the kernel and modules, it fails similarly but I get the more descriptive error "Unable to mount root fs on unknown-block 9,1". But I can see above that it correctly recognizes all the drives up to /dev/sdj.

----------

## Odward

Not sure I'll be much help, but it's a suggestion you can test!

I recently built my Intel C600 SAS controller firmware into the kernel because it was causing a 60 second delay on boot, 

I don't use initrd, and was failing to load firmware (it's also my boot/root drive).  Although I was not ever using isci as a module.

I don't know your previous kernel config, but I don't believe you need the option:

```
CONFIG_FIRMWARE_IN_KERNEL=y
```

Especially if you only need firmware built-in for this 1 controller.

I was comparing your relevant kernel options to mine and seeing that difference I tried to read a bit.  The part

that makes me wonder if it is causing your trouble is the description from the kernel help:

 *Quote:*   

> This single option controls the inclusion of firmware for every driver that uses request_firmware() and ships its firmware in the kernel source tree

 

I think that literally means every driver, so perhaps your errors stem from a different firmware blob in your kernel source.

Also worth mentioning, I actually have an isci folder in my kernel source and wonder if that config would cause it to try to

load 2 versions of firmware?

Bottom line, somebody else should explain the why if this works but you might try unsetting CONFIG_FIRMWARE_IN_KERNEL=y

My, hopefully relevant, kernel config

```
CONFIG_PREVENT_FIRMWARE_BUILD=y

# CONFIG_FIRMWARE_IN_KERNEL is not set

CONFIG_EXTRA_FIRMWARE="isci/isci_firmware.bin"

CONFIG_EXTRA_FIRMWARE_DIR="/lib/firmware/"

CONFIG_FIRMWARE_MEMMAP=y
```

Probably wouldn't hurt to make clean again prior to make if you try this.

My apologies if this is useless  :D

----------

## DaggyStyle

if you are booting from that controller than sure it won't boot, modules are loaded either in initrd, initramfs or normal boot.

you need to setup the former two.

side note, I really fail to see the reason of setting the active controller's driver as module in gentoo.

----------

## Zolcos

 *Odward wrote:*   

> Not sure I'll be much help, but it's a suggestion you can test!

 

Thanks for the explanation. I disabled CONFIG_FIRMWARE_IN_KERNEL and it didn't change much:

http://img266.imageshack.us/img266/7112/img20130211205057.jpg

Above it, /dev/sdj tells me that it is recognizing all of the drives, and the partition sizes tell me that the detection order of the controllers did not change, so the device names are what the system expects. So it surprises me to see that the most common explanation for the error I'm seeing is wrong device name given to grub.

 *DaggyStyle wrote:*   

> if you are booting from that controller than sure it won't boot, modules are loaded either in initrd, initramfs or normal boot.
> 
> you need to setup the former two.

 

I'm not booting from any drives on the C606. These drives are just an array containing my /var. Either way the problem is it is working fine as a module, but not working when builtin :)

I have no initrd/initramfs.

----------

## DaggyStyle

 *Zolcos wrote:*   

>  *Odward wrote:*   Not sure I'll be much help, but it's a suggestion you can test! 
> 
> Thanks for the explanation. I disabled CONFIG_FIRMWARE_IN_KERNEL and it didn't change much:
> 
> http://img266.imageshack.us/img266/7112/img20130211205057.jpg
> ...

 

ahh, my bad, can you post the dmesg of the boot? (use wgetpaste for that and post the link)

----------

## Zolcos

 *DaggyStyle wrote:*   

> ahh, my bad, can you post the dmesg of the boot? (use wgetpaste for that and post the link)

 

I'll need a little more advice for that -- the reason I posted the error as a picture is because the system locks up at that point, so I never get to see RC let alone login to run dmesg. And since /var is on the controller that is having the problem, nothing will be written to /var/log/dmesg either...

----------

## DaggyStyle

 *Zolcos wrote:*   

>  *DaggyStyle wrote:*   ahh, my bad, can you post the dmesg of the boot? (use wgetpaste for that and post the link) 
> 
> I'll need a little more advice for that -- the reason I posted the error as a picture is because the system locks up at that point, so I never get to see RC let alone login to run dmesg. And since /var is on the controller that is having the problem, nothing will be written to /var/log/dmesg either...

 

this can be solved easily, disable the mount line, thus the system should not freeze on boot and you should be able to extract dmesg and see what is the issue.

----------

## Zolcos

 *DaggyStyle wrote:*   

> this can be solved easily, disable the mount line, thus the system should not freeze on boot and you should be able to extract dmesg and see what is the issue.

 

Even after disabling all lines in fstab that mount anything on the c606 (both /var and swap) I get the same problem when trying to boot the new kernel with the builtin driver.

----------

## krinn

Of course it may not be totally true, but without using your /var on the SAS drive, your kernel should boot.

As your SAS is/was use as module, and as you explicitly tell, it's not your boot drive, so just consider your SAS as an ethernet card : if you have then a problem with your ethernet module, you may end with a non working network, but your kernel should still boot.

This to explain : as your kernel cannot boot and the SAS isn't use for that, check what you have change in your kernel that imply the controller that is use to boot the kernel and stop investigating your SAS for trouble as for me, your problem is elsewhere, and touch the booting controller.

There could be plenty reasons why without the SAS drive your kernel may fail to boot, the obvious ones would be something need in /var, another one would be drive name change and kernel is getting lost. Another easy one, you're not loading the kernel you think should be loaded...

And reconsider really checking your kernel options you have changed, you cannot imagine how many times i've heard "but i didn't change anything except that", to see the user gotcha eureka! later that he indeed have change something else he have forget.

From your screenshot, i would say that : sd driver is ok, controller driver is ok too, seeing plenty drives, but still kernel complain unable to mount root, i would dig first my grub.conf to see what drive is suppose to fireup the kernel. A real common error is mistaking drive letters : first controller loaded get first drives letters, when using your SAS controller as module it then will always be loaded after your booting controller, just because the booting controller must be included in order to let you boot. But when changing the SAS to load as build-in, if the kernel load then the SAS driver before the booting controller, the SAS driver will steal the first drives letter, ending with an invalid root= in grub as it now target a drive handled by the SAS and not by the booting controller.

I know  you already tell "the partition sizes tell me that the detection order of the controllers did not change", but rechecking it really won't killed you, specially because you have a lot of drives.

To help you boot and fix your issue, try https://forums.gentoo.org/viewtopic-p-6863584.html#6863584

It will let you boot in single mode, loading the minimal to get a command line you should be able to work with to fix everything.

----------

## Zolcos

 *krinn wrote:*   

> And reconsider really checking your kernel options you have changed, you cannot imagine how many times i've heard "but i didn't change anything except that", to see the user gotcha eureka! later that he indeed have change something else he have forget.

 

To test this I made and installed a kernel with the SAS driver completely gone, and the system actually came up, just without the /var.

 *krinn wrote:*   

> To help you boot and fix your issue, try https://forums.gentoo.org/viewtopic-p-6863584.html#6863584
> 
> It will let you boot in single mode, loading the minimal to get a command line you should be able to work with to fix everything.

 

I added the S to the end of the line in grub and pressed enter then b to boot it. but the system still froze up with the same error

Side note: I noticed in RC that the module isci loads after another module called scsi_wait_scan. Maybe the SAS driver has to load after that? But I can't find a way to build scsi_wait_scan into the kernel.

----------

## DaggyStyle

can you layout the partitions and drives on your system? e.g. partition x is on drive y

I have the feeling that your root is on the sas controller

----------

## Zolcos

 *DaggyStyle wrote:*   

> can you layout the partitions and drives on your system? e.g. partition x is on drive y
> 
> I have the feeling that your root is on the sas controller

 

SATA controller: sda through sdf

SAS controller: sdg through sdj

Here's my mdadm.conf

```
ARRAY /dev/md0 metadata=0.90 devices=/dev/sd[ab]1

#ARRAY /dev/md1 metadata=0.90 devices=/dev/sd[ab]2

#md1 is mounted by the kernel via definition in grub.conf

ARRAY /dev/md2 metadata=1.2 devices=/dev/sd[cdefghij]1

ARRAY /dev/md3 metadata=1.2 devices=/dev/sd[cdefghij]2
```

and my grub.conf

```
default 0

timeout 10

password --md5 $1$ovOgu0$YMVVoycx4uvqVCblB2Ht/1

title Gentoo Linux Hardened 3.5.4-r1

root (hd0,0)

kernel /boot/kernel-3.5.4-r1-hardened root=/dev/md1 md=1,/dev/sda2,/dev/sdb2
```

and my fstab:

```
/dev/md0        /boot           ext2    noauto                                          0 2

/dev/md1        /               ext4    noatime,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0        0 1

/dev/md2        none            swap    sw                                              0 0

/dev/md3        /mnt/bigraid    ext4    noatime,nodev                                   0 2

```

md3 gets mounted in that place because it contains folders for both var and tmp which are symlinked from /var and /tmp

----------

## DaggyStyle

I think it is a config issue, in the pic it is visible that the kernel try to mount root from hd(9,1).

hd(9,1) = /dev/sdj2 which is on the missing raid controller.

mdadm.conf configures the sw raids, it doesn't mounts them, not sure if it will do but you must have it defined in mdadm.conf.

are you using initrd or initramfs?

----------

## Zolcos

 *DaggyStyle wrote:*   

> are you using initrd or initramfs?

 

Nope.

btw, I'm using mdev instead of udev to avoid needing init* with my separate /var

 *DaggyStyle wrote:*   

> I think it is a config issue, in the pic it is visible that the kernel try to mount root from hd(9,1).
> 
> hd(9,1) = /dev/sdj2 which is on the missing raid controller.

 

Ah, I didn't realize that "VFS: unknown-block (9,1)" was referring to a drive in the same way that grub does. But where could those numbers come from? When I install an alternate kernel I give it the same filename and don't change anything in grub.

 *DaggyStyle wrote:*   

> mdadm.conf configures the sw raids, it doesn't mounts them, not sure if it will do but you must have it defined in mdadm.conf

 

I'm not sure I understand. Is something missing in my mdadm.conf? What is the thing that must be defined?

----------

## DaggyStyle

 *Zolcos wrote:*   

> 
> 
> Nope.
> 
> btw, I'm using mdev instead of udev to avoid needing init* with my separate /var

 

I'm using latest udev with /var and /usr on separate partitions without any init* without a problem.

 *Zolcos wrote:*   

> Ah, I didn't realize that "VFS: unknown-block (9,1)" was referring to a drive in the same way that grub does. But where could those numbers come from? When I install an alternate kernel I give it the same filename and don't change anything in grub.

 

not sure, you'll need to investigate more.

 *Zolcos wrote:*   

> 
> 
> I'm not sure I understand. Is something missing in my mdadm.conf? What is the thing that must be defined?

 

I'm running similar config as you and I don't have issues, all my raids are configured in mdadm.conf, including /

----------

