# [Howto] bcache

## GamezR2EZ

Updated: 2013/11/04

bcache is now included in the mainline kernel starting with 3.10! I figured it was time to write up a quick guide on howto use it, information for it can be sparse and confusing on the web currently.

Assumptions:

You already know what bcache does (if not, see here)

You already know how to setup and use an initramfs (if not, see here)

You already know how to configure a kernel (if not, see here)

You know there are alternatives such as ccache and flashcache, but are choosing bcache (currently, those are more documented and used).

You will read this whole guide before you follow it.

Basic Question's Answers:

Can I convert an existing partition or disk with data on it for use with bcache?

Yes. But you have to convert the existing layout. This can be done in place with a conversion tool here: https://github.com/g2p/blocks

This is not explicitly covered in this guide. But essentially you will just need to convert the disk and then attach a cache to it.

The reason a conversion is needed is a new bcache volume applies its own metadata at the start of the volume. The reason for this is because bcache needs to know all data that is being written to the device. If you have a bcache partition on /dev/sda2, for example, and you mount and write to the partition on /dev/sda2 without going through bcache (from a livecd, for example). Then all the information on the cache must be invalidated or risk corruption. bcache would not know that it was written to, so it would not know to invalidate the cache.

Is it stable?

According to the author's site, yes. It has been accepted into the mainline linux kernel, as well. I've been using it on my desktop for about a year now with no stability or data integrity issues.

Getting started

First you must get a kernel that supports bcache. You can either download >=3.10 kernel or you can download the kernel through git from bcache. I do not know how long after 3.10 goes live that the bcache git will stay up, so I suggest getting 3.10 from kernel.org. You must also download bcache-tools. Guess what? The future is now. You can now `emerge bcache-tools`! The git repos below should include the latest tools should you need them.

kernel.org || http://evilpiepirate.org/git/linux-bcache.git

http://evilpiepirate.org/git/bcache-tools.git

In the kernel, enable bcache:

```
-> Device Drivers

   -> Multiple devices driver support (RAID and LVM)

       <*>   Block device as cache
```

Install bcache-tools

```
# emerge bcache-tools
```

Formating cache and backing devices

For the purpose of this howto, our setup will be an SSD at /dev/sda and a HDD at /dev/sdb. /dev/sda will have no partitions. /dev/sdb1 is a 100mb partition for /boot. /dev/sdb2 is the rest of the disk.

Format your cache and backing devices: WARNING - This is destructive to the data on the volume

```
make-bcache --cache /dev/sda

make-bcache --bdev /dev/sdb2
```

At this point, you should have a new device called /dev/bcache0. You can use this for whatever you need. This is your new /dev/sdb2. Format it, use if for LVM stuff, dm-crypt, whatever you need it for. In this case, we are going to use /dev/bcache0 as our new root partition. One small thing, we have to attach our backing device to our cache device. If you have multiple backing devices, you can attach them all to a single cache.

First we need to find out our caching device UUID (if you have more than one caching device, you will see multiple UUIDs here):

```
# ls -1 /sys/fs/bcache

xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx

register

register_quiet
```

Now we take that UUID and attach our backing device to it

```
# echo xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx > /sys/block/bcache0/bcache/attach
```

Done. Now the volume is cached. This concludes setting up the volume.

Basic init script for your initramfs

If you have your root partition cached, you will have to use an initramfs. For autogenerated initramfs, I don't know I have never used them. bcache mentions making a few hooks for them, but again I have no experience with them.

I have included a very basic, no error catching init here to see the needed code.

```
#!/bin/busybox sh

mount -t proc none /proc

mount -t sysfs none /sys

mount -t devtmpfs none /dev

sleep 3

#We need to register both our caching and backing devices that have our root partition

echo /dev/sda > /sys/fs/bcache/register_quiet

echo /dev/sdb2 > /sys/fs/bcache/register_quiet

#You could also use a line like below if you have many devices, changing devices names, or just like one liners. bcache will only register devices that it can find its metadata on

#for i in $(ls /dev/sd*); do echo $i > /sys/fs/bcache/register_quiet; done

#Now that we have registered our bcache backing device AND cache, it will have created /dev/bcache0 for us. Mount our root filesystem now

mount -o ro /dev/bcache0 /mnt/root

#unmount pseudo FS

umount /dev

umount /sys

umount /proc

#root switch

exec /bin/busybox switch_root /mnt/root /sbin/init
```

That is it! You should be booting. You are done!

Tips, Tricks, and Extra Info

1.

You can use bcache for individual lvm partitions. You can drop bcache anywhere you want, in between anything! My setup is `raw device(s) > mdadm > bcache > dm-crypt > lvm`. You can move bcache around anywhere in that stack. Be careful if you put it after dm-crypt as it will cache unencrypted content. The way bcache works means it may not cache anything useful, but why take the chance?

2.

If trying to create an lvm partition on a /dev/bcache* device, you must update your /etc/lvm/lvm.conf to include the following:

```
devices{

....[snip]....

types = [ "bcache", 16 ]

....[snip]....

}
```

If you don't, LVM reports back with "Failed to read physical volume". You also need to include that modified lvm.conf in your initramfs if you have root on an lvm partition that sits directly on a /dev/bcache* device.

3.

WARNING: be careful with this step if you have write-caching enabled. This is potentially corrupting because the ssd may have had data that had not been flushed to the disk yet. Personally, I have successfully replaced a cache device with write caching enabled, but all of the data had been flushed back to the disk and fsck took care of the rest. If you remove the device after a clean shutdown, you should be safe. I recommend disabling write caching before you remove the device.

If your caching device no longer exists for some reason (damaged, remove, etc) you will be unable to see your /dev/bcache* device. You will have to issue the following command after you register it with bcache:

```
# echo 1 > /sys/block/sdb/sdb2/bcache/running
```

This is needed if your SSD crashed or was removed and your init drops to a rescue shell. This way you can still boot up. You can reattach it to another cache later.

4.

bcache defaults to using 4k sectors. Any bcache or bcache supported device will be 4k sector size. If you need a different sector size (and your backing device isn't natively 4k), you have to format your devices with a different block size (this cannot be done after the fact).

```
# make-bache --block 512 --bdev /dev/sdb2

# make-bache --block 512 --cache /dev/sda
```

I am not sure if you need to keep your backing device and cache device the same block size. Intuitively I would say they have to be the same size, but I am just not sure.

Super specific example of why you would need this... If you plan on using a bcached device (with or without LVM stuff on top) to support a Windows VM with Xen and use the GPLPV drivers by James Harper, you will need to make sure the device is in the 512 format. The GPLPV drivers crash with the 4k sector size. You could also just opt to not install the xenvbd portion of the drivers.

5.

If you format your caching and backing device at the same time you will not have to go through the steps of attaching the backing device to the cache.

```
make-bache --bdev /dev/sdb2 --cache /dev/sda
```

6.

Everything you need to use to control bcache is in /sys/fs/bcache/. Simple cat and echo commands can be used to get information and set settings. You can get all of your cache stats there.

7.

Writeback mode. By default, bcache just caches reads, which is safest. For added write performance at added risk, you can run:

```
# echo writeback > /sys/block/bcache*/bcache/cache_mode
```

I have been running in writeback mode the entire time; no corruption issues. Still, make good backups. Expect everything to fail miserably.

8.

If you plan to setup your root on a bcache device, please, please, please, make a livecd or secondary recovery boot with a bcache enabled kernel. You will not be able to access these partitions without a bcache kernel for reasons already discussed. It makes recovery so much simpler, which is great because if you are trying to recover things you are probably already stressed out and more inclined to making errors.

UPDATE: The standard gentoo ISO you would use to setup the environment to install a fresh copy of gentoo includes a bcache enabled kernel now. You can do maintenance with that. I imagine as time rolls on, it will be enable in most "liveCD" enviroments.Last edited by GamezR2EZ on Tue Nov 05, 2013 2:55 am; edited 8 times in total

----------

## user

Welcome GamezR2EZ,

and thanks for your effort.

----------

## cal22cal

I am confused. 

What should I do if the SSD damaged & replaced a new one, since can't convert a 

partition/disk to use bcache ?

----------

## GamezR2EZ

 *cal22cal wrote:*   

> I am confused. 
> 
> What should I do if the SSD damaged & replaced a new one, since can't convert a 
> 
> partition/disk to use bcache ?

 

Changed the wording. You cannot convert an existing partition to use bcache without data loss.

----------

## cal22cal

GamezR2EZ,

Thanks for your explanation.  :Wink: 

----------

## fonic

This is an excellent how-to, thank you for your effort.

I ran into a major problem when trying to implement bcache. I set up bcache according to the how-to without changing any options. /dev/bcache0 comes up and everything seems to work. However, during boot, udev will hang at "waiting for uevents to be processed" for quite a while and after that, X fails to start. I was able to reproduce that problem every single time.

When /lib/udev/rules.d/61-bcache.rules is deleted, the hang won't occur. I tracked down the problem to /dev/bcache0 - whenever this device is present during boot, no matter if created by 61-bcache.rules or via initramfs, udev will hang.

Any suggestions how to fix this?

EDIT: neither dmesg nor /var/log/messages nor /run/udevmonitor.log etc. show any signs of an udev related error, everything appears to be normal

----------

## VoVaN

 *fonic wrote:*   

> This is an excellent how-to, thank you for your effort.
> 
> I ran into a major problem when trying to implement bcache. I set up bcache according to the how-to without changing any options. /dev/bcache0 comes up and everything seems to work. However, during boot, udev will hang at "waiting for uevents to be processed" for quite a while and after that, X fails to start. I was able to reproduce that problem every single time.
> 
> When /lib/udev/rules.d/61-bcache.rules is deleted, the hang won't occur. I tracked down the problem to /dev/bcache0 - whenever this device is present during boot, no matter if created by 61-bcache.rules or via initramfs, udev will hang.
> ...

 

I'm having the same issue and don't have any clue why...?

----------

## VoVaN

 *fonic wrote:*   

> This is an excellent how-to, thank you for your effort.
> 
> I ran into a major problem when trying to implement bcache. I set up bcache according to the how-to without changing any options. /dev/bcache0 comes up and everything seems to work. However, during boot, udev will hang at "waiting for uevents to be processed" for quite a while and after that, X fails to start. I was able to reproduce that problem every single time.
> 
> When /lib/udev/rules.d/61-bcache.rules is deleted, the hang won't occur. I tracked down the problem to /dev/bcache0 - whenever this device is present during boot, no matter if created by 61-bcache.rules or via initramfs, udev will hang.
> ...

 

I've done some more testing with bcache. I'm not using openrc, but systemd. Actually the system works perfectly fine... if I disable display manager and blacklist nvidia (proprietary) module. BTW, do you use nvidia as well? In case of disabled DM and blacklister nvidia, after system is booted I can load nvidia module, start DM (lightdm) and login successfully. So far so good... but I can't logout from graphical session - the system just hangs. Well actually it isn't because I still can access the system using ssh, but rebooting system in the normal way hangs forever. This all is VERY strange. Below is some more details.

If nvidia module isn't blacklisted I see the following message in the log:

```

Jul 11 21:38:57 *** kernel: NVRM: Your system is not currently configured to drive a VGA console

Jul 11 21:38:57 *** kernel: NVRM: on the primary VGA device. The NVIDIA Linux graphics driver

Jul 11 21:38:57 *** kernel: NVRM: requires the use of a text-mode VGA console. Use of other console

Jul 11 21:38:57 *** kernel: NVRM: drivers including, but not limited to, vesafb, may result in

Jul 11 21:38:57 *** kernel: NVRM: corruption and stability problems, and is not supported.

```

What I see after logout (when the system hangs), that the X process is waiting for disk (status Ds+).

I'm not an expert, hopefully somebody have a kind of explanation of this strange chain: nvidia, Xorg and bcache.

----------

## GamezR2EZ

 *Quote:*   

> EDIT: neither dmesg nor /var/log/messages nor /run/udevmonitor.log etc. show any signs of an udev related error, everything appears to be normal

 

You can get more info on the problem by enabling logging in the udev.conf and rc.conf

I don't think it is on by default.

 *Quote:*   

> I'm not an expert, hopefully somebody have a kind of explanation of this strange chain: nvidia, Xorg and bcache.

 

I don't use X with my current setup. I did use X while testing the 3.10 kernel and bcache, but I was using nouveau. I did not run into any issue with X or rebooting.

Do you get the same results when using bcache, X, and the nouveau driver instead?

----------

## VoVaN

 *GamezR2EZ wrote:*   

>  *Quote:*   EDIT: neither dmesg nor /var/log/messages nor /run/udevmonitor.log etc. show any signs of an udev related error, everything appears to be normal 
> 
> You can get more info on the problem by enabling logging in the udev.conf and rc.conf
> 
> I don't think it is on by default.
> ...

 

Without nvidia proprietary drivers I don't have any problems with bcache...

----------

## Zacariaz

Hey there and thanks for the effort.

I'm running Arch, and while I don't expect support, I do have an issue/question that you might ba able to help with.

Basically I do exactly what you describe, except for one small detail.

Instead of using the entire SSD as cache, I use only part of it, a that's really my only option.

As a result, I would assume, /dev/bcache0 does not show up.

So the question is of course: Is it absolutely necessary to to employ the entire disk? If so, bcache is not the solution for me, but I haven't been able to find a satisfying answer.

Hope you can help and bet regards.

edit:

This question is no longer relevant for me, as I've found a way to have theEFI System partition of sdb rather than sda, simply by using gummiboot instead of efibootmgr. (why on earth that would make a difference is a mystery to me though.)

I still can't get bcache figured out, but that question I'll better take somewhere else.

----------

## GamezR2EZ

 *Zacariaz wrote:*   

> Is it absolutely necessary to to employ the entire disk?

 

No. You can use a partition for either the backing or cache.

I currently have my cache volume on /dev/sda2 with /dev/sda1 as scratch space for another project.

You obviously will have to realize that the cache will be slower if you are accessing the other volume in some way.

----------

## Zacariaz

 *GamezR2EZ wrote:*   

>  *Zacariaz wrote:*   Is it absolutely necessary to to employ the entire disk? 
> 
> No. You can use a partition for either the backing or cache.
> 
> I currently have my cache volume on /dev/sda2 with /dev/sda1 as scratch space for another project.
> ...

 

Thank you for the answer. Still hadn't found an answer for that.

In any case I've given up on the project. I did everything in my power to get it to work, but without luck. In the end, I gave up, and payed an arm and a leg for a sizable msata to replace the one I already had, thus no need for bcache any longer.

I would have enjoyed getting it to work though...

Best regards.

----------

## RBee

Hi,

I'm new here (not much of an excuse).

Anyway, THANKS for this thread, I found it interesting and probably SHOULD understand it all.

My question is whether or not there is (or will be) a Debian package for bcache ?

I run Ubuntu Studio, have 13.10 beta 2 up now and it has 3.11.0-2-lowlatency  #1-Ubuntu kernel.

I would have thought that bcache should be in it by now, but /sys/fs shows nothing about bcache.

Anyone got bcache working on a Debian system yet ?

Does this Howto apply to Debian systems ?

----------

## cal22cal

hi, guys

Sorry for my poor english, though

Report back so late, since my pc is dead.

Borrow a notebook from the other and simulate the bcache installation for my new pc.

1. Using qemu-kvm install from install-amd64-minimal-20130816.iso

2. Mount iso as ide disk.

3. Using virtIO disk for the base system as /dev/vda1 and /dev/vda2.

    Without changing the make.conf to save compile time

4. Build kernel with Intel sata driver, lvm, raid1, raid5, bcache in kernel

5. Compiled the all the followings as static

    sys-fs/mdadm static

    sys-apps/busybox static

    =virtual/udev-200 static-libs

    =sys-fs/udev-204 static-libs

6. Reboot and make sure the installation under /dev/vda is ok.

7. Emulate a SSD by a ramdisk as raw image in kvm as /dev/sda

8. Emulate raid5 by 3 raw images /dev/sdb /dev/sdc /dev/sdd

9. Do the sata -> raid5 -> bcache -> lvm installation as usual. 

    As Gentoo Linux x86 with Software Raid and LVM2 Quick Install Guide

    md1 as boot 0.9 metadata, md4 as defualt  

10. Copy over the /dev/vda1 to md1 as boot

11. Copy over the /dev/vda2 file system to lvm /dev/mapper/vg-lvol1 (in my case)

So, the vm m/c got vda as a rescue system, raid5 one is used for testing. 

Both of them can assess the bcache raid5.

Reboot the system to vda and got the "waiting for uevents to be processed" forever problem  :Sad: 

Using this for reference   http://gentoo-en.vfose.ru/wiki/Initramfs

#!/bin/busybox sh

rescue_shell() {

        echo "Something went wrong. Dropping you to a shell."

                busybox --install -s

        exec /bin/sh

}

echo "Starting init!"

mount -t proc none /proc

mount -t sysfs none /sys

mount -t devtmpfs none /dev

CMDLINE='cat /proc/cmdline' 

#Reconfig the array

#ls -l /dev

ls -l /dev/sd*

#sleep 10

/bin/mdadm   --assemble /dev/md1 /dev/sdb1 /dev/sdc1 /dev/sdd1

/bin/mdadm   --assemble --scan /dev/md4 --config=/etc/mdadm.conf

/bin/mdadm -D /dev/md1

/bin/mdadm -D /dev/md4

#sleep 10

ls -l /dev/md*

ls -l /sys/fs/bcache/

#sleep 10

#We need to register both our caching and backing devices that have our root partition

echo /dev/sda > /sys/fs/bcache/register_quiet

echo /dev/md4 > /sys/fs/bcache/register_quiet

#sleep 10

/bin/vgscan --mknodes

/bin/vgchange -aly

ls -l /dev/vg

#sleep 10

ls -l /dev/mapper/vg-lvol1

mount -t ext4 -o ro /dev/vda2 /newroot || rescue_shell 

#sleep 10

#unmount pseudo FS

umount /dev

umount /sys

umount /proc

#root switch

exec  switch_root /newroot /sbin/init ${CMDLINE} 

#      find . -print0 | cpio --null -ov --format=newc | gzip -9 > /boot/initramfs  

build the initram, change the vda2 to /dev/mapper/vg-lvol1 for the raid5 file system

Boot up success for both vda and raid5 installations  :Wink: 

----------

## G2P

Hello,

I'd like to add a few things to this tutorial since I've done some work on the bcache userspace.

The info about the bcache superblock is accurate, except for the fact that it doesn't prevent in-place conversion. Conversion tool here: https://github.com/g2p/blocks

make-bcache should detect the hardware's minimum io size correctly, although the gentoo ebuild currently predates that.

The advice about force-starting bcache should probably come with a warning that this is dangerous if the cache was writeback-enabled.

Finally, we ship udev rules and udev-based initramfs hooks, ideally Gentoo would reuse that. There are three examples here: https://github.com/g2p/bcache-tools

----------

## G2P

 *VoVaN wrote:*   

>  *GamezR2EZ wrote:*    *Quote:*   I'm not an expert, hopefully somebody have a kind of explanation of this strange chain: nvidia, Xorg and bcache. 
> 
> I don't use X with my current setup. I did use X while testing the 3.10 kernel and bcache, but I was using nouveau. I did not run into any issue with X or rebooting.
> 
> Do you get the same results when using bcache, X, and the nouveau driver instead? 
> ...

 

I've just read that the nvidia blob tries to flush the system workqueue. However, if you can reproduce it with 3.x.y -stable kernels, it still points to a bug in bcache. Workqueue tasks should be bounded time and relatively quick.

----------

## GamezR2EZ

 *RBee wrote:*   

> Anyone got bcache working on a Debian system yet ?
> 
> Does this Howto apply to Debian systems ?

 

Bcache needs to be in the kernel. It isn't a package to install.

 *G2P wrote:*   

> Hello,
> 
> I'd like to add a few things to this tutorial since I've done some work on the bcache userspace.
> 
> The info about the bcache superblock is accurate, except for the fact that it doesn't prevent in-place conversion. Conversion tool here: https://github.com/g2p/blocks
> ...

 

Updated the guide. Thanks for the info on superblock conversation. That is useful to me as I was just about to setup bcache for an existing system.

----------

## Massimo B.

 *GamezR2EZ wrote:*   

> My setup is `raw device(s) > mdadm > bcache > dm-crypt > lvm`. You can move bcache around anywhere in that stack. Be careful if you put it after dm-crypt as it will cache unencrypted content. The way bcache works means it may not cache anything useful, but why take the chance?

 Can bcache do useful caching when using behind encrypted containers? Like raw > bcache > LUKS > btrfs?

I thought about adding bcache to my btrfs-on-LUKS. But I don't think genkernel can create bcache supporting initramfs.

Do you have some comparison to lvm-cache? As genkernel already supports --lvm, the initramfs actually should already support lvm-cache? And to mention some more solutions there is also Dm-cache (isn't that the same like lvm-cache, using device mapper?), EnhanceIO and Flashcache.

----------

## nokilli

 *Massimo B. wrote:*   

>  *GamezR2EZ wrote:*   My setup is `raw device(s) > mdadm > bcache > dm-crypt > lvm`. You can move bcache around anywhere in that stack. Be careful if you put it after dm-crypt as it will cache unencrypted content. The way bcache works means it may not cache anything useful, but why take the chance? Can bcache do useful caching when using behind encrypted containers? Like raw > bcache > LUKS > btrfs?
> 
> I thought about adding bcache to my btrfs-on-LUKS. But I don't think genkernel can create bcache supporting initramfs.
> 
> Do you have some comparison to lvm-cache? As genkernel already supports --lvm, the initramfs actually should already support lvm-cache? And to mention some more solutions there is also Dm-cache (isn't that the same like lvm-cache, using device mapper?), EnhanceIO and Flashcache.

 

I've been running raw > bcache > lvm > dm-crypt > ext4 in writethrough mode now for over a month with nothing but a smile on my face.

I think the way to think about it is to just treat bcache as though it's a simple block device.  What happens on top doesn't matter if it just ends up asking for blocks.

----------

## szatox

 *Quote:*   

> And to mention some more solutions there is also Dm-cache (isn't that the same like lvm-cache, using device mapper?)

 It certainly is, since LVM is actually an alternative (higher level) interface to device mapper. E.g. you can inspect its in-memory mapping table with dmsetup.

AFAIR bcache and dmcache support slightly different modes (though they both allow write-back and write-through) and they can be modified in different ways (AFAIR you have to destroy and recreate dmcache to change its mode, but you can do it on an active device, so it doesn't hurt much). There are no dealbreakers in their capabilities, so I'd go for LVM version if I were you, for sake of convenience alone. (One can reconsider it later, if it turns out not good enough)

----------

## Massimo B.

Afaik bcache has only a small developer base if not a one man show, but is optimized to SSD based cache devices. So I tend to use that as I don't have any further plans with LVM. I need to clean both blockdevices (the 750GB hdd and 128GB SSD) anyway. But I hope I can just dd-clone my current /dev/mapper/root with all the LUKS and btrfs inside, and later write that back to the new /dev/bcache. Will the geometry of the 750GB hdd be equal to the bcache with an additional 128GB caching device? I remember that bcache has some header so the resulting device would be smaller?

----------

