# brtfs and zfs experiences

## e3k

i went from ext3->ext-4>zfs on root partition. on /boot i still have ext2. i like it a lot the zfs system but lately i do not like those regular tgx_syncs. should i try brtfs?

E

----------

## kernelOfTruth

how about that:

```
echo 15 > /sys/module/zfs/parameters/zfs_txg_timeout
```

?

also don't forget to switch the vdev_scheduler switch when you're on a desktop:

```
echo cfq > /sys/module/zfs/parameters/zfs_vdev_scheduler
```

or 

```
echo bfq > /sys/module/zfs/parameters/zfs_vdev_scheduler
```

when you encounter some latency-spikes during heavy i/o try disabling prefetch temporarily whether that makes things better:

```
echo 1 > /sys/module/zfs/parameters/zfs_prefetch_disable
```

I'm currently running Btrfs on system partition and /usr/portage

but wouldn't trust my (valuable) personal data to it (yet), there are simply too many issues unresolved

bugs and problems are fixed on a constant and quick basis but it's not really that stable yet ...

edit:

for a set of modifiable options take a look at:

```
for i in /sys/module/zfs/parameters/*; do echo "${i}: $(cat "${i}")"; done
```

----------

## Anon-E-moose

I'm running btrfs on my root partition. 

I got an ssd and had been using reiser(3) but it didn't support trim so I swapped over.

I think that btrfs is stable, with the exception of new features that have been added lately.

I don't do anything fancy with my setup and it's been as stable as reiser (which I used for years)

haven't seen any slowdowns, undue cpu/memory usage, etc. YMMV.

----------

## e3k

 *kernelOfTruth wrote:*   

> how about that:
> 
> ```
> echo 15 > /sys/module/zfs/parameters/zfs_txg_timeout
> ```
> ...

 

thank you. txg_sync now scratches only each 15 seconds. but that budget fair budget fair queueing seems to be a better option than noop (fifo).

----------

## Pearlseattle

 *Quote:*   

> I think that btrfs is stable, with the exception of new features that have been added lately. 

 

Yeah, it might be extremely dependant on the "features" that are being used - e.g. the base fs might now have achieved high quality but the advanced options like raid5 are most probably still a black hole (no clue about the snapshots).

Snapshotting, the auto-re-balancing raid5(/6) and the balanced performance for small and big files are in my case the most attractive features of btrfs.

On my side I'm using since a long time:

1)

nilfs2 on SSDs...

...and I still love it.

Those continuous kind-of-time-based-interval-snapshots are just mind-blowing: apart from the fact that nothing ever got corrupted when my notebooks suddenly shut down (a complicated psychlogical constellation of I-am-too-stupid-to-remember-to-plug-in-the-power-cord + I-dont-want-to-ever-see-any-pop-up-message-nor-have-any-automatic-shutdown-procedure), being able to go back in time for any file on the fs is just plain fantastic.

2)

ext4 on HDDs...

...and it's not exciting at all but after following the path ext3->jfs->xfs->btrfs->xfs->ext4->xfs->ext4 I am now definitely back and stable on ext4 for my RAID5s (mdadm) and normal partitions. Nothing was faster than ext4 with small files (xfs was initially always good but deteriorated extremely with rewrites/deletions/additions - maybe now it's better) and with big files the ~450MB/s I get from the RAID5 is more than enough.

If btrfs would be working 100% I would definitely use it for #2 (therefore getting rid of the mdadm-layer + getting in the package a kind-of-lvm resizing functionality), but for #1 I wouldn't see any reason to turn away from nilfs2.

Btw., does btrfs now have a fsck that works for real?

----------

## vaxbrat

I've been using it now for 2-3 years and had even been playing a bit with raid 5 arrays.  However I've since broken them up when setting up my ceph cluster.  All of my btrfs arrays are now individual drives running ceph OSD stores on top of  btrfs.  As for snapshots.... well let me tell you about how hard ceph uses snapshots   :Cool: 

```
$ ceph -s

    cluster 1798897a-f0c9-422d-86b3-d4933a12c7ac

     health HEALTH_OK

     monmap e6: 5 mons at {0=192.168.2.1:6789/0,1=192.168.2.2:6789/0,3=192.168.2.4:6789/0,4=192.168.2.5:6789/0,5=192.168.2.6:6789/0}, election epoch 3462, quorum 0,1,2,3,4 0,1,3,4,5

     mdsmap e470: 1/1/1 up {0=3=up:active}, 1 up:standby

     osdmap e5820: 12 osds: 12 up, 12 in

      pgmap v1110852: 384 pgs, 3 pools, 4989 GB data, 5805 kobjects

            9887 GB used, 34776 GB / 44712 GB avail

                 384 active+clean

```

Those 12 OSDs used to be four btrfs arrays on four hosts.  See that version number 1110852 for placement group map?  That's the number of btrfs snapshots that have been snapped since I built the cluster.  Based on this "ceph -w" status monitoring:

```
2014-09-05 22:08:34.759840 mon.0 [INF] pgmap v1110862: 384 pgs: 383 active+clean, 1 active+clean+scrubbing; 4989 GB data, 9887 GB used, 34776 GB / 44712 GB avail; 154 kB/s wr, 39 op/s

2014-09-05 22:08:58.594157 mon.0 [INF] pgmap v1110863: 384 pgs: 383 active+clean, 1 active+clean+scrubbing; 4989 GB data, 9887 GB used, 34776 GB / 44712 GB avail

2014-09-05 22:09:00.704030 mon.0 [INF] pgmap v1110864: 384 pgs: 383 active+clean, 1 active+clean+scrubbing; 4989 GB data, 9887 GB used, 34776 GB / 44712 GB avail

2014-09-05 22:09:13.830688 mon.0 [INF] pgmap v1110865: 384 pgs: 383 active+clean, 1 active+clean+scrubbing; 4989 GB data, 9887 GB used, 34776 GB / 44712 GB avail

2014-09-05 22:09:34.212140 mon.0 [INF] pgmap v1110866: 384 pgs: 384 active+clean; 4989 GB data, 9887 GB used, 34776 GB / 44712 GB avail

2014-09-05 22:09:33.620806 osd.6 [INF] 0.3c scrub ok
```

Each of my osds has done a btrfs snapshot create and a delete (it keeps a current and two previous)  every few seconds as I/O transactions are done and committed.  I also have the osd journals as regular files on the ssd drives that I use for my root filesystems and a majority of those are now btrfs.

I've only had problems with btrfs when the hardware has been bad (flakey memory, mobo or hard drive).  I've had good luck with btrfs fsck when it was the hard drives giving me grief.

----------

## HeissFuss

btrfs is pretty stable now, as long as you're using a single drive.  Spanning multiple drives is still flaky.

----------

## e3k

 *kernelOfTruth wrote:*   

> how about that:
> 
> ```
> echo 15 > /sys/module/zfs/parameters/zfs_txg_timeout
> ```
> ...

 

i might try the cfq scheduler now as during the emerge updates i get a slower response. but anyway bfq helped me to get better desktop responses.

----

i was trying to switch to cfq and i figured out that my zfs is back to noop. any ideas why? i am using intramfs thats my only idea of what could happen.

----------

## kernelOfTruth

there are i/o scheduler settings for the block devices from linux and from ZFS/spl:

```
for i in /sys/block/sd*; do

         /bin/echo "bfq" >  $i/queue/scheduler

done
```

```
echo bfq > /sys/module/zfs/parameters/zfs_vdev_scheduler
```

everytime after I've imported a new zpool, added a new devices (e.g. external USB enclosure) I'm running these commands via a script to make sure that BFQ is running instead of e.g. deadline, cfq or noop

replace bfq with cfq in your case ...

no idea why it would reset itself for you ...

----------

## e3k

 *kernelOfTruth wrote:*   

> there are i/o scheduler settings for the block devices from linux and from ZFS/spl:
> 
> ```
> for i in /sys/block/sd*; do
> 
> ...

 

no i am not sure if bfq was even set that time. /sys/module/zfs/parameters/zfs_vdev_scheduler is set to noop and /sys/block/sda/queue/scheduler is set to cfq. i do not know which one takes precedence. by the way i can set the zfs_vdev_scheduler with echo but it fails by setting it per vi with an fsync error. and setting /sys/block/.../scheduler fails with both echo and vi with fsync error.

where do you put that script to set the values? i am running zfs on / and i am not quite sure where to put that..

----------

