# ext4 safety options

## svoop

Hi

I have to admit that I'm not a filesystem guru at all and following the recent safety discussions for ext4 is quickly going beyond my scope. Now last night a power outage while my backup rsnapshot script was running caused severe damage on the ext4 target, I'm on the fourth fsck round now and most likely will have to wipe the thing.

It would be great if someone with real ext insight could add a "Safety" section to http://en.gentoo-wiki.com/wiki/Ext4 which mentions the safest ext4 options for every stable kernel starting with 2.6.30-r4. With safest I mean a set of options that's trying to be as crash resistent as possible even if it has some performance drawback, similar to good old data=ordered on ext3. Maybe this thread is a good place to discuss which could be these options for the current 2.6.30-r4 kernel.

Cheers, -sven

----------

## shazeal

The best way to stop corruption from power loss with any file system (because they are all vulnerable) is to invest in a UPS. Changing options wont really make anything more secure if you are really concerned about data integrity.

----------

## Cyker

My fstab options for ext4 is noatime,nosuid,commit=5,data=journal,journal_checksum,barrier=1

I think this is pretty safe, certainly moreso than the defaults.

----------

## svoop

 *Cyker wrote:*   

> noatime,nosuid,commit=5,data=journal,journal_checksum,barrier=1

 

Does data=journal really make things less error prone on crashes - and is there a big performance hit? (Note: commit=5 and barrier=1 are defaults.)

As for the UPS: You're right, but I'm fasing out the server anyway, so I won't invest. Furthermore, I've had more problems with crashes caused by a DVB-T card and poor signal quality from the building antenna in the past - which caused the box to freeze and the drives to require a check.

----------

## Cyker

data=journal journals everything, not just metadata, and disables delayed allocation, so it should be the safest.

I personally haven't noticed any performance drop, but I am using a RAID array  :Smile: 

And supposedly, data=journal is *faster* than the other two when you're reading and writing at the same time!!

----------

## svoop

 *Cyker wrote:*   

> data=journal

 

I've replaced data=ordered with data=journal for my backup drive and I can mount it without any problem. Next I've changed it in fstab for the root partition as well and - oops - upon reboot the root partition could not be mounted in read/write anymore. The full line from fstab:

```
OLD: /dev/sda3 / ext4 defaults,nodelalloc,journal_checksum,data=ordered,user_xattr,noatime 0 1

NEW: /dev/sda3 / ext4 defaults,nodelalloc,journal_checksum,data=journal,user_xattr,noatime 0 1
```

Any idea what I'm doing wrong?

----------

## Cyker

Ack!  :Shocked: 

Hmm, lemme think... 

<Excessively Verbose>

IIRC, if you created the partition as data=ordered, then it gets mounted by grub/lilo/whatever as data=ordered first, before the fstab is even read.

Now, normally, / gets re mounted as ro, BUT, you can't do a remount operation AND change the journal from one type to another (In this case, going from Ordered to Full Journal).

I suspect that is what's causing the error; You may be able to check the logs in /var/log to see if it gives a hint.

Now,  assuming that *IS* the problem, the fix is:

</Excessively Verbose>

<Short Answer>

Run 

```
tune2fs -ojournal_data </dev/WhatEverYourDriveIs>
```

NB: The -o should be a lowercase (I think!)

</Short Answer>

HTH  :Mr. Green: 

----------

## svoop

 *Quote:*   

> Now,  assuming that *IS* the problem, the fix is:

 

That was the problem, big thanks for the hint!

----------

## Cyker

Glad you could fix it; It always scares the heck out of me when I change something and / stops working!!  :Shocked: 

----------

## mv

Does anybody know what happened to the "data=alloc_on_commit" which was announced for kernel 2.6.30?

Was this just "merged" into the default behavior of "data=ordered"?

I run kernel 2.6.30, but cannot remount the filesystem when I edit my fstab to data=alloc_on_commt. Of course, it might be a similar problem as svoop had  (i.e. data=ordered simply cannot be remounted as data=alloc_on_commit?), but perhaps somebody knows more details...?

----------

## Cyker

Wow, I had to google that; Never even knew such a thing existed!

I think it became redundant as they are supposedly adding exception code to data=ordered to detect and deal with the empty file swappy thingy, which is apparently the main source of the argument.

A rather inelegant hack, but things are going to get worse before they get better with ext4 it seems  :Sad: 

I stick with data=journal because I definitely want robustness over speed! Linux software RAID is very vulnerable to data corruption so anything I can do to mitigate that is welcome (If the FS had ECC for the data blocks that'd rock, and thank smeg they finally got checksums for the journal! I am envious of the copy-on-write stuff in newer FS' tho'. That is a fantastic feature for data robustness!)

I'd like to try btrfs or NILFS2 (Which sounds nifty - Crash-proof but probably very space hungry!),  but they're rather too experimental for me at the moment (And I haven't got a good way of backing up 3TB!!)

----------

## mv

 *Cyker wrote:*   

> I think it became redundant as they are supposedly adding exception code to data=ordered to detect and deal with the empty file swappy thingy, which is apparently the main source of the argument.

 

Which would mean that it has been "merged" with data=ordered, as I had conjectured. However, I never found an "official" statement, neither in the positive nor in the negative. The only "semi-official" statement was that it was scheduled for 2.6.30 when the alloc_on_commit patch came out during 2.6.28; if an important fix like this takes that long to go into the kernel, I did not expect that a completely changed implementation would go into the kernel.

 *Quote:*   

> I stick with data=journal because I definitely want robustness over speed!

 

That's why I wanted to use data=alloc_on_commit instead of data=ordered, if they really decided to make two different options of it. However, I do not see a point in data=journal, since files with freshly modified data will not contain the complete data anyway after a crash, so journaling the data only reduces the speed (and, more importantly, stresses your harddisk extremely) without giving real additional security: It does not really make a difference whether a file contains incomplete data in addition to previous data or, alternatively, incomplete data in addition to previous data! (This is not a typo: For randomly accessed files you will typically always have a mixture of both; files to which data is appended during the crash, should be "too short" in both cases [provided data=ordered is implemented correctly]).

 *Quote:*   

> Linux software RAID is very vulnerable to data corruption so anything I can do to mitigate that is welcome

 

The most annoying point is that barriers are not supported, so you have to make sure to switch off hardware caches.

 *Quote:*   

> I'd like to try btrfs or NILFS2 (Which sounds nifty - Crash-proof but probably very space hungry!)

 

For both I am not so convinced about the crash-proofness. For instance, it is not clear that things like the previous bad behavior of ext4 cannot happen (I mean conceptionally; I am not speaking abot implementation bugs which probably exist in addition). (OK, in NILFS2 it might perhaps be worked around "manually" after a crash, but this is not what I mean). Also, I never read about how e.g. btrfs is supposed to do journaling: Does it have a fixed journal location or is it more like the dancing trees of Reiser4? Is it (conceptionally) correct? By the latter I mean that not the mistake of reiser3 is repeated that only the data which needs changes is written to the journal but instead the full block: Otherwise things can go wrong if the computer crashes while the journal is written to the system (where the remaining data in the block can be damaged). Of course, I can understand that the developers currently have better things to do than to document the concepts, but anyway, I would like to know what is planned (and I do not want to study the source just to find out).

----------

## Cyker

Well, strictly speaking they haven't been merged because the .31 ordered doesn't behave like the ext3 version all the time; It just has a kludge which tries to detect the empty file scenario and then does ext3-esque behaviour toward it.

I don't like hacks like that; They have a nasty habit of biting you in the ass during some scenario you never thought of  :Sad: 

The NILFS2 is very straightforward conceptually, but I have no idea brtfs works; All I know is that it has copy-on-write and apparently uses a clever b-tree implementation to boost efficiency without affecting robustness. But I don't understand that part at all  :Laughing: 

The crash-proof I should clarify; Those FS will (well, *should*) NOT corrupt an existing file in event of a crash unlike most current fs. This is because in both cases the original file data is not touched but is instead copied and then changed. (Unless I understand this whole copy-on-write thing wrong)

But this is all conjecture really; Could be a while before I feel either are up to snuff. Heck, technically I'm still using ext3, just accessing it with ext4 driver for journal checksumming!!  :Laughing: 

----------

## mv

 *Cyker wrote:*   

> It just has a kludge which tries to detect the empty file scenario and then does ext3-esque behaviour toward it.

 

Yes, such a patch was in gentoo-sources since quite a while. However, I hoped that 2.6.30 has the "proper" solution which would be the (unconditional) alloc_on_commit.

 *Quote:*   

> I don't like hacks like that; They have a nasty habit of biting you in the ass during some scenario you never thought of 

 

I completely agree: alloc_on_commit is what one actually wants. So let us wait whether it finally gets implemented. Unfortunately, I am afraid that the maintainer simply has lost interest in fixing the thing properly, now that a "workaround" has been found and loud protests are not to be expected. I really would like to switch to a safer filesystem - unfortunately, it seems that ext4 still is the safest one, currently, at least of those which claim to have reached a stable state.

 *Quote:*   

> The NILFS2 is very straightforward conceptually, but I have no idea brtfs works

 

To be honest, I have no desire for a versionized filesystem - space is too valuable for me; that's why I never considered things like NILFS2 seriously.

 *Quote:*   

> All I know is that it has copy-on-write and apparently uses a clever b-tree implementation to boost efficiency without affecting robustness. But I don't understand that part at all 

 

That's similar to me: The rough description made it sound as if it had similar features than Reiser4 dancing trees, but it is not clear what they are really trying to do. Especially, I am a bit worried that I could not even see any remarks on some sort of "journal" in the description (contrary to Reiser4 where it was explained roughly how the journal is supposed to "move" within the tree).

 *Quote:*   

> Those FS will (well, *should*) NOT corrupt an existing file in event of a crash unlike most current fs. This is because in both cases the original file data is not touched but is instead copied and then changed.

 

This alone does not save you from corruptions. The difficulty is not the file data but the metadata: What happens if a powerloss happens during the modification of a metadata block, causing that block to contain only rubbish data?

 *Quote:*   

> Heck, technically I'm still using ext3, just accessing it with ext4 driver for journal checksumming!! 

 

That's not a bad idea: Disabling extensions you can avoid the mess with the (apparently still nonexistent) alloc_on_commit feature.

----------

