# Bad blocks in journal area of ext3 fs due to bad strategy.

## deva

Hi

I have a media center box running gentoo that is often turned off by just pulling the power rather than doing a nice 'init 0'.

Since audio/video recordings are often made on the box I have added data=journal,commit=1 to the mount options on the partition storing the recordings.

The full fstab entry is:

```
/dev/sda6  /mnt/recordings  ext3     defaults,data=journal,commit=1    0 0
```

This has been quite effective in preventing data loss if the recordings were not actually flushed to disk when the power is pulled and has been working great for quite some time.

However, now a new problem has arisen; bad-blocks in the journal inode of the partition!

Fixing the bad-block is not a problem, but it had me thinking; "perhaps writing to the same sector each time a write to any other sector on the disk is not such a great strategy after all..."

So my real issue is this: Is it possible to either have the journal "spread out" on the disk and thereby not stress the single sector so much, or perhaps move the journal manually just before mounting the partition?

Alternatively an "always flush directly to hardware" strategy might be better, but I'm not really sure how to implement that.

----------

## Hu

Although you may be able to mitigate the damage done by a hard power loss, you should try to avoid this as best you can.  Only cut power if the machine is hung.  Otherwise, turn it off properly.  You can turn off journaling entirely if you want to write changes through without the journal, but this is bad for performance.

----------

## deva

Thank you for your reply.

 *Hu wrote:*   

> Although you may be able to mitigate the damage done by a hard power loss, you should try to avoid this as best you can.  Only cut power if the machine is hung.

 

I am aware that turning off power is a bad idea, but I have not been able to control the powerdown process using the IR remote control yet.

Currently I am turning off the external amplifier and that in turn powers off some power sockets on its back witch is connected to the media center PC.

 *Hu wrote:*   

> You can turn off journaling entirely if you want to write changes through without the journal, but this is bad for performance.

 

Can this be done using mount parameters or will I have to mangle with the file system options (tune2fs)?

----------

## The Doctor

What is wrong with using ssh? You could make an alias on your normal computer, say haltserver or something, which would would log in and send the power off signal with a single command.

EDIT: Even if you can get it down to a 1/1000 chance of damaging your system by pulling the plug (and I don't think you can get odds that good) you will have a 30% chance of damage within a year and 70% with 2. That seems like an awfully big risk.

----------

## Goverp

 *The Doctor wrote:*   

> ...
> 
> EDIT: Even if you can get it down to a 1/1000 chance of damaging your system by pulling the plug (and I don't think you can get odds that good) you will have a 30% chance of damage within a year and 70% with 2. That seems like an awfully big risk.

 

Can you justify these statistics?  As the joke goes "95.376% of statistics are made up on the spot".

IMHO there's actually no point in a "proper" shutdown if your systems journalling & recovery processes work correctly.  (OK, in many cases this isn't true, but equally in many it is true.)  Since you always go through a startup operation to replay the journal to recover "lost" updates, all a proper shutdown does is move the work from startup to closedown.

Sorry, this isn't actually relevant to deva's question about moving the journal, and I can't help with that.

----------

## Hu

Does the media PC not have a power button that you can press?  The standard way to turn off such machines is to bind an ACPI event to the power button, and let that event trigger a software-initiated halt when the user taps the power button.

You may be able to disable the journal with tune2fs.  I doubt you can do it with mount options for ext3.

I believe The Doctor was working from basic probability.  If we assume a .001 chance of damaging the system through an unclean shutdown, then you have a .999 chance of not damaging the system with the shutdown once.  You then have a .999^2=~.998 chance of damaging the system with two shutdowns.  Extend this out to a year of halting the machine every day and you get .999^365=~.694 chance of not damaging it after 365 shutdowns and a .999^(365*2)=~.481 chance of not damaging it after 365*2 shutdowns.

----------

## Goverp

 *Hu wrote:*   

> ...
> 
> I believe The Doctor was working from basic probability.  If we assume a .001 chance of damaging the system through an unclean shutdown, then you have a .999 chance of not damaging the system with the shutdown once.  You then have a .999^2=~.998 chance of damaging the system with two shutdowns.  Extend this out to a year of halting the machine every day and you get .999^365=~.694 chance of not damaging it after 365 shutdowns and a .999^(365*2)=~.481 chance of not damaging it after 365*2 shutdowns.

 

Ah, of course.

Mind you, exactly the same argument applies to controlled shutdowns, since there's probably a similar chance of a bug in the shutdown-cleanup code as in the startup-recovery code.

----------

## Hu

I disagree, for two reasons.  First, clean shutdown is exercised by far more people than unclean startup.  Second, the clean shutdown case proceeds in an orderly fashion and by definition is never interrupted in the middle.  If it were interrupted in the middle, it would be an unclean shutdown, whether that interruption is because the user removed power or because the kernel crashed.  In the unclean startup case, the damage is presumed to occur not because of an unclean startup, but because of the unclean shutdown which preceded it.  By definition, the unclean shutdown is halted at some point that has not properly synchronized its state.  There exist a huge number of such points in the normal operation of a system, many of which are probably hit rarely if ever by users.  Though I hope no one sells consumer hardware like this, there is a chance that the machine has components which can be damaged merely by the electrical aspects of an unexpected power loss, but which would be fine if properly halted before electricity was cut.

----------

## wjb

A variation on The Doctor's suggestion - a tiny custom weberver running on the mediacentre to do a shutdown when a given page is requested, and a bookmark to that page on your phone.

----------

## Ant P.

 *Hu wrote:*   

> If we assume a .001 chance of damaging the system through an unclean shutdown, then you have a .999 chance of not damaging the system with the shutdown once.  You then have a .999^2=~.998 chance of damaging the system with two shutdowns.  Extend this out to a year of halting the machine every day and you get .999^365=~.694 chance of not damaging it after 365 shutdowns and a .999^(365*2)=~.481 chance of not damaging it after 365*2 shutdowns.

 

It's a moot point anyway, because that drive has already failed:

On mechanical hard disks the only thing keeping the write head from colliding with the disk surface is an air cushion effect generated by the disk's rotation. The write head gets left in the journalling area after every flush (by default on ext3/4 that's every 5 seconds, but this user overrode that to every 1 second). That's not a 0.1% chance of failure, that's a 100% chance of a head crash in the same place on disk every time.

Every OS since DOS gained hard disk support 30 years ago has had some method to park the heads on shutdown, it's there for exactly this reason.

----------

