# ext4 filesystem remounting itself readonly

## forkboy

With increasing regularity my /home partition is remounting itself readonly.  This is the last part of dmesg

```
[ 1076.704027] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen

[ 1076.704036] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0

[ 1076.704037]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

[ 1076.704041] ata1.00: status: { DRDY }

[ 1076.704048] ata1: hard resetting link

[ 1077.162013] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

[ 1077.167876] ata1.00: configured for UDMA/133

[ 1077.167888] end_request: I/O error, dev sda, sector 175669291

[ 1077.167906] ata1: EH complete

[ 1077.167938] Aborting journal on device sda3:8.

[ 1077.170322] journal commit I/O error

[ 1077.313191] ext4_abort called.

[ 1077.313194] EXT4-fs error (device sda3): ext4_journal_start_sb: Detected aborted journal

[ 1077.313198] Remounting filesystem read-only

```

 I'm running 2.6.30-gentoo-r4 on amd64.  I don't know what other info to provide and sorry if this is the wrong section.  Any help much appreciated.

----------

## platojones

When this happens, it's usually because the filesystem is encountering a lot of errors, so it remounts itself ro to protect the data...you need to run fsck on that filesystem.  Based on the few error messages I see there, it seems to be a problem with the journal on that filesystem.

----------

## forkboy

It forces a fsck on that partition everytime it reboots after the problem, it always passes with no errors.

----------

## eccerr0r

 *forkboy wrote:*   

> 
> 
> ```
> [ 1077.167888] end_request: I/O error, dev sda, sector 175669291
> 
> ...

 

It looks like it was having trouble writing a new sector for the journal.  If a forced fsck on the disk reports no errors, (first I'd back up the disk ASAP as it may be on its last legs) then check the disk with smartmontools.

Was the journal clean or does it have uncommitted blocks in the journal?  Probably the latter?

----------

## forkboy

 *Quote:*   

> Was the journal clean or does it have uncommitted blocks in the journal?  Probably the latter?

  I don't know... how do I find out? Its all backed up, just in case the drive is dying.

----------

## eccerr0r

I'm just curious of the state of the disk, how much of it is corrupt.  What happens if you explicitly tried mounting the filesystem read-only?  It shouldn't muck with the journal...will it still emit this error?  I'm not too familiar with ext4 but if fsck is clean then it should have replayed outstanding journal entries (or discarded them, which is also OK).  Just hoping you have a consistent backup and not a half baked one -- that's what journaling is supposed to help prevent.

This is somewhat weird as usually when a bad sector is written to, it will also cause the hard drive to reallocate it to a spare, unless it ran out of spares.  It wasn't clear what it was trying to do -- perhaps it was trying to read the journal to commit additional writes to the filesystem?

Probably should just go check how bad it is according to SMART and bank on buying a new disk...

----------

## forkboy

Deleted and recreated the partition... It still happens.  Ran the manufacturers diagnostic tool, it passed without errors and yet it still happens.  Don't know what else to do so am reformatting back to ext3 and see if it still happens.

----------

## forkboy

Found this on the red hat bugzilla.  Guess this is my problem, especially  *Quote:*   

> Some people had this problem with Samsung disks with kernels after 2.6.27.5  

  Guess I'm stuck with ext3 then.

----------

## mv

 *forkboy wrote:*   

> Guess I'm stuck with ext3 then.

 

If it is really the barrier=1 default which causes this, then I would use the "nobarrier" option (together with data=ordered) with ext4 as suggested: Whether you use ext3 or ext4 with nobarrier is equally unsafe. No matter how you decide, switch off your harddisk cache (using hdparm) - then both solutions are equally safe, but ext4 is still preferrable for several other reasons.

----------

## kernelOfTruth

 *mv wrote:*   

>  *forkboy wrote:*   Guess I'm stuck with ext3 then. 
> 
> If it is really the barrier=1 default which causes this, then I would use the "nobarrier" option (together with data=ordered) with ext4 as suggested: Whether you use ext3 or ext4 with nobarrier is equally unsafe. No matter how you decide, switch off your harddisk cache (using hdparm) - then both solutions are equally safe, but ext4 is still preferrable for several other reasons.

 

I saw similar behavior starting to show with reiserfs and barrier=flush (barrier enabled) on a new Samsung drive so I guess disabling barrier will "fix" it

thanks !

what is the cause of this strange behavior ?

a faulty firmware ?

the drive's firmware can't take the pressure ?

----------

## forkboy

 *Quote:*   

> 
> 
> I saw similar behavior starting to show with reiserfs and barrier=flush (barrier enabled) on a new Samsung drive so I guess disabling barrier will "fix" it
> 
> thanks !
> ...

 

God knows, I can only reproduce this on one of my disks, the other has been fine with barriers for a long time.

----------

