# Help me with a raid

## Kurogane

Hi,

I have a problem one my disks fails so i replace the disk

Here i what i have before the change the drivers

i call oldsda is /dev/sda

```
md2 : active raid1 sda3[0]

1927689152 blocks super 1.2 [2/1] [U_]

md0 : active (auto-read-only) raid1 sda1[0]

25149312 blocks super 1.2 [2/1] [U_]

md1 : active raid1 sda2[0]

523968 blocks super 1.2 [2/1] [U_]

unused devices: <none>
```

- Now i shutdown the server and replace the driver for a new one

Now we have clean /dev/sda

Now i check again mdstat and show me this

```

md2 : active raid1 sdb3[2]

      1927689152 blocks super 1.2 [2/1] [_U]
```

where is gone md0 and mod1? i not know.

Ignoring why not show my arrays going to clone the partition of /dev/sdb

- Here the result

```
fdisk -l /dev/sda

Disk /dev/sda: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors

Units: sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disklabel type: dos

Disk identifier: 0x55555555

Device     Boot    Start        End    Sectors  Size Id Type

/dev/sda1           2048   50333696   50331649   24G fd Linux raid autodetect

/dev/sda2       50335744   51384320    1048577  512M fd Linux raid autodetect

/dev/sda3       51386368 3907027120 3855640753  1.8T fd Linux raid autodetect

fdisk -l /dev/sdb

Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors

Units: sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disklabel type: dos

Disk identifier: 0x55555555

Device     Boot    Start        End    Sectors  Size Id Type

/dev/sdb1           2048   50333696   50331649   24G fd Linux raid autodetect

/dev/sdb2       50335744   51384320    1048577  512M fd Linux raid autodetect

/dev/sdb3       51386368 3907027120 3855640753  1.8T fd Linux raid autodetect

```

Looks cool right?

- Now i try to add the arrays

```
mdadm --manage /dev/md0 --add /dev/sdb1

mdadm: error opening /dev/md0: No such file or directory
```

```
mdadm --manage /dev/md1 --add /dev/sdb2

mdadm: error opening /dev/md1: No such file or directory

```

- here md2 details

 *Quote:*   

> mdadm --detail /dev/md2
> 
> /dev/md2:
> 
>         Version : 1.2
> ...

 

Now what? i dont know and i come here how i can solve this problem i have there solution of this  :Crying or Very sad: 

----------

## eccerr0r

Stupid question, did you replace the right disk?  Or is it obvious which one failed as it no longer detects?

mdadm --misc --examine /dev/sdbX to see if the volumes are clean or dirty.  Hopefully you got the right disk, not always clear for me when I have to replace disks since I have multiple (RAID5)..

I forget if Linux will auto assemble disks that are dirty and degraded.  Sounds like a recipe for failure and it probably does not at least for raid5/6.  You may need to mdadm --assemble --force to start the dirty raid again (be very sure it's what you really want to do.)

----------

## Kurogane

Yes, i replace the right disk, sda have sector damaged so i replace the unit.

Another stupid question, for what i read when you replace a disk you need to do before remove the RAID array is true? because i just remove and replace the disk and i did not do that before. Also when remove RAID array need to the [UU] state and not [_U] state with is my case only show me [_U] and not [UU] correct me if that's right or not.

----------

## frostschutz

I don't fully understand your situation, your description is very unclear to me... 

...in your first mdstat the RAIDs were already degraded and only sda was left. If you replace sda at this point there is nothing left. If sda as the only remaining drive was broke somehow, your only option would be to ddrescue it.

When a drive is kicked out of an array, its metadata will still look "fine" because it won't be updated anymore after being kicked out. It's the other drives that have this failure recorded in their metadata.

So if you have a sda drive that says sdb is bad, and an sdb drive that says sdb is fine, it believes sda and you get [U_].

If you remove sda, suddenly sdb is back and you get [_U] with an old, bad drive that was kicked out ages ago.

----

Once you're in this situation that your RAID has "split" into two independent RAIDs (one running as [U_] and the other running as [_U]), unless you know exactly which one to disregard, I would treat both as single drive RAIDs, backup files from both of them... then figure out how to use these backups (based on file timestamps, by verifying file contents) and build a new RAID from scratch.

----------

## Kurogane

I think the problem was i not notice sdb is not sync for long time so the recent files i had on sda, now sda is gone and some point i have sdb but with very old data.

Anyone can respond me this? 

Another stupid question, for what i read when you replace a disk you need to do before remove the RAID array is true? because i just remove and replace the disk and i did not do that before.   :Crying or Very sad: 

----------

## eccerr0r

When a disk is kicked from an array, it will have the wrong timestamp and marked "dirty" since it never got a chance to update upon shutdown.  The timestamp on these will be "old" compared to the remaining disks, again since it did not update.  Those are clues as to which drive failed if all disks are at least somewhat serviceable after the failure (I've had transient failures and disks come up fine after the disk fail event.)  If you had one disk fail and then you had an unclean shutdown, both disks will show up as dirty making it very tricky, and you have to carefully check the timestamps to see which disk is the one you want to keep.

For a mdraid level 5, yes, when a disk fails, you should mdadm --remove that disk before actually physically removing the disk.  I believe this is the same way for other mdraid levels.

Even with the failed disk removed, the array should still autodetect and start, but show up in degraded mode.  Once it's started you can just --add the new disk.

You may have to force start your RAID as a degraded to get them to start again.  This is somewhat dangerous and hence <<RAID is not backup>>.

----------

