# RAID array broken, can't boot

## ExecutorElassus

I have three drives in a raid5 array. One is dying. I pulled the plug on the wrong one and rebooted. I turned off again, plugged in the good drive, unplugged the bad. The good one was dropped, and with tge bad unplugged i couldn't boot. Turned off again, plugged the bad drive back in, booted, and it started to recover the array. After two hours, it stopped. So I tried rebooting.

Now, the initram won't boot, saying that "/dev/md127 is assemvled from 2 drives and 1 rebuilding - not enough to start the array while not clean - consider --force." It drops to a shell and says "can't access tty; job control turned off."

The keyboard won't respond.

How can I recover? I have a very old gentoo recovery CD and a USB cd drive. Are there any tools there I could use to fix thus?

The initram is from Neddy Seagoon. Neddy, can you save me once more?

Thanks in advance,

EE

----------

## ExecutorElassus

update: the old SystemRescueCD I had wouldn't boot; after a few seconds of startup the machine switched off.

So now I'm burning a gentoo install CD, and hoping I can use that to recover. Or I also have a newer SystemRescueCD.

So, to be clear, I have:

sda - good

sdb - dead

sdc-old - good, but out of sync with the array

sdc-new: a new, blank HDD which I haven't plugged in yet.

How can I recover the array with the tools I have available?

Thanks for the help,

EE

----------

## ExecutorElassus

UPDATE: I booted into a gentoo Live/Install CD, and now I can get to a prompt. 'cat /proc/mdstat' sows all three arrays, all inactive (and mis-numbered), and each with only two members (sdaN and sdcN).

At this point I can't start the arrays. If I type 'mdadm -A /dev/md126' I get the error that the array is not in the config file. If I give the command 'mdadm -E --scan > /etc/mdadm.conf' I then see three lines (for md1, md126, and /dev/md/carrier, the last of which is wrong). 

Using 'mdadm -A /dev/md1 --force' does not work. 

How do I recover from this?

Again: sdaN and sdcN are all assumed good. I have set the partition table for sdb and can add it back into the arry, but theliveCD won't do it, and I still can't boot.

thanks for the help,

EE

----------

## ExecutorElassus

UPDATE2: OK, I just rewrote the array entries in /etc/mdadm.conf, and then it assembled two of them. I was able to add the appropriate sdbN partitions to those arrays, and it synced.

My last problem is the final array. This is a large array with a volume group inside with multiple partitions, holding all the rest of the system. I thought I might try a last-ditch command, and use

```
mdadm --create /dev/md127 --level=5 --assume-clean --metadata=1.2 --rai-devices=3 /dev/sda4 missing /dev/sdc4
```

but I got two errors:

that sda and sdc appear to be partof a raid array (theoldone)

and, more worrisome:

that a partition table exists on /dev/sda4 but will be lost or meaningless after creating array

How should I proceed? Can I safely ignore the warning about the partition table? How else might I be ble to rescue the array?

Thanks,

EE

----------

## NeddySeagoon

ExecutorElassus,

Do not use --create. Defaults have changed. Your data will still be there but you won't find it.

Been there, done that, don't you do it too.

If you really really want to do that (you don't yet) you must specify all the parameters to --create

What does 

```
mdadm -E /dev/sd[abc]4
```

say?

Post it all, you will be glad you did later.

Post the content of /proc/mdstat too.

Read RAID Recovery and its references.

Play about with some USB drives and overlay filesystems before you trash your real data.

----------

## ExecutorElassus

Hi Neddy,

one thing: if I try to boot with the bad drive still in, it won't assemble because it says it has two drives and one rebuilding, not enough to start, and suggests using --force. But the keyboard on the initram doesn't work. Should I just avoid that route altogether?

I can post the mdadm -E information once I boot up again from the liveCD (though I have to copy it by hand).

Stay tuned,

EE

----------

## NeddySeagoon

ExecutorElassus,

Don't copy it by hand.  That post will be your get-out-of-jail free card if you run a --create and all your data vanishes.

The raid ver 1.2 metadata is written to the start of all the volumes in the raid set.

Trashing it is is free, you can do it as many times as you like without harming your data. However, once you run a --create and mess up, you have lots of parameters to get right all at the same time to find your data again. 

With software raid, you can move the drives to another system and the raid set will work as well as it ever did.

It need not boot or use the raid you want to poke about on.

Don't use force yet.  We need to see the mdadm -E output before you do anything you can't undo.

----------

## ExecutorElassus

mdadm -E:

```

/dev/sda4;

    Magic: a92b4efc

    Version: 1.2

    feature Map: 0x0

    Array UUID: d4se5336:b75b0114:a502f2a0:178afc11

    Name: domo-kun:carrier

    Creation Time: Wed Apr 11 00:10:50 2012

    Raid Level: raid5

    raid Devices: 3

    Avail Device Size: 1931841384 (921.17 GiB 989.10 GB)

    Array Size: 1931840512 (1842.35 Gib 1978.20 GB)

    Used Dev Size: 1931840512 (921.17 GiB 989.10GB)

    Data Offset: 2048 sectors

    Super Offset: 8 sectors

    Unused Space: before=1968 sectors, after=872 sectors

    State: active

    Device UUID: 4a8d21e3:15026b07_bfacaedc:b5160599

    Update Time: fri Feb 8 12:13:10 2019

    Checksum: 8b9a3dbd - correct

    Events: 1340931

    Layout: left-symmetric

    Chunk Size: 512k

    Device Role: Active device 2

    Array State: AAA ('A' == active, '.' == missing,  'R' == replacing)

/dev/sdc4:

    Magic: a92b4efc

    Version: 1.2

    feature Map: 0x2

    Array UUID: d4se5336:b75b0114:a502f2a0:178afc11

    Name: domo-kun:carrier

    Creation Time: Wed Apr 11 00:10:50 2012

    Raid Level: raid5

    raid Devices: 3

    Avail Device Size: 1931841384 (921.17 GiB 989.10 GB)

    Array Size: 1931840512 (1842.35 Gib 1978.20 GB)

    Used Dev Size: 1931840512 (921.17 GiB 989.10GB)

    Data Offset: 2048 sectors

    Super Offset: 8 sectors

    Recovery Offset: 1389843520 sectors

    Unused Space: before=1768 sectors, after=872 sectors

    State: clean

    Device UUID: 9a99d7ad:9b5a9b75:42cb3258:cfb40e04

    Update Time: fri Feb 8 12:13:10 2019

    Checksum: fdedb47b - correct

    Events: 1340931

    Layout: left-symmetric

    Chunk Size: 512k

    Device Role: Active device 2

    Array State: AAA ('A' == active, '.' == missing,  'R' == replacing)
```

'cat /proc/mdstat' outputs:

```
Personalities: [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear] [multipath]

md125: active raid1 sdb3[1] sdc3[0] sda3[2]

          9765504 blocks [3/3] [UUU]

md126: active (auto.read.only) raid1 sdb1[1] sda1[2] sdc1[0]

          97536 blocks [3/3] [UUU]

```

So those two are fine, and it's just the problem with sdc4 being set as "rebuilding" and my not knowing how to reassemble it.

Thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

We need the mdadm -E from the other volume too.

Both those volumes say 

```
    Update Time: fri Feb 8 12:13:10 2019

    Checksum: 8b9a3dbd - correct

    Events: 1340931 
```

so they appear to be self consistent.

That suggests that /dev/sdb4 is being rebuilt from these two. 

However, only /dev/sdc4 says 

```
Recovery Offset: 1389843520 sectors
```

so maybe not.

If its /dev/sdc4 that is being rebuilt, you don't have enough data to usefully do anything.

If you have a spare drive, recover /dev/sdb4 with ddrescue to the spare drive.

Depending on how that goes, it might be good enough to bring up the array.

ddrescue is surprisingly good at what it does.

----------

## ExecutorElassus

OK, some confusion.

I had three drives in the array. sda, sdb, and sdc. sdb was dying, so I'll call id sdb-old. I have its replacement, sdb-new. Here's the timeline:

1) I unplug sdc in error, and boot

2) I see I've unplugged the wrong HDD. Shut down, plug it back in, unplug sdb-old, reboot

3) boot fails, because sdc was kicked from the array, and sdb-old is unplugged. shut down, plug in sdb-old, boot

4) recovery starts on sdc, using sda and sdb-old

5) I nod off, the screensaver kicks in, recovery stops. I shut down, try to reboot. Reboot fails because recovery is not complete

6) pull sdb-old, plug in sdb-new, boot with a liveCD

7) by renaming the arrays that mdadm wrote to /etc/mdadm.conf, I can re-start md125 and md127. I add the appropriate partitions from sdb-new, and they rebuild and are fine

8( I cannot, however, assemble md126. sda4 is fine, but sdb4 is empty so I don't want to add it yet, and sdc4 still shows as being "rebuilding". So two drives are out, and the array can't start.

So how can I proceed here? I'm not sure why the initramfs won't let me use the keyboard (it's USB) but maybe I could rebuild the array with sdb-old from the liveCD?

Cheers,

EE

----------

## ExecutorElassus

UPDATE3:I tried putting sdb-back in the machine and completing recovery from the liveCD, but it failed. So I'm back where I was. Should I try again to recover? It slowed way down at about 80%, then sdb(old)4 got set F, sdc4 got set S, and the recovery stopped and the array went inactive.

What now?

EELast edited by ExecutorElassus on Fri Feb 08, 2019 10:32 pm; edited 1 time in total

----------

## NeddySeagoon

ExecutorElassus,

 *Quote:*   

> 1) I unplug sdc in error, and boot 

 

This brings up the array in degraded mode on sd[ab]4

 *Quote:*   

> 2) I see I've unplugged the wrong HDD. Shut down, plug it back in, unplug sdb-old, reboot
> 
> 3) boot fails, because sdc was kicked from the array, and sdb-old is unplugged. shut down, plug in sdb-old, boot 
> 
> 

 

This won't start as sd[ac]4 now have different event counts. You said that.

At the outset sd[abc]4 were self consistent.

Other than the rebuild (which should do nothing on a self consistent set) what writes were there when sdc4 was unplugged?

If you are 100% sure there were none, then force is safe. If you get that wrong, horrible things will happen.

Heres the safe approach.

Partition the replacement sdb. Do not try to do anything with md126

Use ddrescue to image the dying sdb4 onto its replacement. Post the ddrescue log.

If ddrecue is successful, sda4 and the new sdb4 can continue to rebuild sdc4.

This will be a bit of a mess as you will have three raid sets spread over four drives.

While ddrescue does its thing, read up on the mdadm --replace command.

What you should have been doing is add the new drive to the system.

Partition the new drive, then with all the raids working on three drives, --replace the sdb elements.

When replace completes, it drops the sdb element out of the array. 

Replace has the advantage of building the new element from any n-1 of the n elements in the array, which can vary as the replace progresses.

----------

## NeddySeagoon

ExecutorElassus,

No, don't pull the bad one until its been replaced.

----------

## ExecutorElassus

Hi Neddy,

I'm certain there was nothing going on. I booted, started the array recovery, started the wm, and then recovery stopped (see above). 

How do I find ddrescue? It's not on the liveCD. And I'm going to have to find an extra set of cables, as I think I'm full.

But I'm done for tonight. Have to sleep.

Cheers,

EE

----------

## NeddySeagoon

ExecutorElassus,

dmesg will tell why the recovery stopped.

Maybe a read error?

The WM will write files.

----------

## ExecutorElassus

I tried to re-start recovery by booting a liveCD and re-starting the array with sdb-old. It resumed from 74%, but then at about 80% the throughput started slowing down considerably, then sdb-old was set as (F) and sdc set (S) and both dropped from the array and the array set inactive.

So what now? try to re-start recovery? Try to use ddrescue with both sdb-old and sdb-new plugged in?

Thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

The 80%, where sdb-old was marked as (F) is where ordinary retries failed and the kernel gave up reading the drive.

You need to try harder to read sdb-old. That's what ddrescue does. Only sdb4-old though.

Use ddrescue to image sdb4-old onto sdb4-new.

If you take both sdb-old and sdb-new to another machine, you can image all of sdb and the other raid sets will adopt it as their own when you put sdb-new back in place of sdb-old.

Once sdb4-old is imaged onto sdb4-new the raid rebuild will continue. 

We will need to check the ddrescue log becaue and data not recovered will be rubbish but the kernel will happily read that and build it into your raid rebuild.

Its essential that you squeeze ddrescue until the pips squeek.

ddrescue tries much harder than the kernel to get your data off the drive, so it way well work where the kernel fails. It will get most of the data as fast as the kernel, then it will slow down.

With only a few blocks to go, I've seen in as slow as 1 physical block per hour..

----------

## ExecutorElassus

Hi Neddy,

ddrescue has been running now for two hours, reporting 1h15m left to go. "Current rate" usually stays around 110MB/s, but then stops for a few seconds every 10s or so. The "average rate" has been steadily dropping from 120MB/s down to 70MB/s now. I'm at 53% complete.

the command I used was:

```
ddrescue -f -n /dev/sdb4 /dev/sdd4 /root/rescue.log
```

Once that completes, the guide tells me I should use

```
ddrescue -d -f -r3 /dev/sdb4 /dev/sdd4 /root/rescue.log
```

to attempt to recover any failures.

Assuming this completes (hopefully just the first, but when not, both), what is the next step?

If it keeps slowing down, can/should I really just let it keep running?

Thank for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

The slowdown is a feature of HDD, not ddrescue.

Rotating rust HDD are 'zoned'. The number of sectors per physical track is not a constant, so the read rate is higher near the edge of the platter that near the spindle.

A factor of 2x or more is common.  Each zone has fewer sectors per track, so there is less data per revolution of the platter.

The file /root/rescue.log is a text file. You can read it any time.

A + at the end of the line is recovered.

A - at the end of the line indicates a problem.

From memory, (I don't have a sample to hand), the columns are start, end and size in hex bytes.

Size is always an integer multiple of the physical block size. Thats 0x200 for 512B blocks, or 0x1000 for 4kB blocks. 

ddrescue is just dd with some extra tricks. The extra tricks are only used as required.

When ddrescue stops, post the log. We will tell it to try harder and use a few more tricks that ddrescue can't do unaided.

When you rerun ddrescue with an existing log file, it knows what has already been recovered and won't try that data again.

----------

## ExecutorElassus

Hi Neddy,

ddrescue finished. Here's what its status says:

```
GNU ddrescue 1.21

Press Ctrl-C to interrupt

       ipos:   971398,  MB non-trimmed:       0 B,     current rate: 5461   B/s

      opos:   971398  MB  non-scraped: 1000  kB,   average rate:   57412   kB/s

non-tried:             0   B   errsize:          36352  B      run time: 4h 47m   8s

rescued:    989102   MB,  errors:               71, remaining time:              3m

percet rescued:    99.99%             time since last successful read:      0s

Finished

zsh: bus error   ddrescue -f -n /dev/sdb4 /dev/sdd4 /root/rescue.log
```

/root/rescue.log reads like this:

```
# Mapfile. Created by GNU ddrescue version 1.212

# Command line: ddrescue -f -n /dev/sdb4 /dev/sdd4 /root/rescue.log

# Start time: 2019-02-09  10:56:52

# Current time: 2019-02-09 15:44:00

# Finished

# current_pos   current_status

0xE22BDDC000     +

#       pos               size     status

0x00000000   0xAAA4A18000   +

0xAAA4A18000 0x00000200     -

0xAAA4A18200 0x00000C00    /

0xAAA4A18E00 0x00000200    -

0xAAA4A19000 0x000BC000  +
```

That goes on for wuite a while. Do you want me to copy that all out?

what should I do now?

Thanks,

EE

----------

## NeddySeagoon

ExecutorElassus,

```
#       pos               size     status

0x00000000   0xAAA4A18000   + 

0xAAA4A18000 0x00000200     - 

0xAAA4A18200 0x00000C00    / 

...
```

The first line tells of recovered data from the beginning of the drive. That's good.

The second line tells of a single disk block not yet recovered.

The third line tells of 6 consecutive blocks that ddrescue didn't try very hard with. 0xC00/0x200.

Make sure you can handle the failing drive while ddrescue runs. The idea now is to tell ddrescue to try harder while you operate the drive the right way up, upside down and on each edge in turn.

Its OK to move the drive slowly while it operates.

Lets look at the options ...

```
       -d, --idirect

              use direct disc access for input file
```

thats good.

```
      -r, --retry-passes=<n>
```

You used retry=3 ... 256 is better, especially as you will move the drive between six orientations.

```
       -A, --try-again

              mark non-trimmed, non-scraped as non-tried
```

That's like a reset between all 256 attempts.  Add that. 

You have 

```
errsize:          36352  B
```

 more bytes to get. That's decimal, not hex.

If its still not got all the data after the above, add in 

```
       -R, --reverse

              reverse the direction of all passes
```

so it starts from the end of the drive.  This is really slow, as it adds a lot of latency.

```
# Command line: ddrescue -f -n /dev/sdb4 /dev/sdd4 /root/rescue.log 
```

You told it not to try too hard with -n.

```
ddrescue -f -r 256 -d -A /dev/sdb4 /dev/sdd4 /root/rescue.log 
```

looks good and try all 4 edges and both faces while that runs.

It its still not got all your data, add -R to the above command and do it again.

----------

## ExecutorElassus

All right, now I'm running the command

```
ddrescue -d -f -r20 -A /dev/sdb4 /dev/sdd4 /root/rescue.log
```

it's presently scraping failed blocks. The errsize is 163 kB, errors is 118, and it says it has 10 hours to go. Last successful read goes up to about 30s, sometime 3m, before going back to 0.

Obviously, I won't be able to stand here and rock the HDD back and forth for 10 hours (and since the cable-salad in the machine is kinda dens, it's not easy anyway). 

Am I doing OK so far here?

Cheers,

EE

----------

## NeddySeagoon

ExecutorElassus,

```
non-tried:             0   B   errsize:          36352  B      run time: 4h 47m   8s
```

That's only 36kB of error.  It should never increase.

Are you copy typing inaccurately?

-r20 is OK, turn the drive every time ddrescue stops and rerun the command.

Its only trying hard/impossible to read sectors now, so its going to be slow. 

There is no need to rock the drive continuously.  I use a large -r and turn the drive from time to time.

Every pass in the -r20 tries all the unread areas only.  All that matters is that the errsize reduces.  

The error count can increase, if it gets a block out the middle of a larger block of errors.

When you go away for the night, switch the system off and let the failed drive cool.

In the morning, you may find the cool drive reads better.

Do not put the drive in the fridge/freezer to aid cooling.  

All you are trying to do is to coax one last read of the 'failed' sectors.

----------

## ExecutorElassus

Hi Neddy,

the center column now reads:

```
non-trimmed: 0       non-scraped:     357376 [this number is decreasing]     errsize:   366  kB  [this number is increasing]   errors:    195
```

is that bad?

To shut down for the night, can I just do 'init 0' from another TTY? Will it pick up again in the morning where it left off if the Mapfile disappears? (this is a liveCD)

Cheers,

EE

----------

## NeddySeagoon

ExecutorElassus,

The log file must persist from run to run, or ddrescue will start over again.

That's not what you want/need with a dying drive.

You can save the log, presently, /root/rescue.log to a USB stick and keep it there for future runs.

----------

## ExecutorElassus

Hi Neddy,

it's now on the first retry. The error count was at 270 when it started, and now it's down to 268. Errsize is 574kB. It says it has 14h 30m remaining, so I'm going to let it run another few hours, then Ctrl-C, copy the file to a USB stick, and shut it down.  Is that more or less the correct procedure?

Thanks,

EE

----------

## NeddySeagoon

ExecutorElassus,

Thats it.  When you start it up again, leave the log on the USB stick and point ddrescue to it there.

Then it will be maintained on the USB stick.

```
Errsize is 574kB
```

Thats a lot.

The damage depends on what is stored there, so that needs to be minimised.

Don't be in a hurry :)

----------

## ExecutorElassus

Hi Neddy,

well, the first part of the partitions that live inside the logical volume from that array ore system partitions (/usr, /var, /etc), and /home, but that altogether is maybe only a couple hundred GB out of the array. The vast majority of the later partitions are all just mass storage of media, and I can live with them suffering some data corruption.

errsize is now 560kB, errors is 255.

the errors in the RAID recovery only started at around 75%; how likely is it that those errors affect those later partitions, and not anything necessary to assemble the array or boot?

My / is on a separate array that is properly synced with the new drive, so it's really only a matter of getting it to assemble at boot time.

But I'll shut it down in a couple hours, let it cool off, and start again in the morning.

Is there anything else I can do?

Oh, one other question: how do I prevent this array from attempting to re-assemble on boot?

Cheers,

EE

----------

## NeddySeagoon

ExecutorElassus,

You need to know the mdadm and LVM data structures on the drive.

I think mdadm ver 1.2 puts its metadata at the start of the raid volume. That's what mdadm -E shows you.

I think LVM puts its metadata at the the start of the volume group too.

That would make the metadata safe if its correct.

I know that extents are allocated to logical volumes from the start of the volume group but it all gets messy when you extend a logical volume. 

Previously we had

```
#       pos               size     status

0x00000000   0xAAA4A18000   + 
```

That's 732,906,487,808 bytes good from the start of the partition.

We also know 

```
Recovery Offset: 1389843520 sectors
```

 which is 711,599,882,240 bytes from the start of the raid set.

The start of the partition and start of the raid set are not quite identical.

It looks like the first 1.5TB approx of user data space is intact.

The worrying things are filesystem metadata. Some is at the start of the filesystem, some is dynamically created/destroyed, e.g. directories.

And the writes that the raid sustained while sdc4 was disconnected.

We know that the first 1.5TB approx of user data space was resynced ...

If you use the trick with overlay filesystems, you can assemble that raid and have a look round without writing to the HDD at all.

The raid metadata changes due to using --force will go to the overlay.

If you don't like what you see, the changes are all in the overlay, so will drop out. I understand the theory but I've never done it.

It may come down to rock ... hard place.

----------

## ExecutorElassus

What is the overlay trick? I found this page, but it's confusing to me.

Right now, this array is identified as /dev/md127. It stopped recovery when the old sdb4 stopped reading, so sdc4 is still not recovered. But I couldn't stop the array, so I removed each sdX4 from it. Now I still can't stop the array. So, later on, when I get to the point where ddrescue either completes or gives up, how do I put the array back together again and test it with the overlay trick?

ddrescue is now on Retry 2. errsize is 531kB, errors is 226

Cheers,

EE

----------

## NeddySeagoon

ExecutorElassus,

Read all of that page but pay particular attention to the section Making the harddisks read-only using an overlay file 

Its a shame that the page uses parallel everywhere. That makes it hard to read and understand.

All parallel does is to run the command inside the single quotes several times at the same time.

Its so rarely useful to me that I never use it.

So  

```
parallel 'test -e /dev/loop{#} || mknod -m 660 /dev/loop{#} b 7 {#}' ::: $DEVICES
```

sets up lots oy /dev/loopX devices in one line.

You need three. one each for sd[abc]4

```
parallel truncate -s4000G overlay-{/} ::: $DEVICES
```

creates overlay files.  I recalled this being done with USB devices.

That was incorrect. The advice was to practice on something expendable, like USB devices.

Like you, I shuddered reading this for a broken 5 spindle raid, that was mostly my DVD collection.

----------

## ExecutorElassus

Hi Neddy,

well, I don't have parallel on this LiveCD, so how would I create all these overlays without it? Also, I don't have a USB stick nearly big enough for 3 1TB disks.

But I'll have to look into that in the morning. Before I shut down: How do I make sure the LiveCD doesn't try to re-start this array when I boot back up, altering the data further and using the old drive?

errsize is now 528kB, errors is 228. I'm on Retry 3.

Cheers,

EE

----------

## NeddySeagoon

ExecutorElassus,

Its OK to let it assemble the raid and fail.  If can't be assembled, it cant be used.

You could also unplug sda and sdc.

You don't need much space for the overlay filesystems.  They will only contain the writes that would have gone to the HDD.

As long as you don't sync into the overlay :)

I have no idea how much is enough.

----------

## ExecutorElassus

OK, so I'll interrupt ddrescue, copy rescue.log to a usb stick, and re-start it in the morning.

Can I stop /dev/md127 while it's still attempting to rebuild when I boot? For some reason, even with all drives removed, it still says it can't get exclusive access to /dev/md127.

errsize: 525, errors: 229. Retry 4

Cheers,

EE

----------

## NeddySeagoon

ExecutorElassus,

Yes. Stopping the rebuild is safe. However, a rebuild allows the raid to be used normally. You don't want that in case there are writes.

----------

## ExecutorElassus

Hi Neddy,

good morning. On reboot, mdraid refused to build the array (which, given that I don't want it assembled and writing, is good). Here's what /proc/mdstat now shows:

```
cat /proc/mdstat

Personalities [SNIP]

md1: active raid1 sdb1[1]

         97536 blocks [3/1] [_U_]

md124: active raid1 sdc3[0] sda3[2] sdd3[1]

        9765505 blocks [3/3] [UUU]

md126: inactive sdb4[3](S)

        965920692 blocks super 1.2

md127: inactive sda4[2](S) sdc4[4](S) sdd4[3](S)

        2897762076 blocks super 1.2

md125: active raid1 sdc1[0] sda1[2] sdd1[1]

       97536 blocks [3/3] [UUU]
```

I'm a bit worried that md127 seems to think that it has four or more members, but it seems like at least the system recognized the metadata of sdd4 and could conceivably restart the array if I forced it. But now I'm re-running ddrescue; errsize is 524kB, errors 227.

Anything else I should do in the meantime?

Cheers,

EE

----------

## NeddySeagoon

ExecutorElassus,

It sees sd[abcd]4 all with the same raid UUID. That's expected.

ddrescue has copied the raid metadata already.

If you run mdadm /dev/sd[bd]4 you will see that they are both in the same slot as one is almost a copy of the other.

With software raid, what goes where in terms of hardware is not important.

mdadm will find all the bits if you assemble by uuid.  You don't want any clones lying around when the time comes though.

-- edit --

Keep rerunning ddrescue with gravity assist in various directions.

The idea is to use gravity to help compensate for bearing wear, by running on a lesser worn part of the bearings.

It seems to help, even with modern air bearings, where there is no contact once the platter is close to the nominal speed.

Just one more read ...

----------

## ExecutorElassus

Hi Neddy,

unfortunately, the power connector for sdd is too tight to turn the drive much around. I could maybe shut down again, unplug sda and sdc, and keep going, trying the different orientations that way. But right now it looks like it's recovered about 10kB in the last 3h40m, which would mean it would take about 4 days for everything. I … maybe could live with that, if it meant I got all my data back.

If I added sdd into the array and tried to force assembly with four members, would that make it impossible to recover if I went back down to three (with sdc still marked as partially rebuilt)?

Errsize 515kb, errors 225

Cheers,

EE

----------

## NeddySeagoon

ExecutorElassus,

It will do better if you can move the drive. Can you rearrange power cables?

Typically, you try a new face/edge and you get lots of good reads very quickly.

The data recovery rate is not linear.  Its doing retries over that 515kb and 225 regions. If it gets lucky, and some data is read, it won't try that sector again.

You can't assemble the raid with four members. Two are in the same slot, you have two sdb4s.

It should be safe to bring up the raid read only in degraded mode with sdb-new and sda.

Its mdadm that gets the read only option, not the filesystem.  You only want to look.

```
       -o, --readonly

              Start  the array read only rather than read-write as normal.  No

              writes will be allowed to the array, and no resync, recovery, or

              reshape  will be started. It works with Create, Assemble, Manage

              and Misc mode.
```

You need to examine every file to see what's damaged.

If you bring up the array in degraded mode with sdb-old and sda, You can try to  cp -a the filesystem to /dev/null.

If it finishes, it didn't read any damaged blocks, so you don't need to recover them. 

The first read error will tell the first encountered damaged file and the cp will stop.

You can try the copy --readonly with all three original drives too.  That will be interesting.

I'm not sure what happens with the rebuild in progress. It may not use any data from sdc4 after the rebuild progress limit.

All this ddrescuing may have caused bad blocks to be reallocated on sdb too, in which case, it may appear to have partly healed. 

You really do need to try all 6 faces/edges.

----------

## ExecutorElassus

Hi Neddy,

alright, I found a spare cable, shut down, plugged it in, and restarted. Now I'm rotating the drive around on retries. I'll let you know when it completes. 

Errsize 512 kB, errors 225

Cheers,

EE

----------

## ExecutorElassus

Hi Neddy,

I'm looking at the metadata for the four drives, and I have a couple questions. 

sdb4 and add4 both show as Active Device 1 of an array with three active members and an even count of 1343686. This is expected as they should be copies of one another.

But sda4 and sdc4 both show as clean members of an array with two missing and one active, with sda4 set as Active device 2 and sdc4 set as spare.sda4 has an event count of 1347561, sdc4 has 1347560.

Is all of this something I can get into a workable state once ddrescue finishes? How do I start the array in degraded state?

ddrescue reported an I/O error with the USB stick, so I emergency-saved to a local file, swapped out the stick, and resumed. Errsize is now 503, errors 226. 

By that count, I'm managing to recover about 3kB/hr, which means I'd take another six days to recover everything (if that's even possible). I'm rotating the dive around now every retry.

Thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

Where did the data in this post come from

How did you post it. Copy/paste or copy type?

----------

## ExecutorElassus

Hi Neddy,

copy type. the metadata for sdc4 no longer has a recovery offset. update time is now yesterday, and there is a Bad Block log entry with 512 entries available at offset 264 sectors.

This is why I was asking about assembling the array. It's possible that mdadm assembled the array at some point and reset the disk to clean.

But if it's a spare in the new array, it would just get overwritten in rebuild anyway, yes?

Thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

It looks like we will need to do the --create I wanted to avoid after all.

Both /dev/sda4 and /dev/sdc4 show 

```
    Device Role: Active device 2
```

which can't be correct.

With three devices, there are six combinations for Role.

The feature map differs too. feature Map: 0x0 ... feature Map: 0x2

If we end up doing a --create we need a complete set of reliable metadata.

What does mdadm -E sdb-new say?

Please don't copy type.  Without knowing the metadata correctly, it can't be fed back to --create.

I believe 

```
    Raid Level: raid5

    raid Devices: 3 

    Layout: left-symmetric

    Chunk Size: 512k 
```

but what about 

```
    feature Map:

    Data Offset: 2048 sectors

    Super Offset: 8 sectors

    Unused Space: before=1968 sectors, after=872 sectors 
```

it all has to be spot on because the defaults keep changing.

Quoting from the post I linked 

```
    Update Time: fri Feb 8 12:13:10 2019

    Checksum: fdedb47b - correct

    Events: 1340931 
```

Now its  *Quote:*   

> sda4 has an event count of 1347561, sdc4 has 1347560

 That's moved on by about 7000 writes. That's worrying.

Maybe its the recovery?

----------

## ExecutorElassus

Hi Neddy,

one correction: 

sdc4's metadata shows it as "spare", so only sda4 is Active device 2. Does this change anything?

Cheers,

EE

Addendum: as it's getting late again, and it's only on retry 7 of 20, should I let it run overnight, or shut down and let it cool off again? Errsize 500 kB, errors 224.

----------

## NeddySeagoon

ExecutorElassus,

Let it run all night.

We still need accurate metadata for the raid set, even if the drive slots have been lost.

----------

## ExecutorElassus

Hi Neddy,

having realized I can ssh into the machine from my laptop, here is what mdadm -E /dev/sdX4 reports:

```
/dev/sda4:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : d42e5336:b75b0144:a502f2a0:178afc11

           Name : domo-kun:carrier

  Creation Time : Wed Apr 11 00:10:50 2012

     Raid Level : raid5

   Raid Devices : 3

 Avail Dev Size : 1931841384 (921.17 GiB 989.10 GB)

     Array Size : 1931840512 (1842.35 GiB 1978.20 GB)

  Used Dev Size : 1931840512 (921.17 GiB 989.10 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

   Unused Space : before=1968 sectors, after=872 sectors

          State : clean

    Device UUID : 4a8d21e3:15026b07:bfacaedc:b5160599

    Update Time : Sat Feb  9 11:01:48 2019

       Checksum : 8b9a9869 - correct

         Events : 1347561

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 2

   Array State : ..A ('A' == active, '.' == missing, 'R' == replacing)

```

```
 % mdadm -E /dev/sdb4

/dev/sdb4:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : d42e5336:b75b0144:a502f2a0:178afc11

           Name : domo-kun:carrier

  Creation Time : Wed Apr 11 00:10:50 2012

     Raid Level : raid5

   Raid Devices : 3

 Avail Dev Size : 1931841384 (921.17 GiB 989.10 GB)

     Array Size : 1931840512 (1842.35 GiB 1978.20 GB)

  Used Dev Size : 1931840512 (921.17 GiB 989.10 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

   Unused Space : before=1968 sectors, after=872 sectors

          State : clean

    Device UUID : 6484cb2a:b50e63db:eead2787:af47cecc

    Update Time : Sat Feb  9 10:49:30 2019

       Checksum : 857fc458 - correct

         Events : 1343686

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 1

   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)

```

```
 % mdadm -E /dev/sdc4

/dev/sdc4:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x8

     Array UUID : d42e5336:b75b0144:a502f2a0:178afc11

           Name : domo-kun:carrier

  Creation Time : Wed Apr 11 00:10:50 2012

     Raid Level : raid5

   Raid Devices : 3

 Avail Dev Size : 1931841384 (921.17 GiB 989.10 GB)

     Array Size : 1931840512 (1842.35 GiB 1978.20 GB)

  Used Dev Size : 1931840512 (921.17 GiB 989.10 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

   Unused Space : before=1768 sectors, after=872 sectors

          State : clean

    Device UUID : 9a99d7ad:9b5a9b75:42cb3258:cfb40e04

    Update Time : Sat Feb  9 10:56:23 2019

  Bad Block Log : 512 entries available at offset 264 sectors - bad blocks present.

       Checksum : ab16b9a7 - correct

         Events : 1347560

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : spare

   Array State : ..A ('A' == active, '.' == missing, 'R' == replacing)

```

```
% mdadm -E /dev/sdd4

/dev/sdd4:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : d42e5336:b75b0144:a502f2a0:178afc11

           Name : domo-kun:carrier

  Creation Time : Wed Apr 11 00:10:50 2012

     Raid Level : raid5

   Raid Devices : 3

 Avail Dev Size : 1931841384 (921.17 GiB 989.10 GB)

     Array Size : 1931840512 (1842.35 GiB 1978.20 GB)

  Used Dev Size : 1931840512 (921.17 GiB 989.10 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

   Unused Space : before=1968 sectors, after=1951911088 sectors

          State : clean

    Device UUID : 6484cb2a:b50e63db:eead2787:af47cecc

    Update Time : Sat Feb  9 10:49:30 2019

       Checksum : 857fc458 - correct

         Events : 1343686

         Layout : left-symmetric

     Chunk Size : 512K

   Device Role : Active device 1

   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)

```

I got your last message too late, so I'd already shut down for the night. It's up and running again now, this time from the ssh on my laptop, so I can copy stuff as needed.

ddrescue has now been running for five hours today. Errsize is 497 kB, errors is 227

Thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus'

Lovely ...

```
/dev/sda4: 

...

    Update Time : Sat Feb  9 11:01:48 2019

       Checksum : 8b9a9869 - correct

         Events : 1347561

... 

   Device Role : Active device 2

/dev/sdb4: 

...

    Update Time : Sat Feb  9 10:49:30 2019

       Checksum : 857fc458 - correct

         Events : 1343686 

...

   Device Role : Active device 1

/dev/sdc4:

...

     Update Time : Sat Feb  9 10:56:23 2019

  Bad Block Log : 512 entries available at offset 264 sectors - bad blocks present.

       Checksum : ab16b9a7 - correct

         Events : 1347560 

...

   Device Role : spare 

 
```

and /dev/sdd4 is a copy of /dev/sdb4 or should be but 

```
   Unused Space : before=1968 sectors, after=1951911088 sectors 
```

That should be a copy of the metadata on sdb4 but its not.

All the entries 

```
 Avail Dev Size : 1931841384 (921.17 GiB 989.10 GB)

     Array Size : 1931840512 (1842.35 GiB 1978.20 GB)

  Used Dev Size : 1931840512 (921.17 GiB 989.10 GB)

    Data Offset : 2048 sectors

   Super Offset : 8 sectors

   Unused Space : before=1968 sectors, after=872 sectors 
```

should be identical too.

Again, they differ.

We may need to try assembling all the combinations of two drives to see what happens.

At face value sdc4 does not look too healthy.

I've not seen 

```
Bad Block Log : 512 entries available at offset 264 sectors - bad blocks present.
```

before but it wasn't there in your previous metadata post.

----------

## ExecutorElassus

Hi Neddy,

well, as for the space after on sdd4, the new drive is 2TB instead of one, so having way more space afterwards is expected. That also explains the discrepancy in the Avail/Used Dev size entries.

So the only remaining issue should be the bad block data that's recently appeared on sdc4, yes? Is it possible that this data got put in during one of the attempts at rebuilding?

In any case, ddrescue has now been running since morning. After just under 8h, Errsize is 496, errors 227. How should we proceed?

Thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

You are in the middle of doing a dd from sdb4 to sdd4.

That means the data being copied is *identical* dd is a low level, block by block copy. 

dd is uninformed of the meaning of whatever is being copied.

When you are fed up with ddrescue, try to assemble the raid from sda4 and sdb4 using mdadm --assemble --readonly

That's two old drives. If it assembles, mount it and look around.  If it assembles, it may not mount, that requires the filesystem metadata to be intact too. 

```
/dev/sda4:

...

    Update Time : Sat Feb  9 11:01:48 2019

       Checksum : 8b9a9869 - correct

         Events : 1347561

...

   Device Role : Active device 2

/dev/sdb4:

...

    Update Time : Sat Feb  9 10:49:30 2019

       Checksum : 857fc458 - correct

         Events : 1343686

...

   Device Role : Active device 1 
```

 Thats 4000 events missing.

To add sdc4, we need to know where it goes.  Its either Active device 0 or Active device 3.

One of us needs to read up on how mdadm numbers  Active devices.

If that works, try sda4 and sdd4 always with mdadm --assemble --readonly.

Since we have the metadata in this thread, you can try --force too.

If --force won't work, all that's left is --create but that is very much a last ditch thing.

----------

## ExecutorElassus

Hi Neddy,

from this

```
 cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 

md1 : active raid1 sdb1[1]

      97536 blocks [3/1] [_U_]

      

md124 : active raid1 sdc3[0] sda3[2] sdd3[1]

      9765504 blocks [3/3] [UUU]

      

md126 : inactive sdb4[3](S)

      965920692 blocks super 1.2

       

md127 : inactive sda4[2](S) sdc4[4](S) sdd4[3](S)

      2897762076 blocks super 1.2

       

md125 : active raid1 sdc1[0] sda1[2] sdd1[1]

      97536 blocks [3/3] [UUU]

      

unused devices: <none>

```

it looks like sdcN has always been set as Active device 0 on the arrays, so I think it's probably the same for what is here given as md127 (where it's listed as Active device 4).

ddrescue has been running now for almost nine hours and has rescued 3 more kb. I think I'll probably call it quits with that in a bit.

what's the command to assemble the array? Do I need to name it? Given what /proc/mdstat says, what command do I use to assemble what's listed here as md127? Also, it's important to note that what'son the array is a bunch of logical paritions which themselves will have to be mounted. How can I do that from the liveCD?

thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

Here's one of my raid sets ... just one drive.

```
# mdadm -E /dev/sd[abcd]5

/dev/sda5:

          Magic : a92b4efc

        Version : 0.90.00

           UUID : 5e3cadd4:cfd2665d:96901ac7:6d8f5a5d

  Creation Time : Sat Apr 11 20:30:16 2009

     Raid Level : raid5

  Used Dev Size : 5253120 (5.01 GiB 5.38 GB)

     Array Size : 15759360 (15.03 GiB 16.14 GB)

   Raid Devices : 4

  Total Devices : 4

Preferred Minor : 126

    Update Time : Sat Jun 16 17:20:52 2018

          State : clean

Internal Bitmap : present

 Active Devices : 4

Working Devices : 4

 Failed Devices : 0

  Spare Devices : 0

       Checksum : 80b12c93 - correct

         Events : 108

         Layout : left-symmetric

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State

this     0       8        5        0      active sync   /dev/sda5

   0     0       8        5        0      active sync   /dev/sda5

   1     1       8       21        1      active sync   /dev/sdb5

   2     2       8       37        2      active sync   /dev/sdc5

   3     3       8       53        3      active sync   /dev/sdd5
```

So it appears that the role starts at 0. [url=https://raid.wiki.kernel.org/index.php/Mdadm-faq]This page supports that.

From the man page.

```
ASSEMBLE MODE

       Usage: mdadm --assemble md-device options-and-component-devices...
```

Slot 0 is missing so 

```
mdadm --assemble /dev/md1 --readonly missing /dev/sdb4 /dev/sda4
```

should bring up /dev/sdb4 /dev/sda4 as a degraded raid set on /dev/md1

You may need to tell it to --run if it assembles but does not start.

----------

## ExecutorElassus

Hi Neddy, 

here's what I tried:

```
 % mdadm --assemble /dev/md1 --readonly missing /dev/sdb4 /dev/sda4

mdadm: cannot open device missing: No such file or directory

mdadm: missing has no superblock - assembly aborted

root@sysresccd /root % mdadm --assemble /dev/md1 --readonly /dev/sdb4 /dev/sda4 

mdadm: /dev/sdb4 is busy - skipping

mdadm: /dev/sda4 is busy - skipping

```

then I stopped the arrays that were running but inactive. Then:

```
 % mdadm --assemble /dev/md127 --readonly /dev/sdb4 /dev/sda4 

mdadm: /dev/md127 assembled from 1 drive - not enough to start the array.

```

Now 'cat /proc/mdstat' shows:

```
cat /proc/mdstat                                      

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 

md127 : inactive sdb4[3](S) sda4[2](S)

      1931841384 blocks super 1.2

       

md124 : active raid1 sdc3[0] sda3[2] sdd3[1]

      9765504 blocks [3/3] [UUU]

      

md125 : active raid1 sdc1[0] sda1[2] sdd1[1]

      97536 blocks [3/3] [UUU]

      

unused devices: <none>

```

What now?

thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

Don't use /dev/md127.  It may be in config files and its certainly in the raid metadata as a preferred minor number,  so choose a new md number.

I hadn't thought to stop already running arrays of one drive. That's correct.

----------

## ExecutorElassus

Hi Neddy,

So now:

```
 % mdadm --stop /dev/md127

mdadm: stopped /dev/md127

root@sysresccd /root % mdadm --assemble /dev/md2 --readonly /dev/sdb4 /dev/sda4 

mdadm: /dev/md2 assembled from 1 drive - not enough to start the array.

root@sysresccd /root % cat /proc/mdstat                                        

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 

md127 : inactive sdb4[3](S) sda4[2](S)

      1931841384 blocks super 1.2

       

md124 : active raid1 sdc3[0] sda3[2] sdd3[1]

      9765504 blocks [3/3] [UUU]

      

md125 : active raid1 sdc1[0] sda1[2] sdd1[1]

      97536 blocks [3/3] [UUU]

      

unused devices: <none>

```

it still won't assemble, and is apparently automatically renaming it to its preferred name.

What should I try next?

thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

```
md127 : inactive sdb4[3](S) sda4[2](S)

      1931841384 blocks super 1.2 
```

Try adding in --force

----------

## ExecutorElassus

Hi Neddy,

now we're here:

```
% mdadm --assemble /dev/md2 --readonly --force /dev/sdb4 /dev/sda4

mdadm: forcing event count in /dev/sdb4(1) from 1343686 upto 1347561

mdadm: /dev/md2 has been started with 2 drives (out of 3).

root@sysresccd /root % cat /proc/mdstat                                                

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 

md2 : active (read-only) raid5 sdb4[3] sda4[2]

      1931840512 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU]

      

md124 : active raid1 sdc3[0] sda3[2] sdd3[1]

      9765504 blocks [3/3] [UUU]

      

md125 : active raid1 sdc1[0] sda1[2] sdd1[1]

      97536 blocks [3/3] [UUU]

      

unused devices: <none>

```

So, should I mount it and look around?

EDIT: right, as I said: this raid array is an lvm. So here's what I get:

```
 % mkdir raidtest

root@sysresccd /root % mount /dev/md2 raidtest/

mount: /root/raidtest: unknown filesystem type 'LVM2_member'.
```

is there a way to start the lvm on a read-only array?

Thanks for the help,

EE

PS, here's the info on the array:

```
% mdadm -D /dev/md2

/dev/md2:

           Version : 1.2

     Creation Time : Wed Apr 11 00:10:50 2012

        Raid Level : raid5

        Array Size : 1931840512 (1842.35 GiB 1978.20 GB)

     Used Dev Size : 965920256 (921.17 GiB 989.10 GB)

      Raid Devices : 3

     Total Devices : 2

       Persistence : Superblock is persistent

       Update Time : Sat Feb  9 10:49:30 2019

             State : clean, degraded 

    Active Devices : 2

   Working Devices : 2

    Failed Devices : 0

     Spare Devices : 0

            Layout : left-symmetric

        Chunk Size : 512K

Consistency Policy : resync

              Name : domo-kun:carrier

              UUID : d42e5336:b75b0144:a502f2a0:178afc11

            Events : 1347561

    Number   Major   Minor   RaidDevice State

       -       0        0        0      removed

       3       8       20        1      active sync   /dev/sdb4

       2       8        4        2      active sync   /dev/sda4

```

and /proc/mdstat:

```
 % cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 

md2 : active (read-only) raid5 sdb4[3] sda4[2]

      1931840512 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU]

      

md124 : active raid1 sdc3[0] sda3[2] sdd3[1]

      9765504 blocks [3/3] [UUU]

      

md125 : active raid1 sdc1[0] sda1[2] sdd1[1]

      97536 blocks [3/3] [UUU]

      

unused devices: <none>

```

----------

## NeddySeagoon

ExecutorElassus,

Looks promising but its early days. Those 4000 writes bother me.

```
vgchange -ay
```

should try and start all logical volumes. That can take a long time if lvmetad is not running as it will search every block device in /dev for volume groups to start.

You are breaking new ground here. I've not tried to start a volume group on a read only raid.

----------

## ExecutorElassus

Hi Neddy,

```
 % vgchange -ay

  17 logical volume(s) in volume group "vg" now active
```

it completed immediately.

now?

EDIT: Here's what I see now:

```
% vgchange -ay

  17 logical volume(s) in volume group "vg" now active

root@sysresccd /root % ls /mnt

backup  custom  floppy  gentoo  windows

root@sysresccd /root % ls /mnt/gentoo 

root@sysresccd /root % cd /dev/vg 

root@sysresccd /dev/vg % ls

carrier1  carrier2  carrier3  carrier4  carrier5  carrier6  carrier7  carrier8  carrier9  distfiles  home  opt  portage  tmp  usr  var  vartmp

```

those are all links to dm-N files, which I assume to by the physical volumes.

UPDATE:

I've tried mounting a few of tjhese. Here's what I get:

```

 % mkdir vgroup

root@sysresccd /root % mount /dev/vg/carrier1 vgroup

 % umount vgroup

root@sysresccd /root % mount /dev/vg/usr vgroup     

mount: /root/vgroup: can't read superblock on /dev/mapper/vg-usr.

root@sysresccd /root % mount /dev/vg/var vgroup

mount: /root/vgroup: can't read superblock on /dev/mapper/vg-var.

root@sysresccd /root % mount /dev/vg/opt vgroup

mount: /root/vgroup: can't read superblock on /dev/mapper/vg-opt.

root@sysresccd /root % mount /dev/vg/portage vgroup

 % umount vgroup

root@sysresccd /root % mount /dev/vg/home vgroup   

mount: /root/vgroup: can't read superblock on /dev/mapper/vg-home.

root@sysresccd /root % mount /dev/vg/distfiles vgroup 

root@sysresccd /root % umount vgroup                 

root@sysresccd /root % mount /dev/vg/portage vgroup  

root@sysresccd /root % umount vgroup               

root@sysresccd /root % mount /dev/vg/vartmp vgroup 

root@sysresccd /root % umount vgroup              

```

So some of them it can't mount. Ia there a way to fix that? The google says I could try using dumpe2fs to find backup superblocks, then run fsck on the partition, but that would require write access, yes?

What next?

thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

```
/dev/vg % ls

carrier1  carrier2  carrier3  carrier4  carrier5  carrier6  carrier7  carrier8  carrier9  distfiles  home  opt  portage  tmp  usr  var  vartmp 
```

Those are all filesystems. You don't care about tmp and vartmp. You may be emotionally attached to distfiles. I am I have all my distfiles since April 2009 when this box was new, so that's expendable.

opt should only be binaries, that can be recreated too.

System Rescue CD comes with /mnt/gentoo.

```
cd /mnt/gentoo

mkdir carrier1  carrier2  carrier3  carrier4  carrier5  carrier6  carrier7  carrier8  carrier9  distfiles  home  opt  portage  tmp  usr  var  vartmp

mount -o ro /dev/vg/carrier1  ./carrier1 

...
```

and look at files.  Ignore expendable filesystems if you want.

We know that the LVM metadata is OK.

If all the mounts work, some of the filesystem metadata is OK too.

We can map logical volumes to the array and to the underlying HDDs too and see which filesystems are damaged.

Put the output of 

```
/sbin/lvdisplay -am
```

onto a pastebin site.

Put your ddrescue log onto a pastebin site too.

Its not difficult to map the holes in the log to the allocated segments in your logical volumes.

From one of mine ...

```
 LV Path                /dev/vg/usr

...

  --- Segments ---

  Logical extents 0 to 5119:

    Type      linear

    Physical volume   /dev/md127

    Physical extents   0 to 5119

   

  Logical extents 5120 to 10239:

    Type      linear

    Physical volume   /dev/md127

    Physical extents   315136 to 320255
```

a physical extent is a 4Mb block by default.

----------

## ExecutorElassus

Here:s the pastebin for attempted mounts and the lvdisplay: https://pastebin.com/MfzUpiKj

and here's the ddrescue.map file: https://pastebin.com/izG41dN5

would running fsck on the volumes using a backup superblock allow them to be fixed and then mounted? So far, I don't see any glaring errors (but the stuff I care about is thousands of files in hundreds of subdirectories, so I doubt I'd ever find them all.

How does it look? Are we making progress? Should I, at some point, switch to using /dev/sdd4, as it is the non-broken drive?

thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

```
root@sysresccd /mnt/gentoo % mount -o ro /dev/vg/home  ./home    

mount: /mnt/gentoo/home: can't read superblock on /dev/mapper/vg-home
```

Where we go from there depends on the filesystem. fsck is a last resort. In the face of missing metadata, it guesses.

It tries to make the metadata self consistent without regard to user data and often makes a bad situation worse.

What filesystem is home?  extX keeps backup superblock copies which mount can use if you tell it to.

Also, how big is home? That will be in your pastebin.

```
 LV Path                /dev/vg/home

  Segments               1

  --- Segments ---

  Logical extents 0 to 15359:

    Type        linear

    Physical volume /dev/md2

    Physical extents    9472 to 24831
```

That's near the front of the raid set.

From the map file

```
#      pos        size  status

0x00000000  0xAAA4A18800  +
```

 has been recovered.

That's 732,906,489,856 B or 732 salesman GB. That means the first 1.4TB of the raid should be good.

home ends at 99,324,000,000 or 99G. That's 24831 * 4MiB

So home is in the recovered region.

There are two potential issues.

1) those 4000 writes means sd[ab]4 are out of sync.

2) the old sdb is damaged in that region but ddrescue has recovered it on sdb-new4 

Look around all the carrierX and see what you can see. Are the files good?

Its progress, we should be able to repeat this with sdd4 in place of sdb4 and maybe get more.

There are two approaches now.  

Take the raid down and swap sdb4 to sdd4 and see it it looks better.

Try to mount home with an alternate superblock.

The first backup superblock is at 131072 on home.

Try 

```
mount -o ro,sb=131072 /dev/vg/home  ./home
```

and read  

```
man ext4
```

That's harmless if I've got the number wrong, so you can try adding in sb=131072 to the other failed mounts too.

There are more backup superblocks too but that one is in the man page, so its easy to find.

----------

## ExecutorElassus

I think at this point I'd like to start trying to work with sdd4, in case I either try to make writes, or if there's more data recovered there.

How do I shut down the active volume groups?

UPDATE: nvm I figured out how to use 'vgchange -an' to shut it down. Now I've stopped /dev/md2 and restarted it with /dev/sdd4, and activated the volume groups. I'll update in a sec when I try your last suggestions.

Thanks for the help,

EE

----------

## ExecutorElassus

Hi Neddy,

Using backup superblocks managed to mount all the remaining partitions except /dev/vg/portage. I used a different backup superblock, found using dumpe2fs | grep superblock, tried that one, and it mounted as well.

So all the partitions mount, and a cursory look inside shows them all having the contents they should (it's worth noting, btw, that the signal for me a couple weeks ago that a drive was failing was that portage kept not being able to emerge --sync due to permissions and other problems, so I think this was the partition where a lot of the bad blocks accumulated.

```
 % mount                                                            

udev on /dev type devtmpfs (rw,nosuid,relatime,size=10240k,nr_inodes=4098759,mode=755)

/dev/sr0 on /livemnt/boot type iso9660 (ro,relatime,nojoliet,check=s,map=n,blocksize=2048,fmode=644)

/dev/loop0 on /livemnt/squashfs type squashfs (ro,relatime)

tmpfs on /livemnt/memory type tmpfs (rw,relatime)

none on / type aufs (rw,noatime,si=1cd62804c614d2a2)

tmpfs on /livemnt/tftpmem type tmpfs (rw,relatime,size=524288k)

none on /tftpboot type aufs (rw,relatime,si=1cd62804c6818aa2)

proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)

tmpfs on /run type tmpfs (rw,nodev,relatime,size=3283428k,mode=755)

sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)

securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)

debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)

configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)

fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)

pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)

mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)

devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)

shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime)

binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)

tmpfs on /tmp type tmpfs (rw,relatime)

/dev/sdg1 on /root/usb type fuseblk (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096)

/dev/mapper/vg-carrier1 on /mnt/gentoo/carrier1 type ext3 (ro,relatime,stripe=256,data=ordered)

/dev/mapper/vg-var on /mnt/gentoo/var type ext3 (ro,relatime,sb=131072,stripe=256,data=ordered)

/dev/mapper/vg-usr on /mnt/gentoo/usr type ext3 (ro,relatime,sb=131072,stripe=256,data=ordered)

/dev/mapper/vg-home on /mnt/gentoo/home type ext3 (ro,relatime,sb=131072,stripe=256,data=ordered)

/dev/mapper/vg-opt on /mnt/gentoo/opt type ext3 (ro,relatime,sb=131072,stripe=256,data=ordered)

/dev/mapper/vg-portage on /mnt/gentoo/portage type ext2 (ro,relatime,sb=24577,errors=continue,user_xattr,acl)

```

So portage is ext2 (not sure why, but this may be part of the problem; might be worth reformatting to ext3), and it looks like all the rest are ext3. 

What's the next step?

thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

We *know* that sdd4 has holes in. Its a question of where and what in affected.

Just because things mount does not mean that the file contents are correct.

When you read a raid5, two out of the three (in your case) drives are used to decode the data,

You need both bits. with sdb4, you would get read errors when you hit a failed block.

With sdd4, it will silently return rubbish.

That rubbish may be file content, directory content, or now unlikely, key filesystem metadata.

While you are using mdadm --readonly you won't do any damage to your data.

If you put the raid together with sd[ac] you may get a different subset of logical volumes that work.

You said that sdb-new is 2GB ?

If that's correct, it will hold all the data from the raid set.  Hold that thought.

It may be that different pairings of drives in md2 give you correct access to different LVM filesystems. If that's true, then in place data recovery may not be possible, but you can copy all the files to sdb-new. 

That's a bit simplistic. if sdd4 is made to fill the the remaining space on sdd, (it may be like that anyway) then it can be used to hold all the data from md2 while its other partitions are members of the other raid sets.  There is a big difference between reading the data and reading correct data. The only way you can verify the data is correct is by examining it.  

Like I've already said, some filesystems are expendable. Don't bother recovering them.

Looking through your logical volume to HDD map

```
 LV Path                /dev/vg/usr

  LV Size                20.00 GiB

  --- Segments ---

  Logical extents 0 to 5119:

    Type        linear

    Physical volume /dev/md2

    Physical extents    0 to 5119
```

That's correctly recovered to sdb-new as its all before the first read error at about 349477 physical extents into the volume group / raid.

```
  LV Path                /dev/vg/portage

  LV Size                3.00 GiB

  --- Segments ---

  Logical extents 0 to 511:

    Type        linear

    Physical volume /dev/md2

    Physical extents    5120 to 5631
```

 That follows immedately after /dev/vg/usr is the physical space so its copy is good too.

Don't spend any time on recovery. Its just a portage snapshot, er even an emerge --sync away.

```
  LV Path                /dev/vg/distfiles

  LV Size                15.00 GiB

  --- Segments ---

  Logical extents 0 to 3839:

    Type        linear

    Physical volume /dev/md2

    Physical extents    5632 to 9471
```

The same rational as /dev/vg/portage applies. We are only 38G down the raid, so the ddrescue copy is good.

```
  LV Path                /dev/vg/home

  LV Size                60.00 GiB

  --- Segments ---

  Logical extents 0 to 15359:

    Type        linear

    Physical volume /dev/md2

    Physical extents    9472 to 24831
```

That's 98G down the raid but it wouldn't mount. There are no holes is the copy, so it must be the different event counts causing the mount issue. 

```
  LV Path                /dev/vg/opt

  LV Size                4.00 GiB

  --- Segments ---

  Logical extents 0 to 1023:

    Type        linear

    Physical volume /dev/md2

    Physical extents    24832 to 25855
```

102G from the start. opt should only be binary installs. Don't bother recovering it. Look and see whats there and remerge those packages.

```
  LV Path                /dev/vg/var

  LV Size                4.00 GiB

  --- Segments ---

  Logical extents 0 to 1023:

    Type        linear

    Physical volume /dev/md2

    Physical extents    25856 to 26879
```

Only 106G of raid used so far.

You need some files here. Its how portage knows whats installed. /var/lib/portage/world is essential. /var/db/pkg/* is what tells portage exactly what is installed.

```
  LV Path                /dev/vg/tmp

  LV Size                4.00 GiB

  --- Segments ---

  Logical extents 0 to 1023:

    Type        linear

    Physical volume /dev/md2

    Physical extents    26880 to 27903
```

110G used. Throw this filesystem away. Its cleared every boot anyway.

```
  LV Path                /dev/vg/vartmp

  LV Size                15.00 GiB  --- Segments ---

  Logical extents 0 to 2559:

    Type        linear

    Physical volume /dev/md2

    Physical extents    27904 to 30463

   

  Logical extents 2560 to 3839:

    Type        linear

    Physical volume /dev/md2

    Physical extents    465664 to 466943
```

This is the first logical volume that's been extended. That means we need to to the arithmetic instead of adding up the sizes and take account of the locations.

The firs part is OK as before 349477 physical extents into the volume group. The second part is harder.

If that's portages build space, throw it away and make a new filesystem.

```
  LV Path                /dev/vg/carrier1

  LV Size                100.00 GiB

  --- Segments ---

  Logical extents 0 to 25599:

    Type        linear

    Physical volume /dev/md2

    Physical extents    30464 to 56063
```

The first sdb4 read error falls in here. Then there is a big rash of errors close together.

-- edit --  

The data for carrier1 and on is damaged on sdb4 and therefore on sdd4 too. You can keep going with ddrescue to try to get more data to recover or try to bring up the raid with sd[ac]4 and look around.

There is the recovery and event count issues there.

If you are really really lucky, the  sdb4 damage is in unused areas of the drive, so while ddrescue can't copy it, a filesystem level read might work.

That means no recovery in place though.

----------

## ExecutorElassus

Hi Neddy,

yes, the new HDD is 2TB (not GB, as you typed in your last message). that means, theoretically, that I could copy all of one of the other drives onto the extra space if needed.

So there is a big block of bad data on vg-carrier-1, and most everything that comes before it is expendable (I'd like to keep /usr, just because I have a lot of fonts and stuff under /usr/share, anda bunch of custom ebuilds in /usr/local, but those aren't priorities). vg-carrier1 is more concerning: that's where all of my work files are (documents, papers, articles, invoices, etc), so I need to do a more thorough investigation of what's there.

Using the --force flag seems to reset the event count to the highest number, so I'm not sure if that is going to cause problems.

But what should I do next? I'm not sure how to look more deeply into carrier1 without being able to load the files in some sort of GUI, but is there some other way I can try to recover the data?

Thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

Put the raid together with sd[ab]4 still --readonly

mount carrier1 somewhere and try to copy the files out.

cp -a will do.

It will fail at the first read error. If you are lucky, there won't be a read error.

You can try with sd[ac]4 too but sdc4 was in the middle of a rebuild.

As long as you always use --readonly everywhere, the data on the raid will not change and it is what it is.

The metadata may change but we have the info in this thread to run a --create but that won't sync your raids.

Warning: Even if the copy works, some files can be corrupt because of the difference in event counts.

The out of sync is not detectable.

Try not to use sdb-new as a destination, unless you make a new partition off the end of sdd4, so the raid image is preserved.

You need 100G, or less, depending on how full carrier1 is.

Someone with some script-fu could write a script to recursively list all the files in carrier1, then copy them one at a time, listing the ones that failed.

That's beyond my bash skills though.

-- edit --

Its lots of small blocks not recovered, rather than a big block.

```
      pos        size  status

0x00000000  0xAAA4A18800  +  recovered

0xAAA4A18800  0x00000400  -  2 sectors

0xAAA4A18C00  0x000BD200  +  recovered

0xAAA4AD5E00  0x00000200  -  1 sector

0xAAA4AD6000  0x00003400  +  recovered

0xAAA4AD9400  0x00000200  -  1 sector

...
```

----------

## Hu

The quick way to copy files out would be rsync.  I have had some success syncing files off dying drives before.

If that doesn't work for you, you could try cd "$carrier_mount_point" && find . '-(' -type f -o -type d '-)' other-restrictions -print0 > "$TMPDIR/files-to-save.txt" to save a list of files.  You will likely need to preserve directory structure when copying them, which makes the copy side a bit harder.  You could try tar --no-recursion -C "$carrier_mount_point" -0 -T "$TMPDIR/files-to-save.txt" -c -f - | tar -C "$recovery_directory" -x -f - -k to copy them out with tar.  If that also fails (and it might, if read errors come back and tar aborts on error), fall back to cd "$carrier_mount_point" && while read -d '' filename; do cp --parents -a "$filename" "$recovery_directory"; done < "$TMPDIR/files-to-save.txt".

Note that this last method will preserve directory hierarchy, but not directory permissions / ownership.  If you need that, you could try to pre-create the hierarchy: cd "$carrier_mount_point" && find . -type f other-restrictions -printf '%h\0' | sort -z | uniq -z > "$TMPDIR/dirs-to-save.txt" && tar -C "$carrier_mount_point" -0 -T "$TMPDIR/dirs-to-save.txt" --no-recursion -c -f - | tar -C "$recovery_directory" -x -f -.  If this step fails, you are unable to read back some of your directory entries.  That would be very unfortunate, as it means some files may be unreachable, even if their contents are intact.

Where I wrote other-restrictions, you could plug in any find predicates to restrict saving files you don't want (old enough that you can restore them from backup; derived files you can recreate from other salvaged files; etc.).  As much as practical, you want to minimize recovering files that you can get elsewhere.

Regarding /usr, I would say copy /usr/local, but plan on reinstalling the relevant packages to rebuild /usr/share.  If you can save it after you've saved all your irreplaceable contents, go ahead and try.  Just prioritize the things that are most difficult to replace.

----------

## ExecutorElassus

Hi Neddy,

conveniently, I have two SSD drives, 250GB each, that I was planning to use as a RAID1 and migrate all my system partitions (everything up to /home, but not the carrier partitions). I never got around to it, so I have some extra space. 

But I have a meeting to attend today, so I'll have to get to this when I get back in the evening.

The only thing I did when sdc4 was rebuilding was start the WM. None of the carrier partitions was mounted. So hopefully being out of sync won't affect too much. 

What's a good filesystem format for an SSD? My third one is formatted with f2fs, but this liveCD apparently doesn't have that. Hu, how do I use rsync to copy everything?

Also, conveniently, carrier1 is only about 30% full, so it's quite possible the bad sectors don't even have data on them.

It would be nice if there were a way to use the rescue mapfile and have /dev/sd[ac]4 used to rebuild *only those sectors*, on the off chance that those specific sectors might be intact on /dev/sd[ac]4.

Anyway, when I get back home I'll format the SSD and see about copying carrier1 from sd[ab]4.

thanks for the help,

EE

PS it turns out that carrier1 mounts without issue when mounted from sd[ab]4. I'm not sure why that is.

----------

## NeddySeagoon

ExecutorElassus,

1. You don't know the sync status of sd[ac]4 so attempting to recover individual blocks would be risky.

mdadm reads/writes chunks, that 512k, so a chunk is all or nothing. One missing/unreadable sector costs you a whole chunk.

On top of that, your filesystems use 4kB blocks, so 4kB is the least you can loose at the filesystem level, even if the drive is 512B sectors.

You have a filesystem with 4k blocks on top of a raid with 512k chunks on top of a HDD (which you are trying to rescue) with 512B sectors.

The holes in your recovered data are bigger than you think but on the bright side, one raid chunk spans several unreadable sectors.

sdc4 has lost its active slot. I suspect we will need to run a --create to rewrite that before you can assemble it.

If you really think the raid is clean ... and we know its not, its possible to bring it up with all three drives --readonly then try file copies.

Reads to sdb4 will fail from time to time but the data will be fetched from the other drives, so should succeed.

Again, that does not mean the recovered data is what its supposed to be.

With all three drives in the raid set, we don't have any control, or knowledge, of which drives are being read.

That will be important if sd[ab]4 gives you rubbish/fails we can try the same files from sd[ac]4 which may give a different answer.

That trick with read the missing sectors from the 'good' drives is what mdadm --replace would have done.

It would have duplicated sdb4 using data from all three drives.

That's what you really wanted to do at the outset but hindsight gives everyone 20/20 vision.

-- edit --

See your PM too.

----------

## ExecutorElassus

Hi Neddy,

all right. I've reassembled sd[ab]4 and activated the VGs inside. Once I have the SSD formatted (I guess with ext3?) and I copy over all of carrier1, what should I do with it? Assuming there are no copy errors, what next? I can't check file integrity without the programs to open the files, but is there something else I should do? 

What other steps should I take?

Thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

Thats it.  Copy over other things too.

Use rsync as Hu suggested rather than cp -a.

You could do a second copy based on  sd[ac]4 and compare the copies.  The differences are just that.

You still have to look at the files that differ to see which is correct, if any.

Even where the files compare correctly, you only know that they are the same, not correct.

You can defer the checking. The data is what it is, you can go back to ddrescue and attempt to fill more holes and get back more data.

The drawback with using sdd4 is that the missing data will still read. The drive will return whatever happens to be there.

If you thought that was useful, you can do it.

e.g. if sdb4 fails to read a directory, it will all be missing.

However, if sdd4 has part recovered that directory, you may be able to use the part recovered directory to salvage the files that are there.

Its OK to run ddrescue overnight if you want to too.  Set up lots of retries and let it run.

Next night, do it again with the drive in a different position.

I wouldn't rule out a --create on all three drives last thing and see what happens then, but I've given up trying to salvage the data in place.

Hold that for a moment ...

You could copy off 

/dev/mapper/vg-var 

/dev/mapper/vg-usr 

/dev/mapper/vg-home 

/dev/mapper/vg-opt 

onto the SSD and make new there filesystems for tmp, vartmp, distfiles and portage

It fits with over 100G to spare, fix /etc/fstab to point to the SSD and try to bring the box up normally (with the SSD in place of the raid).

That 100G will give you space for /dev/mapper/vg-carrier1 too.

You  would then have a working Gentoo to use to recover the data from the raid.

----------

## ExecutorElassus

Hi Neddy,

this last suggestion is actually the process I think I asked about over a year ago, when I first got the SSDs, but chickened out of doing.

So, here's what I think I would do. Please correct me if I'm wrong: 

Format the SSD as one single partition. I can't use f2fs, apparently. What's a good format?

rsync all of sd[adc]3 (which holds / and all the rest of the system directories like /etc and is clean). How do I do this?

then, rsync each of the other system partitions that live on sd[ab]4 (that is, /usr, /portage, /distfiles, /opt, and /home, all of which I have reason to believe are clean) to their respective directories on the SSD. How do I do this?

edit /etc/fstab to point to this SSD as /. But I boot from an initfs (yours, incidentally). How do I edit this to it boots from the SSD and not the RAID array? is that something I just edit in grub.cfg?

As you say, this should get me a bootable system using only the SSD. Is there a way to turn this into a RAID1 array later while there's still data on it?

Can you please walk me through how to do this? This all looks risky and above my level of skill.

Thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

Use ext4 on the SSD. You may want to feed it options to control the number of i-nodes and/or turn off the journal on expenable partitions.

For the portage tree, one inode per filesystem block is good or you will run out of i-nodes.

Due to having a senior moment, my portage tree is on a filesystem with 1k blocks on an SSD with 4k physical blocks.  Don't do that.

Make portage one i-node per block. 4G should be enough.

You are going to mount all these filesystems separately, like you do now.

Either use LVM, so you can move space around, growing is trivial, shrinking is hard, or make separate partitions.

As you have used LVM, that will offer the best use of space.

Make two, possibly three partitions on the SSD,  /boot, root and everything else.

Combine /boot and root if thats what you do now,.

Make everything else, LVM, as that's what you have now.

Boot with System Rescue CD attach your old Gentoo to /mnt/gentoo but use the read only option to mount for all the old filesystems.

Bring up the sd[ab]4 raid as you have been doing and attach its filesystems in the right places, under /mnt/gentoo. Don't forget the read only option.

Make a new mountpoint /mnt/SSD

Attach all the empty SSD filesystems here. After you mount the SSD_root, you will need to mkdir all the lower level mount points.

Don't forget that /tmp will need its permissions adjusted.

Time for a review before you do something you can't undo.

Your old Gentoo is attached at /mnt/gentoo and its all read only. Do check.

Your empty SSD has its filesystem tree attached at /mnt/SSD and its read/write.

This read only read write stops you removing you existing install by messing up the rsync 

I'm not a habitual rsync user, so I can't give you the command. I'll refer you to Hus post.

-- edit --

Once the copy completes, chroot into /mnt/SSD 

You will need /proc, /dev and /sys

Fix /etc/fstab

Fix /boot/grub/grub.cfg

Reinstall grub as it uses space outside any filesystem, so thats not been copied.

Reboot but go into the BIOS and choose to boot from the SSD.

Boot.  It should come up on the SSD only but your good raid sets will be running but not mounted.

-- edit --

You want to recursively rsync from /mnt/gentoo to /mnt/SSD

I use 

```
rsync -avHtr /source/ /destination/
```

 the trailing slashes are important.  my command  sets up a ssh tunnel to copy over, I think I removed that. 

I don't recall what the options do but they were arrived at with trial and error and reading the man page.

Don't just use that until you check what it will do.

----------

## ExecutorElassus

Hi Neddy,

right now my drives have four partitions: /boot (raid1 on sd[adc]1), <swap> on sd[adc]2, / as raid1 on sd[adc]3, and the LVM holding /usr, /opt, /var, /tmp, /var/tmp, /var/portage, /var/portage/distfiles, and /home, along with the nine /carrierN partitions as degraded raid5 on sd[ab]4. 

So, what I would do is create three partitions on the SSD (which is /dev/sde right now), for holding /boot, /, and the LVM for /usr, Var, /tmp, /var/tmp, /var/portage, /var/portage/distfiles, and /home. I would leave the nine carrier partitions on the raid5 array and try to recover them once the rest of the system boots.

Right now, I have /dev/md124 (active with sd[adc]3) mounted at /mnt/gentoo. At this point, once I partition and format the SSD, I should be able to just rsync everything over, yes? 

EDIT: I could, theoretically, plug the other SSD back in (I unplugged it to use the cables for sdb), and put RAID1 arrays on all three partitions from the start. Would that be a smart thing to do, since that was my original intent anyway?

Hu, can you walk me through how to do that?

Thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

You can migrate to raid1 later.  

If there is a risk of not fitting everything into 250G of one SSD, the second one can be used for another 250G of space.

If you want to make it easy to migrate to raid1 later, set up the raid1 sets on the SSD as degraded now.

You can add the other drive later, if its not got carrier partitions on it.

If you use both SSDs in the raid1 now, what space will you recover the nine carrier partitions to?

The 2Tb drive is the wrong answer. sdb may fail totally at any time, then the 2TB drive, or at least sdd4, becomes essential to your data recovery.

Go for the SSD raid one now, if you will still have space to recover the carrier partitions.

----------

## ExecutorElassus

Hi Nedda,

OK, I'll just set it up as a degraded array for now. I wish I could find the old thread where somebody told me how to go through using rsync to copy over each of the relevant subdirectories to the new locations, but I guess it won't matter as much if I'm partitioning the SSD with LVMs to match.

One other wrinkle, when we get to it: Neddy, do you remember that guide you wrote years ago about setting up an initrd to pre-mount /var and /usr to boot? I still use that, which means I'm going to need to re-do the initrd to use the new mountpoints on the SSD before I can boot the system itself. When I get that far, that is.

I could use the other SSD as a midway storage medium for the carrier partitions, but I have no hope of doing so with the last (it's over 1TB in size). That one I'll have to check some other way.

So once I have the SSD partitioned, I'll let you know (it'll have to be tomorrow) and ask for help setting up the filesystems and copying them over from sd[ab]4.

Thanks for the help,

EE

----------

## NeddySeagoon

ExecutorElassus,

If that's the initrd guide I posted on the wiki somewhere, after it mounts root, it reads the real /etc/fstab no find out where /var and /usr are.

That means it will not need to be changed because it will read the /etc/fstab from the SSD, which you will have updated. 

I won't be around until about 7:00 PM Wednesday.

----------

## Hu

I cannot help with RAID questions.  I defer to Neddy for that.

As for rsync, -a is short for -rlptgoD, so the -tr shown is unnecessary, but harmless.  For this purpose, I would add in -A -X.  See man rsync for what all these flags do.  Depending on the size of the files, I might use --inplace if I expected to need to interrupt the rsync (or have it interrupted for me) and restart it later.  You can use -n to make rsync print what it would do, without doing any work.  Once you are satisfied that your invocation will do what you want, remove -n.

If you have specific questions for me, please ask and I will help as best I can.  From what I can tell, Neddy's advice is thorough and correct, so I am standing back until called.  I think your last remark to me was asking about guidance for rsync.  What more do you want to know there?

----------

## ExecutorElassus

Hi Neddy, Hu,

do I need a special module to work on an SSD? Because when Itried to run fdisk on /dev/sde, I got an I/O error.

Thanks for the help,

EE

UPDATE: Help! I rebooted, and it immediately assembled md127 out of sd[dca]4 and started rebuilding onto sdd4! How do I stop this? How do I preserve the data I spent days trying to recover from sdb4?

----------

## ExecutorElassus

UPDATE: Well, "§%&/!!. I was too scared to stop the raid rebuild midway, so I let it complete. Since all the VGs mounted OK on the liveCD, I shut down, unplugged the bad drive and removed the CD, and booted. 

Everything booted OK, fsck ran on a couple partitions and fixed a couple missing inodes, and I got to a prompt. I started X. Now I'm forcing fsck to check each of the carrier partitions.

There's a very good chance that the first rebuild, a week ago when I accidentally pulled sdc instead of sdb, synced everything correctly. As you noted, the bad blocks only start once I'm onto the carrier partitions, and those weren't mounted when I booted without sdc. So they wouldn't have any data written to them that would corrupt a rebuild later.

But as I said: I'm running fsck on all the carrier partitions just in case.

Good heavens, that was a stressful two hours. I was afraid I'd børked all my data, and knew that if the rebuild went wrong I'd lose the last week of painstaking ddrescue work and have to start over.

But I'm back on my desktop, and everything works so far.

I'll report back with updates.

Stay tuned,

EE

----------

## ExecutorElassus

Update: well, it booted, and all the carrier partitions checked out. I ran 'emerge --sync' to re-populate the portage tree (it still had the permissions problems it was exhibiting that precipitated this whole mess), and so far I haven't been able to find any bad files or corrupted data (but, like I said, there are thousands of files in hundreds of directories, so it's going to take a while to check everything).

In any case, the main OS is working. I'm going to update the kernel and let 'emerge -uD world' run later, but for now I think I'm going to leave it in place and hope hope that whatever bad blocks there were on sdb never made it onto sdc or sda, and that the rebuild thus went through OK.

I'll post again in a couple days, but I'm cautiously optimistic.

Stay tuned,

EE

----------

## NeddySeagoon

ExecutorElassus,

Not mounted means that there were no writes via the filesystem.

The raid can do housekeeping writes anywhere, anytime, e.g. resyncing.

fsck is a very bad idea if you let it change anything. Its harmless to use to see if the filesystem really is self consistent.

Like I said, in the face of missing metadata, it guesses and it doesn't always guess correctly.

Letting fsck 'fix' a filesystem can make a bad situation worse.

Did fsck do anything?

Look in /lost+found at the top level of each filesystem.  

We know that sd[dca]4 has holes in and sdd4 is not correct but it rebuilt sdd4 from the other drives.

You now have a self consistent raid raid set based on whatever was on sd[ac]4 which we didn't ever test.

You still have sdb4 but I suspect that its no longer useful.

You just have to sift through your data now.

----------

## ExecutorElassus

Hi Neddy,

fsck didn't change anything on any of the partitions it checked, except (I think) one inode on /usr with zero dtime. /usr/lost+found has *thousands* of files, but nothing newer than 2012. /var/lost+found/, /opt/lost+found/, and /lost+found/ are all empty. Is it safe to delete the contents of /usr/lost+found/?

I know this isn't the best result, and I'm frustrated that I lost the ddrescue'd sdd4. I'm checking through files now, but so far everything looks all right. Of course, as always, I have no way to know until I happen across a file or directory that's actually broken (this happened the last time I went through this on an emerge, when it turned out one whole directory of a particular package's .so files had its contents all turned to directories, and emerge renamed all its .so files to .so.backup, which I then had to clean up). 

But I'm afraid at this point that if I try to ddrescue sdb4 again I'm going to get even less data than I managed the last time.

Sigh. I guess I'm just going to have to take whatever losses I have, hope that they're few and can be rebuilt, and remember next time this happens to use 'mdadm --replace' before I start pulling drives out. 

I'll report back if I come across any damage, but for now I think the rest of this is on me to fix.

Thank you, as always, for helping me get my machine back up and running. You've always been the one to walk me through my various crises, and I'm really thankful the community has someone like you to help. If I'm ever in your part of the world, I owe you a few drinks.

Cheers,

EE

----------

