# rsync copy errors with large files 10-80GB each

## dbishop

I have been using rsync to copy files across disks, largely for archival purposes.  The files are often 20-40GB each, and there may be as much as 4TB in total.  The files are streamed data from a 16bit analog to digital converter, so the contents are for all intents and purposes random.

Up until now I have been copying these with rysnc and then comparing md5sums generated on the source, copying them to the destination, then doing an md5sum -c <file> to confirm fidelity.

When the files are relatively small, say <1GB there is never any problem. But when files are much larger (10-80GB each), the md5sums do not match as much as 35% of the time.

My questions are these:

1) Why is rsync failing without reporting any errors? is this common?

2) What kind of error checking does rsync do (besides doing before-and-after checksums when requested)

3) Is there a better way?

4) How can I troubleshoot the problem?

Tis is many different disks and several fs types (reiserfs 3.6, ext4, ext2, ntfs). The interfaces range from (e)SATA3, SAS, USB3. The disk technologies are rotating/magnetic and SSD.  This seems to a problem without regard for any of these types of details.

I have tried using the --checksum option with rsync, but it is really not a viable solution as implemented.  For my application, it is better to loop this and do an md5sum one at a time using a shell script.  At least I can get feedback  a per-file bases rather than waiting for rsync to compute 3TB worth of md5sums before getting any status.

Thanks always for all help

----------

## Akkara

I had seen a similar problem several years back.  In that case, it turned out to be a marginal motherboard (can't remember which one at the moment).  It would experience an occasional bitflip when both ethernet and hard disk traffic was going on at the same time.  Writing just to disk (with a local test program) worked OK.  Reading just from the net (with netcat | md5sum) also worked OK.  I just couldn't both read from the net and write to the disk.  Trying different kernels, stuff like that didn't change things.  Playing around with memory voltage and timing helped but didn't completely eliminate the problem.

Ended up solving it by getting server-class hardware, with ECC RAM.  Never seen that problem since.

In your case, what happens when you md5sum the entire disk, unmount it, remount, then do it again and compare the files?

Then try it again while other (unrelated) writes are going on on the other disk, see if it is a problem with both disks contending for the bus.

----------

## dbishop

Hmmm, this is a server-class machine Xeon V3, 32GB ECC, checks out okay. SAS HBA's are LSI.  there are two machines involved. If I copy 300 files to an intermediate drive (using, say, eSATA3), about 200 of the files will copy with no errors. Typical file size is 24GB.

Copying from the intermediary disk to another host will corrupt another 60 or 70 files.  I checked this, as I mentioned, by generating the md5sums on the original file system, then copying the md5sums with the files and doing a local md5sum -c.

What I haven't done is generate a set of md5sums for the intermediary files (such as they are) to see if some files are getting doubly corrupted with the second copy.  I suppose that might have some diagnostics value. I am running rsync with the -c option now, just to see what will happen.

Funny thing is rsync never reports any errors during the copy process. I have no idea what mechanisms it uses to confirm writes (if any).  I don't know if cp or scp uses any type or read/write/verify methods either.

And I am not wedded to rsync. If you (or anyone) has any suggestions for alternatives, I am happy to give those a try. But at 10min/file to copy, 10min to md5sum, doing recopies on 30% of the files can end up doubling or more the amount of time it takes.

But more than anything I would like to understand what is happening here.  And md5summing is a poor way to tell how much of a file is corrupted. But I don't know anything that will tell me to what extent the files are different (I can't imagine looking at two 24GB files side by side dhex is practical...)

----------

## Akkara

 *dbishop wrote:*   

> Hmmm, this is a server-class machine Xeon V3, 32GB ECC, checks out okay.

 

Hmm, ok, probably not the hardware then.

There was a recent kernel (in the 4.x series, can't remember which) that had a ext4 filesystem corruption problem, that only showed on RAIDed filesystems.  I think it was fixed pretty quick but something to check for, that popped to mind.

 *Quote:*   

> But more than anything I would like to understand what is happening here.  And md5summing is a poor way to tell how much of a file is corrupted. But I don't know anything that will tell me to what extent the files are different (I can't imagine looking at two 24GB files side by side dhex is practical...)

 

Write a quick program that takes two files as arguments, reads both and outputs the byte-for-byte XOR of the contents.  Pipe that thru something like "od", which will skip long strings of 0's.  Or, simply print out the differences right from the program as you encounter them.

----------

## cal22cal

"post-copy checksum"  won't be performed by rsync. 

e.g.

http://unix.stackexchange.com/questions/30970/does-rsync-verify-files-copied-between-two-local-drives

----------

## dbishop

Just thinking out loud here...

I have had problems with USB flash drives because of delayed writes, so by habit (maybe a bad one) I always perform a sync after a cp, mv, etc. whenever USB storage is involved. With such large files, I am wondering if buffers are getting corrupted or are not getting properly flushed... After all, this problem does not happen on relatively small files.  The disks themselves also have size-varying RAM buffers...

Is there a kernel setting I should be looking at?

Anyway, instead of rsync, I am wondering if a shell script isn't better. One problem with rsync's checksum is that it runs an md5sum on all files each time, and is md5sum itself not preserved. In my specific case I am using intermediary disks to transfer files (the source file system is in a physically different location making direct source -> ultimate_destination impossible). Thus having an md5 file created at the source once, then kept with the data file would be helpful in insuring integrity to the original. And I only have to run a confirming md5sum, which given the task at hand can save hours.

Then a simple one-liner like this could be used to create a list of bad files:

```
for i in *.md5 ; do md5sum -c $i  | grep "FAILED" | grep -v "open" | cut -f1 -d':' >> ~/bad_raidx.txt ; tail -n 1 ~/bad_files.tmp ; done
```

Then something like this to do the re-copying:

```
while read i ; do cp -v /source/path/$i /destination/path/ ; sync ; done < ~/bad_files.tmp
```

This is not fully baked, it's just presented for any comments/improvements, etc. For example, if you folks think this is a decent (workable) way to do this, I am sure the whole thing could be put in a single script.

I am still searching for a supported read-copy-buffer_flush-verify utility that will do this in bocks rather than in the all-or-nothing way described above. Even on a fast machine and high-speed disk I/O each file can take minutes to end-to-end. When there are hundreds of files, and potential recopying, this could easily swell to the better part of a day.  At least if I knew on copy-block 1000 we had a problem we could restart (or better just retry the block) as the process went along.

Thoughts?

----------

## cal22cal

In those old days, only dsl modem were using, ppl sending files thro' ftp and somtimes got corrupt files.

There was uty, sorry forgot the name of it. 

Will compute the diff and only resend the corrupted part(s) for remote corrections.

Let's assume no H/W related problem, i.e. bad memory, hard drives, silent data corruption  ...,

can crop such a large file in relative acceptable small bits, check the checksum, reconstruct the file,

check the whole file again ...

Anyway, keeping a checksum for each source file should be a must.

Actually, bittorrent will be some kind of uty worth to have a look.  :Wink: 

Say, first send by ftp, bt verify and resend the corrupted parts...

English is not my native language, though.

----------

## dbishop

Well, Bash to the rescue. I wrote a shell script that does what I want.  Works perfectly.

Interesting thing is I just used it to copy 650 files -- all under 8GB -- and not a single error.  Same hardware.  This has something to do with file sizes.

Essentially, the script does an md5sum on the original, copies both the md5sum file and the original to the destination disk, does an md5 check on it. If the checksum matches, the file name is echoed out to a "good" file. If it fails, the corrupted file is deleted from the destination disk and it is tried again. Any second-time failures are echoed out to a "failed" file.  These good and bad files should help me narrow in in -- figure out -- what is going wrong.

One thing that I added was a sync command following each cp command. One working theory I have is that with super-large files the buffer structures may not have time to fully flush (some kind of an overflow condition. This is just a working theory at the moment, but fwiw, the sync seems to be helping.  I will run it on files >8GB shortly to see if if the problem comes back.

----------

## dbishop

Another issue I found was that one of the machines I was using actually did have bad memory. It passed all the BIOS memchecks (I always turn those on at the expense of boot time), but it failed to catch the bad memory block problems. memtest86 found bad RAM. It affected block was very limited, and I am guessing that only super-large file copies used enough memory to get corrupted.

While the proximate cause was bad RAM, none of the disk file utilities could cope with the problem from within its own processes as far as I can determine. rsync has a checksum-after-copy option, but waiting for an 80GB file to copy, then wait while running an all-or-nothing checksum is of limited value (to me).

What is alarming to me is how undetectable these errors are. There was absolutely nothing that indicated any problem existed, or that the files were not moved with 100% fidelity. I had to specifically look.  Machines with big memory that is nt fully utilized all the time may easily mask these problems (like here with me).

Does anyone know if there are any read-write-verify settings at the block level? md5summing is okay, but it's all after-the-fact and all-or-nothing...

----------

## Akkara

 *dbishop wrote:*   

> Another issue I found was that one of the machines I was using actually did have bad memory. It passed all the BIOS memchecks (I always turn those on at the expense of boot time), but it failed to catch the bad memory block problems. memtest86 found bad RAM.

 

That is exactly the same symptoms I had seen several years ago, which led me to get server-class hardware with ECC.  Best investment ever, rock solid not a single hiccup since then.

 *Quote:*   

> Does anyone know if there are any read-write-verify settings at the block level? md5summing is okay, but it's all after-the-fact and all-or-nothing...

 

Several posts up, cal22cal's idea of using bittorrent sounds like a good one.  It does checksumming on selectable blocksizes, usually around 1MB or so and will re-transfer a block until good.

But even that won't help if the initial read of the block was bad.

The dd program, used with iconv=direct disables filecaching and would let you read the same block twice, each time getting it from disk, and hopefully not hitting the same error both times.  Add bs=X to specify a suitable blocksize (such as 1M, or even 64M), and count=1, while looping thru values of iskip will let you checksum block by block, essentially compiling a block checksum list similar to bittorrent's.  Then copy the file over however you want.  Then another of these dd commands run on the remote host to compile a block checksum list there, diff those, and send over any differences using similar dd invocations on the blocks that differ.  But first, re-run the checksums on the differing blocks locally, to be sure the block wasn't read incorrectly the 1st time.

Probably wouldn't be too hard to set up a shell script that does this.

There's filesystems such as zfs and btrfs that maintain block checksums of the files.  That should be more robust, assuming the initial creation goes without error.  Once bad memory is in play it's difficult to guarantee anything.

 *Quote:*   

> One thing that I added was a sync command following each cp ...

 

It also could be that the sync alters the buffer allocations enough that they fortuitously avoid the bad memory area.

----------

## dbishop

I liked the bittorrent idea myself. Pretty clever.

I have been running tons of testing with machines now that have memtest86-verified memory.  Out of two hunder hours copying files 5GB ~ 25GB, I have actually had one get copied with errors. A subsequent copy fixed it. The testing has used SAS, SATA2, SATA3, and USB3 interfaces.  The disks are all "enterprise class" with a lot of cache, so thermal recomp cycles and whatnot should not be a factor.  Some of the disks have been SSD, but mostly magnetic.

For now what I have settled on is a bash script that generates a local md5 file, copies both to the destination, then verifies the copied md5tio the copied data file.  The application is a bit unique, I have large data files with very small xml descriptor files that are paired with the data files. The data never has any human-readable data, it is for all intents and purposes random data. So the script actually copies both the data and descriptor files, md5's both, copies both, verifies both.  The results get logged (extensively) so that I can know about, and deal with any problems.

I suppose I ought to see about writing my own cp+verify application, but for now the script gets it done.

Thanks for the help and the good ideas  :Smile: 

----------

