LWN.net Logo

A look at rsync performance

A look at rsync performance

Posted Aug 19, 2010 0:53 UTC (Thu) by cooperstein (subscriber, #1139)
Parent article: A look at rsync performance

rsync uses encryption. Depending on what algorithm you use it can
make a big difference. I generally find even a 50 percent drop in
throughput compared to cp. You don't seem to be able to turn off
encryption altogether, but you can use an algorithm with lower cpu
usage, like blowfish.


(Log in to post comments)

A look at rsync performance

Posted Aug 19, 2010 1:28 UTC (Thu) by jdub (subscriber, #27) [Link]

You're confusing 100% local rsync for rsync over SSH. :-)

A look at rsync performance

Posted Aug 19, 2010 1:46 UTC (Thu) by joey (subscriber, #328) [Link]

No encryption of course, but it *does* calculate rolling checksums. Quoth the man page:

Note that rsync always verifies that each transferred file was
correctly reconstructed on the receiving side by checking a
whole-file checksum that is generated as the file is trans‐
ferred

I think that means both the client and server sides checksum the file,
even if rsync is running locally. Thus cpu usage, etc.

Most of the reason to use rsync locally is its nice interface. It should be possible to have a rsync varient that omits using the checksums, and simply overwrites the destination file always, like cp -- but with the rest of the rsync interface left intact. That should be much faster on some hardware.

For example, I have an arm fileserver that I used to use to rsync data to an external usb disk. It turns out to be faster to run rsync on a faster (intel) client, even though it has to get the data over NFS..

Since md4 tends to be 50% or so faster than md5, running rsync with --protocol=29 may also be a nice way to speed it up.

A look at rsync performance

Posted Aug 19, 2010 2:30 UTC (Thu) by Trelane (subscriber, #56877) [Link]

"Most of the reason to use rsync locally is its nice interface."

I disagree rather strenuously. IMHO, the main reason to use rsync locally is if you're trying to copy over an update (e.g. you have a camcorder with a bunch of videos that you've previously copied, and some new videos that you've not, or backing up a large dataset that has a number of files that are an update or are new since the previous copy to backup.)

"It should be possible to have a rsync varient that omits using the checksums, and simply overwrites the destination file always, like cp -- but with the rest of the rsync interface left intact"

Or you could use the right tool for the job, e.g. tar or cp. If the files are entirely new, there's no point in using rsync; there's no need to calculate any checksums (unless you're verifying the integrity of the copy perhaps).

A look at rsync performance

Posted Aug 19, 2010 2:33 UTC (Thu) by joey (subscriber, #328) [Link]

I've written a simple rsync accellerator script, local-rsync:

http://git.kitenet.net/?p=joey/home.git;a=blob_plain;f=bi...

It takes the same options as rsync, except the src and dest directories
must be specified as the first 2 parameters. And neither directory can be remote.

It operates by simply using rsync --dry-run to determine which files need to be updated, and then copying them to the dest directory using cp. rsync is run at the end to handle everything else.

Testing on my laptop, rsync takes 19 seconds to sync a directory containing a 260 mb file. local-rsync takes 8 seconds. Roughly in line with the benchmarks in this article.

A look at rsync performance

Posted Aug 19, 2010 15:16 UTC (Thu) by jcvw (subscriber, #50475) [Link]

Note that checksumming does not explain the excess use of system time as the checksumming is done in userspace

A look at rsync performance

Posted Sep 4, 2010 8:53 UTC (Sat) by llloic (subscriber, #5331) [Link]

"It should be possible to have a rsync varient that omits using the checksums, and simply overwrites the destination file always, like cp -- but with the rest of the rsync interface left intact. That should be much faster on some hardware."

Note that rsync already do the "overwrites the destination file always, likecp" as you said with the --whole-file option, which is according "the default when both the source and destination are specified as local paths", quoting this man page.

As far as I understand, when rsync acts on local files, in addition to a "normal" cp, rsync is only computing the whole file checksum.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds