|
|
Log in / Subscribe / Register

The Linux "copy problem"

The Linux "copy problem"

Posted May 29, 2019 21:05 UTC (Wed) by smfrench (subscriber, #124116)
In reply to: The Linux "copy problem" by roc
Parent article: The Linux "copy problem"

In the presentation I listed seven options that could be added (e.g. to cp and rsync). Other copy tools (like robocopy for Windows) have these (as well as others that may be less important for us on Linux) and may be useful examples.

For example some options which other tools like robocopy let the user select:
- parallel i/o (especially for the uncached copy case)
- allow setting file size first (to reduce the number of metadata updates during the copy operation)
- allow calling the copy system call (copy_file_range API) for file systems which support it
- allow copying additional metadata (e..g xattr and ACLs)
- allow choosing larger i/o (overriding the block size). For some filesystems i/o > 1MB can be much faster than small I/O (some tools will default to 4K or smaller which can be more than 10 times slower)

And then following up on other discussions at the sumimt:
- allow options like encryption or compression (which could be supported over SMB3 for example and probably other filesystems).


to post comments

The Linux "copy problem"

Posted May 29, 2019 23:05 UTC (Wed) by roc (subscriber, #30627) [Link]

That makes sense, but you also want to the default to be as good as can be.

The Linux "copy problem"

Posted May 30, 2019 16:20 UTC (Thu) by boutcher (subscriber, #7730) [Link]

I had to laugh that you brought up OS/2

The Linux "copy problem"

Posted Jun 1, 2019 1:40 UTC (Sat) by tarkasteve (subscriber, #94934) [Link] (4 responses)

I'd also humbly suggest `xcp`:

https://crates.io/crates/xcp

* Uses copy_file_range() where possible, falls back to userspace if not.
* Supports sparse files (with lseek; I wasn't aware of fiemap, is there any advantage to one over the other?)
* Partially parallel (recursive read is separate from copy operations; I have an todo for parallel copy as it seems to have advantages on nvme drives).
* Optional progress bar.
* Written in Rust
* Cross platform (well, Linux + other unix-like OSs; Windows may work, I've never managed to get Rust to work on it).

It doesn't support much in the way of permissions/ACLs ATM, it's still an intermittent WIP.

I did look at using O_DIRECT, but I get EINVAL. The open manpage lists a whole series of caveats and warnings about using it, including a disparaging quote from Linus.

Thanks for the discussion/article, it's given me some things to look into.

The Linux "copy problem"

Posted Jun 1, 2019 13:04 UTC (Sat) by desbma (guest, #118820) [Link] (1 responses)

Thanks for the link.

It joins the list of great little tools that have taken inspiration from classic Unix command line tools, but rewritten them in Rust with many improvements along the way: grep -> ripgrep, find -> fd, hexdump -> hexyl, cat -> bat, du -> diskus, cloc -> tokei...

I'll be sure to look into xcp, and probably open a few issues along the way :)

The Linux "copy problem"

Posted Jun 2, 2019 3:02 UTC (Sun) by scientes (guest, #83068) [Link]

I myself was using inotail until I reported the problem (tail -f didn't support inotify) to coreutils and it was actually fixed.

The Linux "copy problem"

Posted Jun 2, 2019 5:12 UTC (Sun) by tarkasteve (subscriber, #94934) [Link] (1 responses)

So inspired by all this, I've updated xcp with the ability to do parallel copies (at the per-file level). The results are fairly good; I'm seeing 30%-60% speed-ups depending on caching.

The Linux "copy problem"

Posted Jun 10, 2019 21:58 UTC (Mon) by smfrench (subscriber, #124116) [Link]

This is great news - looking forward to trying it. Am also very excited about the work Andreas at RedHat did, enabling GCM crypto for SMB3.1.1 mounts, which can more than double performance copying files to server when on encrypted mounts (in conjunction with two cifs.ko client patches that I recently merged into for-next that enable GCM on the client).


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds