LWN.net Logo

Advertisement

GStreamer, Embedded Linux, Android, VoD, Smooth Streaming, DRM, RTSP, HEVC, PulseAudio, OpenGL. Register now to attend.

Advertise here

copy_range()

By Jonathan Corbet
May 15, 2013
Copying a file is a common operation on any system. Some filesystems have the ability to accelerate copy operations considerably; for example, Btrfs can just add another set of copy-on-write references to the file data, and the NFS protocol allows a client to request that a copy be done on the server, avoiding moving the data over the net twice. But, for the most part, copying is still done the old-fashioned way, with the most sophisticated applications possibly using splice().

There have been various proposals over the years for ways to speed up copy operations (reflink(), for example), but nothing has ever made it into the mainline. The latest attempt is Zach Brown's copy_range() patch. It adds a new system call:

    int copy_range(int in_fd, loff_t *in_offset,
		   int out_fd, loff_t *out_offset, size_t count);

The intent of the system call is fairly clear: copy count bytes from the input file to the output. It is not said anywhere, but it's implicit in the patch that the two files should be on the same filesystem.

Inside the kernel, a new copy_range() member is added to the file_operations structure; each filesystem is meant to implement that operation to provide a fast copy operation. There is no fallback at the VFS layer if copy_range() is unavailable, but that looks like the sort of omission that would be fixed before mainline merging. Whether merging will ever happen remains to be seen; this is an area that is littered with abandoned code from previous failed attempts.


(Log in to post comments)

No reflink()?

Posted May 16, 2013 10:06 UTC (Thu) by jezuch (subscriber, #52988) [Link]

Wait, reflink() is not in the mainline? I've been using cp --reflink repeatedly and sucessfuly on my btrfs partition, so I am puzzled. Is it using some other mechanism (like a per-fs ioctl())?

No reflink()?

Posted May 16, 2013 12:41 UTC (Thu) by Yorick (subscriber, #19241) [Link]

Yes, it uses ioctl(BTRFS_IOC_CLONE).

No reflink()?

Posted May 16, 2013 12:41 UTC (Thu) by Tobu (subscriber, #24111) [Link]

cp --reflink just tries BTRFS_IOC_CLONE at the moment. For completeness, without --reflink=always it also falls back to normal copying silently.

copy_range()

Posted May 17, 2013 2:27 UTC (Fri) by felixfix (subscriber, #242) [Link]

Is there any prohibition on both fds being the same?

copy_range()

Posted May 17, 2013 17:59 UTC (Fri) by zab (subscriber, #7281) [Link]

> Is there any prohibition on both fds being the same?

At the moment, yes.

I expect that we'll relax this restriction as the patch series develops and the ->copy_range() methods correctly implement this case.

copy_range()

Posted May 26, 2013 17:46 UTC (Sun) by heijo (guest, #88363) [Link]

This would be useful to "insert" or "delete" data from files, changing the offset of existing data.

But how about adding the ability to "move" to copy_range, to improve the "delete" case and other possibilities?

It would be this set of flags:
1. Leave non-overlapping source untouched (copy)
2. Zero out source
3. Replace source with zeroes or with random data not related to source
4. Truncate source file to beginning of source range or max(beginning_of_source, end_of_dest) if files are the same

If an application specifies more than one, the kernel chooses the most space efficient and if tied most time efficient method.

copy_range()

Posted May 17, 2013 16:20 UTC (Fri) by joern (subscriber, #22392) [Link]

> It is not said anywhere, but it's implicit in the patch that the two files should be on the same filesystem.

Not necessarily. Provided the necessary plumbing exists, the two files could also be two different block devices on the scsi array, with copying done via xcopy.

copy_range()

Posted May 18, 2013 4:06 UTC (Sat) by foom (subscriber, #14868) [Link]

I'm really confused by the argument against extending sendfile/splice to do this. It looks like it's exactly the same thing, only with support for a different subset of types of files...

Why does that need a new name?

copy_range()

Posted Jun 18, 2013 16:39 UTC (Tue) by vedantk (subscriber, #88435) [Link]

I'm trying to understand why the in_offset and out_offset parameters need to be pointers. At first I assumed it was to correct for non block-aligned offsets. However, the prototype for vfs_copy_range drops them [1]. Perhaps I missed something on first glance..

[1]
vfs_copy_range(struct file *file_in, loff_t pos_in,
struct file *file_out, loff_t pos_out,
size_t count)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds