|
|
Subscribe / Log in / New account

The two sides of reflink()

The two sides of reflink()

Posted May 9, 2009 23:31 UTC (Sat) by butlerm (subscriber, #13312)
In reply to: The two sides of reflink() by giraffedata
Parent article: The two sides of reflink()

Assuming they have the necessary privileges to do so, obviously.


to post comments

The two sides of reflink()

Posted May 10, 2009 1:45 UTC (Sun) by giraffedata (guest, #1954) [Link] (14 responses)

Assuming they have the necessary privileges to do so, obviously.

Well not obviously, since that assumption leaves the situation equally weird.

But now that you've said that's what you have in mind, maybe you can elaborate. Would an unprivileged person be able to use reflink? What would happen if he did it on a file he doesn't own? Would it be possible for someone to create a file he can't access? One whose space is charged to someone else?

The two sides of reflink()

Posted May 10, 2009 10:30 UTC (Sun) by nix (subscriber, #2304) [Link] (12 responses)

It already is possible. Create a directory readable/executable only by
yourself; hardlink someone else's file into it; wait for that other person
to delete it. Now you've stolen that person's quota.

The two sides of reflink()

Posted May 10, 2009 18:03 UTC (Sun) by giraffedata (guest, #1954) [Link] (11 responses)

Yes, and you misspoke. The other person didn't delete the file because no one can delete a file. The system deletes one automatically when it's no longer accessible. The space charging problem is one of the many reasons this innovative Unix concept should actually be scrapped. Along with the related concepts that directories are kernel level things, and you can't give a file a name.

The two sides of reflink()

Posted May 10, 2009 18:36 UTC (Sun) by nix (subscriber, #2304) [Link] (2 responses)

I suspect that if you actually tried to scrap link() et al, a million MTA
authors would try to kill you.

(I'd be rather annoyed, too: I use hardlinks all the time.)

The two sides of reflink()

Posted May 11, 2009 5:50 UTC (Mon) by giraffedata (guest, #1954) [Link] (1 responses)

I suspect that if you actually tried to scrap link() et al, a million MTA authors would try to kill you.

Well, I wouldn't scrap link() et al -- I'd just move them out of the kernel and add the ability to explicitly create and delete files independent of directory links.

The two sides of reflink()

Posted May 11, 2009 6:10 UTC (Mon) by nix (subscriber, #2304) [Link]

We already have the ability to create and delete files independently of
directory links: mkstemp(). What you can't do is easily create them
outside of /tmp, or link them to names at a later date.

The two sides of reflink()

Posted May 11, 2009 5:42 UTC (Mon) by butlerm (subscriber, #13312) [Link] (7 responses)

Personally, I would rather not have to reboot every time I installed or
updated virtually any piece of system software. That would be the direct
consequence of discarding the directory entry / inode distinction in Unix -
to regress to the reboot happy world of Win32.

The two sides of reflink()

Posted May 11, 2009 5:57 UTC (Mon) by giraffedata (guest, #1954) [Link] (6 responses)

That would be the direct consequence of discarding the directory entry / inode distinction in Unix -

But what I described makes the distinction even larger. Today directory entries and inodes are tied together tightly by the kernel.

But I'm curious about how this affects having to reboot when you update system software.

The two sides of reflink()

Posted May 11, 2009 6:09 UTC (Mon) by nix (subscriber, #2304) [Link] (5 responses)

What? Directory entries and inodes aren't tied together in the fs model at
all, except that each directory entry increases i_nlink in the
corresponding inode by one. Reflinks simply would ensure that i_nlink was
*at least* one but would not increment it (probably by maintaining a
separate i_reflink count), and the semantics of unlink() would change to
ensure that a reflink()/unlink() sequence had the same (no) effect on link
count as link()/unlink().

You could no longer rely on unlink() decrementing i_nlink, but I don't
know of *anything* that depends on this (some things doubtless do but it
can't be common).

It breaks updating running software because that involves unlinking files
that are in use, and because the update process generally consists of
creating a file with a temporary name, filling it out, and rename()ing it
over the original (that's an implicit unlink right there, and it does not
fail). If you break that you break every package manager on the face of
the earth.

The two sides of reflink()

Posted May 11, 2009 7:37 UTC (Mon) by giraffedata (guest, #1954) [Link] (3 responses)

What? Directory entries and inodes aren't tied together in the fs model at all, except that each directory entry increases i_nlink in the corresponding inode by one.

That's a pretty tight bond, especially since i_nlink controls when the inode/file gets deleted. Also, you can't make the kernel create an inode without also creating a directory entry, and except temporarily, an inode cannot exist without at least one directory entry associated with it. Those are the bonds that it would be nice to get away from, as pretty much every OS except Unix does.

Reflinks simply would ...

We must be talking about different things. I was just talking about what Unix should do instead of what it always has (as a fundamental design point) done. Nothing to do with reflinks. And I'm also not claiming it would be compatible with any existing Unix application, but I do believe every application could be done at least as well with a kernel without automatic file deletion and directories.

The two sides of reflink()

Posted May 11, 2009 18:32 UTC (Mon) by nix (subscriber, #2304) [Link] (2 responses)

*And directories*? You're dreaming. Directories are in practice essential
for scalability. If they weren't in the kernel, they'd need to be in some
userspace library (ew).

The two sides of reflink()

Posted May 12, 2009 1:22 UTC (Tue) by giraffedata (guest, #1954) [Link] (1 responses)

If they weren't in the kernel, they'd need to be in some userspace library (ew).

They work better in user space -- there's more flexibility there and the basic concept of a directory has nothing to do with resource allocation between users, which is what the kernel is for. Many OSes do them outside the kernel. The only reason they have to be in the kernel in Unix is that the kernel deletes files implicitly based on directory references. And as I've been saying, we'd be better off without that.

The two sides of reflink()

Posted May 12, 2009 19:59 UTC (Tue) by nix (subscriber, #2304) [Link]

Putting directories outside the kernel also means that a whole pile of
things POSIX guarantees become, as near I can tell, impossible to provide.
I can't see any way to keep cross-directory rename() atomic, for instance.

Also it's a grotesque security hole: now you can't keep stuff secret by
hiding it in unreadable directories anymore.

Periodically there are proposals to introduce an open()-by-inode-number
syscall. They are always shot down. I don't know what sort of system
you're thinking of, but it isn't Unix.

(And if you're going to go that route, make the inums 1024 bits long and
bingo, you've got a capability-based system.)

The two sides of reflink()

Posted May 11, 2009 15:38 UTC (Mon) by butlerm (subscriber, #13312) [Link]

There is no practical way for a filesystem to implement "reflinks" such that
the reflink shares the same inode. The ownership, permissions, and file data
of both the original file and the new file all have to be modifiable
independently. To make any sense, they would also need separate inode
numbers.

The two sides of reflink()

Posted May 11, 2009 5:34 UTC (Mon) by butlerm (subscriber, #13312) [Link]

You raise an excellent point. The useful implementation of "reflink" would
have semantics as a file copy. Since an unprivileged user cannot change the
ownership of an existing file, a general purpose implementation *must* be
able to change the ownership of the new file to that of the current user in
the process.

Otherwise you would get a highly restricted operation that would only be
useful to unprivileged users for making efficient copies of files they
already own.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds