LWN.net Logo

Advertisement

E-Commerce & credit card processing - the Open Source way!

Advertise here

Why not just not support hard links?

Why not just not support hard links?

Posted Sep 2, 2004 7:08 UTC (Thu) by walles (subscriber, #954)
Parent article: More notes on reiser4

Since hard links seem so problematic within ReiserFS, why not just remove support for them in ReiserFS entirely?

Also, as I haven't used a hard link in all my life, can anybody tell me about any real-life situation where hard links matter?


(Log in to post comments)

Why not just not support hard links?

Posted Sep 2, 2004 8:52 UTC (Thu) by Klavs (subscriber, #10563) [Link]

Well vserver uses them to save space. If you have 10 vservers (virtual servers) on the same filesystem, mirrored from the same original - instead of having f.ex. 10*3gb of files - you would only have 3gb of files. And since vserver1 can't know of vserver2 or anything outside of the vserver - soft links would be bad.

The other nice thing, that I'm not sure if symlinks could implement (seing past the issues of symlinks to outside a vserver not being a good idea :) - is the new vserver fileattribute "Immutable-unlink" which means you can make all files immutable+Unlink in all vservers(two flags as far as I remember Immutable and Immutable-Unlink flag) - and if one vserver tries to change a file marked Immutable-Unlink - it will simply get a new file - and the old hardlink will be removed - meaning the other 9 vservers still share the same file.

To me this is clever use of hardlinks - but I'm no filesystem guru :)

Space savings should use COW

Posted Sep 2, 2004 9:48 UTC (Thu) by walles (subscriber, #954) [Link]

I think space savings would be better done by supporting copy-on-write semantics within the file system, which is just what you describe as Immutable-unlink. I imagine COW should be less error-prone than hard links, but just like you I'm no fs guru (or I wouldn't be asking these questions :-).

Something along these lines have been implemented at http://www.ext3cow.com/ (which is a random project I found on Google, I don't know anything about it except what the web page says).

So do you (or anybody else for that part) know of any other uses for hard symlinks that don't have anything to do with space savings?

My use of hard links

Posted Sep 2, 2004 13:23 UTC (Thu) by utoddl (subscriber, #1232) [Link]

I have a script that makes good use of hard links -- not for space saving so much as time saving, but it saves a lot of space as well. I keep a copy of my RedHat/Fedora/whatever ISO images, and occasionally use wget to grab all the updates into another directory. These updates contain all 19 gazillion versions of the updated packages -- way more than will fit on a CD -- when what I really want of course is the latest version of each. So I use cpio to make a hard link duplicate tree of all those updates (i.e. real directories, hard linked files). That's pretty quick, 'cause it's not moving any data, just creating dir entries. Then my script throws out everything I don't want from that tree -- all the older versions of a given rpm -- and I'm left with a small enough set of rpms to fit onto a CD. I add my own favorite goodies that aren't on the distro (config files, utilities, etc.), and burn an ISO from that. That gives me the original distro CDs plus an extra CD with all current updates and my favorites all on CDs I can carry around so I can install on and update the various boxes I play with at home, work, friends' and family's houses, wherever. (Heck, I'll stick a copy of the scripts here if anybody wants to play with 'em.)

Using hardlinks for this was a natural. Having said that, I recall only using hardlinks once before, a long time ago, and that was specifically for space savings.

My use of hard links

Posted Sep 2, 2004 15:50 UTC (Thu) by fergal (subscriber, #602) [Link]

I think you could have used symlinks for this too. Then mkisofs -f which tells it to follow symlinks (it might break something that really was meant to be a symlink though).

Space savings should use COW

Posted Sep 10, 2004 0:46 UTC (Fri) by roelofs (subscriber, #2599) [Link]

So do you (or anybody else for that part) know of any other uses for hard symlinks that don't have anything to do with space savings?

Assuming you really meant "hard links," I use them to avoid accidentally deleting local mail files. Since such files get updated every day, usually several times a day, even daily incremental backups aren't sufficient to recover from an accidental deletion. But with hard-linked copies in a separate directory (e.g., ../.backups/foo.backup, etc.), you're safe. (And when you really do want to nuke the file, just truncate it to zero bytes--I wrote a trivial "trunc" utility that simply uses truncate() or ftruncate() for this purpose.) Of course, I suppose I could simply keep the "real" copy in the same hidden directory and use local symlinks to append to it...but with hard links you save one letter in every ls(1) command (i.e., -L). :-)

I've also occasionally used hard links to save a temporary MPEG or PDF file downloaded by Netscape; under some conditions, the temporary file will disappear as soon as the download is complete, but if you make a hard link at any point prior to that, the link will remain. For multi-megabyte downloads, that can be convenient, albeit not absolutely critical...

Greg

Space savings should use COW

Posted Sep 11, 2004 15:44 UTC (Sat) by khim (subscriber, #9252) [Link]

And this too can be covered by COW files. The fact is in almost all cases where I think about hardlink I found what I really need is COW file and hardlink is just poor substitute.

Why not just not support hard links?

Posted Sep 2, 2004 9:40 UTC (Thu) by rjw (guest, #10415) [Link]

Mainly useful for things like chrooting nowadays AFAIK.

Why not just not support hard links?

Posted Sep 2, 2004 10:03 UTC (Thu) by walles (subscriber, #954) [Link]

Please forgive my ignorance, but what do hard links have to do with chrooting? How exactly are hard links used together with chroot jails?

Why not just not support hard links?

Posted Sep 2, 2004 11:26 UTC (Thu) by hensema (guest, #980) [Link]

You cannot symlink to files outsides a chroot. So, if you want to create a chroot jail without hard links, then you'd have to copy all files you need inside the chroot, effectively duplicating those files.

However, with hard links, you only need one instance of a file on disk, which saves space.

Note that hard linking from inside a chroot to main system files (such as /bin/bash) is not a very smart thing to do, as chrooted users can then modify exactly the files you wanted to prevent them from modifying. So you always need two copies of a file.

Why not just not support hard links?

Posted Sep 2, 2004 12:04 UTC (Thu) by flewellyn (subscriber, #5047) [Link]

So, in other words, security concerns make the space-saving of hardlinks in a chroot environment useless, since duplication is necessary anyway.

Why not just not support hard links?

Posted Sep 2, 2004 12:52 UTC (Thu) by maniax (subscriber, #4509) [Link]

Using hardlinks for chroot jails is a bad idea. Firstly, you don't have a good way to protect the file, and if you do some modifications on the chroot environment's structure, you'll have to update all it's users (or, if the tool that update software in it uses unlink() and then open(), you'll have to update the users of the env. on every update).
Just use bind mounts, which will save more space, and make it possible to have the evironment mounted somewhere read-write for updates, and somewhere read-only, for use.

Why not just not support hard links?

Posted Sep 2, 2004 16:35 UTC (Thu) by Ross (subscriber, #4065) [Link]

If the file is not writeable, there is no problem. If the process running
in the jail is under uid 0, then you aren't gaining anything by the jail
anyway.

Why not just not support hard links?

Posted Sep 2, 2004 19:52 UTC (Thu) by oak (subscriber, #2786) [Link]

Good point would be that it's easier to get security updated versions of
the libraries etc inside the chroots. Same can of course achieved easier
with mount --bind's from chroot "template" directory, but with hardlinks
you can pick and choose what you put input chroots from the template
directory structure.

Slightly different suggestion

Posted Sep 2, 2004 16:39 UTC (Thu) by Ross (subscriber, #4065) [Link]

How about this?

rule 1: you can't create hard links to a file with streams
rule 2: you can't create streams in a file with more than one directory entry (link)

Problem solved. You get both features on a per-file basis. You just can't
use them both at the same time.

Slightly different suggestion

Posted Sep 2, 2004 18:34 UTC (Thu) by bronson (subscriber, #4806) [Link]

As streams become more and more popular, your proposed solution becomes more and more problematic. Besides, I think that in reiser4 all files have streams.

Why not just not support hard links?

Posted Sep 2, 2004 16:42 UTC (Thu) by piman (subscriber, #8957) [Link]

Every regular file is a hard link to itself. . and .. are hard links to the directories you get when you cd to them.

Why not just not support hard links?

Posted Sep 2, 2004 19:24 UTC (Thu) by xtifr (subscriber, #143) [Link]

A hard link is basically just something (usually a name) that points directly to an inode, rather than linking indirectly to another name. So pretty much every file on your system (except for the symlinks) is a hard link. (Actually, even symlinks are usually hard links to inodes containing the symbolic reference, so every symlink is a hardlink - but I believe there are filesystems where symbolic references can be stored in the directory structure, so this is not a hard-and-fast rule. But it is a common one.)

If you mean that you never use more than one hard link per inode (not counting the automatic "." and ".." hardlinks that all directories have), well, even that's pretty tricky - when a process opens a file, it actually creates a new hard link, internal to the process (not associated with any name on the filesystem). So, if you forbid multiple hard links, you lose the ability to open files (unless you delete them as you open them), which would make the files a bit useless. :)

Also, as others have mentioned, hard links are slightly smaller and faster than symlinks. This may not matter to you but it does matter to some people, especially people working with small embedded systems, and the insane performance fanatics who take a wasted CPU cycle as a personal affront (I'll try not to mention any Gentoo fans by name here.:)

Using symlinks also requires you to have a primary, privileged name (the main hard link). Sometimes this isn't convenient. For example, I'm not entirely sure how I want to organize my music: by artist or by genre. Currently I have two directory trees populated with hard links to the same music files. If I used symlinks, one of those trees would have to be privileged, and would be very hard to get rid of if I decided I didn't need it any more.

Why not just not support hard links?

Posted Sep 2, 2004 19:44 UTC (Thu) by walles (subscriber, #954) [Link]

> when a process opens a file, it actually creates a new hard link, internal
> to the process (not associated with any name on the filesystem)

I'm not sure I'm following this. The article text says that hard links to directories are forbidden, but libc still has an opendir() call. If what you say above is correct, how come opendir() doesn't have to delete directories upon opening them?

> Using symlinks also requires you to have a primary, privileged name

What do you mean with "privileged" name? And why would such files be hard to get rid of?

Why not just not support hard links?

Posted Sep 3, 2004 16:41 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

He's using a rather expansive definition of "hard link." Usually, "hard link" refers only to a reference to a file from a directory entry. The kind of reference you get when you open a file isn't called a hard link. (It's just called a reference).

Incidentally, most of this thread is using "hard link" in a too restrictive way. When you create a file 'foo', you create one hard link to the file (from the directory entry with name 'foo'). When you ln foo bar, you create a second hard link to the file. Note that Unix files themselves do not have names -- not text ones anyway; they are traditionally named by inode number.

Directories have lots of hard links. There's the one from the parent directory, the one from '.', and all the ones from the subdirectories' '..' entries. In some modern models, '.' and '..' aren't actually considered directory entries, but you'll still see them -- for historical purposes -- in the directory's link count (e.g. from ls -l).

But you can't make an arbitrary hard link to a directory. Only the specific ones described above are allowed to exist.

Why not just not support hard links?

Posted Sep 3, 2004 16:46 UTC (Fri) by hppnq (subscriber, #14462) [Link]

What do you mean with "privileged" name? And why would such files be hard to get rid of?

I suppose what is meant is, that with symbolic links there is a distinction between the actual file and the link: removing the symbolic link leaves the file intact, while removing the file leaves you with a link pointing nowhere. This, of course, is because there are two separate inodes (the entities that keep the metadata): a symbolic link has its own inode. With hard links, you merely remove one of the references to the file (this is a number in the inode that is increased whenever a hard link is created), leaving it intact until the last link is removed. In this respect, all link names are "equal" then.

Getting back to your original question: this is also one of the reasons why you would want to use hard links. (In practice, symbolic links are almost always preferable. You should really know what you're doing when using hard links.)

Why not just not support hard links?

Posted Sep 4, 2004 21:38 UTC (Sat) by Ross (subscriber, #4065) [Link]

Because another process can't use that opened directory to traverse the
filesystem (even /proc/self/cwd is just a symlink). The same thing with
open files. They prevent the item from being removed from the disk, but
they don't mess with the namespace.

Why not just not support hard links?

Posted Sep 4, 2004 22:39 UTC (Sat) by jmshh (guest, #8257) [Link]

Here is another scenario, where hard links are useful:

I had do make changes to a config file for a program. There was no source available, the program crashed on reading my version, and the crash handler removed any temporary stuff.

So I started the program, made a hard link to the temporary output and waited for the crash. Now I could see how far the program got and immediately spotted the error.

Disclaimer: Free software makes this unnecessary, good programs provide at least a debug option that makes temporary stuff survive, and even better programs give useful error messages. But one can't always choose the environment.

hard links are a necessity

Posted Sep 11, 2004 13:35 UTC (Sat) by job (subscriber, #670) [Link]

Hard links is basically the same thing as a file name, so what you're
saying is to limit files to having only one name. This would be to put an
artificial restriction in the file system that people are not used to
(except for the win32 crowd, their system is crippled to start with).

As an example I use hard links for my mp3 folder. Song.mp3 is placed both
in mp3/genres/Pop and mp3/artists/Artist, that way I can browse my
collection according to both artist and genre. But the ability to have
several names (and paths) to one file is useful in lots of other places.

No, this has to be solved in a better way.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds