More notes on reiser4
Here is last week's table, with a new line for tests done starting with a reiser4-built tarball:
Filesystem | Test | ||||
---|---|---|---|---|---|
Untar | Build | Grep | find (name) | find (stat) | |
ext3 | 55/24 | 1400/217 | 62/8 | 10.4/1.1 | 12.1/2.5 |
reiser4 | 67/41 | 1583/386 | 78/12 | 12.5/1.3 | 15.2/4.0 |
reiser4 (new) | 57/35 | 1445/393 | 58/9.9 | 8.4/1.3 | 11.1/4.0 |
The results do show a significant difference in performance when the files are created in the right order - and the differences carry through all of the operations performed on the filesystem, not just the untar. In other words, the performance benefits of reiser4 are only fully available to those who manage to create their files in the right order. Future plans call for a "repacker" process to clean up after obnoxious users who insist on creating files in something other than the optimal order, but that tool is not yet available. (For what it's worth, restoring from the reiser4 tarball did not noticeably change the ext3 results).
Last week, the discussion about reiser4 got off to a rather rough start. Even so, it evolved into a lengthy but reasonably constructive technical conversation touching on many of the issues raised by reiser4.
At the top of the list is the general question of the expanded capabilities offered by this filesystem; these include transactions, the combined file/directory objects (and the general representation of metadata in the filesystem namespace), and more. The kernel developers are nervous about changes to filesystem semantics, and they are seriously nervous about creating these new semantics at the filesystem level. The general feeling is that any worthwhile enhancements offered by reiser4 should, instead, be implemented at the virtual filesystem (VFS) level, so that more filesystems could offer them. Some developers want things done that way from the start. If there is a consensus, however, it would be along the lines laid out by Andrew Morton: accept the new features in reiser4 for now (once the other problems are addressed) with the plan of shifting the worthwhile ones into the VFS layer. The reiser4 implementation would thus be seen as a sort of prototype which could be evolved into the true Linux version.
Hans Reiser doesn't like this idea:
Somehow, over the years, Hans has neglected to tell the developers that he was, in fact, planning to replace the entire VFS. That plan looks like a difficult sell, but reiser4 could become the platform that is used to shift the VFS in the directions he sees.
Meanwhile, the reiser4 approach to metadata has attracted a fair amount of attention. Imagine you have a reiser4 partition holding a kernel tree; at the top of that tree is a file called CREDITS. It's an ordinary file, but it can be made to behave in extraordinary ways:
$ tree CREDITS/metas CREDITS/metas |-- bmap |-- gid |-- items |-- key |-- locality |-- new |-- nlink |-- oid |-- plugin | |-- compression | |-- crypto | |-- digest | |-- dir | |-- dir_item | |-- fibration | |-- file | |-- formatting | |-- hash | |-- perm | `-- sd |-- pseudo |-- readdir |-- rwx |-- size `-- uid 1 directory, 24 files
You can also type "cd CREDITS; cat ." to view the file. (One must set execute permission on the file before any of this works).
What appears to be a plain file also looks like a directory containing a number of other files. Most of these files contain information normally obtained with the stat() system call: uid is the owner, size is the length in bytes, rwx is the permissions mask, etc. Some of the others (bmap, items, oid) provide a window into how the file is represented inside the filesystem. This is all part of Hans Reiser's vision of moving everything into the namespace; rather than using a separate system call to learn about a file's metadata, just access the right pseudo file.
One branch of the discussion took issue with the "metas" name. Using reiser4 means that you cannot have any file named metas anywhere within the filesystem. Some people would like to change the name; ideas like ..metas, ..., and @ have been tossed around, but Hans seems uninclined to change things.
Another branch, led by Al Viro, worries about the locking considerations of this whole scheme. Linux, like most Unix systems, has never allowed hard links to directories for a number of reasons; one of those is locking. Those interested in the details can see this rather dense explanation from Al, or a translation by Linus to something resembling technical English. Linus's example is essentially this: imagine you have a directory "a" containing two subdirectories dir1 and dir2. You also have "b", which is simply a link to a. Imagine that two processes simultaneously attempt these commands:
Process 1 | Process 2 | |
---|---|---|
mv a/dir1 a/dir2/newdir | mv b/dir2 b/dir1/newdir |
Both commands cannot succeed, or you will have just tied your filesystem into a knot. So some sort of locking is required to serialize the above actions. Doing that kind of locking is very hard when there are multiple paths into the same directory; it is an invitation to deadlocks. The problem could be fixed by putting a monster lock around the entire filesystem, but the performance cost would be prohibitive. The usual approach has been to simply disallow this form of aliasing on directory names, and thus avoid the problem altogether.
In the reiser4 world, all files are also directories. So hard links to
files become hard links to directories, and all of these deadlock issues
come to the foreground. The concerns expressed by the kernel developers -
which appear to be legitimate - is that the reiser4 team has not thought
about these issues, and there is no plan to solve the problem. Wiring the
right sort of mutual exclusion deeply into a filesystem is a hard thing to
do as an afterthought. But something will have to be done; Al Viro has
made it clear that he will oppose merging reiser4 until the issue has been
addressed, and it is highly unlikely that it would go in over his
objections (Linus: "This means that
if Al Viro asks about locking and aliasing issues, you don't ignore it, you
ask 'how high?'
")
One way of dealing with the locking issues (and various other bits of confusion) would be to drop the "files as directories" idea and create a namespace boundary there. Files could still have attributes, but an application which wished to access them would use a separate system call to do so. The openat() interface, which is how Solaris solves the problem, seems like the favored approach. Pushing attributes into their own namespace breaks the "everything in one namespace" idea which is so fundamental to reiser4, but it would offer compatibility with Solaris and make many of the implementation issues easier to deal with. On the other hand, applications would have to be fixed to use openat() (or be run with runat).
Another contingent sees the reiser4 files-as-directories scheme as the way to implement multi-stream files. Linux is one of the few modern operating systems without this concept. The Samba developers, in particular, would love to see a multi-stream implementation, since they have to export a multi-stream interface to the rest of the world. There are obvious simple applications of multi-stream files, such as attaching icons to things. Some people are ready to use the reiser4 plugin mechanism and go nuts, however; they would like to add streams which present compressed views of files, automatically produce and unpack archive files, etc. Linus draws the line at that sort of stuff, though:
Which means that the only _real_ technical issue for supporting named streams really ends up being things like samba, which want named streams just because the work they do fundamentally is about them, for externally dictated reasons. Doing named streams for any other reason is likely just being stupid.
Once you do decide that you have to do named streams, you might then decide to use them for convenient things like icons. But it should very much be a secondary issue at that point.
Yet another concern has to do with how user space will work with this representation of file metadata. Backup programs have no idea of how to save the metadata; cp will not copy it, etc. Fixing user space is certainly an issue. The fact is, however, that, if reiser4 or the VFS of the future changes our idea of how a file behaves, the applications will be modified to deal with the new way of doing things. Meanwhile, it has been pointed out that reiser4-style metadata is probably easier for applications to work with than the current extended attribute interface, which is also not understood by most applications.
The discussion looks likely to continue for some time. Regardless of the
outcome, Hans Reiser will certainly have accomplished one of his goals: he
has gotten the wider community to start to really think about our
filesystems and how they affect our systems and how we use them.
Index entries for this article | |
---|---|
Kernel | Filesystems/Reiser4 |
Kernel | Named streams |
Posted Sep 2, 2004 6:44 UTC (Thu)
by larryr (guest, #4030)
[Link]
I think it would be ok to have a mount flag which says I want POSIXy semantics when using POSIXy system calls like open/close/rename/link/symlink, and I am willing to lose the ability to access added intuitive but nevertheless non-POSIX behavior through those system calls. From what I have seen of the VFS layer, it looked pretty tightly coupled to POSIXy semantics,
and not easy to shunt past to let the filesystem decide the semantics for itself, which sounds to me like a nice alternative to changing or replacing the VFS layer.
Larry
Posted Sep 2, 2004 7:08 UTC (Thu)
by walles (guest, #954)
[Link] (23 responses)
Also, as I haven't used a hard link in all my life, can anybody tell me about any real-life situation where hard links matter?
Posted Sep 2, 2004 8:52 UTC (Thu)
by Klavs (guest, #10563)
[Link] (5 responses)
The other nice thing, that I'm not sure if symlinks could implement (seing past the issues of symlinks to outside a vserver not being a good idea :) - is the new vserver fileattribute "Immutable-unlink" which means you can make all files immutable+Unlink in all vservers(two flags as far as I remember Immutable and Immutable-Unlink flag) - and if one vserver tries to change a file marked Immutable-Unlink - it will simply get a new file - and the old hardlink will be removed - meaning the other 9 vservers still share the same file.
To me this is clever use of hardlinks - but I'm no filesystem guru :)
Posted Sep 2, 2004 9:48 UTC (Thu)
by walles (guest, #954)
[Link] (4 responses)
Something along these lines have been implemented at http://www.ext3cow.com/ (which is a random project I found on Google, I don't know anything about it except what the web page says).
So do you (or anybody else for that part) know of any other uses for hard symlinks that don't have anything to do with space savings?
Posted Sep 2, 2004 13:23 UTC (Thu)
by utoddl (guest, #1232)
[Link] (1 responses)
Using hardlinks for this was a natural. Having said that, I recall only using hardlinks once before, a long time ago, and that was specifically for space savings.
Posted Sep 2, 2004 15:50 UTC (Thu)
by fergal (guest, #602)
[Link]
Posted Sep 10, 2004 0:46 UTC (Fri)
by roelofs (guest, #2599)
[Link] (1 responses)
Assuming you really meant "hard links," I use them to avoid accidentally deleting local mail files. Since such files get updated every day, usually several times a day, even daily incremental backups aren't sufficient to recover from an accidental deletion. But with hard-linked copies in a separate directory (e.g., ../.backups/foo.backup, etc.), you're safe. (And when you really do want to nuke the file, just truncate it to zero bytes--I wrote a trivial "trunc" utility that simply uses truncate() or ftruncate() for this purpose.) Of course, I suppose I could simply keep the "real" copy in the same hidden directory and use local symlinks to append to it...but with hard links you save one letter in every ls(1) command (i.e., -L). :-)
I've also occasionally used hard links to save a temporary MPEG or PDF file downloaded by Netscape; under some conditions, the temporary file will disappear as soon as the download is complete, but if you make a hard link at any point prior to that, the link will remain. For multi-megabyte downloads, that can be convenient, albeit not absolutely critical...
Greg
Posted Sep 11, 2004 15:44 UTC (Sat)
by khim (subscriber, #9252)
[Link]
And this too can be covered by COW files. The fact is in almost all cases where I think about hardlink I found what I really need is COW file and hardlink is just poor substitute.
Posted Sep 2, 2004 9:40 UTC (Thu)
by rjw (guest, #10415)
[Link] (6 responses)
Posted Sep 2, 2004 10:03 UTC (Thu)
by walles (guest, #954)
[Link] (5 responses)
Posted Sep 2, 2004 11:26 UTC (Thu)
by hensema (guest, #980)
[Link] (4 responses)
However, with hard links, you only need one instance of a file on disk, which saves space.
Note that hard linking from inside a chroot to main system files (such as /bin/bash) is not a very smart thing to do, as chrooted users can then modify exactly the files you wanted to prevent them from modifying. So you always need two copies of a file.
Posted Sep 2, 2004 12:04 UTC (Thu)
by flewellyn (subscriber, #5047)
[Link]
Posted Sep 2, 2004 12:52 UTC (Thu)
by maniax (subscriber, #4509)
[Link] (1 responses)
Posted Sep 2, 2004 16:35 UTC (Thu)
by Ross (guest, #4065)
[Link]
Posted Sep 2, 2004 19:52 UTC (Thu)
by oak (guest, #2786)
[Link]
Posted Sep 2, 2004 16:39 UTC (Thu)
by Ross (guest, #4065)
[Link] (1 responses)
rule 1: you can't create hard links to a file with streams
Problem solved. You get both features on a per-file basis. You just can't
Posted Sep 2, 2004 18:34 UTC (Thu)
by bronson (subscriber, #4806)
[Link]
Posted Sep 2, 2004 16:42 UTC (Thu)
by piman (guest, #8957)
[Link]
Posted Sep 2, 2004 19:24 UTC (Thu)
by xtifr (guest, #143)
[Link] (4 responses)
If you mean that you never use more than one hard link per inode (not counting the automatic "." and ".." hardlinks that all directories have), well, even that's pretty tricky - when a process opens a file, it actually creates a new hard link, internal to the process (not associated with any name on the filesystem). So, if you forbid multiple hard links, you lose the ability to open files (unless you delete them as you open them), which would make the files a bit useless. :)
Also, as others have mentioned, hard links are slightly smaller and faster than symlinks. This may not matter to you but it does matter to some people, especially people working with small embedded systems, and the insane performance fanatics who take a wasted CPU cycle as a personal affront (I'll try not to mention any Gentoo fans by name here.:)
Using symlinks also requires you to have a primary, privileged name (the main hard link). Sometimes this isn't convenient. For example, I'm not entirely sure how I want to organize my music: by artist or by genre. Currently I have two directory trees populated with hard links to the same music files. If I used symlinks, one of those trees would have to be privileged, and would be very hard to get rid of if I decided I didn't need it any more.
Posted Sep 2, 2004 19:44 UTC (Thu)
by walles (guest, #954)
[Link] (3 responses)
I'm not sure I'm following this. The article text says that hard links to directories are forbidden, but libc still has an opendir() call. If what you say above is correct, how come opendir() doesn't have to delete directories upon opening them?
> Using symlinks also requires you to have a primary, privileged name
What do you mean with "privileged" name? And why would such files be hard to get rid of?
Posted Sep 3, 2004 16:41 UTC (Fri)
by giraffedata (guest, #1954)
[Link]
He's using a rather expansive definition of "hard link." Usually, "hard link" refers only to a reference to a file from a directory entry. The kind of reference you get when you open a file isn't called a hard link. (It's just called a reference).
Incidentally, most of this thread is using "hard link" in a too restrictive way. When you create a file 'foo', you create one hard link to the file (from the directory entry with name 'foo'). When you ln foo bar, you create a second hard link to the file. Note that Unix files themselves do not have names -- not text ones anyway; they are traditionally named by inode number.
Directories have lots of hard links. There's the one from the parent directory, the one from '.', and all the ones from the subdirectories' '..' entries. In some modern models, '.' and '..' aren't actually considered directory entries, but you'll still see them -- for historical purposes -- in the directory's link count (e.g. from ls -l).
But you can't make an arbitrary hard link to a directory. Only the specific ones described above are allowed to exist.
Posted Sep 3, 2004 16:46 UTC (Fri)
by hppnq (guest, #14462)
[Link]
I suppose what is meant is, that with symbolic links there is a distinction between the actual file and the link: removing the symbolic link leaves the file intact, while removing the file leaves you with a link pointing nowhere. This, of course, is because there are two separate inodes
(the entities that keep the metadata): a symbolic link has its own inode. With hard links, you merely remove one of the references to the file (this is a number in the inode that is increased whenever a hard link is created), leaving it intact until the last link is removed. In this respect, all link names are "equal" then.
Getting back to your original question: this is also one of the reasons why you would want to use hard links. (In practice, symbolic links are almost always preferable. You should really know what you're doing when using hard links.)
Posted Sep 4, 2004 21:38 UTC (Sat)
by Ross (guest, #4065)
[Link]
Posted Sep 4, 2004 22:39 UTC (Sat)
by jmshh (guest, #8257)
[Link]
Here is another scenario, where hard links are useful: I had do make changes to a config file for a program. There was no
source available, the program crashed on reading my version, and the crash
handler removed any temporary stuff. So I started the program, made a hard link to the temporary output and
waited for the crash. Now I could see how far the program got and
immediately spotted the error. Disclaimer: Free software makes this unnecessary, good programs provide
at least a debug option that makes temporary stuff survive, and even
better programs give useful error messages. But one can't always choose
the environment.
Posted Sep 11, 2004 13:35 UTC (Sat)
by job (guest, #670)
[Link]
Posted Sep 2, 2004 12:50 UTC (Thu)
by exco (guest, #4344)
[Link] (4 responses)
CREDITS//gid
It sure will broke badly written application but it does not remove usable filename possibilities.
Posted Sep 2, 2004 13:55 UTC (Thu)
by elanthis (guest, #6227)
[Link] (1 responses)
Posted Sep 2, 2004 19:01 UTC (Thu)
by pflugstad (subscriber, #224)
[Link]
For example, clearcase is a (commercial) version control system. It implements something like this by allowing you to append @@ to any file within it and from there see all the versions and branches that are available. This is very powerful, as you can do things like using tab completion in BASH, direct scripting, etc. It's very nice.
Posted Sep 9, 2004 18:31 UTC (Thu)
by nobrowser (guest, #21196)
[Link] (1 responses)
Posted Sep 11, 2004 15:55 UTC (Sat)
by khim (subscriber, #9252)
[Link]
There are no need: RMS freely admit there are a lot of cruft in Emacs and there are plans to reimplement everything in more sane ways for few years now
Posted Sep 2, 2004 13:58 UTC (Thu)
by ballombe (subscriber, #9523)
[Link] (2 responses)
Posted Sep 2, 2004 15:54 UTC (Thu)
by RobSeace (subscriber, #4435)
[Link]
Posted Sep 11, 2004 15:57 UTC (Sat)
by khim (subscriber, #9252)
[Link]
Simple. It mostly ignores them when they are in present on disk (some filesystems do not even have "." and ".." on disk!) and handles them specially when in memory.
Posted Sep 2, 2004 15:52 UTC (Thu)
by fergal (guest, #602)
[Link] (4 responses)
Posted Sep 2, 2004 16:27 UTC (Thu)
by RobSeace (subscriber, #4435)
[Link]
Posted Sep 2, 2004 17:23 UTC (Thu)
by larryr (guest, #4030)
[Link] (2 responses)
I think maybe the problem is that unix style filesystem semantics
assume a tree structure, meaning one parent edge/entry for
each vertex/node, but having a hard link to a directory
violates that assumption. I think if it was considered typical for
a directory to have multiple parent pointers, and there were
consistent conventions
for performing atomic locking on all the parents of a directory
at once, there might be no problem. But if "the parent"
of a node is assumed by the implementation to be "the node corresponding to the path component to the left of
the path component referencing this node", locking "the parent" of "/x/a/dir1" could be different from locking "the parent" of "/x/b/dir1".
Larry
Posted Sep 3, 2004 1:57 UTC (Fri)
by vonbrand (subscriber, #4458)
[Link] (1 responses)
Posted Sep 3, 2004 15:36 UTC (Fri)
by larryr (guest, #4030)
[Link]
I wrote:
Larry
Posted Sep 2, 2004 21:51 UTC (Thu)
by iabervon (subscriber, #722)
[Link] (1 responses)
It seems to me like prohibiting all the tricky operations in the attribute space would be fine. It's not like Reiser4 gets rid of directories and makes everything equivalent to attributes of the filesystem root.
It should be fine to also prohibit file/.. (which would cause problems with hard-linked files; the same file can be in multiple directories).
I think the more serious problem is how to distinguish the attributes of a directory (which is information about the directory) from the contents of the directory (which is really information about the contents). Personally, I think dir/.../attribute is best, since all of the cases in which I've heard of "..." being used, it hasn't been for files in directories.
Posted Sep 3, 2004 18:05 UTC (Fri)
by hppnq (guest, #14462)
[Link]
For what it's worth, I think Tivoli Storage Manager uses ... as a wildcard in filenames.
Posted Sep 9, 2004 14:14 UTC (Thu)
by leandro (guest, #1460)
[Link]
I propose all this is due to it actually amounting to trying to fit a hierarchical database management system inside the kernel.
Now the reason why hierarchical databases lost to SQL is that SQL, even if not being really relational, by implementing some of the relational ideas was much simpler.
So the US$1M question is, at this point wouldnt designing all data structures around the relational model make more sense? Or at least all data storage.
Obviously this doesnt amount to Oracle in the kernel, because SQL is much more complex than a relational implementation would be.
Posted Sep 10, 2004 11:12 UTC (Fri)
by dash2 (guest, #11869)
[Link]
There is a well-known software development management meme out there called "beware of the guy in a room". (I think someone at Microsoft blogged about this recently.) That is, beware of the "genius" who sits on his own coding without communicating with the rest of the team: his ideas may be brilliant, but he may be more interested in implementing his ideas than in the project's needs.
I think maybe Hans Reiser is like a very high level "guy in a room". He is clearly very smart, his ideas are deep - I love reading the namesys website even when I don't get it - but he's very much in love with those ideas, rather than being a pragmatist who just wants to make things work. Which is cool, because open source needs such guys, but... don't let them in the driving seat!
Just my 2c, no disrespect meant to anyone.
Posted Sep 10, 2004 22:18 UTC (Fri)
by tytso (subscriber, #9993)
[Link]
The benchmark I would suggest be tried against reiser4 is compiling a kernel tree from scratch. If you have to tar and untar the kernel under reiser4 first, that's fine. But then unmount and remount the filesystem (so none of the source files are in the page cache), and then try to do kernel compile. My guess is that the results would be extremely enlightening --- and this would certainly be a fair and representative use scenario which every kernel developer would be familiar with, and indeed use every day.
mount -t reiser4 -o posix
Since hard links seem so problematic within ReiserFS, why not just remove support for them in ReiserFS entirely?Why not just not support hard links?
Well vserver uses them to save space. If you have 10 vservers (virtual servers) on the same filesystem, mirrored from the same original - instead of having f.ex. 10*3gb of files - you would only have 3gb of files. And since vserver1 can't know of vserver2 or anything outside of the vserver - soft links would be bad.Why not just not support hard links?
I think space savings would be better done by supporting copy-on-write semantics within the file system, which is just what you describe as Immutable-unlink. I imagine COW should be less error-prone than hard links, but just like you I'm no fs guru (or I wouldn't be asking these questions :-).Space savings should use COW
I have a script that makes good use of hard links -- not for space saving so much as time saving, but it saves a lot of space as well. I keep a copy of my RedHat/Fedora/whatever ISO images, and occasionally use wget to grab all the updates into another directory. These updates contain all 19 gazillion versions of the updated packages -- way more than will fit on a CD -- when what I really want of course is the latest version of each. So I use cpio to make a hard link duplicate tree of all those updates (i.e. real directories, hard linked files). That's pretty quick, 'cause it's not moving any data, just creating dir entries. Then my script throws out everything I don't want from that tree -- all the older versions of a given rpm -- and I'm left with a small enough set of rpms to fit onto a CD. I add my own favorite goodies that aren't on the distro (config files, utilities, etc.), and burn an ISO from that. That gives me the original distro CDs plus an extra CD with all current updates and my favorites all on CDs I can carry around so I can install on and update the various boxes I play with at home, work, friends' and family's houses, wherever. (Heck, I'll stick a copy of the scripts here if anybody wants to play with 'em.)
My use of hard links
I think you could have used symlinks for this too. Then mkisofs -f which tells it to follow symlinks (it might break something that really was meant to be a symlink though).My use of hard links
So do you (or anybody else for that part) know of any other uses for hard symlinks that don't have anything to do with space savings?
Space savings should use COW
Space savings should use COW
Mainly useful for things like chrooting nowadays AFAIK.Why not just not support hard links?
Please forgive my ignorance, but what do hard links have to do with chrooting? How exactly are hard links used together with chroot jails?Why not just not support hard links?
You cannot symlink to files outsides a chroot. So, if you want to create a chroot jail without hard links, then you'd have to copy all files you need inside the chroot, effectively duplicating those files.Why not just not support hard links?
So, in other words, security concerns make the space-saving of hardlinks in a chroot environment useless, since duplication is necessary anyway.Why not just not support hard links?
Using hardlinks for chroot jails is a bad idea. Firstly, you don't have a good way to protect the file, and if you do some modifications on the chroot environment's structure, you'll have to update all it's users (or, if the tool that update software in it uses unlink() and then open(), you'll have to update the users of the env. on every update).Why not just not support hard links?
Just use bind mounts, which will save more space, and make it possible to have the evironment mounted somewhere read-write for updates, and somewhere read-only, for use.
If the file is not writeable, there is no problem. If the process runningWhy not just not support hard links?
in the jail is under uid 0, then you aren't gaining anything by the jail
anyway.
Good point would be that it's easier to get security updated versions of Why not just not support hard links?
the libraries etc inside the chroots. Same can of course achieved easier
with mount --bind's from chroot "template" directory, but with hardlinks
you can pick and choose what you put input chroots from the template
directory structure.
How about this?Slightly different suggestion
rule 2: you can't create streams in a file with more than one directory entry (link)
use them both at the same time.
As streams become more and more popular, your proposed solution becomes more and more problematic. Besides, I think that in reiser4 all files have streams.Slightly different suggestion
Every regular file is a hard link to itself. . and .. are hard links to the directories you get when you cd to them.Why not just not support hard links?
A hard link is basically just something (usually a name) that points directly to an inode, rather than linking indirectly to another name. So pretty much every file on your system (except for the symlinks) is a hard link. (Actually, even symlinks are usually hard links to inodes containing the symbolic reference, so every symlink is a hardlink - but I believe there are filesystems where symbolic references can be stored in the directory structure, so this is not a hard-and-fast rule. But it is a common one.)Why not just not support hard links?
> when a process opens a file, it actually creates a new hard link, internal Why not just not support hard links?
> to the process (not associated with any name on the filesystem)
Why not just not support hard links?
What do you mean with "privileged" name? And why would such files be hard to get rid of?
Why not just not support hard links?
Because another process can't use that opened directory to traverse theWhy not just not support hard links?
filesystem (even /proc/self/cwd is just a symlink). The same thing with
open files. They prevent the item from being removed from the disk, but
they don't mess with the namespace.
Why not just not support hard links?
Hard links is basically the same thing as a file name, so what you're hard links are a necessity
saying is to limit files to having only one name. This would be to put an
artificial restriction in the file system that people are not used to
(except for the win32 crowd, their system is crippled to start with).
As an example I use hard links for my mp3 folder. Song.mp3 is placed both
in mp3/genres/Pop and mp3/artists/Artist, that way I can browse my
collection according to both artist and genre. But the ability to have
several names (and paths) to one file is useful in lots of other places.
No, this has to be solved in a better way.
Whatever is choosen to access file attributes it will broke some application. So why use // ?Using // to access file attributes
Or just use a different system call, which makes sense, since we *already* have a system calls for file-system attributes...Using // to access file attributes
Because if you can just use a special syntax to acess the extended attributes, that makes them immediately accessible to the shell, scripts of various types, and programs that don't know how to access the attributes natively. Using // to access file attributes
Apologize to RMS for implying Emacs is "badly written" :-)Using // to access file attributes
Using // to access file attributes
Aren't '.' and '..' the canonical example of hard link to directory ? More notes on reiser4
How does the kernel handle locking for those two case ?
Yeah, but those two are easily recognized, and can be special-cased, ifMore notes on reiser4
necessary... Basically, the only issue with those is first canonicalizing
your pathname (a la realpath()), and then names will be the same, no matter
what combinations of "." and ".." you use to reach the real destination...
However, with arbitrarily named hard-links, you don't have the ability to
recognize them... Or, rather, I should say you don't have the ability to
canonicalize them into any single specific format, since there is no "one
true name" for them... Ie: if you have "X" and "Y" as hard-links to the
same exact file, which one is the canonical name?? There's no way to decide
that... But, since everyone knows about the special cases of "." and "..",
those can be handled specially, with little trouble...
More notes on reiser4
Why is it that the first example here is a problem but the other 2 arent't?
puzzled
mv a/dir1 a/dir2/newdir mv b/dir2 b/dir1/newdir
mv a/dir1 a/dir2/newdir mv a/dir2 a/dir1/newdir
cd a cd b
mv dir1 dir2/newdir mv dir2 dir1/newdir
Well, I don't pretend to know much about kernel internals, but I would assumepuzzled
the reason your case #2 isn't a problem is because the files canonicalize
to the same exact pathnames in both processes, and presumably there must
be some kernel-level synchronization of such things as renaming, based on
the pathnames... So, basically, either one or the other of those processes
will succeed, and the second will fail, because the target director no longer
exists at that point, because the first process beat it to the punch... But,
as I understand it, the issue with the arbitrarily-named hard-links is that
there's no way to recognize that "a/dir1" is the exact same thing as "b/dir1",
and hence no way to possibly synchronize these things, and prevent them from
clashing with each other... And, as such, I would think your example #3 is
also a problem...
Re: puzzled
No, the problem is precisely that if a diretory has several parents, you need to make sure nobody messes around with one of the paths while you are screwing up another one. For that you'd have to be able to find all parents and lock them before doing anything... and that will have a heavy cost if done naïvely. Al (and Linus) is asking how do they solve these problems. If the ReiserFS guys have good answers, symlinks (as second-class citizens) could be on the way out. I for one doubt they have the answers (or are able to come up with them). Time will tell.
Re: puzzled
Re: puzzled
No, the problem is precisely that if a diretory has several parents[... ]you'd have to be able to find all parents and lock them before doing anything... and that will have a heavy cost if done naïvely.
if it
was considered typical for a directory to have multiple
parent pointers, and there were consistent conventions for
performing atomic locking on all the parents of a directory
at once, there might be no problem.
There are a number of different things going on. First is that the things Reiser4 puts in the attribute space of a file seem to me to be virtual files, in the sense that they replicate existing metadata in the filesystem in a different namespace. Obviously, "echo test > CREDITS/metas/rwx" shouldn't act like that were an ordinary file, and "mv CREDITS/metas/uid CREDITS/metas/gid" doesn't make any sense. Likewise, "ln CREDITS/metas/rwx MAINTAINERS/metas/rwx" seems like it shouldn't be expected to succeed. There's just more expressive power available in the filesystem interface than can be logically supported. The operations which are, from the point of view of the VFS, problematic seem to me to be exclusively ones which Reiser4's use of an extended namespace over files should prohibit anyway.More notes on reiser4
Personally, I think dir/.../attribute is best, since all of the cases in which I've heard of "..." being used, it hasn't been for files in directories.
More notes on reiser4
Looks like ReiserFS is getting too complicated and running into issues trying to extend POSIX functionality without breaking neither POSIX compatibility nor kernel internals.Trying to fit an hierarchical database inside the kernel?
Comment from a non-technical user (non-technical at this level anyway).More notes on reiser4
Some folks at Berkeley tried an abortive attempt at a related idea, called log-structured filesystems, and it similarly had the requirement for the filesystem to be periodically groomed using a "log cleaner" in order to repack and reoptimize the filesystem. For small benchmarks where the log cleaner doesn't need to be run during the duration of the test, the results can look much better than under real-world use, where the cost of the log cleaner has to be included overhead of the filesystem.Log structured filesystems had similar benchmarking issues....