Posted Mar 17, 2009 16:04 UTC (Tue) by xav (guest, #18536)
Parent article: Better than POSIX?
Maybe the answer is a new set of guarantees for Linux's POSIX API, e.g. an overwriting rename() will either leave the old or new version to disk, atomically.
Posted Mar 17, 2009 17:01 UTC (Tue) by drag (subscriber, #31333)
[Link]
A simple way to say it would be:
In the actual physical file system image committed to disk; Don't name partially written files.
------
That would pretty much get what everybody wants. I suppose it's much much complicated then that, of course. I'll take "good enough" any day.
Better than POSIX?
Posted Mar 17, 2009 17:52 UTC (Tue) by vonbrand (subscriber, #4458)
[Link]
"Don't name partially written files" would mean that nothing has a name until the file is closed, and the file has to disappear whenever it is opened for writing... I'd take the current mess situation any time in the face of that.
Better than POSIX?
Posted Mar 17, 2009 23:50 UTC (Tue) by drag (subscriber, #31333)
[Link]
Who said it has to dissappeer when being re-opened? It got finished writing once and thus had a name. The fact that it was edited again doesn't change that. :)
It certainly will solve the write() then rename() issue. :)
And as I recalled I remember hearing about file system designers deliberately zero-ing out files for various reasons.
Postponing flush until close()?
Posted Mar 18, 2009 23:17 UTC (Wed) by xoddam (subscriber, #2322)
[Link]
Am I to understand that you're proposing that any changes to a file (metadata and data blocks alike) not be flushed to disk until close()?
That doesn't really sound like a good way to enhance recoverability. For applications that keep large files (eg. caches) open for a long time and update them piecemeal, it sounds like sheer madness.
Applications that truncate existing data before rewriting it are asking for trouble, though I appreciate a filesystem that doesn't exacerbate the race condition by promptly truncating the inode but delaying the flush of the new data blocks for several seconds. Ted has already fixed that particular issue heuristically by delaying truncation until it is time to flush the data. Flushing *early* on close() couldn't hurt integrity but could hurt performance quite a bit.
Postponing flush until close()?
Posted Mar 19, 2009 23:34 UTC (Thu) by xoddam (subscriber, #2322)
[Link]
Sorry, I just read the patch. If a file has been opened with O_TRUNC then it will indeed be flushed (early) when closed. The race condition still exists of course, but flushing on close will keep the risky interval relatively short in the vast majority of cases.
Maybe not
Posted Mar 17, 2009 21:40 UTC (Tue) by man_ls (subscriber, #15091)
[Link]
Maybe the answer is a new set of guarantees for Linux's POSIX API, e.g. an overwriting rename() will either leave the old or new version to disk, atomically.
Why? As has been pointed out, ext2 is perfectly fine for many applications, and it would never be Linux-POSIX-compliant this way. For example in data centers with 3-way redundant power supplies and redundant storage, or temporary filesystems.
Do you really think it is better to force everyone to comply with a new standard than trying to convince ext4 developers to do the (obvious) right thing?
Posted Mar 18, 2009 22:57 UTC (Wed) by xoddam (subscriber, #2322)
[Link]
I can think of no reason at all why a guarantee of this sort should not be considered desirable for any filesystems that try to ease crash recovery. It may be out of ext2's reach (because its code does not impose a strict partial ordering on disk writes), but it should be achievable as an enhancement to any journaling, log-structured or soft-update filesystem.
Posted Mar 18, 2009 23:25 UTC (Wed) by man_ls (subscriber, #15091)
[Link]
It is by all means desirable. The proper place for such a standard might be debated though. I have always understood that POSIX is a standard for compatibility, e.g. Wikipedia says:
POSIX or "Portable Operating System Interface for Unix"[1] is the collective name of a family of related standards specified by the IEEE to define the application programming interface (API), along with shell and utilities interfaces for software compatible with variants of the Unix operating system, although the standard can apply to any operating system.
So I don't know if a standard for reliable file systems would fit in.
Better than POSIX?
Posted Mar 17, 2009 22:30 UTC (Tue) by dhess (guest, #7827)
[Link]
Maybe the answer is a new set of guarantees for Linux's POSIX API, e.g. an overwriting rename() will either leave the old or new version to disk, atomically.
Yeah, I've come to a similar conclusion. Perhaps the rename() semantics alone is sufficient. It's simple enough conceptually that it might be relatively easy to get other operating systems to adopt the new semantics, too, at least for the filesystems that can support it. And it sounds like there's already a quite common belief amongst application developers that all filesystems behave this way, anyway.
In a previous life, I worked on memory ordering models in CPUs and chipsets. During this recent ext4 hubbub, it dawned on me that the issues with ordering and atomicity in high-performance filesystem design may be isomorphic to memory ordering. Even if that's not strictly true, there's probably a lot to be learned by filesystem designers and API writers from modern CPU memory ordering models, in any case, because memory ordering is a well-explored space by this point in the history of computer engineering; and I don't just mean the technical semantics, either, but the whole social aspect, too, i.e., how to balance good performance with software complexity, how much of that complexity to expose to application programmers, who often have neither the time nor the background to understand all of the tradeoffs, let alone dot all the "i"s and cross all the "t"s, etc. Anyway, changing rename's semantics as you suggest would be the equivalent of a "release store" in memory ordering terms, and seems to be exactly the right kind of tradeoff in this situation.
Better than POSIX?
Posted Mar 17, 2009 23:04 UTC (Tue) by quotemstr (subscriber, #45331)
[Link]
Thanks for that comment --- it's amazing how much knowledge we're rediscovering in computing. It's almost as if we're coming out of some kind of dark age.
One thing that struck me was a comment on a Slashdot story about a "breakthrough" in data center energy optimization. The comment showed that the problem of deciding when to boot up additional servers to meet demand was isomorphic to the problem of steam boiler management --- right down to the start-up and constant energy costs --- and that the problem had already been thoroughly addressed in literature from the turn of the last century.
Better than POSIX?
Posted Mar 17, 2009 23:52 UTC (Tue) by dhess (guest, #7827)
[Link]
Hmm, that is interesting! I'll file the steam boiler analogy away for later use.
Alan Kay never misses an opportunity to point out that our field has a terrible track record when it comes to learning from, or even being aware of, our history, let alone that of other related fields. There's a lot of unfortunate "rediscovering" of knowledge in computer science and engineering. (I'm as guilty of it as anyone.) I think it's a good habit to consider Alan's admonishment when we're faced with challenges or seeking solutions to problems, so I guess I'll follow his lead by mentioning it here :)
Better than POSIX?
Posted Mar 17, 2009 23:57 UTC (Tue) by rahvin (subscriber, #16953)
[Link]
That's cause FOSS has opened up the gates to allow technology experience and knowledge to flow around society instead of being trapped behind the corporate copyright, trade secret and patent. What was once a segregated system where you could only learn from those who worked with you directly has turned into a system where the experts from every company provide wisdom and training to the new kids at every company. The corporate policies that once locked behind the corporate veil the experts who built the foundation we are trying to build upon has been blasted apart and is now being shared to everyone's benefit.
Imagine for how many years the wheel was reinvented over and over again at hundreds of companies as people relearned how to code something the proper way for a certain scenario. It's scary to think how fast we could have developed software if it had been FOSS all along instead of corporations each trying to slit each others throats. In the case of computer information systems the sharing of code accelerates the total technology much faster than the private corporate system of the past ever has. Of course this isn't always true. Niche software will probably always need the economic support closed systems provide even if it divides the efforts among a few companies who reinvent each other's innovations.
Better than POSIX?
Posted Mar 26, 2009 10:56 UTC (Thu) by massimiliano (subscriber, #3048)
[Link]
That is one explanation.
Another one is that we don't have an "engineering culture" in software development.
I mean, software developers are not necessarily engineers, so they rarely know about issues like steam boiler management.
But most importantly, a software developer is seldom trained to think at an engineering level. I remember when I studied for my degree, I have been taught about power plants, engines, turbines, cooling plants, pipes, dissipators... none of that has anything to do directly with software development. But after a few years of studying those systems it becomes obvious that there are lots of analogies between them, and very often the mathematical models that describe them are the same.
The teachers themselves pointed this out every time they could, and they did it on purpose, to teach us to recognize the patterns.
Now, I'm not claiming engineers are necessarily better than others in this sense. I know many guys who quit college and they are better than me in understanding aspects of different technologies.
What I'm claiming is that very often people reinvent the wheel not because the previous wheel was a secret, but because they do not have this "engineering culture" of knowing different kinds of wheels in advance, and being able to understand correctly in which ways they are similar and when they are relevant.
And without going to different disciplines, how many software developers have a good "culture" about the basic concepts needed in their job, like recurrent algorithms and patterns?
I mean, how many actually tried to read Donald Knuth's books (or similar ones), or at least consult them when appropriate?
There are lots of answers already published, but we continue reinventing them anyway...
My 2c,
Massimiliano
Better than POSIX?
Posted Mar 27, 2009 1:01 UTC (Fri) by nix (subscriber, #2304)
[Link]
HEAR HEAR.
I'd estimate that I spend 20% of my time at work ripping out people's
buggy broken slow reimplementations of wheels and replacing them with a
wheel that uses twenty-to-forty-year-old techniques to do the same thing
faster and more reliably.
(And do the reimplementations stop? No! I ditched a chained hash table
implementation today which had a stupid bug which led to every element
landing in the same bucket. Obviously it was too hard to look
in "include/hash.h" to find that there was already a hash table in the
system with a better API...)
I mean it's not as if computers are bad at searching for things, but half
the people I work with are tentative and reluctant to just grep for a few
plausible terms to see if they can avoid reinventing the damn wheel yet
again.
Pride
Posted Mar 27, 2009 5:49 UTC (Fri) by quotemstr (subscriber, #45331)
[Link]
One explanation behind all these square wheels is the phase every programmer goes through during which he overestimates his abilities, has no sense of scale, and lacks sense for robustness. In short, he's proud, ignorant, and dangerous. He believes that libraries are bloated and slow, that he can out-perform standard implementations. He optimized prematurely, avoids function calls, abuses the ternary operator, and doesn't use a profiler.
Eventually, these programmers grow up, but in the meantime, they've written a significant amount of horrible code. I've seen this pattern again and again. As the parent mentioned, software developers have no "engineering culture." I imagine that in more established engineering disciplines, students have the above attitude beaten out of them before they graduate.