Not logged in
Log in now
Create an account
Subscribe to LWN
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
PostgreSQL 9.3 beta: Federated databases and more
LWN.net Weekly Edition for May 9, 2013
Ted speaks again
Posted Mar 16, 2009 10:19 UTC (Mon) by regala (subscriber, #15745)
So to you, I'd say, "your mouth spits again". You all rant against the fact that you never read programming guidelines (POSIX standards) and that makes me sick. Know your place, for christ's sake.
Posted Mar 16, 2009 12:29 UTC (Mon) by forthy (guest, #1525)
Maybe some people didn't read the POSIX standard. But this doesn't
actually matter, because Ted Ts'o didn't read it either. He's just
ducking behind it, because POSIX makes no promise in case of a system
crash. That's anal-retentive, because POSIX makes promises about ordering
(ordering is strong), and a supposed-to-be-reliable file system
should keep that order even after a crash. If ext4 doesn't (ext3 in
data=ordered mode does, btrfs should do according to the FAQ, and will
actually do - bugs happen - in 2.6.30, too, etc.), then the users will
just not use ext4.
In the long thread of the previous discussion about this topic, the
semantics behind the different operations were clearly described.
POSIX promises atomicy of operations like rename(). Should this atomicy
be preserved in case of a crash? Sane file system design says: yes.
People who use create-write-close-rename want atomicy, if they
also want durability (i.e. know that the new file is actually
committed), they need fsync, too. To be precise: fsync on the file
and on the directory. Atomic operations are part of the POSIX file
system semantics, durability is part of fsync's semantics.
When do you need durability? E.g. in a networked ordering system - if
you receive an order over network, you update your books, make the first
half of the booking durable, confirm the order, and when you know that
the confirmation is out, then you finalize your booking and make that
durable, as well (double handshake). You don't need durability if you
just update your configuration settings, but you need atomicy to avoid
loss of all configuration settings.
Note that POSIX still does not
guarantee anything in case of a crash - complete loss of data and
metadata is "allowed". Whether I or anyone else actually wants to
use such a file system is a completely different question.
Posted Mar 16, 2009 13:17 UTC (Mon) by kleptog (subscriber, #1183)
Now we're at the stage of worrying about exactly what the files should look like after a crash. Give it a few years and I'm sure we'll find something else to worry about. Also, POSIX was written a long time ago and deliberately vague on some points because they wanted to support many existing systems which all worked slightly differently.
NB: ISTM the solution to the 'lots of little files on ext3' problem is obvious. Create all the new files, then fsync them (fsync on ext3 may be slow, but it wouldn't be as much of a problem this way because all the data would be written out for all the files in one go). Finally rename them all.
Posted Mar 16, 2009 15:25 UTC (Mon) by drag (subscriber, #31333)
Well ya. That's progress I guess. People always want better, demand better.
In the case of Linux your traditionally dealing with half-way decent hardware running with UPS and ran by professionals. That is your designing the OS to perform well and reliably when managed by a person who knows, understands, and cares quite a bit about the hardware they are using.
Now with consumer-oriented Linux devices your dealing with people constantly putting excessive demands and loads on the system (especially graphics, which has been a weak point in stability for all systems including Linux) devices that are cheap and mass produced, ran by people that don't even understand what a OS is, have to operate with as low as power usage as possible, and have users with very low tolerances for anything really technical.
In this specific case your having Ubuntu users using unstable graphics drivers with developer versions of the operating system. They were crashing their system frequently; several times a day sometimes. They are doing weird things like over clocking RAM and all that crap.
They were finding that Ext4 was eating a significant portion of their file system, were as with Ext3 it didn't.
But that is just a tip of the iceberg. Your going to deal with mobile phones with batteries that just 'crap out'. Your going to deal with mobile internet devices that get used in abusive environments. Your going to deal with hand held devices that suspend to ram a dozen times a minute.
Try explaining to your grandma or to the guy down the street running a Moblin netbook that their system is not bootable anymore, or they can't use most of their applications, because POSIX doesn't give a shit that users get half their file system blown away when they shut their devices down incorrectly.
I don't know the best way to fix it, whether it's best to:
* Get the Kernel developers to care about maintaining a consistent file system image on the disk at all times
* Get the biggest clue stick in the world and collectively drive the "fsync is your friend" point home to all potential Linux developers.
* third option
I don't know.
But certainly demands and expectations change. Just like everything else in the computing landscape changes.
Posted Mar 16, 2009 16:54 UTC (Mon) by kleptog (subscriber, #1183)
Honestly, I don't see why POSIX should care. It's a standard that describes an API that can be used by programs that wish to be portable. In principle it could be implemented on anything from the smallest handheld to the largest mainframe. Reliability after a crash is outside the purview of POSIX since the requirements are vastly different in different situations. People writing software for embedded devices don't rely on POSIX to give them crash safety, they read the manuals for the device to see what the manufacturers say they should do.
POSIX compliance is a property of the OS-userspace boundary, crash-safety is a property of an entire system. They're largely orthogonal.
In my opinion it's wrong for people to say that either behaviour is mandated by POSIX.
IMHO it's neither mandated nor forbidden. Crash reliability is a contract between you and the OS+hardware+kernel. A ramdisk can be POSIX compliant yet is clearly not crash safe. Leave POSIX out of it, decide what Linux wants to guarantee. POSIX provides a way of guaranteeing a certain reliability but Linux is free to provide additional guarantees if it sees fit.
Maybe something for LSB? I'd like to see the language lawyers work out a way of defining "crash-safety" in a way that doesn't exclude things like ramdisks and several existing filesystems.
Posted Mar 16, 2009 14:54 UTC (Mon) by k8to (subscriber, #15413)
Posted Mar 17, 2009 0:08 UTC (Tue) by jlokier (guest, #52227)
POSIX promises atomicy of operations like rename()
It promises atomicity of the directory modification done by rename, and every version of ext4 provides that. Renaming is equivalent to an atomic sequence of unlink() and link() calls.
You're confusing atomicity of the directory modification with serialising against the file content modification. POSIX doesn't promise anything about that in the absence of fsync() or fdatasync() used as a barrier between them. [I can't tell from the standard if fdatasync() is sufficient.]
Posted Mar 16, 2009 15:34 UTC (Mon) by nix (subscriber, #2304)
And, 'know your place'? WTF? This isn't an empire and Ted is not King:
although he is surely worthy of respect, we peons are not forbidden to
talk to the Mighty. This is a *good* thing, ffs.
Posted Mar 20, 2009 14:53 UTC (Fri) by regala (subscriber, #15745)
reread all this. Respect is way behind this thread since long...
Posted Mar 16, 2009 21:33 UTC (Mon) by bojan (subscriber, #14302)
PS. If you read any of my comments, you would know that I agree that Ted interpretation as to what is permitted by POSIX is correct.
Posted Mar 20, 2009 13:49 UTC (Fri) by regala (subscriber, #15745)
Know your place
Posted Mar 18, 2009 1:08 UTC (Wed) by xoddam (subscriber, #2322)
Talk like that makes me sick.
Posted Mar 20, 2009 13:53 UTC (Fri) by regala (subscriber, #15745)
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds