LWN.net Logo

Atomicity vs durability

Atomicity vs durability

Posted Mar 15, 2009 13:19 UTC (Sun) by bojan (subscriber, #14302)
In reply to: Atomicity vs durability by man_ls
Parent article: Ts'o: Delayed allocation and the zero-length file problem

Look, I'm all for reliability. But, if the manual says: "fsync if you want your data on disk" and we don't fsync, then it is us that are creating the problem.

I think we should come up with a new API that guarantees what people really want. Making the existing API do that on a particular FS is just going to make applications non-portable to any FS that doesn't work that way using existing POSIX API. We've seen this with XFS. Who knows what's lurking out there. Better do the proper thing, fsync and be done with it. Then we can invent the new, better, smarter API.


(Log in to post comments)

Atomicity vs durability vs reliability

Posted Mar 15, 2009 13:35 UTC (Sun) by man_ls (subscriber, #15091) [Link]

No, you are not all for reliability if you cannot see beyond your little POSIX manual. Or if you don't care about system crashes because the manual is silent about this particular point. Sorry to break it to you: reliability is such little details such as having predictable response to a crash, or surviving the crash while retaining all the nice properties.
I think we should come up with a new API that guarantees what people really want.
APIs are good enough as they are -- we don't need a special "reliability API" so we can build a special "reliability manual" for guys who just follow the book.
We've seen this with XFS.
Nope. What we have seen with XFS is how some anal-retentive developers lost most of their user base while trying to argue such points as "POSIX-compliance", and then they finally give in. With ex4 we are hoping to get to the point where the devs give in before they lose most of their user base. Just because ext4 is important for Linux and for our world domination agenda. Meanwhile you can keep waving the POSIX standard in our face. The POSIX standard seems to be about compatibility, not about reliability, and it should keep playing that role. Reliability is left as an exercise for the attentive reader. Let us hope that Mr Ts'o is attentive and can tell atomicity, reliability and durability apart.

Actually it's done deal...

Posted Mar 15, 2009 17:34 UTC (Sun) by khim (subscriber, #9252) [Link]

If you read the comments on tytso's blog you'll see that current position is: "POSIX is right while applications are broken yet we'll save them anyway". Even if "proper way" is fix thousands of applications its just not realistic - so ext4 (starting from 2.6.30) will try to save these broken applications by default. And if you want performance - there are a switch. Good enough for me. Can we close the discussion?

Actually it's done deal...

Posted Mar 15, 2009 21:10 UTC (Sun) by bojan (subscriber, #14302) [Link]

Exactly. Ted is a practical man, so he already put a workaround in place, until applications are fixed.

Sorry

Posted Mar 15, 2009 21:20 UTC (Sun) by man_ls (subscriber, #15091) [Link]

Sure, I have polluted the interwebs enough with my ignorance, and there is little chance to learn anything else.

Atomicity vs durability vs reliability

Posted Mar 15, 2009 21:06 UTC (Sun) by bojan (subscriber, #14302) [Link]

> No, you are not all for reliability if you cannot see beyond your little POSIX manual.

POSIX manual is not little ;-)

Seriously, we tell Microsoft that going out of spec is bad, bad, bad. But, we can go out of spec no problem. There is a word for that:

http://en.wikipedia.org/wiki/Hypocrisy

> What we have seen with XFS is how some anal-retentive developers lost most of their user base while trying to argue such points as "POSIX-compliance", and then they finally give in.

Yep, blame the people that _didn't_ cause the problem. We've seen that before.

Sorry, but I don't see it this way...

Posted Mar 15, 2009 22:08 UTC (Sun) by khim (subscriber, #9252) [Link]

I'm yet to see anyone who asks Microsoft to never go beyond the spec. It'll be just insane: if you can not ever add anything beyond what the spec says how any progress can occur?

When Microsoft is blamed it's because Microsoft
1. Does not implement spec correctly, or
2. Don't say what's the spec requirements and what's extensions.

When Microsoft says "JNI is not sexy so we'll provide RMI instead" the ire is NOT about problems with RMI. Lack of JNI is to blame.

I don't see anything of the sort here: POSIX does not require to make open/write/close/rename atomic but it certainly does not forbid this. And it's useful thing to have so why not? It'll be best to actually document this behaviour, of course - after that applications can safely rely on it and other systems can implement it as well if they wish. We even have nice flag to disable this extensions if someone wants this :-)

Sorry, but I don't see it this way...

Posted Mar 15, 2009 22:24 UTC (Sun) by bojan (subscriber, #14302) [Link]

> 1. Does not implement spec correctly

Which is exactly what our applications are doing. POSIX says, commit. We don't and then we blame others for it.

This is the same thing HTML5 is doing

Posted Mar 15, 2009 22:33 UTC (Sun) by khim (subscriber, #9252) [Link]

Sorry, but it's not the problem with POSIX or FS - it's problem with number of applications. Once a lot of applications are starting to depend on some weird feature (content sniffing in case of HTML, atomicity of open/write/close/rename on case of filesystem) it makes no sense to try to fix them all. Much better to document it and make it official. This is what Microsoft did with a lot of "internal" functions in MS-DOS 5 (and it was praised for it, not ostracized), this is what HTML is doing in HTML5 and this is what Linux filesystems should do.

Was it good idea to depend on said atomicity? May be, may be not. But the time to fix these problems come and gone - today it's much better to extend the spec.

This is the same thing HTML5 is doing

Posted Mar 15, 2009 23:37 UTC (Sun) by bojan (subscriber, #14302) [Link]

> But the time to fix these problems come and gone - today it's much better to extend the spec.

Time to fix these problems using the existing API is now, because right now we have the attention of everyone on how to use the API properly. To the credit of some in this discussion, bugs are already being fixed in Gnome (as I already mentioned in another comment). I also have bugs to fix in my own code - there is no denying that :-(

In general, I agree with you on extending the spec. But, before the spec gets extended officially, we need to make sure that _every_ POSIX compliant file system implements it that way. Otherwise, apps depending on this new spec will not be reliable until that's the case. So, can we actually make sure that's the case? I very much doubt it. There is a lot of different systems out there that are implementing POSIX, some of them very old. Auditing all of them and then fixing them may be harder than fixing the applications.

Why do we need such blessing?

Posted Mar 16, 2009 0:05 UTC (Mon) by khim (subscriber, #9252) [Link]

Linux extends POSIX all the time. New syscalls, new features (things like "According to the standard specification (e.g., POSIX.1-2001), sync() schedules the writes, but may return before the actual writing is done. However, since version 1.3.20 Linux does actually wait."), etc. If application wants to use such "extended feature" - it can do this, if not - it can use POSIX-approved features only.

As for old POSIX systems... it's up to application writers again. And you can be pretty sure A LOT OF them don't give a damn about POSIX compliance. They are starting to consider Linux as third platfrom for their products (first two are obviously Windows and MacOS in that order), but if you'll try to talk to them about POSIX it'll just lead to the removal of Linux from list of supported platforms. Support of many distributions is already hard enough, support of some exotic filesystems "we'll think about it but don't hold your breath...", support for old exotic POSIX systems... fuggetaboudit!

Now - the interesting question is: do we welcome such selfish developers or not? This is hard question because the answer "no, they should play by our rules" will just lead to exodus of users - because they need these applications and WINE is not a good long-term solution...

Atomicity vs durability

Posted Mar 15, 2009 22:05 UTC (Sun) by dcoutts (guest, #5387) [Link]

Remember, we do not care if the data is on disk or not, just that if it does make it to disk that it preserves the atomic property we were after. All that needs to happen is for the rename not to be reordered in front of the write. That hardly restricts performance.

As for a new API, yes, that'd be great. There are doubtless other situations where it would be useful to be able to constrain write re-ordering. For example for writes within a single file if we're implementing a persistent tree structure where the ordering is important to provide atomicity in the face of system failure.

Having a nice new API does not mean that the obvious cases that app writers have been using for ages are wrong. We should just insert the obvious write barriers in those cases.

Atomicity vs durability

Posted Mar 16, 2009 4:52 UTC (Mon) by dlang (✭ supporter ✭, #313) [Link]

remember that the drive has it's own buffer (that usually isn't battery backed), and it will tell the OS that the data is written when it's in the buffer, not when it is on the disk. it then can re-order the writes to the disk.

so everything that you are screaming that the OS should guarantee can be broken by the hardware after the OS has done it's best.

you can buy/configure your hardware to not behave this way, but it costs a bunch (either in money or in performance). similarly you can configure your filesystem to give you added protection, at a significant added cost in performance.

Atomicity vs durability

Posted Mar 16, 2009 11:00 UTC (Mon) by forthy (guest, #1525) [Link]

Any reasonable hard disk (SATA, SCSI) has write barriers which allow file system implementers to actually implement atomicy.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds