LWN.net Logo

Better than POSIX?

Better than POSIX?

Posted Mar 17, 2009 18:20 UTC (Tue) by sf_alpha (guest, #40328)
Parent article: Better than POSIX?

Good and balanced view.

I think it also duty of application developers to ensure their applications work on POSIX if they are writing UNIX applications.

Ther are many filesystems out there that implement delayed allocation. Although those filesystems are not default filesystem for Linux, we expected applications to work regardless which STABLE filesystem and Operating System used.

Use of allocate-on-commit mount option for provide ext3 like behavior is the workaround that give time for applications to migrate, also the patch to the commit on rename. But again, application developers get ignore POSIX specfication and compliance. If they want to write Linux-only applcation, that is no problems to rely on Linux or ext3 functionality. But I am sure that most applications not intended to use only in Linux and a few filesystems.

So if allocate-on-commit is default behavior, we get non-portable (and bugged) application as an exchange.


(Log in to post comments)

Better than POSIX?

Posted Mar 17, 2009 19:52 UTC (Tue) by aleXXX (subscriber, #2742) [Link]

> I think it also duty of application developers to ensure their
> applications work on POSIX if they are writing UNIX applications.

Sure, sounds reasonable. But, how to do that ? Most people have one box around, usually running Linux. How should they/we test that our software works fine everywhere ? Ok, one can install a virtual machine and run e.g. FreeBSD or Solaris on it. Not sure how many people do this in their spare time. At least for me it's quite at the top of my TODO.

E.g. Kitware has nightly builds and testing for basically all their software on basically all operating systems, i.e. Linux, Windows, OSX, AIX, HP-UX, Solaris, FreeBSD, QNX and others (here's the current dashboard for CMake: http://www.cdash.org/CDash/index.php?project=CMake) But setting this up is quite some effort, you need to find people to host these builds for you. Not every small project can afford this.

Alex

Better than POSIX?

Posted Mar 18, 2009 5:45 UTC (Wed) by sf_alpha (guest, #40328) [Link]

Ok. I did not mean to test on every system, Just pay attention to POSIX for ensure data integrity and application work as expected on most of the systems.

If applications is designed to use only Linux with ext3, that is OK, just ignore this problems and rely on Ext3 robustness. Only drawbacks is application is not portable and still MAY LOST DATA WHEN CRASH.

Better than POSIX?

Posted Mar 17, 2009 20:17 UTC (Tue) by quotemstr (subscriber, #45331) [Link]

I think it also duty of application developers to ensure their applications work on POSIX if they are writing UNIX applications.
No. I don't support every POSIX system, and it's not my responsibility to do that. I'll decide where I want my application to fall on the spectrum between simplicity and portability, not you.

Better than POSIX?

Posted Mar 17, 2009 22:29 UTC (Tue) by nix (subscriber, #2304) [Link]

Ah, but just use gnulib and your program will work everywhere, including
POSIX environments like mingw. ;}}}}

Better than POSIX?

Posted Mar 17, 2009 22:21 UTC (Tue) by man_ls (subscriber, #15091) [Link]

I think it also duty of application developers to ensure their applications work on POSIX if they are writing UNIX applications.
Those applications do work on POSIX systems. They also happen to leave hordes of little empty files after a crash, something which, as has been argued ad nauseam, is a POSIX-compliant way of dealing with a crash. It is also POSIX-compliant to make the user hunt these little buggers and remove them, or to provide valid contents. It seems that it is even POSIX-compliant to zero the whole disk on crash, something which these applications kindly refrain from doing. See? nobody is ignoring POSIX specification and compliance.

Just joking. You are right that people should respect the spec, but I think that POSIX compliance is not the problem here. In fact POSIX is just a red herring that Ted Ts'o threw in the way to make the hounds lose the scent. Apparently he failed, but he left the hounds half-crazed and biting each other for a long time.

Better than POSIX?

Posted Mar 18, 2009 0:45 UTC (Wed) by bojan (subscriber, #14302) [Link]

> They also happen to leave hordes of little empty files after a crash, something which, as has been argued ad nauseam, is a POSIX-compliant way of dealing with a crash.

I know you are joking here, but these files are not empty because of some evil "POSIX compliant" way of dealing with a crash by the FS or kernel. They are empty because they were never committed to the disk by running fsync().

So, it is not that POSIX compliant FS decided to _remove_ that data upon crash. It was never _explicitly_ placed there in the first place, so it cannot possibly be there after the crash.

Sure, you could have a very rare situation where you do run fsync() and you get a zero length file, for all sorts of reasons (usually hardware and kernel driver issues). This is not the case here, however.

> POSIX is just a red herring

I have to disagree here and agree with Ted. The manual page of close is very specific and says:

> A successful close does not guarantee that the data has been successfully saved to disk, as the kernel defers writes. It is not common for a file system to flush the buffers when the stream is closed. If you need to be sure that the data is physically stored use fsync(2). (It will depend on the disk hardware at this point.)

So, if you want have _any_ guarantee that you will see you data on disk _now_, you better fsync().

The manual page of rename(2) is similarly clear on what is being atomic - just the directory entries. Sure, ignore at your peril.

And, finally, the documentation of fsync() is also crystal clear that one is allowed to run it independently on a directory and on a file. Which means that users themselves are allowed to do this separately (and they do). So does the kernel.

Sure, Ted is being gentle to everyone with bugs in their apps, which is fair enough. But, I'll bet $5 they'll hit the same thing on another Unix-like OS in the future, at which point all this screaming will happen again and people writing buggy code will accuse FS writers that it's their fault.

In the meantime, it is easy to take backups of configuration files rarely and restore them if real config files are broken. And it doesn't require running fsync() all the time.

3 out of 4?

Posted Mar 18, 2009 7:22 UTC (Wed) by man_ls (subscriber, #15091) [Link]

You seem to think that:
  • if you repeat the same red herring long enough then it becomes the truth,
  • you can ignore the distinctions thrown in your face once and again,
  • and that the last person to reply wins the discussion,
Just curious. Do lurkers support you in email?

3 out of 4?

Posted Mar 18, 2009 8:46 UTC (Wed) by bojan (subscriber, #14302) [Link]

Very funny :-)

Look, it is completely up to you to believe what you want of course. I will do the same. OK?

3 out of 4?

Posted Mar 20, 2009 5:02 UTC (Fri) by k8to (subscriber, #15413) [Link]

I'm personally in agreement.

The applications are expecting behavior POSIX does not provide.
The applications should stop expecting this.

It's fine to use a pattern that doesn't request the data be on disk, but you should write the app to deal with the lack of the data being on disk.

This is what I've done many times, in my own software authoring.

Congratulations

Posted Mar 20, 2009 11:52 UTC (Fri) by man_ls (subscriber, #15091) [Link]

I am sure you also run your Bash console in POSIX mode, never use ls with long options, and only use cp with the four-and-a-half POSIX options. Congratulations. The rest of the world is not so spartan.

I hope I don't have to use your software that fsync()s after every file operation. Mistaking durability with atomicity can have dire consequences both for durability and for atomicity.

Besides, you are not a lurker and this is not email.

Better than POSIX?

Posted Mar 18, 2009 2:03 UTC (Wed) by ras (subscriber, #33059) [Link]

> You are right that people should respect the spec, but I think that POSIX compliance is not the problem here.

It strikes me as odd that an open source OS uses a non-free spec to define its operations. Doesn't it strike anybody else as odd that we have a whole pile of people here arguing about compliance to a spec they most likely haven't seen? I see statements like "ensure your app only relies on stuff in POSIX". Perfectly good advice, except how is your typical open source developer meant to do that when he can't get access to the bloody thing?

That aside, I gather (since I have not been able to get a copy of POSIX myself), POSIX's doesn't offer much to programmers who want to ensure some combination of consistency consistency and durability. This sort of stuff is a basic requirement if your want to produce a reliable application. The furor here is an indication of just how basic it is. Yet even if you did have access to the spec, I gather it doesn't spell out how to do this. So programmers have learnt a bunch of ad hoc heuristics, like "to get consistency without the slowdowns caused by durability, use open();write();close();rename()". Then we get accused of "not adhering to the spec" when the next version of the FS doesn't implement the heuristic. Give us a break!

Ted's suggestion that you should be using sqlite if you want to write out a few hundred bytes of text reliably is on one level almost a joke. I presume he suggested it because the sqlite authors have taken the time to learn all the heuristics to get data on the disc reliably. Given it _is_ so hard figure all those heuristics for the various file systems your application could find itself running on I guess it is a reasonable suggestion. Unfortunately, as the firefox programmers found out, it doesn't always work. Yeah, sqlite got the data onto the disc reliably, but only by using fsync() which killed performance on some platforms. Given you probably don't care if your latest browsing history hit the disc in 5 minutes time, it is a great illustration of why programmers are so fond of "open();write();close();rename()".

From talking to a MySql developer, I gather the situation is even worse than most posting here realise. Not only does the rename() trick not work, it turns out just about anything beyond fdatasync() doesn't work. For example, you might expect that appending to a file would be fairly safe. Well, not so apparently according to POSIX. He said that if you append to a file, there is a chance on POSIX system the entire file could be truncated if you crash at the wrong moment. The only way to guarantee a file can't be corrupted by a write is to ensure you don't effect the metadata (think block allocations) - ie always write to pre-allocated blocks. Need to extend your 100Gb database? Well then you have to copy it, write zero's to the extra space at the end to ensure it isn't sparsely allocated, then use the fsync(); rename() trick.

And that should be a joke. Pity it isn't. Given that filesystems aren't going to implement ACID, we need a set of primitives we can use build up our own implementations ACID. Fast, simple things, along the lines of the lines of the CPU instruction "Test Bit and Set" which is there so assembly programmers to implement all sorts of complex locking schemes on top of it. And we need them defined in a spec that we can actually access - unlike POSIX.

Given that ain't going to happen, Ted's only way of of this is to publish such a document for his filesystems - the ext* series. Just a series of HOWTO's would be a good start - HOWTO extend a large file reliably, HOWTO get consistent data written to disc (ie impose ordering on writes) without the slowdown's of unwanted sync()'s, HOWTO ensure a rename() for a file you don't have open has hit the disc. Nothing fancy. Just the basic operations we applications are expected to implement reliably every day on his file systems.

Better than POSIX?

Posted Mar 18, 2009 4:35 UTC (Wed) by butlerm (subscriber, #13312) [Link]

Most of the POSIX specs are online these days. Google "POSIX IEEE".

Better than POSIX?

Posted Mar 18, 2009 13:17 UTC (Wed) by RobSeace (subscriber, #4435) [Link]

The actual POSIX specs may not be available anywhere for free, but the Single Unix Specs are, and they are essentially a superset of POSIX, and probably what most people really mean when they say "POSIX" these days...

Better than POSIX?

Posted Mar 18, 2009 15:41 UTC (Wed) by markh (subscriber, #33984) [Link]

POSIX.1-2008 is available here.
POSIX and SUS have merged, and are now the same thing. The link in the last comment, and the first google link, point to the older 2004 edition.

Better than POSIX?

Posted Mar 18, 2009 20:27 UTC (Wed) by sb (subscriber, #191) [Link]

> It strikes me as odd that an open source OS uses a non-free spec to define its operations. Doesn't it strike anybody else as odd that we have a whole pile of people here arguing about compliance to a spec they most likely haven't seen? I see statements like "ensure your app only relies on stuff in POSIX". Perfectly good advice, except how is your typical open source developer meant to do that when he can't get access to the bloody thing?

Read the online specifications, particularly the System Interfaces volume.

Read also the Linux manpage for the system call you are using. It will say which standards the implementation adheres to, and how it departs from those standards.

On a Debian system, install "mapages-dev" and "manpages-posix-dev". For most system interfaces, you will then have the Linux implementation in section 2 and the POSIX spec in section "3posix".

Better than POSIX?

Posted Mar 19, 2009 0:31 UTC (Thu) by ras (subscriber, #33059) [Link]

Thanks to everybody pointing out POSIX is actually available nowadays - with links even.

Times have apparently changed, and it is a big improvement. The last time I went search for POSIX was when I was referred to it by some man page which just said it implemented "POSIX regex's", and got completely pissed off when I discovered the manual entry for a supposedly free library referred me to a non-free spec.

Better than POSIX?

Posted Mar 19, 2009 18:43 UTC (Thu) by anton (guest, #25547) [Link]

So if allocate-on-commit is default behavior, we get non-portable (and bugged) application as an exchange.
You might get a few more applications that sync before renaming, but that does not make them any more portable or bug-free. If the OS crashes, that's not an application bug nor a portability problem. If the user uses a file system that gives no crash consistency guarantees (e.g., ext4), that's not an application bug or portability problem. A user using such a file system should just back up frequently and be prepared to restore from backup in case of a crash. Application programming doesn't have anything to do with it.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds