LWN.net Logo

Better than POSIX?

Better than POSIX?

Posted Mar 18, 2009 2:03 UTC (Wed) by ras (subscriber, #33059)
In reply to: Better than POSIX? by man_ls
Parent article: Better than POSIX?

> You are right that people should respect the spec, but I think that POSIX compliance is not the problem here.

It strikes me as odd that an open source OS uses a non-free spec to define its operations. Doesn't it strike anybody else as odd that we have a whole pile of people here arguing about compliance to a spec they most likely haven't seen? I see statements like "ensure your app only relies on stuff in POSIX". Perfectly good advice, except how is your typical open source developer meant to do that when he can't get access to the bloody thing?

That aside, I gather (since I have not been able to get a copy of POSIX myself), POSIX's doesn't offer much to programmers who want to ensure some combination of consistency consistency and durability. This sort of stuff is a basic requirement if your want to produce a reliable application. The furor here is an indication of just how basic it is. Yet even if you did have access to the spec, I gather it doesn't spell out how to do this. So programmers have learnt a bunch of ad hoc heuristics, like "to get consistency without the slowdowns caused by durability, use open();write();close();rename()". Then we get accused of "not adhering to the spec" when the next version of the FS doesn't implement the heuristic. Give us a break!

Ted's suggestion that you should be using sqlite if you want to write out a few hundred bytes of text reliably is on one level almost a joke. I presume he suggested it because the sqlite authors have taken the time to learn all the heuristics to get data on the disc reliably. Given it _is_ so hard figure all those heuristics for the various file systems your application could find itself running on I guess it is a reasonable suggestion. Unfortunately, as the firefox programmers found out, it doesn't always work. Yeah, sqlite got the data onto the disc reliably, but only by using fsync() which killed performance on some platforms. Given you probably don't care if your latest browsing history hit the disc in 5 minutes time, it is a great illustration of why programmers are so fond of "open();write();close();rename()".

From talking to a MySql developer, I gather the situation is even worse than most posting here realise. Not only does the rename() trick not work, it turns out just about anything beyond fdatasync() doesn't work. For example, you might expect that appending to a file would be fairly safe. Well, not so apparently according to POSIX. He said that if you append to a file, there is a chance on POSIX system the entire file could be truncated if you crash at the wrong moment. The only way to guarantee a file can't be corrupted by a write is to ensure you don't effect the metadata (think block allocations) - ie always write to pre-allocated blocks. Need to extend your 100Gb database? Well then you have to copy it, write zero's to the extra space at the end to ensure it isn't sparsely allocated, then use the fsync(); rename() trick.

And that should be a joke. Pity it isn't. Given that filesystems aren't going to implement ACID, we need a set of primitives we can use build up our own implementations ACID. Fast, simple things, along the lines of the lines of the CPU instruction "Test Bit and Set" which is there so assembly programmers to implement all sorts of complex locking schemes on top of it. And we need them defined in a spec that we can actually access - unlike POSIX.

Given that ain't going to happen, Ted's only way of of this is to publish such a document for his filesystems - the ext* series. Just a series of HOWTO's would be a good start - HOWTO extend a large file reliably, HOWTO get consistent data written to disc (ie impose ordering on writes) without the slowdown's of unwanted sync()'s, HOWTO ensure a rename() for a file you don't have open has hit the disc. Nothing fancy. Just the basic operations we applications are expected to implement reliably every day on his file systems.


(Log in to post comments)

Better than POSIX?

Posted Mar 18, 2009 4:35 UTC (Wed) by butlerm (subscriber, #13312) [Link]

Most of the POSIX specs are online these days. Google "POSIX IEEE".

Better than POSIX?

Posted Mar 18, 2009 13:17 UTC (Wed) by RobSeace (subscriber, #4435) [Link]

The actual POSIX specs may not be available anywhere for free, but the Single Unix Specs are, and they are essentially a superset of POSIX, and probably what most people really mean when they say "POSIX" these days...

Better than POSIX?

Posted Mar 18, 2009 15:41 UTC (Wed) by markh (subscriber, #33984) [Link]

POSIX.1-2008 is available here.
POSIX and SUS have merged, and are now the same thing. The link in the last comment, and the first google link, point to the older 2004 edition.

Better than POSIX?

Posted Mar 18, 2009 20:27 UTC (Wed) by sb (subscriber, #191) [Link]

> It strikes me as odd that an open source OS uses a non-free spec to define its operations. Doesn't it strike anybody else as odd that we have a whole pile of people here arguing about compliance to a spec they most likely haven't seen? I see statements like "ensure your app only relies on stuff in POSIX". Perfectly good advice, except how is your typical open source developer meant to do that when he can't get access to the bloody thing?

Read the online specifications, particularly the System Interfaces volume.

Read also the Linux manpage for the system call you are using. It will say which standards the implementation adheres to, and how it departs from those standards.

On a Debian system, install "mapages-dev" and "manpages-posix-dev". For most system interfaces, you will then have the Linux implementation in section 2 and the POSIX spec in section "3posix".

Better than POSIX?

Posted Mar 19, 2009 0:31 UTC (Thu) by ras (subscriber, #33059) [Link]

Thanks to everybody pointing out POSIX is actually available nowadays - with links even.

Times have apparently changed, and it is a big improvement. The last time I went search for POSIX was when I was referred to it by some man page which just said it implemented "POSIX regex's", and got completely pissed off when I discovered the manual entry for a supposedly free library referred me to a non-free spec.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds