LWN: Comments on "PostgreSQL's fsync() surprise" https://lwn.net/Articles/752063/ This is a special feed containing comments posted to the individual LWN article titled "PostgreSQL's fsync() surprise". en-us Thu, 06 Nov 2025 14:26:16 +0000 Thu, 06 Nov 2025 14:26:16 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Failed writeback to removable devices https://lwn.net/Articles/841490/ https://lwn.net/Articles/841490/ andrit <div class="FormattedComment"> I wonder how FreeBSD (who keeps the pages dirty in memory on writeback failures) handles this...<br> </div> Thu, 31 Dec 2020 15:13:16 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/780291/ https://lwn.net/Articles/780291/ nybble41 <div class="FormattedComment"> <font class="QuotedText">&gt; Of what interest is a portable program that is not useful?</font><br> <p> It's not a matter of either/or. Programs should be both portable *and* useful.<br> <p> <font class="QuotedText">&gt; A user can expect that qsort sorts, but can they expect that it does so reasonably quickly? How often can you call fsync to maintain a reasonable balance between speed and safety? That's never going to be defined by the standard...</font><br> <p> Why not? Standards do sometimes specify things like algorithmic complexity. C doesn't specify that for qsort(), unfortunately, but C++ does require std::sort() to be O(n log n) in the number of comparisons. What constitutes a "reasonable balance" is up to the user, but there is no reason in principle why there couldn't be a standard for "filesystems useable with PostgreSQL" which defines similar timing requirements for fsync().<br> </div> Tue, 19 Feb 2019 22:39:41 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/780141/ https://lwn.net/Articles/780141/ dvdeug <div class="FormattedComment"> Of what interest is a portable program that is not useful? It's trivial to write a portable program; just check uname at the start and exit out on any but the system you're written for. Nobody does that, because it's not useful.<br> <p> As for which parts may vary from version to version, version 3 may adhere to an entirely different standard than version 2. The fact that there is a standard may do you no good if it's evolving rapidly along with the software.<br> <p> From the other side, even if you are standards conforming, that may not be enough. A user can expect that qsort sorts, but can they expect that it does so reasonably quickly? How often can you call fsync to maintain a reasonable balance between speed and safety? That's never going to be defined by the standard, but an understanding needs to be reached by the authors of a program like PostgreSQL.<br> <p> I don't believe it's a question of abstract philosophy. It's one thing if standards were a tool used some places and not others in the computer world, that at their best were understood not to be sufficient to be binding on either implementer or user, then it would be reasonable to use unstandardized language in writing standards. But if _all_ APIs should depend on standards, then using an unstandardized language, when, again, formal languages like Lojban or simply standardized ones like French exist.<br> </div> Tue, 19 Feb 2019 00:42:33 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/780095/ https://lwn.net/Articles/780095/ nybble41 <div class="FormattedComment"> <font class="QuotedText">&gt; A POSIX-compliant system could have malloc just return an error for all calls. ... Even POSIX-compatible systems aren't perfectly interchangable.</font><br> <p> True, but irrelevant. I only mentioned POSIX as an example. No one is expecting a complex project like PostgreSQL to work equally well under all POSIX-compliant operating systems; there will be other dependencies. <br> <p> Regarding the first point, a POSIX-compliant program would check for malloc() errors and either recover or terminate in a well-defined way. The program is portable as long as the behavior is well-defined for all conforming implementations; this is a separate consideration from being *useful*.<br> <p> <font class="QuotedText">&gt;&gt; Standards are how users and implementers of an API communicate.</font><br> <font class="QuotedText">&gt; In theory, but not in reality. Most of the APIs a major program depends on are implemented by one library and have but vague descriptions of how it works outside the source code and behavior of that library.</font><br> <p> What you are describing is a failure to communicate. Programs written this way are inherently non-portable because they are written to fit the specifics of particular implementations. Any change to an implementation can cause any program to break in unspecified ways. This is the problem which standards exist to solve. They allow implementers and users of an interface to agree on roles and responsibilities; implementers can improve their code without worrying about breaking standards-compliant users, and users know which parts of the interface they can rely on and which parts may vary from one implementation (or version) to the next.<br> <p> <font class="QuotedText">&gt; How can we know what a standard means if the language it is written in is unstandardized?</font><br> <p> "How can digital logic exist when all electronic components have analog characteristics?" This is bordering on abstract philosophy in the "can two people ever truly communicate" sense, but I'll try to answer it seriously anyway: We distinguish between parts of the language we can rely on for clear communication and parts which, while perhaps useful in other contexts, fail to clearly convey our intent, and build up more complex constructs from elements of the first set. The subset of natural language used for formal standards is actually pretty tightly constrained compared to literature in general. Even so, the dependency on natural language for formal specifications is a weak point and communication does occasionally break down as a result. We have feedback mechanisms in place to detect such breakdowns and correct them by issuing clarifications or revising the standards.<br> </div> Mon, 18 Feb 2019 20:09:54 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/779860/ https://lwn.net/Articles/779860/ dvdeug <div class="FormattedComment"> A POSIX-compliant system could have malloc just return an error for all calls. A POSIX system that could be reasonable in some circumstances could have malloc return an error for any malloc over a megabyte; the first port of Unix was the Interdata 8/32, with 256kb of memory. There is no non-trivial Unix program that doesn't make assumptions about the POSIX system it's running on.<br> <p> <font class="QuotedText">&gt; if someone puts together a new OS which follows all the relevant standards neither you nor they can be confident that your program will work on it unmodified.</font><br> <p> Even POSIX-compatible systems aren't perfectly interchangable. In the case of a program like PostgreSQL, it's usually important not just that it runs, but it runs well, and POSIX can not and does not guarantee speed constraints; even Linux alone can store its filesystems in many different ways on many different media, and some of those combinations may not work in practice for PostgreSQL.<br> <p> <font class="QuotedText">&gt; Standards are how users and implementers of an API communicate.</font><br> <p> In theory, but not in reality. Most of the APIs a major program depends on are implemented by one library and have but vague descriptions of how it works outside the source code and behavior of that library. There were many Unixes before POSIX, many C and C++ compilers before the first standard was written down. Many people still depend on specialized features of GNU C, enough that several compilers have to copy those unstandardized features. Standards are wonderful if they're followed, but many are underspecified or just usually ignored. New versions of the C, C++ and Scheme standard have removed features that older standards have mandated because they were not well supported.<br> <p> A huge example is the fact that most of these standards are written in English, an unstandardized language, not Lojban or even French. How can we know what a standard means if the language it is written in is unstandardized? But, for the most part, we manage.<br> </div> Thu, 14 Feb 2019 21:21:44 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/771457/ https://lwn.net/Articles/771457/ immibis <div class="FormattedComment"> That is almost exactly the abstraction that a file is supposed to provide - except that it's a fixed size (and consequently you can't get ENOSPC because you are handling the space allocation yourself). You can still get EIO. Or EUSERPULLEDTHEDRIVEOUT.<br> </div> Mon, 12 Nov 2018 04:03:16 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/754063/ https://lwn.net/Articles/754063/ ringerc <div class="FormattedComment"> Further reading at:<br> <p> * <a href="https://lwn.net/Articles/753650/">https://lwn.net/Articles/753650/</a><br> * <a href="https://lwn.net/Articles/752952/">https://lwn.net/Articles/752952/</a><br> * <a href="https://lwn.net/Articles/752613/">https://lwn.net/Articles/752613/</a><br> * <a href="https://www.postgresql.org/message-id/20180427222842.in2e4mibx45zdth5@alap3.anarazel.de">https://www.postgresql.org/message-id/20180427222842.in2e...</a><br> <p> </div> Thu, 10 May 2018 01:25:50 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/754049/ https://lwn.net/Articles/754049/ nilsmeyer <div class="FormattedComment"> It can be configured in MySQL with variables like innodb_flush_method and sync_binlog, also MySQL uses a threading model instead of multiple processes so I suppose some of the issues regarding file descriptors don't crop up, and one would usually use direct io bypassing most other caches (O_DIRECT). Of course this assumes one runs InnoDB, I don't know how RocksDB/MyRocks behaves in this case. <br> <p> Basically MySQL / InnoDB will manage all the buffering and try to bypass the kernel buffering as much as possible. This is why you usually try to allocate most (like 75/80%) of the memory on a MySQL server to the InnoDB buffer pool. <br> </div> Wed, 09 May 2018 20:14:00 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/753619/ https://lwn.net/Articles/753619/ ssmith32 <div class="FormattedComment"> Hmmm. I dunno, I feel like if you're running a DB of any type, with data that you care about, you definitely should be thinking about your storage layer (whatever is below the OS - in today's world you may not know the actual physical setup, but you should know the performance &amp; reliability characteristics of it). <br> <p> It's not really the OS's job to make unreliable hardware reliable or slow hardware performant. <br> <p> And I do agree with the thin provisioning.. for critical data.. just don't.<br> </div> Sun, 06 May 2018 06:04:52 +0000 Déjà vu? https://lwn.net/Articles/753618/ https://lwn.net/Articles/753618/ marcH <div class="FormattedComment"> Indeed it took Java a long time between trying and succeeding - to be expected when you're first to do... both? Years before others started to merely express a decent level of interest for formalization and standardization?<br> <p> </div> Sun, 06 May 2018 01:46:47 +0000 Déjà vu? https://lwn.net/Articles/753523/ https://lwn.net/Articles/753523/ ncm <div class="FormattedComment"> Java was also the first language to have an unimplementable memory model. Oops!<br> <p> They tried, bless their hearts.<br> </div> Fri, 04 May 2018 04:53:38 +0000 PostgreSQL's fsync() surprise: Patched proposed https://lwn.net/Articles/753513/ https://lwn.net/Articles/753513/ tech2018 <div class="FormattedComment"> Seems like a patch hast been already developed (April 24, 2018)<br> <a rel="nofollow" href="https://patchwork.kernel.org/patch/10358111/">https://patchwork.kernel.org/patch/10358111/</a><br> </div> Thu, 03 May 2018 22:09:06 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/753424/ https://lwn.net/Articles/753424/ james That means you can only run them on systems with that sort of storage available -- which means<br> <tt>dnf install package-that-uses-postgresql-as-a-database-engine</tt> <br> doesn't have a chance of Just Working. Thu, 03 May 2018 11:26:40 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/753405/ https://lwn.net/Articles/753405/ zlynx <div class="FormattedComment"> Raw, unformatted blocks I would suppose. We could call it postgresfs.<br> </div> Thu, 03 May 2018 06:20:38 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/753385/ https://lwn.net/Articles/753385/ andresfreund <div class="FormattedComment"> Which would be?<br> </div> Thu, 03 May 2018 00:03:43 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/753383/ https://lwn.net/Articles/753383/ gerdesj <div class="FormattedComment"> Perhaps it has been discussed to death before but why not put DBs on some sort of DB oriented storage instead of say xfs/ext{n}/btrfs/fat16? <br> </div> Wed, 02 May 2018 23:56:35 +0000 Failed writeback to removable devices https://lwn.net/Articles/752921/ https://lwn.net/Articles/752921/ Wol <div class="FormattedComment"> Dunno what relevance this has to my use case - I regularly do multi-gigabyte network copies (24MP raw camera images, HD video etc) - but this absolutely kills my laptop performance.<br> <p> On a twin-core machine, load average will hit 4 or 5 or 6, and system response basically goes through the floor. Actually, the cause could well be that RAM is flooded, but whatever the cause, it's rather frustrating.<br> <p> Cheers,<br> Wol<br> </div> Fri, 27 Apr 2018 10:39:20 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752898/ https://lwn.net/Articles/752898/ nybble41 <div class="FormattedComment"> <font class="QuotedText">&gt; It would be *potentially* non-portable, not necessarily. It would become actually non-portable if an unpleasant implementation appears.</font><br> <p> See, you're talking about level 2 (particular implementations). Portable program *design* happens at level 3 (design/logic). If your program relies on behavior which is undefined according to the standard then it is non-portable, regardless of whether other implementations behave the same way. You can't say "this program works on any POSIX-compatible system", for example. You know that it works on Linux version X and maybe BSD version Y, but if someone puts together a new OS which follows all the relevant standards neither you nor they can be confident that your program will work on it unmodified.<br> <p> <font class="QuotedText">&gt; Most programmers do not formally verify their programs, but instead test them.</font><br> <p> Formal verification in this context is a red herring. Tests are also a form of proof, albeit in the weaker courtroom-style, balance-of-evidence sense rather than the strict mathematical sense. The point is that without a standard you don't have a sound basis for reasoning "I called the function with these arguments, therefore the implementer and I both know that it should do this." Standards are how users and implementers of an API communicate. Relying on undefined behavior in your program is like speaking gibberish and expecting the listener to guess what you meant; there is a breakdown in communication, and the problem isn't on the implementer's end.<br> <p> <font class="QuotedText">&gt; Any system worth using (e.g., Linux) maintains in future versions the pleasantness it has supported in earlier versions.</font><br> <p> As zlynx already explained, that is an unreasonable expectation and even Linux doesn't always operate that way.<br> </div> Thu, 26 Apr 2018 22:42:57 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752884/ https://lwn.net/Articles/752884/ andresfreund <div class="FormattedComment"> <font class="QuotedText">&gt; It's tricky. PostgreSQL would like not to have to completely crash and burn if one file on one tablespace becomes impossible to properly flush, so something that gives it options would be nice.</font><br> <p> FWIW, I don't agree that that's a useful goal. It'd be nice in theory, but it's not even remotely worth the sort of engineering effort it'd require.<br> <p> <p> <font class="QuotedText">&gt; Nobody really wants a kernel panic or database crash because we can't fsync() some random session table that gets nuked by the app every 15 minutes anyway, after all.</font><br> <p> I don't think that's a realistic concern. If your storage fails, you're screwed. Continuing to behave well in the face of failing storage would require a *LOT* of work. We'd need timeouts everywhere, we'd need multiple copies of the data etc.<br> </div> Thu, 26 Apr 2018 18:02:18 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752879/ https://lwn.net/Articles/752879/ zlynx <div class="FormattedComment"> <font class="QuotedText">&gt; Any system worth using (e.g., Linux) maintains in future versions the pleasantness it has supported in earlier versions. </font><br> <p> No, because that is an unreasonable limit.<br> <p> Simply because of implementation limits, ext3 serialized file and directory updates in a certain way and for many years. So people got used to it. But it never applied to ext2, XFS or FAT or literally ANY other filesystem. Not to mention BSD's UFS or Hammer2, or Apple's HFS. Heck, it didn't even apply to ext3 in certain configurations.<br> <p> And then people tried to require that ext4 work the same way. And btrfs. And even wanted to go back to force XFS to work that way too.<br> <p> The correct answer is to fsync() everything, which would show how bad ext3 was at that particular operation. All those fsyncs make things slower for people using ext3, but that does not mean fsync is the wrong answer. It just means ext3 was a filesystem with a terrible fsync() implementation that people got used to using.<br> <p> "Pleasant behavior" is often simply what programmers have become used to. It doesn't make it correct or actually pleasant.<br> </div> Thu, 26 Apr 2018 17:06:23 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752878/ https://lwn.net/Articles/752878/ Wol <div class="FormattedComment"> <font class="QuotedText">&gt; I wonder if the "what POSIX mandates" in the article really refers to a mandate by POSIX, or another case of lack of definition that an implementator sees as a welcome opportunity for an unpleasant surprise. </font><br> <p> As I understand it, POSIX explicitly *avoids* what happens when things go wrong, precisely because POSIX has no idea what's happened.<br> <p> So a linux standard that says "this is the way we handle errors" will be completely orthogonal to POSIX. And would be a good thing ...<br> <p> The trouble with POSIX is it's an old standard, that is out-of-date, and while I believe there is some effort at updating it, there is far too much undefined behaviour out there.<br> <p> Cheers,<br> Wol<br> </div> Thu, 26 Apr 2018 16:29:34 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752869/ https://lwn.net/Articles/752869/ anton <blockquote>All true, of course, but "unpleasant behavior" can still be a reasonable choice.</blockquote> Yes, as mentioned, when implementing on a system with 64KB, you may not be able to afford the pleasantness. But we would not be discussing this topic if all cases of unpleasant behaviour were reasonable. <blockquote> Any application which *relied* on system-specific "pleasant" behavior would necessarily be non-portable. </blockquote> It would be *potentially* non-portable, not necessarily. It would become actually non-portable if an unpleasant implementation appears. But so what? I am pretty keen on portability, but life's too short for unreasonably unpleasant implementations. If your program does not run in 64KB anyway, there is no need to cater to that reasonable unpleasantness; and if you want to cater to unreasonable unpleasantness, it's your time and money to waste (after all, some people write programs in Brainfuck), but I would not recommend it to anyone else. <blockquote> If "pleasant" behavior is desirable then, IMHO, the right solution is to standardize the behavior so that applications can be written against the standard and not one particular implementation. </blockquote> If you think so, go ahead and work on standardizing pleasant behaviours. But as mentioned, there is the issue of constrained systems where you cannot afford the pleasantness. One solution is to specify several levels of the standard. The minimal level allows unpleasantness that is reasonable on constrained systems; a higher level specifies more pleasantness. However, if you have unreasonable implementors in the standards committee, you will be out of luck in your standardization effort. <p>Concerning reporting when undefined behaviour is performed, that's a relatively pleasant way to deal with the situation. It's not appropriate when the application developer actually wants to rely on a specific behaviour and does not want to "fix" it, but it certainly makes it clear that your implementation is not pleasant enough to run this application. <blockquote> In the end, an application which relies on a specific implementation of undefined behavior, pleasant or unpleasant, is broken. </blockquote> No, it isn't. If it behaves as intended in a specific setting, it's working, not broken. It may be unportable, but that does not make it broken. <blockquote>since the application is not in compliance with the standard, one cannot prove that it will work on any standard-compliant system </blockquote> Most programmers do not formally verify their programs, but instead test them. There is no way to prove that a program is in compliance with a standard by testing, even if the programmer intends to avoid undefined behavior. But even the few programmers that actually use formal verification for their programs cannot prove that their programs comply with most standard (e.g., POSIX), because most standard are not formally specified. So this whole proof issue is a red herring. <blockquote>including future versions of the same system.</blockquote> Any system worth using (e.g., <a href="https://felipec.wordpress.com/2013/10/07/the-linux-way/">Linux</a>) maintains in future versions the pleasantness it has supported in earlier versions. Thu, 26 Apr 2018 16:22:13 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752874/ https://lwn.net/Articles/752874/ ringerc <div class="FormattedComment"> It's tricky. PostgreSQL would like not to have to completely crash and burn if one file on one tablespace becomes impossible to properly flush, so something that gives it options would be nice.<br> <p> But it also needs to be able to know reliably that "all data from last successful flush is now fully flushed", so it can make decisions appropriately. Right now it turns out we can't know that.<br> <p> Nobody really wants a kernel panic or database crash because we can't fsync() some random session table that gets nuked by the app every 15 minutes anyway, after all. In practice that won't happen because the table is usually created UNLOGGED but there are always going to be tables you don't want to lose, but don't want the whole system to grind to a halt over either.<br> <p> </div> Thu, 26 Apr 2018 16:18:15 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752742/ https://lwn.net/Articles/752742/ xxiao <div class="FormattedComment"> what about mysql, does it use DIO and never touches fsync()? Maybe this is not just postgresql-specific?<br> </div> Wed, 25 Apr 2018 14:12:27 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752688/ https://lwn.net/Articles/752688/ nybble41 <div class="FormattedComment"> <font class="QuotedText">&gt; If the standards do not specify what the implementation should do ("undefined behaviour" or somesuch), there is nothing in the standard that the implementation could follow, and it's the sole responsibility of the implementor to choose a particular behaviour. If, in such a situation, the implementor chooses to implement unpleasant behaviour, it's his fault, and his fault alone; the standard did not make him do it.</font><br> <p> All true, of course, but "unpleasant behavior" can still be a reasonable choice. Any application which *relied* on system-specific "pleasant" behavior would necessarily be non-portable. If "pleasant" behavior is desirable then, IMHO, the right solution is to standardize the behavior so that applications can be written against the standard and not one particular implementation. In the meantime, the most productive choice when undefined behavior is detected is to complain as loudly as possible, or even terminate the process, rather than allow the application to silently continue in an undefined state. This ensures that the application developer is made aware of the issue and has both the opportunity and incentive to fix it. (However, this outcome should remain *undefined* behavior so that this can be changed in the future if and when more pleasant behavior is standardized.) Going out of one's way to make undefined behavior "pleasant" is a form of attractive nuisance, in that it tends to encourage non-portable code.<br> <p> In the end, an application which relies on a specific implementation of undefined behavior, pleasant or unpleasant, is broken. A particular installation may do the right thing for certain known inputs; one may even be able to prove that it does the right thing for all possible inputs given perfect knowledge of the implementation in use on a particular system. However, the third layer of software[1]—design/logic—is missing: since the application is not in compliance with the standard, one cannot prove that it will work on any standard-compliant system, including future versions of the same system.<br> <p> [1] <a href="http://www.pathsensitive.com/2018/01/the-three-levels-of-software-why-code.html">http://www.pathsensitive.com/2018/01/the-three-levels-of-...</a><br> </div> Tue, 24 Apr 2018 16:32:02 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752573/ https://lwn.net/Articles/752573/ andresfreund <div class="FormattedComment"> It's not optimal, but what freebsd appears to do is to just clear the error *and* mark the buffer as dirty again. So the error will be hit again and again (unless the device is gone):<br> <a href="https://github.com/freebsd/freebsd/blob/master/sys/kern/vfs_bio.c#L2633">https://github.com/freebsd/freebsd/blob/master/sys/kern/v...</a><br> </div> Mon, 23 Apr 2018 21:19:38 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752571/ https://lwn.net/Articles/752571/ helsleym <div class="FormattedComment"> I am not contesting your assertion but I am curious -- would you care to elaborate on what FreeBSD does that "[gets] it (mostly) right"?<br> </div> Mon, 23 Apr 2018 21:00:32 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752452/ https://lwn.net/Articles/752452/ anton Yes, ideally standards would be complete. In practice, they tend to specify just the intersection of the behaviour of the existing implementations (in line with the requirement that a standard should standardize common practice), as well as considering various constraints on outlier systems; e.g., "We want this standard to be implementable on a system with 64KB RAM, and mandating the pleasant behaviour would cost several KB for this subfeature alone, so we leave the behaviour unspecified." And then a bloody-minded implementor for systems that use multiple GBs of RAM uses the lack of specification as justification to implement unpleasant behaviour. <p>And don't forget that standards are decided through consensus in the committee, so it takes just a few bloody-minded implementors on the standards committee to block any progress towards pleasantness. <blockquote>If everyone is expected to be nice instead of following the standards</blockquote> That's an excellent example of what I mean with "hiding behind the standard", and why I suspect that "what POSIX mandates" is in reality different from what was claimed in the discussion described in the arcticle. If the standards do not specify what the implementation should do ("undefined behaviour" or somesuch), there is nothing in the standard that the implementation could follow, and it's the sole responsibility of the implementor to choose a particular behaviour. If, in such a situation, the implementor chooses to implement unpleasant behaviour, it's his fault, and his fault alone; the standard did not make him do it. Sat, 21 Apr 2018 14:54:01 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752420/ https://lwn.net/Articles/752420/ zlynx <div class="FormattedComment"> If the user-friendly, pleasant behavior is expected then it should be in the standard. If it isn't, there's a reason for that and implementors should be able to be as bloody-minded as they please.<br> <p> If everyone is expected to be nice instead of following the standards, then there's no point in the current standard and it should be replaced with the "be nice" version.<br> <p> For example, there are people who expect TCP/IP to deliver their packets in the same sized chunks they were sent. These people are simple wrong. But by the "be nice" standard we'd have to write stupid networking stacks because some people expect behavior that isn't required.<br> <p> Maybe it's time for a POSIX 2020 standard. But if it isn't in there, don't expect it to work like anything else.<br> </div> Fri, 20 Apr 2018 18:12:02 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752388/ https://lwn.net/Articles/752388/ cornelio <div class="FormattedComment"> It appears only FreeBSD got it (mostly) right.<br> <p> The advantage of keeping around the most experienced filesystem developer ever.<br> </div> Fri, 20 Apr 2018 14:38:36 +0000 Failed writeback to removable devices https://lwn.net/Articles/752379/ https://lwn.net/Articles/752379/ ringerc <div class="FormattedComment"> From a PostgreSQL point of view that's actually nearly ideal, so long as we can protect the postmaster. Since it doesn't do much regular I/O that should be fine. Plus, on systemd systems the postmaster will get restarted if killed.<br> <p> We'd receive SIGCHLD for the killed user backend worker(s)/checkpointer/etc, which would trigger crash recovery where we kill all other backends then execute redo. That's perfect. Something portable would be better, of course, but something that covers 95% of users is pretty darn good.<br> <p> I was unaware of the hwpoison mechanism.<br> </div> Fri, 20 Apr 2018 11:43:10 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752377/ https://lwn.net/Articles/752377/ anton <blockquote> POSIXly correct filesystems have surprised users in unpleasant ways in the past; recall early ext4 eating people's DE config files, all because the standard had some undefined behaviour around file writes and renames. </blockquote> If a standard does not define something, it's up to the implementation to do it; i.e., it's their responsibility. Sufficiently bloody-minded implementors produce unpleasant surprises, and then point to standards or benchmarks as an excuse; but as long as the standard does not require the unpleasant behaviour (in which case it would be defined, not undefined), the implementator has the choice, and therefore the responsibility. Of course, implementors who blame the standard don't want you to recognize this, and often argue as if lack of definition in the standard required them to behave unpleasantly. It doesn't. <p>I wonder if the "what POSIX mandates" in the article really refers to a mandate by POSIX, or another case of lack of definition that an implementator sees as a welcome opportunity for an unpleasant surprise. Fri, 20 Apr 2018 10:19:58 +0000 Failed writeback to removable devices https://lwn.net/Articles/752373/ https://lwn.net/Articles/752373/ Cyberax <div class="FormattedComment"> Can unflushable dirty pages be "poisoned" instead? Just re-use the hwpoison mechanism to kill all the processes that might refer to them. Perhaps make this killing optional through some prctl() option.<br> </div> Fri, 20 Apr 2018 09:24:31 +0000 Déjà vu? https://lwn.net/Articles/752366/ https://lwn.net/Articles/752366/ marcH <div class="FormattedComment"> <font class="QuotedText">&gt; error reporting nightmare</font><br> <p> By the way Java makes a decent attempt with "Futures"<br> <a href="https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Future.html">https://docs.oracle.com/javase/7/docs/api/java/util/concu...</a><br> <p> Java was also the first language to have a formal memory model. These "performance" features may explain why Java was more successful on the server side than in embedded for which it was targeted initially.<br> <p> Ugly[*] and not fun but doing the job!<br> <p> [*] <a href="https://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html">https://steve-yegge.blogspot.com/2006/03/execution-in-kin...</a><br> </div> Fri, 20 Apr 2018 07:46:18 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752348/ https://lwn.net/Articles/752348/ andresfreund <div class="FormattedComment"> <font class="QuotedText">&gt; Andres Freund, like a number of other PostgreSQL developers, has acknowledged that DIO is the best long-term solution.</font><br> <p> Worth to note that that'll probably have to be an opt-in configuration. Using DIO one certainly has more control and can get higher performance, but it also requires that the database is more carefully configured. But a lot of people use PostgreSQL without configuring the size of it's own buffer cache at all - the OS adaptively providing a second level of caching makes that OK for a lot of scenarios. Postgres can't realistically figure out how much memory it should use on a given system. It doesn't, and shouldn't, have the information to make such a policy decision.<br> </div> Thu, 19 Apr 2018 20:30:23 +0000 Failed writeback to removable devices https://lwn.net/Articles/752329/ https://lwn.net/Articles/752329/ nix <div class="FormattedComment"> External USB drives also have the advantage that they can be swapped out for offsite backup, without blowing your network usage cap trying to back up to some cloud service you don't control. (Also, you can be fairly sure you can get them *back* again, unlike some cloud service you don't control.)<br> <p> Honestly, I suspect most of us here are doing last-ditch USB drive backups of *something*, at least.<br> <p> </div> Thu, 19 Apr 2018 16:43:05 +0000 Failed writeback to removable devices https://lwn.net/Articles/752311/ https://lwn.net/Articles/752311/ farnz <p>Ideally, you'd be able to tune the dirty pages cap to match the throughput the target can handle in a sensible timescale - say 100 ms for removable devices. That way, your device still has a lot of data to handle compared to its throughput, but it'll block applications when they're able to generate dirty pages far faster than your device can handle - no more application finished with 60 seconds of data left to write out to your USB device. <p>Something like <a href="https://lwn.net/Articles/682582/">less-annoying background writeback</a> definitely takes you in the right direction… Thu, 19 Apr 2018 15:07:04 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752307/ https://lwn.net/Articles/752307/ ringerc <div class="FormattedComment"> Before any users panic, note that you will only run into problems if your storage system fails in an abnormal way, OR you're running a few potentially unsafe configurations that may raise errors on writeback during normal operation.<br> <p> I suggest taking extra care and doing extra testing if you use:<br> <p> * Any sort of network block device<br> * Thin-provisioned storage<br> * multipath I/O (especially if you haven't set queue_if_no_path etc)<br> <p> Also, take care not to run out of space in your file system, or test disk-exhaustion behaviour in advance, if you use NFS. Or, preferably, don't do that.<br> <p> But while this is not cool, it's NOT going to be randomly corrupting PostgreSQL installations all over the place. It's also likely that PostgreSQL is far from the only thing affected.<br> </div> Thu, 19 Apr 2018 14:32:52 +0000 Déjà vu? https://lwn.net/Articles/752306/ https://lwn.net/Articles/752306/ ringerc <div class="FormattedComment"> It could, to a degree. When testing this, I used dmsetup and the 'error' target to introduce errors. The dmsetup 'flakey' target is also spectacularly useful.<br> </div> Thu, 19 Apr 2018 14:27:17 +0000 PostgreSQL's fsync() surprise https://lwn.net/Articles/752294/ https://lwn.net/Articles/752294/ oseemann <div class="FormattedComment"> In this context, a detailed and worthwhile writeup on file system error handling with links to the corresponding research papers can be found here:<br> <p> <a href="https://danluu.com/filesystem-errors/">https://danluu.com/filesystem-errors/</a><br> </div> Thu, 19 Apr 2018 13:26:22 +0000