> I know that this issue has cropped up again due to the fact that in
> Ubuntu the dpkg program detects if it's running on Ext4 and goes into
> paranoid mode were it runs 'fsync' were as with Ext3 it does not. This
> causes Ubuntu installs to last significantly longer if you choose
> 'Ext4' file system.
> If the dpkg folks were smart they'd enable paranoid mode on all file
> systems, except maybe Ext3 (due to Ext3's poor ability to handle that
> sort of workloads)
dpkg has always done fsync() on the internal database, it was only
missing doing fsync() for the extracted control files from a package
to be installed/upgraded (which include maintainer scripts for example).
As of recently, dpkg started doing fsync() before rename on *all*
file systems for all extracted files from a package (there's actually
never been any kind of file system detection or special "paranoid mode").
It also does now fsync() on all database related directories.
The reason for this has been mainly the zero-length issues with ext4
(appearing even with the recent rename heuristic fixes), as we've had
no previous bug reports of broken systems due to zero-length files on
any other file system. But I consider it was still a bug for something
like dpkg to not fsync() files, just because the package status would
not match the package installed data, which is an issue, but not as
grave as having empty files left around (think boot loader, kernel or
libc as example).
But those changes produced major performance regressions *only* on
ext4 (that we know as of now), so we implemented per package delayed
fsync()s + rename()s, which helped a bit with ext4, but not enough. We
have now switched to use delayed sync() + rename()s *only* on Linux
(because it's the only place were sync() is synchronous) which brings
performance closer to the initial values. ext3 didn't have a noticable
performance degradation during the implementation iterations.
The still present zero-length issues and performance issues with fsync()
have been reported to ext4 upstream, the solutions offered were to either
not use fsync() because it's slow and it's not feasible to make it faster,
use non-portable sync() or ignore the problem as it's not a usual case...
(most of the hundreds of duped reports in Ubuntu, which happens to have
ext4 as default file system in latest releases, were due to sudden power
off, and not to system crash which were a minority).
Not to mention this will be an issue if someone happens to port ext4 to
any non-Linux kernel where sync() is asynchronous, then the only options
for developers are either massive performance degradation or possible
data loss in case of abrupt system crashes/shutdown...
> As far as my personal opinion this is a advantage for using Ext4 over
> Ext3 since upgrades will be much safer on my laptop...
Well, whatever happens in maintainer scripts for example is not synced,
so there's still room for data loss with dpkg on ext4...
I've just checked if rpm is doing any kind of sync for extracted files
before rename() and it does not seem so, I'm guessing other packaging
systems might be susceptible to this issue too, but I've not checked.
This is something they might also want to consider doing, in case those
systems start offering ext4 as installation file system, or they might
start suffering the same kind of bug reports as Ubuntu saw. :/