Trading off safety and performance in the kernel
Trading off safety and performance in the kernel
Posted May 14, 2015 17:13 UTC (Thu) by marcH (subscriber, #57642)In reply to: Trading off safety and performance in the kernel by neilbrown
Parent article: Trading off safety and performance in the kernel
> And of course, any real app would have auto-saved every few minutes so even in a disaster you wouldn't lose more than a few minutes work.
"Don't break userspace" - even userspace bugs.
Posted May 14, 2015 22:31 UTC (Thu)
by neilbrown (subscriber, #359)
[Link] (4 responses)
If userspace needs the kernel to call sync before crashing then the user-space is already broken. Systems can crash without entering suspend first.
But that is a big "if". Are there actually any non-trivial apps which don't save their data properly?
Posted May 21, 2015 21:58 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
Okay, it's not linux, but ... MS Word ?
(maybe it's changed, but I OFTEN lose data if I'm working on a document and it crashes - often it's the attempted auto-save that causes the crash :-(
Cheers,
Posted May 23, 2015 16:20 UTC (Sat)
by anton (subscriber, #25547)
[Link] (2 responses)
Anyway, one example of a broken file system losing data of a popular application (including the autosave files that the application produces regularly) is here.
Posted May 25, 2015 6:52 UTC (Mon)
by neilbrown (subscriber, #359)
[Link] (1 responses)
That makes no sense.
I agree that "If a filesystem needs the kernel to call sync before crashing then the filesystem in already broken" with the understanding that "needs" means "needs in order to protect the data that it is responsible for."
> one example of a broken file system losing data of a popular application
That is a filesystems from decades ago. Yes it was broken, no question. Linux filesystems aren't like that. All non-trivial Linux filesystems do journalling of metadata, which is much safer than synchronous metadata updates. I cannot promise they are all 100% bug free in every release, but I am certain that calling 'sync' in the suspend path isn't going to usefully fix any bug that they might have.
It also sounds like that "popular application", which was emacs, wasn't calling 'fsync' as it should and as it certainly now does.
Posted May 25, 2015 9:00 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Resume failed (yet again) and after reboot my BTRFS filesystem refused to mount.
Trading off safety and performance in the kernel
Trading off safety and performance in the kernel
Wol
Trading off safety and performance in the kernel
If userspace needs the kernel to call sync before crashing then the user-space is already broken.
No, the file system is broken.
Are there actually any non-trivial apps which don't save their data properly?
No, there are just file systems (e.g., ext4) which do not provide decent guarantees and use this kind of rethoric to justify their poor behaviour. I expect that pretty much all non-trivial applications do not jump all the time through all the hoops that some developers of file systems expect of them; that's because they have no good way to test that they meet the expectations of these file system developers, and most application developers probably have many more urgent things to care about.
Trading off safety and performance in the kernel
Data that has not yet been written to the filesystem is certainly not the filesystem's reponsibility.
Data that has been written but hasn't been the subject of 'fsync' is also not completely the filesystem's responsibility (unless you mount with '-o sync').
Yes - bugs should be fixed. But let's not scatter "sys_sync" calls around and pretend that fixes them.
Trading off safety and performance in the kernel
As if on cue, today my laptop corrupted my filesystem during suspend/resume. I started synchronization of several large (~200G) directories with lots of small files from our local network and then totally forgot about it. Then I closed the laptop's lid and went home.