|
|
Subscribe / Log in / New account

Trading off safety and performance in the kernel

Trading off safety and performance in the kernel

Posted May 13, 2015 1:32 UTC (Wed) by zblaxell (subscriber, #26385)
In reply to: Trading off safety and performance in the kernel by dlang
Parent article: Trading off safety and performance in the kernel

> that still doesn't address the problem of the battery running out while it's in the backpack.

...because it's *not* a problem.

Really, it's not. It's been years since I had a healthy laptop run out of battery. They last for hours at full load and days on suspend.

> Far better to generate some extra heat for a little bit than loosing hours of data because it didn't get flushed out.

No, it's not better.

If the sync takes longer than 20 seconds, the suspend fails completely and the laptop stays on (unless you've set up your ACPI scripts to forcibly kill the power at that point).

While the laptop is on, it's damaging its battery, reducing the charge it can hold *forever*. This also conveniently breaks the battery charge estimation function, so you get to be surprised when your battery abruptly shuts down at "40% charge" in the future.

There's no "hours of uncommitted data" either. There's one filesystem commit interval at most. If you're sane that's not more than 30 seconds or so. If you're not sane, you can configure laptop-mode-tools to run sync() from userspace.


to post comments

Trading off safety and performance in the kernel

Posted May 13, 2015 2:01 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

> Really, it's not. It's been years since I had a healthy laptop run out of battery. They last for hours at full load and days on suspend.
Oh, what a BS.

I've had resume problems with ALL laptops that I had. Including MacBook Pro with OS X. Linux and Windows laptops tend to be even crashier.

Trading off safety and performance in the kernel

Posted May 13, 2015 3:02 UTC (Wed) by pizza (subscriber, #46) [Link] (3 responses)

> Really, it's not. It's been years since I had a healthy laptop run out of battery. They last for hours at full load and days on suspend.

Until you leave it in your backpack for *four* days instead of three, thanks to a long weekend.

Or the battery gets jostled. Or the battery isn't so healthy any more (do you get a nice email when it crosses that threshold?) Or you suspend it when the battery is relatively low. Or it doesn't resume properly because something plugged in was unplugged. Or one of many, many, many failure modes.

Or when the "low battery" threshold wakes the system up and the thing dies hard when trying to write the buffers back out.

Every single one of these situations has happened to me. (Funnily enough, in my experience, Linux suspend/resume is actually *more* reliable than Windows on the last couple of Tier-1 laptops I've owned.

I absolutely *want* any dirty buffers to be flushed to disk and the filesystems synced into a safe state before a system suspends. More than want; that is a hard requirement. Data loss is never acceptible, but when it's so damn easily preventable there is simply no excuse.

Trading off safety and performance in the kernel

Posted May 13, 2015 6:14 UTC (Wed) by tpo (subscriber, #25713) [Link] (2 responses)

My experience is similar to pizza's.

The single most frequent case "loss of data" occurring here is, when I work without the laptop being attached to power for some reason, then notice, that the end of battery power is near and close the lid.

At some point my laptop will wake up by itself and try to suspend to disk, which doesn't and hasn't ever worked here and will then run out of power hanging in that state.

I had managed to disable this behavior somehow in the past but some well meaning part of the system switched that on again and I am currently unwilling to spend my time to find out how to disable it again.

And the loss for me isn't usually the "data" but it is the state and context of the desktop: what shells did I have open with what content? Which files was I editing? Which applications were running? This can be more than annoying when I had my laptop prepared with all stuff needed open and at the right place to do a presentation only to find out when opening the laptop in front of people that its dead.

Of course syncing file buffers out to disk or not will not change anything wrt to the problem described in the last paragraph.

And it doesn't help that XFce's power indicator is too badly designed for me to be able to notice that the battery is running too low.

Trading off safety and performance in the kernel

Posted May 13, 2015 13:40 UTC (Wed) by jospoortvliet (guest, #33164) [Link]

at least syncing buffers to disks means you don't have to lose any data you wrote or worked on... ideally. For me, that's a huge deal and I certainly wouldn't like to lose stuff.

Trading off safety and performance in the kernel

Posted May 13, 2015 16:03 UTC (Wed) by zblaxell (subscriber, #26385) [Link]

> I had managed to disable this behavior somehow in the past but some well meaning part of the system switched that on again and I am currently unwilling to spend my time to find out how to disable it again.

I fired my distro's ACPI event-handling code. Several years ago it started being not merely useless, but an active source of failure. After several rounds of patches that consisted only of deletions, I gave up and replaced the entire thing with:

#!/bin/sh
echo mem > /sys/power/state

I have "find out where the acpi-support package is hiding today and kill it" on my to-do list for every dist-upgrade because the machine can be physically damaged if I don't.

Trading off safety and performance in the kernel

Posted May 13, 2015 19:56 UTC (Wed) by kleptog (subscriber, #1183) [Link]

> If the sync takes longer than 20 seconds, the suspend fails completely and the laptop stays on (unless you've set up your ACPI scripts to forcibly kill the power at that point).

Aah, so that's what happens. So what I need is a script that does: if suspend fails and laptop lid is closed, start playing an alarm at maximum volume. And cuts power if no response within a few minutes.

Much better than hearing a loud whirring noise a few hours later and then pulling an overcooked laptop out of your bag.

> There's no "hours of uncommitted data" either. There's one filesystem commit interval at most.

Well, not everything is saved on disk. If you have a document open that isn't saved then sync won't help anyway. It'd be great if there was a way to announce to running programs that the system is being suspended and to dump state, but that doesn't exist or isn't widely supported. Currently suspend (for me) is primarily a way to avoid the startup time. It's not reliable enough to rely on.

Mind you, I just found a basic-pm-debugging.txt in the kernel documentation which describs steps that can be used to debug issues. My current problem is that ext4 is trying to read a directory inode on resume while the the disk is not ready, and it remounts the rootfs readonly. The machine is then essentially unrecoverable (neither su nor sudo work with a readonly fs).

Trading off safety and performance in the kernel

Posted Jun 12, 2015 17:32 UTC (Fri) by bluefoxicy (guest, #25366) [Link] (1 responses)

> If the sync takes longer than 20 seconds, the suspend fails completely and the laptop stays on

SATA 3 6Gb/s: 15 gigabytes of recently-written dirty data to write

SATA 1 1.5Gb/s: 3.75 gigabytes

ATA100 100Mbit/s: 250 megabytes

I'm pretty sure this is a non-issue for any hardware made since 2003. /proc/meminfo shows 624kb of dirty pages on a big ass database server, 2712 on a busy Web server in a cluster. It's rare to have several gigabytes of unflushed disk just hanging around in memory; I've never seen more than a few megabytes.

Trading off safety and performance in the kernel

Posted Jun 12, 2015 17:50 UTC (Fri) by raven667 (subscriber, #5198) [Link]

That's just the link speed between the storage device and the main system and has little bearing on how fast you can actually write data to disk, especially if the disk is spinning rust. If the disk has to seek between writes then you aren't going to see more than around 100 writes per second which can be just a handful of megabytes, regardless of what the link speed is.

Your examples include a read heavy, write little web server and a db server which is probably explicitly flushing every IO to disk so that there is little data to write back in either case, neither of which is representative of how a laptop is used. It's easy to create a bunch of buffered writes, by copying a DVD image or compiling software or copying memory to disk for suspend, and on a laptop you may delay writes longer than normal to keep the disk subsystem in a low power state for as long as possible, leading to a storm of activity.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds