LWN.net Logo

User-space software suspend

User-space software suspend

Posted Sep 29, 2005 9:13 UTC (Thu) by hawk (subscriber, #3195)
Parent article: User-space software suspend

One of the fairly big benefits of swsusp2 is that it doesn't do away with any memory that can be done away with. Doing so may be ideal from some point of view (probably simplifies stuff), but it is definitely not ideal for the user!

After a suspend/resume cycle with swsusp2 (which is actually slightly quicker than a swsusp1 cycle!) the machine is in the same state as at was before suspending, it still has the running programs in-memory, stuff cached, etc.

Swsusp1 may work "just as well" (for me at least), but it puts the system back in a very sorry state, where the system is on the verge of being unusable for some time after resuming.


(Log in to post comments)

User-space software suspend

Posted Sep 29, 2005 11:45 UTC (Thu) by rise (subscriber, #5045) [Link]

Good points, though I'd like to note that in my experience a suspend2 suspend & resume cycle is much faster than a swsusp1 cycle even with keeping cache and buffers.  Suspend2 also has the option to throw away both, which dramatically speeds up the cycle at the cost of an system that's initially a bit sluggish after resume as it faults everything back in - though no worse than swsusp1.

User-space software suspend

Posted Sep 30, 2005 3:03 UTC (Fri) by zblaxell (subscriber, #26385) [Link]

I do like the fact that swsusp2 resumes with caches and buffers intact. If I wanted to wait while the system painfully restored this data one 4K page fault from swap at a time, I might as well reboot--it could actually be faster.

On the other hand, I generally like to run a small application before suspending, which allocates memory until a few hundred pages are swapped (it is a loop of malloc() and reading paging statistics out of /proc), then exits. This dumps out some of the more useless 400MB or so of caches on my system, and cuts resume time in half (it does add a second or two to suspend), without the extreme pain of having to swap _everything_ back in on resume.

I'm not sure what benefit there is in pushing too much of the suspend and resume functions into user space. After a while we start to need a whole lot of system calls to tell the kernel which of its "user space" processes are in fact absolutely critical to the continued functioning of the kernel, at which point IMHO it would be much simpler, safer, smaller, and swifter to just push the whole thing back into kernel-space. If you combine user-space suspend and resume with user-space block devices, user-space network devices, user-space encryption (on either), user-space device configuration, network storage devices, and device drivers that live partly or entirely in user-space, there's a whole lot of stuff that is just bouncing back and forth between user-space and kernel-space with no really sane reason to do so other than "we don't have to do all of it in the kernel."

In one special case of user space--monolithic user-space applications--there is a similar question of what to include in the main application's space and what to farm out to other processes. Sometimes the monolithic application is even called a "kernel." One solution in common between the Linux kernel and other large applications is to dynamically load code into the application's address space (.ko's or .so's). Another solution is to initiate another process with a separate address space, then communicate with the main application over some kind of IPC (netlink, /proc, /sys, dbus, hotplug, mmap...or sockets, pipes, shared memory, mmap).

There is a third option which is used by big applications but not the Linux kernel: embedded interpreted languages. Modern applications, once they cross a certain size threshold, tend to suddenly sprout a language interpreter to cope with their more advanced configuration options (where "configuration" sometimes amounts to "when I press this button, execute 1500 lines of custom workflow code"). Things like netfilter get close to this--iptables is almost Turing-complete, the chains are analogous to functions, some of the experimental netfilter modules implement dictionary lookups analogous to variables, and the non-experimental modules can do basic boolean logic on packets combining the results from multiple rules, as long as you don't need more than 8 levels of nesting or 32 bits of storage per packet. Netfilter in particular could benefit a lot from having a compiler in user-space generate an optimized (not every netfilter chain entry *needs* to look at the source and destination network/netmask, but they do nonetheless) bytecode (or even machine code) filter configuration, then pushing that code into a much simpler kernel-space implementation. I'm surprised the Linux kernel doesn't have at least one interpreted configuration language, not even as a module--other Unixish kernels and their bootloaders do.

Most of the time, the only advantage I ever see from having things like root filesystem configuration, device mapping, encryption, firmware loading, etc. configured from or provided by user-space is that it is then possible to do non-trivial configurations or experimental implementations. For example, the md-RAID setup allows a number of straightforward RAID configurations to be set up automatically by the kernel, while the device-mapper and other LVM flavors are configured from user-space and can (in theory) be a lot more flexible. Another example is encrypted filesystem setup, where you almost certainly want to have a custom user-space script to retrieve the decryption keys from whatever they're stored on, match them up with the right partitions, and of course collect the passphrase from the console. All this stuff can easily be handled by even a minimal scripting language with the right set of primitives--most of which would just be wrappers around existing kernel code, e.g. open() or sha1().

I currently do this kind of userspace configuration on an initrd with busybox (almost but not quite as painful as custom C code), custom binaries (which are comparatively hard to fix when they break, unless you have the presence of mind to keep a working development environment on a bootable CD with you at all time), or even shell scripts (which work, but take up megabytes of space for the 99% of the code you're not using). IMHO they all suck. The amount of stuff that I have to put into the initrd keeps getting bigger while the amount of stuff in the kernel keeps getting...well, bigger, and yet the amount of stuff that the kernel can do without help from user-space seems to be decreasing with each new major kernel subsystem. Also, I have to go through some weird flaky gymnastics to reconfigure user space (pivot_root and real-root-dev come to mind here) without leaving dangling references to multi-megabytes of initrd crap taking up RAM and swap. I'd rather just put 20K of some simple script language runtime into the kernel, have the kernel read and execute a 4K boot script, and be done with it. It can't take more than that much code to prompt for a password, run it through the appropriate salt and hash functions, set up two loop device AES keys, then exec "/sbin/init".

User-space software suspend

Posted Oct 6, 2005 17:36 UTC (Thu) by peschmae (guest, #32292) [Link]

> I do like the fact that swsusp2 resumes with caches and buffers intact.
> If I wanted to wait while the system painfully restored this data one 4K
> page fault from swap at a time, I might as well reboot--it could actually
> be faster.

Me too. But on my machine (laptop - harddisk is slow) rebooting would still be slower ;-)

> On the other hand, I generally like to run a small application before
> suspending, which allocates memory until a few hundred pages are swapped
> (it is a loop of malloc() and reading paging statistics out of /proc),
> then exits. This dumps out some of the more useless 400MB or so of caches
> on my system, and cuts resume time in half (it does add a second or two
> to suspend), without the extreme pain of having to swap _everything_ back
> in on resume.

Isn't that exactly what the # ImageSizeLimit 200 item in hibernate.conf (or the /proc/software_suspend/image_size_limit respectively) are there for?

Does your way of doing the more or less same thing have an advantage over that? (Faster maybe?)

> I'm not sure what benefit there is in pushing too much of the suspend and
> resume functions into user space.

I agree here. Because it still seems to need very much code in the kernel - only a minimal part is user space application.
And I don't really like it if the kernel depends on user space apps to boot - only makes for trouble (the tool has to be on the initrd (I don't like initrds anyway - at least not for my custom built kernels))

Peschmä

User-space software suspend

Posted Oct 6, 2005 19:11 UTC (Thu) by zblaxell (subscriber, #26385) [Link]

Normally suspend2 writes all non-free pages (including clean cache pages and cached swap pages). This is a bit annoying for me, since 90% of the time I use less than 40% of my laptop's memory, but I have to wait for the other 60% of the RAM to be read and written at suspend and resume time.

ImageSizeLimit is an upper bound on the image size. If the image would be larger than this, then there is a pre-suspend forcing of pages--dirty or not--to disk. If the value is not dynamically chosen, it is inefficient--too high, and unnecessary pages are written in the suspend image; too low, and suspend and resume time is significantly increased since a bunch of stuff has to be swapped out before suspend and back in after resume, and page for page the swapper is much slower than Suspend2's image writer. Dynamically choosing the value is apparently non-trivial...at least I tried to do it for a while, then gave up.

My application forces all the clean pages (600MB as I write this) to go away, without losing active program text pages or forcing dirty pages to swap. It stops as soon as there are more than 100 pages written to swap since the program started running, so it does not significantly extend the suspend time (a few hundred pages are swapped before the application notices and exits, which does take a second or so).

This approach doesn't need prior configuration--it automatically discovers just how much RAM can be cheaply freed by allocating as much as the system can spare without swapping, then it exits and leaves thousands of free pages.

Without all the extra pages, the suspend image is much smaller, so suspend and resume are faster. Since only a few dirty or active pages were actually swapped, it doesn't noticeably slow down the machine after resume (there is more overhead when xscreensaver wakes up after noticing the wall clock time jumping well past the inactivity threshold, than there is from post-resume swapping ;-).

User-space software suspend

Posted Oct 30, 2005 1:51 UTC (Sun) by NinjaSeg (subscriber, #33460) [Link]

Errr, care to share it with us?

User-space software suspend

Posted Nov 4, 2005 0:43 UTC (Fri) by zblaxell (subscriber, #26385) [Link]

#!/usr/bin/perl -w
use strict;
use Time::HiRes qw(time);

sub swapfree {
open(PROC, "/proc/meminfo") or die "open: /proc/meminfo: $!";
my ($swapfree) = grep(/^SwapFree:/, <PROC>);
close(PROC);
$swapfree =~ s/\D+//gos;
print STDERR "swapfree=$swapfree\n";
return $swapfree;
}

my $last_swapfree = swapfree;
my @blobs;

my $count = 0;
my $total = 0;

my $start_time = time;

while ($last_swapfree <= (my $new_swapfree = swapfree)) {
++$count;
push(@blobs, ('.' x (1048576 * $count)));
$total += $count;
print STDERR "${total}M allocated\n";
$last_swapfree = $new_swapfree;
}
system("ps m $$");
print STDERR time - $start_time, " seconds\n";

User-space software suspend

Posted Sep 30, 2005 16:41 UTC (Fri) by richardfish (subscriber, #20657) [Link]

Could not agree more! The biggest reason I prefer suspend2 is because it preserves cached memory.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds