We are actually measuring systemd very frequently and compare it with what was there before. But I am a bit reluctant publishing those results, because I don't really believe they have too much value, since they are difficult to reproduce, and have little relevance on what people actually will experience.
The results range from one extreme (On an X300 on SSD we reliably can boot the full stack in < 7s) to the other (virtually no change if you have additional sysv services installed and rotating media). I prefer reading the bootchart measurements as a simple indicator where we need to fix things, not so much as something were we can people tell "hey, install systemd and your system will be as fast as this", because well it won't.
systemd is just a tool to make things faster. It's not magic potion you apply and which then magically makes everything go faster. There's not doubt that systemd is the right approach, but before we are happy with the end result we need to fix quite a number of other things all over the place. For example, one thing we learned is that maximizing parallelization the way we do has little benefit on the disk elevator on rotating media. Naive people like me assumed that providing the current Linux IO scheduler with a larger amount of requests at the same time it can choose from would improve its performance. Turns out it currently doesn't really. There are things one can tweak to improve the situation, but I guess we can safely say that the current Linux disk scheduler for rotating media is not optimized for this kind of workload systemd now pushes onto it during boot. (and yes, I hope to shed more light on this during LPC).
If you take stock F14 system on rotating media, you'll probably measure little difference from F13, simply since only a small subset of the services have been converted to become parallelizable and get rid of the shell (i.e. ship proper native systemd files). However if you run my development system things look much different because I did the full conversion, and the shell usage is very much reduced.
So, I guess what I want to say: consider this all work in progress. I won't stop people from measuring the boot times, but I am not planning to publish a lot of data in this area any time soon. Because I could provide you with both: graphs that show a super-duper speed-up and graphs that show virtually no speed-up at all. And both could rightfully be called systemd performance measurements.
Anyhow, I'd prefer if people would not reduce systemd to the speed issue. It's a lot more. It's an attempt do things the right way, to simplify things, and to make the boot a lot more powerful, for users, developers and administrators alike.
Posted Aug 23, 2010 14:41 UTC (Mon) by mmcgrath (guest, #44906)
[Link]
> Anyhow, I'd prefer if people would not reduce systemd to the speed issue
If that's not a feature... I'd prefer developers not promote it as such.
A systemd status update
Posted Aug 23, 2010 14:49 UTC (Mon) by mezcalero (subscriber, #45103)
[Link]
Well, I am not saying it wasn't a feature. I am just asking folks to not reduce it to this.
A systemd status update
Posted Aug 23, 2010 15:24 UTC (Mon) by obi (guest, #5784)
[Link]
With hindsight, are there any things in the current systemd design that stand in the way of boot-up speed, or are we really going as fast as possible in your opinion (assuming a generic boot system, not a heavily customized one for embedded or eeepc or the like).
I/O scheduler performance
Posted Aug 23, 2010 15:30 UTC (Mon) by epa (subscriber, #39769)
[Link]
one thing we learned is that maximizing parallelization the way we do has little benefit on the disk elevator on rotating media. Naive people like me assumed that providing the current Linux IO scheduler with a larger amount of requests at the same time it can choose from would improve its performance. Turns out it currently doesn't really.
This is backed up by folk wisdom about how to get fast I/O on Unix-like systems. Do we run multiple cp commands in parallel to give the I/O scheduler more choice? When browsing a directory of image files, does the thumbnail viewer try to open them all simultaneously? We all know that this will tend to make things slower, not faster.
So has research in disk elevator algorithms reached the point where it's possible to do better - to throw large numbers of requests at the system and have it respond to them faster than it would if given one at a time? Or are we stuck (on rotating media at least) with the practical reality that most of the time, requests to the same disk are best made one at a time?
I/O scheduler performance
Posted Aug 23, 2010 16:31 UTC (Mon) by dlang (✭ supporter ✭, #313)
[Link]
the problem is that the system has no way of knowing when you submit all this I/O if you mean
do all of these, and minimize the overall time
or
I need these to all make progress at the same time, even if it means taking longer overall.
current algorithms tend to assume the second, they try to split the available I/O bandwidth between all the requests, since this ends up resulting in lots of seeks, this hurts on traditional media with massive parallel requests
a small amount of parallelism helps by giving the drive something to do when it would otherwise be idle, however once you pass the saturation point it hurts because it adds additional seeks as the system jumps from one set of requests to the next.
this is the same sort of thing that makes hyperthreading be anywhere from a noticable benifit to a mild loss depending on the workload
I/O scheduler performance
Posted Aug 23, 2010 20:04 UTC (Mon) by axboe (subscriber, #904)
[Link]
A scheduler like CFQ will attempt to provide a mix of what you describe, depending on how you submit it. If the submission is done from one process, it will assume that you want it to be done as fast as possible. It'll be sorted accordingly. If done from multiple processes or threads, it will attempt to provide equal progress while preserving overall throughput.
What you describe is true on classical work conserving IO schedulers, it's not the case for the default Linux IO scheduler.
I/O scheduler performance
Posted Sep 8, 2010 11:15 UTC (Wed) by epa (subscriber, #39769)
[Link]
So, then, the way to get fast I/O is to make asynchronous I/O calls from a single thread (so that the scheduler knows that fairness doesn't matter) rather than spawning multiple threads or processes.
Is there any way to fork subprocesses but still let CFQ know that they're all related and happy to altruistically share I/O bandwidth between them, so it doesn't try to slice up I/O requests fairly at the expense of total throughput?
I/O scheduler performance: not good enough!
Posted Aug 23, 2010 23:15 UTC (Mon) by renox (subscriber, #23785)
[Link]
I remember being very impressed by a paper where userspace application can schedule their I/O by knowing (an approximation of) the block number for the file they use:
>>As an example, a tar of the Linux kernel tree was 82.5 seconds using GNU tar, while our modified tar completed in 17.9 seconds.<<
I wonder if there will be eventually a system call to know the block number of a file?
I/O scheduler performance: not good enough!
Posted Aug 24, 2010 0:12 UTC (Tue) by wmf (guest, #33791)
[Link]
The FIEMAP ioctl does this, but in general apps probably shouldn't try to implement such optimizations; Linux isn't an exokernel.
I/O scheduler performance: not good enough!
Posted Aug 24, 2010 21:53 UTC (Tue) by mhelsley (subscriber, #11324)
[Link]
I don't think FIEMAP provides block numbers. Block numbers would be unique on the partition/device whereas I think FIEMAP provides "extents" (and FIBMAP provides bits) which effectively describe offsets within the file -- not block numbers.
Posted Aug 25, 2010 11:28 UTC (Wed) by etienne (subscriber, #25256)
[Link]
Both FIEMAP and FIBMAP report block numbers, i.e. the position of the file on the device.
You have to guess yourself the position of the device on the hardware device, simple if they are the same of there is a simple partition table - difficult if there is LVM or MD(RAID) in between.
FIEMAP is a lot quicker than FIBMAP, noticeable when getting the mapping of ISO file images.
As an example:
$ wget http://www.mirrorservice.org/sites/download.sourceforge.n...
$ gcc -O2 showmap.c -o showmap
$ ./showmap ./showmap.c
File "./showmap.c" of size 15013 (32 blocks512) is on filesystem 0x302.
According to /proc/diskstats, file './showmap.c' is on device '/dev/hda2'
/dev/hda2: Permission denied
$ su
Password:
# ./showmap ./showmap.c
File "./showmap.c" of size 15013 (32 blocks512) is on filesystem 0x302.
According to /proc/diskstats, file './showmap.c' is on device '/dev/hda2'
Device block size: 512, FS block size: 4096, device size: 61432560 blocks
Device length: 31453470720 bytes
The device start at 61432560 sectors, C/H/S: 65535/16/63.
First FIEMAP says 1 extents
second FIEMAP success, 1 extents filled:
0: logical offset 0 (0 * 4096), physical offset 25904218112 (6324272 * 4096),
length 16384 (4 * 4096) flags 0x1
flags meaning: FIEMAP_EXTENT_LAST 0x1, FIEMAP_EXTENT_UNKNOWN 0x2,
FIEMAP_EXTENT_DELALLOC 0x4, FIEMAP_EXTENT_ENCODED 0x8,
FIEMAP_EXTENT_DATA_ENCRYPTED 0x0, FIEMAP_EXTENT_NOT_ALIGNED 0x100,
FIEMAP_EXTENT_DATA_INLINE 0x200, FIEMAP_EXTENT_DATA_TAIL 0x400,
FIEMAP_EXTENT_UNWRITTEN 0x800, FIEMAP_EXTENT_MERGED 0x1000.
FIGETBSZ: block size 4096 bytes
File (4 blocks of 4096) start at block 6324272 for 4 blocks,
last block 6324275 and file has 1 fragments.
FIBMAP succeeded after end of file, block index 4 give block 0
#
I/O scheduler performance: not good enough!
Posted Aug 26, 2010 12:48 UTC (Thu) by renox (subscriber, #23785)
[Link]
Interesting, thanks.
It didn't work on my computer though:
>>
./showmap ./showmap
Boot record of size 512 bytes read successfully from './showmap'
First bytes 0x7F 0x45 0x4C, signature 0x804, FAT16 sig 0x0, FAT32 sig 0x4,
No FAT signature recognised, cannot analyse header.
Partition table: WindowsNTmarker 0x3C, Unknown 0x0, signature 0x804
0: indicator 0x0 i.e. non bootable, start 5570560 length 0
(i.e. end at 5570560); start 513/0/43 end 0/0/18 name 'empty'
1: indicator 0x0 i.e. non bootable, start 3014656 length 0
(i.e. end at 3014656); start 768/0/29 end 0/0/18 name 'empty'
2: indicator 0x0 i.e. non bootable, start 6750208 length 0
(i.e. end at 6750208); start 0/0/54 end 0/0/18 name 'empty'
3: indicator 0x0 i.e. non bootable, start 4587520 length 2438987776
(i.e. end at 2443575296); start 256/0/60 end 0/0/18 name 'empty'
<<
Probably a not recent enough version..
I/O scheduler performance: not good enough!
Posted Aug 26, 2010 13:33 UTC (Thu) by etienne (subscriber, #25256)
[Link]