LWN.net Logo

Book review: Linux System Programming

By Jake Edge
December 5, 2007

"System programming" is not easily defined, but is typically considered to consist of programming at a lower level than regular application programming. As Robert Love points out in the introductory chapter of Linux System Programming, there is no technical difference between the two – the same system calls are used – it is more of a difference between programs that implement the infrastructure and programs that use it. Programmers faced with either task will find that understanding how to best use the system call interface is very important. Love sets out to provide that understanding in his book.

The book is organized into ten chapters: an introduction, three on I/O, two on process management, and one on each of file and directory handling, memory management, signals, and time handling. Each chapter does a good job of covering the subject matter at a level that will help programmers make good choices in the various trade-offs available. The main focus of each chapter is the system calls that Linux provides to perform tasks specific to that area.

The history of each call is described, along with information about which members of the UNIX family make it available, so that the right choices can be made for portability. Also, various historical (perhaps vestigial is more accurate) calls are documented, with readers being warned away from using them. Each call itself is given a treatment similar to a man page, but with greater detail. Where the book really shines is in its comparisons of "similar" system calls.

The trade-offs between using select() and poll() or the advantages and disadvantages of using mmap() vs. traditional file I/O mechanisms are just two of the comparisons presented. For example, after listing five bulleted advantages of poll(), select() gets its due:

The select() system call does have a few things going for it, though:
  • select() is more portable, as some Unix systems do not support poll().
  • select() provides better timeout resolution: down to the microsecond. Both ppoll() and pselect() theoretically provide nanosecond resolution, but in practice, none of these calls reliably provides even microsecond resolution.
Superior to both poll() and select() is the epoll interface, a Linux-specific multiplexing I/O solution that we'll look at in Chapter 4.

This is the kind of information that only comes with experience; this book will help a programmer get to that point more quickly. Even for experienced programmers, the comparisons will help crystallize some thoughts that have been floating around. It is definitely one of the better features of the book.

The book is not without its faults, though, especially in the example code. For each system call, a small example of calling it is provided, but the code snippets are simplistic and do not really provide much meat. There are very few code examples that tie together the various concepts. Had Love done that, there might have been complaints about the size of the resulting book, but the benefit to budding system programmers would be huge.

There are other problems with the book; for instance, the pirate motif in the examples did not seem to provide anything useful. More seriously, some of the major problems faced by system programmers: race conditions, concurrent data access synchronization, reentrant code, etc. were not covered in much detail. These topics are certainly something a system programmer will need to understand, but they will have to be found elsewhere.

The back cover of the book describes it as "an insider's guide to writing smarter, faster code" – it lives up to some of that, but not all. It is a useful book, however, that will find a home on the bookshelf of many Linux programmers. For those who are relatively new to the topic, there will be a wealth of information. But, even for those who are old hands, there will be useful tidbits, system calls that had escaped notice, and lots of reference material.


(Log in to post comments)

Don't forget the classic

Posted Dec 6, 2007 16:55 UTC (Thu) by vmole (guest, #111) [Link]

While not Linux specific, and missing some of the shiny new stuff such as epoll(), you can't go wrong with Richard Stevens's _Advanced Programming in the Unix Environment_. Good examples, often comparing subtle differences in the way things are called, consideration of race conditions and re-entrancy, and, at the end, four good-sized example projects tying it all together. And despite the title, it's quite suitable for the beginning Unix/Linux/Posix programmer, although it does more-or-less assume knowledge of C. But even if you're a Python programmer, APUE provides excellent discussion of how Unix works, and you won't regret the time spent.

Of course, that's true of *any* Richard Stevens book.

Don't forget the classic

Posted Dec 6, 2007 17:57 UTC (Thu) by nix (subscriber, #2304) [Link]

It also horrifies the newbie with indications of just how *bad* things 
were before POSIX standardized away much of the SysV/BSD gulf. The 
chapters on signal handling and terminal I/O may as well be divided into 
completely separate pieces, SysV and BSD differed so much. (In both cases 
SysV basically won. Thankfully it didn't win in all areas. I wish SysVIPC 
had never existed, ick.)

Don't forget the classic

Posted Dec 6, 2007 22:55 UTC (Thu) by vmole (guest, #111) [Link]

I wish SysVIPC had never existed, ick.

Ick? Ick? If "ick" satisfactorily expresses your opinion of SysV IPC, then you clearly haven't suffered enough :-). I *could* use the words that come into my mind, but then Corbett would have to delete this comment.

For those who haven't had the pleasure, among the problems is that the namespace is completely independent of the normal Unix namespace (files), and there's no way to programatically query it. Also, the associated resources were limited, and they didn't automatically close when all the programs using one exited uncleanly (like file descriptors do). And by "limited" I mean on the order of 16 or 32. System wide. Also, while you could list the in-use resource (ipcs), there was pretty much no way to find out who had created or was using a particular id. Imagine the fun.

Don't forget the classic

Posted Dec 7, 2007 0:47 UTC (Fri) by nix (subscriber, #2304) [Link]

There's more fun. While SysV message queues can fill up because, well, 
they're full (they have a bounded maximum size), they can also fill up 
because *other* queues have filled up to the point where the system won't 
allow any more messages into any queues. Most Unixes have ridiculously low 
limits (64 messages or 64Kb across all queues is not unusual) so it is 
utterly trivial to produce horrible deadlocks with this system.

(Of course they don't work with select(). That would be *useful*.)

(I would have used stronger language than `ick', too, but I didn't want to 
get the bills for setting a thousand people's computers on fire.)

Huh?

Posted Dec 6, 2007 17:42 UTC (Thu) by clugstj (subscriber, #4020) [Link]

"select() provides better timeout resolution: down to the microsecond. Both ppoll() and
pselect() theoretically provide nanosecond resolution, but in practice, none of these calls
reliably provides even microsecond resolution."


Microsecond resolution on the interface is worthless when the granularity of the timer
utilized is in the millisecond range.

Huh?

Posted Dec 6, 2007 20:11 UTC (Thu) by njs (guest, #40338) [Link]

1) Microsecond resolution is far from worthless even when the timer's granularity is in the
millisecond range... given that the alternative API, poll(2), has *second* resolution.  (What
were POSIX guys thinking?)

2) You're just repeating his point anyway...?  Maybe I missed something.

3) Timers have better than millisecond resolution these days anyway, if you're running a
tickless kernel.  At least that's my understanding.

Huh?

Posted Dec 10, 2007 16:35 UTC (Mon) by pphaneuf (guest, #23480) [Link]

According to the poll(2) I have here, the timeout is actually in milliseconds.

Huh?

Posted Dec 11, 2007 5:11 UTC (Tue) by njs (guest, #40338) [Link]

...Quite right.  Not sure how I misread that man page myself.

Well, points (2) and (3) still stand, I guess :-).

We call that "thinking ahead"

Posted Dec 8, 2007 0:40 UTC (Sat) by vmole (guest, #111) [Link]

Of course, the interface wasn't *designed* for Linux. Someone had the foresight to see that someday there could be systems that provide microsecond resolution, and decided that allowing for that might possibly be a good idea.

Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds