Review: The Linux Programming Interface
Michael Kerrisk's (relatively) new book, The Linux Programming
Interface (TLPI), is targeted at Linux system programmers, but it is
not just
those folks who will find it useful. While it is a hefty tome ("thick
enough to stun an ox
" as Laurie Anderson might say), it is eminently
readable, both by browsing through it or by biting the bullet and reading
it straight through. The coverage of the Linux system call interface is
encyclopedic, but the writing style is very approachable. It is, in short,
an excellent reference that will likely find its way onto the bookshelves
of user-space developers and kernel hackers—including some who
aren't necessarily primarily focused on Linux.
Kerrisk has been the maintainer of the Linux man pages since 2004, which gives him a good perspective on the Linux API. As he says in the preface, it is quite likely that you have already read some of his work in sections 2, 3, 4, 5, and 7 of those pages. But the book is not a collection of man pages though it covers much of the same ground. The style and organization is much less dry, and more explanatory, than a typical man entry.
The book is some 1500 pages in length, which makes it a rather daunting prospect to review. Once I started reading it, though, it was quite approachable. Kerrisk's clear descriptions of various system calls and other parts of the Linux API made it easy to keep reading. I set out to pick and choose certain chapters to read, and just skim the others, but found myself reading quite a bit more than that—which might partially explain the lateness of this review.
The book is organized into 64 chapters of around 20 pages each, which makes for nice bite-sized chunks that allow for reading the book around other tasks. While the focus is on Linux, Kerrisk doesn't neglect other Unix varieties and notes where they differ from Linux. He also pays careful attention to the various standards that specify Unix behavior—like POSIX and the Single Unix Specification (SUS)—pointing out where Linux does and does not follow those standards.
TLPI was written for kernel version 2.6.35 and glibc 2.12. In the text, though, Kerrisk is careful to indicate which kernel version introduced a new feature, so that those working with older kernels will know which they can use. While it is primarily looking at the 2.6 series, 2.4 is not neglected, and the text notes features that were introduced at various points in the 2.4 kernel history.
The book starts with a bit of history, going all the way back to Ken Thompson and Dennis Ritchie and then moving forward to the present, looking at the various branches of the Unix tree. It then moves into a description of what an operating system is, the role that the kernel plays, and some of the overarching concepts that make up Unix (and Linux). While this information may be unnecessary for most Linux hackers, it will come in handy for those coming to Linux from other operating systems. The ideas that "everything is a file" and that files are just streams of bytes are described in ways that will quickly get a system programmer up to speed on the "Unix way".
After that introductory material, Kerrisk launches into the chapters that cover aspects of the system call interface. This makes up the vast majority of the book and each of these chapters is fairly self-contained. They build on the earlier chapters, but the text is replete with references to other sections. In the preface, Kerrisk says that he attempted to minimize forward references, but that clearly was a difficult task as there are often as many forward as backward references in a chapter.
Navigating within the book is easy to do because there are frequent numbered section and subsection headings, along with the chapter number on each page. Other technical books could benefit from that style. There is also an almost too detailed index that runs to more than 50 pages.
Each chapter comes with sample code that is easy to read and understand. Importantly, the examples also do a good job of demonstrating the topic at hand and some of them could be adapted into useful utilities. The code is available from the TLPI web site and is free software released under the Affero GPLv3. Each chapter also has a handful of exercises for the reader, some of which have answers in one of the appendices.
So, what does the book cover? It would be easy to say "all of it", but that would be something of a cop-out, and a bit inaccurate as well. There are multiple chapters on files, file I/O, filesystems, and file attributes, extended attributes, and access control lists (ACLs). There is a chapter covering directories and links, as well as one that looks at the inotify file event notification call.
There are multiple chapters on processes, threads, signals, as well as chapters covering process groups and sessions, and process priorities and scheduling. Of particular interest to me were a chapter on writing secure privileged programs and one on Linux capabilities. There are two chapters on shared libraries, the first of which is more about the ideas underlying libraries and shared libraries along with how to build them, rather than the dlopen() system call (and friends), which is covered in the second.
There are, perhaps, too many chapters covering interprocess communication (IPC), with separate chapters devoted to each System V IPC mechanism (shared memory, message queues, and semaphores). There is also a chapter for each of the POSIX variants of those three IPC types. Both POSIX and System V IPC get their own introductory chapter in addition to the chapters focusing on the details of each type. Sandwiched between the System V and POSIX IPC mechanisms are two chapters on memory mapping and virtual memory operations that might have been better placed elsewhere in the book. There is also a chapter devoted to an introduction to IPC and one that looks at the more traditional Unix pipes and FIFOs. In all, there are twelve chapters on IPC before we even get to the sockets API.
After IPC, comes a chapter on file locking followed by six chapters covering sockets. Those chapters look at Unix and internet domain sockets, along with server design and advanced sockets topics. The book wraps up with a chapter on each of terminals and pseudoterminals, with something of an oddly placed "Alternative I/O Models" chapter in between them. It's an interesting chapter, covering select(), poll(), epoll(), signal-driven I/O, and a few other topics, but it seems weird where it is.
There is more, of course, and looking at the detailed table of
contents will fill out the list. One thing that stands out from the
book is the vast size of the Linux/Unix API. It also points out some of
the warts and historical cruft that is carried along in that API. Kerrisk
is not shy about noting things like that where appropriate in the text:
"In summary, System V message queues are often best avoided.
"
There were two specific topics that I looked forward to reading about but were only marginally covered by the book. The first is containers and namespaces, which are very briefly mentioned in a discussion of the flags to the clone() system call. A more puzzling omission is that there is almost no mention of the ptrace() system call. In the few places it does come up, readers are referred to the ptrace(2) man page.
There are certainly other parts of the Linux API that could have been covered, beyond the system call interface—sysfs, splice(), and perf come to mind—but Kerrisk undoubtedly needed to draw the line somewhere. Overall, he did an excellent job of that. Technical books, especially those covering Linux, have a tendency to get stale rather quickly, but TLPI shouldn't suffer from that as much as a kernel internals book would, for example. There should really only be additions down the road as the user-space API is maintained by the kernel developers "forever", but updates will presumably need to be made eventually.
There are a handful of additional complaints I could make about the book, but they are all quite minor, as were those mentioned above. The biggest nit is that the "asides" in the text, which are numerous, are really often much more than just asides. Each is set off from the rest of text, indented and rendered in a slightly smaller font (which is typographically a bit annoying to me), and are meant to contain additional information that is not necessarily critical to understanding the topic. In my experience, though, many of them might best have been worked into the main text. See what I mean about minor complaints?
This is a book that will be useful to application and system-level
developers, primarily, but there is much of interest for others as well.
Kernel hackers will find it useful to ensure their new feature (or fix)
doesn't break the existing API. Programmers who are primarily targeting
other Unix systems may also find it useful for making their code more
portable. I found it to be extremely useful and expect to return to it
frequently. Anyone who has an interest in programming for Linux will likely
feel the same way.
| Index entries for this article | |
|---|---|
| Kernel | Books |
