|
|
Subscribe / Log in / New account

A look at "BPF Performance Tools"

By Jake Edge
February 26, 2020

BPF has exploded within the Linux world over the last few years, growing from its networking roots into the go-to tool for running custom in-kernel programs. Its role seems to expand with every kernel release into diverse areas such as security and device control. But none of that is the focus of a relatively new book from Brendan Gregg, BPF Performance Tools; it looks, instead, at how BPF provides visibility into the guts of the kernel. Finding performance bottlenecks of various sorts on (generally large) production systems is an area where BPF and the tool set that has grown up around it can excel; Gregg's book describes that landscape in great depth.

[Book cover]

The book is meant to be both a way to learn about what BPF can do to improve the observability of Linux systems and applications and a reference guide to a large body of tools that Gregg and others have built up to peer inside the running system. Interestingly, it does not actually cover the underlying BPF virtual-machine instructions all that much, except in an appendix; the focus is on how to use BPF at a higher level. Even then, learning to actually write tools using the high-level environments (BCC and bpftrace) is not truly the intent either, though code samples for bpftrace abound. The book is definitely geared toward finding problems at multiple levels on Linux systems running in production.

It begins by introducing BPF, noting its origin as the Berkeley Packet Filter and its eventual upgrade to extended BPF (eBPF), before giving a quick overview of the tracing and sampling techniques available on Linux. It then gives a taste of what the BPF Compiler Collection (better known as BCC) can actually do by using canned tools to examine system-wide execve() calls and block I/O latency. The different levels of tracing available in a Linux system, from applications through system libraries and the system-call interface down to internal kernel tracepoints and hardware counters, are briefly described with an eye toward a few bpftrace "one-liners" to examine open() and openat() system calls. Examples of bpftrace one-liners (and more) can be found in Gregg's LWN article from July 2019 and a report on his talk at the 2019 Linux Storage, Filesystem, and Memory-Management Summit.

That first chapter would be useful to anyone who is curious what the BPF fuss is all about. The concepts introduced in the first chapter (and more) are spelled out in greater detail in the rest of Part I ("Technologies"). The book is meant to be read straight through, if desired, or simply used as a reference of the tools and techniques that can be used to track down problems in a system. That leads to a bit of repetitiveness here and there throughout, so that readers popping into a particular place will not be completely lost. It can be a bit irritating at times for those just reading through it, but it is probably unavoidable in a dual-purpose book like this.

BPF itself is a complicated beast, which hooks into a wide variety of facilities for gathering tracing information. That includes both static options (kernel tracepoints and user-level statically defined tracing (USDT) markers) and ways to insert dynamic instrumentation into the kernel (kprobes) or user-space programs (uprobes). BPF programs can be used to collect information from those sources (and others like hardware performance monitoring counters (PMCs) and perf_events), summarize it in-kernel, and display the results in a variety of forms. Chapter 2 describes all of them in some detail

One of the key advantages of BPF over other tracing techniques is that it does its work efficiently in the kernel and can simply present its results; many other tools require storing lots of information in memory or log files and then post-processing it to actually pull out the data of interest. Some also require adding code to the kernel, either by rebuilding it with a different configuration or by adding a kernel module; BPF dispenses with all of that. In addition, BPF has data structures and helper functions to collect the kinds of information that might be of interest (e.g. stack traces); descriptions of all of that is gathered up in Chapter 2 as well.

While using BPF is the focus of the book, Gregg does not ignore the other tools available for diagnosing problems. The chapter on the process of analyzing a system starts with a look at the goals and methodologies that can be used to narrow things down. There are two separate checklists that are presented as starting points. The first uses standard Linux tools (e.g. vmstat, pidstat, and sar) in a "60-second checklist", before moving into a checklist of BCC tools (e.g. execsnoop, biosnoop, and tcpaccept). Each of the entries on the checklists is described along with how the output can be useful in pinpointing where problems might be; the BCC tool descriptions also reference other parts of the book where they are described in even more detail.

Rounding out Part I are a chapter each for BCC and bpftrace covering their installation, internal operation, and how they can be used; each chapter has multiple examples of them in action. These days, many Linux distributions provide packages for both of these interfaces, including the tools developed using them. There is also a large set of tools that Gregg developed specifically for the book, which can be seen in the diagram below in red; the existing tools are shown in black. All of the new tools can be found in his GitHub repository

[Tools diagram]

While the first part of the book gives a lot of useful context and a large, tasty bite of what BPF can do, the meat of the book is contained in Part II ("Using BPF Tools"). There are 11 separate chapters, each looking at a different area of the system with an eye toward how to use the tools and bpftrace one-liners to dig into the operation of that area. For example, there are chapters covering the CPU, memory, I/O, networking, security, applications, languages (e.g. Java), containers, and hypervisors.

Each chapter gives some background information to help understand the role of the area covered in the chapter; it also describes aspects of it that might lead to performance or other problems. The traditional tools for investigating problems are introduced with examples given of the kinds of information they can provide. The chapters then move into BPF tools and bpftrace programs (or one-liners) that can be used for troubleshooting and pinpointing problem areas. Many of the chapters have an "Optional Exercises" section with ideas for ways to extend the existing tools or write new ones either using BCC or bpftrace; the ones marked "advanced, unsolved" are, of course, particularly challenging.

The remaining parts of the book are supplemental material at some level. Part III ("Additional Topics") has a chapter on other BPF-based performance-analysis tools and one on "Tips, Tricks, and Common Problems". The final part is appendixes, including a list of all the one-liners used in Part II, a bpftrace cheat sheet, information on developing BCC-based and C-based BPF programs, and a reference on the BPF instruction set. That is followed by a glossary, bibliography, and an index.

I have a couple of nits to point out with the book, but overall it is an excellent book with comprehensive coverage of BPF-based tools and how to use them for investigating performance and other problems. The book can be a bit overwhelming at times, but that is really due to the subject matter at hand; there are lots of parts and pieces in the BPF landscape, so trying to keep them all straight can be a challenge.

I got a review copy of the EPUB version of the book from the publisher that I read in two different ways: on a tablet using Lithium and on my desktop with calibre; I did not try it on my Kobo E Ink reader as the layout of the book did not seem conducive to a small, monochrome screen. There were some rendering problems I encountered on Lithium, which I have used successfully with other technically oriented books; examples and tool output that spanned page boundaries on the screen would not display the piece on the next page. But there were links that would take you to a full-page rendering of the item, which could then be tapped to return to the right place. Calibre did not have that flaw and presumably other EPUB readers would not either, but it was obviously not annoying enough for me to go search out another reader.

The book has quite a number of in-line footnotes, which are useful; they often highlight the history and developer behind a particular tool. But the use of square-bracket-style links in the text left something to be desired. Clicking (or tapping) those would take you to an entry in the list after the bibliography, but each listed item was itself simply a link to a web URL. Some way to directly go to the linked-to item would have been a bit easier to navigate. Obviously, a dead-tree version of the book would not suffer from that lack, but paging to the list might be a bit painful as well. Perhaps newer editions could simply use regular footnotes for the links as well, making them directly selectable in electronic copies and saving the paging on paper copies.

While the book focuses on performance problems on "big iron"—many of the examples show output from 48-CPU systems—the techniques and tools will be useful for a wide variety of other environments. Tracking down bugs on a desktop system or gaining familiarity with the internals of the kernel are just two of the possibilities that the book helps unlock. Nearly anyone running Linux will find a bpftrace one-liner (or three) that will pique their curiosity. BPF Performance Tools is definitely worth a look for anyone curious about the workings of their Linux systems.


Index entries for this article
KernelBPF


to post comments

A look at "BPF Performance Tools"

Posted Feb 26, 2020 18:50 UTC (Wed) by hkario (subscriber, #94864) [Link] (5 responses)

> While the book focuses on performance problems on "big iron"—many of the examples show output from 48-CPU systems

* AMD wants to know your location

seriously though, Ryzen Threadripper 2990WX is a desktop-oriented 1.5 years old CPU and it delivers 64 logical CPUs...

A look at "BPF Performance Tools"

Posted Feb 26, 2020 22:52 UTC (Wed) by edeloget (subscriber, #88392) [Link] (4 responses)

> Ryzen Threadripper 2990WX is a desktop-oriented 1.5 years old CPU

With a tag price of 1700€, I would consider it "big iron" as well - that's not exactly your average desktop-oriented CPU... :)

A look at "BPF Performance Tools"

Posted Feb 27, 2020 2:46 UTC (Thu) by gus3 (guest, #61103) [Link]

Once enough people say "oooh, shiny!" and the market flow rises, it will be. Just like CD/DVD recorders (late 90's) and 2G DIMM's (early 2010's).

A look at "BPF Performance Tools"

Posted Feb 27, 2020 15:06 UTC (Thu) by hkario (subscriber, #94864) [Link] (2 responses)

I'd say that "big iron" starts if your computer (or cluster) is 48U high or occupies multiple racks...

I mean, it's not like mainframes—the original big iron—aren't a thing any more.

A look at "BPF Performance Tools"

Posted Feb 27, 2020 22:02 UTC (Thu) by bgregg (guest, #46639) [Link]

I might not have been clear in the book, but the 48-CPU systems are EC2 instances and Netflix runs many thousand of them (over 200k instances of varying sizes.) I used a lot of 48-CPU examples since it's a typical instance size for a busy microservice (where the instance count can range from tens to thousands of such instances.)

A look at "BPF Performance Tools"

Posted Feb 29, 2020 12:01 UTC (Sat) by dowdle (subscriber, #659) [Link]

Mainframes ARE still a thing. Just ask IBM.

In fact, there were a couple of presentations about Mainframes at FOSDEM... or was it linux.conf.au?

There was a claim made that building large systems out of clusters of PC hardware was a problematic way to go... and that the Mainframe offered many advantages over such PC clusters.


Copyright © 2020, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds