Displaying QR codes for kernel crashes
A proposal from Cong Wang to discuss the various mechanisms to store the kernel's "dying breath" spawned a rather large thread on the ksummit-2012-discuss mailing list. While things like pstore were set up specifically to provide a means to store kernel crash information, that doesn't necessarily make it easy for users to access and report kernel crashes. That led to suggestions and discussion of better ways for users to get the information out of their crashed systems—including using QR codes to facilitate the process.
Most regular users do not have a serial console set up to record crash information on a separate machine. So the kernel backtrace that appears after a crash is just written to the console, which means that much of it will have scrolled off the screen. Even the data that is there is hard to extract, with some folks trying to type the information in, which is tedious, not to mention error-prone. A QR code that encoded the relevant data could certainly help there.
Konrad Rzeszutek Wilk was the first to broach the QR code idea, though he said it did not originate with him. It turns out that H. Peter Anvin and Dirk Hohndel have been "messing with" the idea, but Will Deacon and Marc Zyngier actually showed something along those lines at the recent Linaro Connect in Hong Kong. Deacon was hesitant to call it a prototype, but said that there was some work done on encoding a kernel crash backtrace as a QR code. There were two problems with their approach:
- Even without any error correction, the QR code started to get pretty large (and unreadable) after more than a few lines of backtrace. This should be fairly easy to fix by encoding the data in a more sensible manner rather than just verbatim (especially since a backtrace is a well-structured log). Maybe you could even gzip the whole thing after that too (then sell an android app to gunzip it :p)
-
Displaying the QR code on a panic could be problematic. We tried using the ASCII option of libqrencode but we couldn't find any phone that would read the result. So we need a way to get to the framebuffer once we've sawn our head off (maybe this is easier with x86 and VGA modes?).
One of the original motivations for kernel modesetting (KMS) was to get readable oops information to the screen. Using KMS to display a fairly simple QR code graphic instead should be workable, rather than creating an ASCII version as Deacon describes. Matthew Garrett noted that it should be fairly straightforward at least for hardware that has KMS support:
There is some disagreement about where the decoding of any QR code should take place. Garrett believes that existing QR apps in phones should be used, while others are not convinced they can be coerced into being flexible enough to deal with the large QR codes that might result from a kernel backtrace. Garrett has also done some work on the problem and described his approach:
Anvin would rather see some kind of web
application that accepts a photo of the QR code and decodes it on the server. For
one thing, having one (working) decoding code base is desirable: "I can tell you just how bad a lot of the QR decoder software running on
smartphones are -- because I have tried them.
" In addition, though,
a web application would also have the photo itself, so even if it didn't
decode because of picture quality or other reasons, those photos could be
used to improve the quality of the decoder.
But that implies that a user would need to download an app to their phone or use some web application as suggested by John Hawley. Garrett was not in favor of either solution, noting that requiring an app makes its harder for users, while a web application doesn't really make it any better:
Given that many users already use photos to report crashes—taking a
picture of the screen with the last part of the backtrace—the QR code
mechanism, even if a bit cumbersome, might be able to provide the full
backtrace. But, as Dave Jones suggested,
just having scrollback available on the console after a crash would make
much of the problem disappear:
"What would be a thousand times more useful would be having working scrollback
when we panic, like we had circa 2.2
".
Users could then take a photo, scroll back a ways, take another, and so on. In the thread, there was widespread agreement that console scrollback would be desirable. But it turns out that the advent of USB keyboards caused the loss of that feature. Doing USB handling inside the panic code would be messy, so bringing that feature back is difficult. Other ideas were mentioned, like providing enough of the USB stack to write the crash information to a USB stick as Anvin suggests, or to "auto-scroll" the console output after a crash without requiring keyboard input as proposed by Paul Gortmaker.
Making it easier for users to report crashes with useful information was one branch of the discussion, but the folks who work on the embedded side are looking for more developer-oriented solutions as well. Tony Luck outlined the pstore back-ends that are currently available to store crash and other information in various places (ERST, EFI variables, RAM) that are accessible after a reboot. Wang, Tim Bird, Jason Wessel, and others are interested in discussing that piece of the puzzle.
While QR codes may seem like something of gimmick, they can compress a fair amount of data into a form that can be digested elsewhere. Getting useful information out of an unresponsive, crashed Linux system is fairly difficult at this point, so finding better ways to do so would be good. Should the program committee decide to add this topic, a lively discussion seems likely. If not, though, enough people are looking into the idea that something will emerge sooner or later.
| Index entries for this article | |
|---|---|
| Kernel | Debugging |
| Kernel | Development tools/Kernel debugging |
