The USB debugging arsenal

By Jake Edge
September 11, 2019

At the 2019 Embedded Linux Conference North America, which was held in San Diego in August, Krzysztof Opasiak gave a presentation on demystifying the ways to monitor—and even change—USB traffic on a Linux system. He started with the basics of the USB protocol and worked up into software and hardware tools to observe, modify, and fuzz the messages that get sent. Those tools are part of the arsenal that is available to those interested in looking deeply into USB.

Opasiak works in Poland for what he called a "small Korean company" (Samsung). He noted that it is not that easy to sniff USB traffic and that the ways to do so are not well known. But "there are no dragons"; nothing bad will happen if you do so. In some ways, USB is like the internet and some of the same tools can be used for both.

USB basics

A USB device provides one or more services (e.g. printing, storage, network interface, camera) to the USB host. Each device can only have a single host, but a host can have multiple client devices at any given time. In the USB model, "endpoints" serve the same role as ports do for the internet. Devices can have up to 31 endpoints, each providing a separate channel for the device. Endpoint zero (ep0) is required and is the only bidirectional endpoint; all of the rest are either IN (from device to host) or OUT (from host to device).

There are four types of endpoints: control, bulk, interrupt, and isochronous. The control endpoint is ep0, which is used for the enumeration of other aspects of the device. It could be used for services as well, but typically is not, because of various bugs in the implementations. A bulk endpoint is used to transfer a large amount of data that is not time sensitive; no reservation of bus bandwidth is done for these transfers. Unlike bulk, the interrupt endpoint type does reserve bus bandwidth; it is for small amounts of low-latency data. Finally, the isochronous endpoint is for large amounts of time-sensitive data, but delivery is not guaranteed; it is better to drop a frame than to delay this data.

There is a need for a way to discover the endpoints that a device provides, which is done via enumeration on ep0. Endpoints are organized into two higher-level constructs. The endpoints themselves are grouped into "interfaces"; those interfaces are then grouped into "configurations". Only one configuration can be active for a device; so a device may have multiple functions (services) available via different configurations, but only one of them is available at any given time. The ep0 control endpoint is always available, however, so that devices can be switched to a new active configuration.

The USB bus is host controlled; the host initiates all data exchange and there is no device-to-device communication. At the link layer, the host will send an IN token to the device; if the device has data ready to send, it will send it, otherwise it will send a NAK. The host will then retry the IN token until it times out.

For an OUT endpoint, the host will send an OUT token, followed by the data, to the device; if the device was ready to receive the data, it does so and sends an ACK, otherwise it NAKs the data and the host will retry sending until a timeout is reached. That is not particularly bandwidth-efficient—the OUT token is eight bits but the data is around 500 bits—so there is a way for the host to ping the device to see if it is ready to accept data rather than to blindly send it over and over.

Those are the low-level USB "transactions", but repeating the code to implement that in all of the drivers is not a good idea, Opasiak said. So drivers act at the level of USB "transfers", which consist of one or more transactions; typically the device or host hardware implement the transactions directly.

The Linux kernel provides a hardware-independent API for drivers that is based around the USB request block (URB), which provides a kind of envelope for the data. The API is asynchronous; transfers are initiated and a completion function is called when the transfer has finished. Since the kernel only works with URBs, and not the link-layer transactions, that is the layer at which any kernel-side USB sniffing can be done.

Sniffing USB

The first tool he described for looking at USB traffic is usbmon, which is a kind of in-kernel logger for URB-related events; it shows URB submission, completion, and error events. He noted that the data buffer in a URB is only valid in an event depending on the direction of the endpoint. OUT transfer buffers are valid for submit events, while IN transfer buffers are only valid as part of completion events.

There is both binary and text output available from usbmon, but instead you can use Wireshark to capture and parse the URB event data. He gave a short demo, starting with loading the usbmon kernel module. That sets up various devices in debugfs, one per USB bus in the system; there is also a device that can be used to capture the traffic from all of the buses. He then plugged in a USB device and checked the kernel log to see which bus it had connected to. He used Wireshark to capture data from the binary interface for the bus of interest and then further filtered the data by the device ID of the newly connected device.

Using usbmon is a simple, no-cost solution, but it does have some limitations. You cannot see the low-level transaction packets, just the URB transfers, for one thing. In addition, you need to be able to run code on the host that talks to the device, so you cannot see the traffic between the device and, say, a PlayStation, he said.

A hardware solution is one way around those limitations. USBProxy is code that runs on a single-board computer (SBC) of some sort, which can sit in between the device and the host and, as the name implies, proxies the data between them. In theory, it should work on any SBC that has the proper interfaces, but he was only able to get it working on a BeagleBone Black with a custom kernel. Using that, an attacker could inject malware into an Android device by doing a man-in-the-middle attack on adb, for example. USBProxy is not an out-of-the-box solution for USB interception, though; it "needs some love" to make it work.

The idea of using a logic analyzer for USB capture is often raised. They can be used to capture the traffic as well, he said, but typically only for low-speed or full-speed traffic. The newer high-speed signaling is done at 480Mbps, which, due to the Nyquist-Shannon theorem, means that it would need to be sampled at 1Ghz—something that is not common for reasonably priced logic analyzers.

OpenVizsla is a low-cost, open-hardware USB analyzer that connects between the host and device of interest. It cannot inject USB traffic, but it can capture data from any kind of hosts and devices without either side knowing that it has been done. It has an FPGA that writes the captured data to on-board RAM, which can then be retrieved over USB by an analysis host. There are tools that can be run on the analysis host to retrieve the data, including a Python script to retrieve the data and display it on the console, which is available in the OpenVizsla GitHub repository linked above. The Virtual USB Analyzer can be used to view the data, but Opasiak is skeptical that anyone will maintain a separate project just to view that USB data.

A better choice is the Wireshark integration that he and a student added. It can parse all of the traffic; it reuses the URB parsing that is already present and adds parsing for the low-level USB transactions. So, without expensive hardware, one can easily capture and analyze all of the data that flows between a USB device and the host it is connected to. He demonstrated the use of Wireshark to get the data from the OpenVizsla and showed it dissecting the traffic between two devices. He noted that if the higher-level traffic being handled is something that has a dissector available (e.g. adb), everything from the low-level to the actual application protocol data will be decoded and displayed.

Fuzzing

He then moved on to the FaceDancer device that can be used to do fuzz testing against host USB stacks. It has a custom USB stack in Python that is run on the host, which turns the FaceDancer into a researcher-controlled—or attacker-controlled—USB device. Using two FaceDancers—one emulating a host and the other emulating a device—allows a system to sit in between a USB host and device like the USBProxy. It can perform man-in-the-middle attacks against both ends of the connection.

Two other, related hardware devices are the GreatFET and the GreatFET Rhododendron. The GreatFET was originally used for radio hacking, he said. Each is compatible with the Python stack used by FaceDancer.

The Umap2 tool has been developed for FaceDancer, GreatFET, and a few other devices. It will emulate USB devices of various sorts, scan hosts to determine what types of USB devices are supported, and do some basic fuzzing using the Kitty framework.

Another fuzzing approach does not require any additional hardware. The Linux USB stack can be fuzzed with syzkaller on a Linux system using dummy_hcd, which acts as a kind of USB loopback device. With it, you can generate and handle USB traffic under the control of syzkaller to find problems in the USB stack. There are more tools out in the wild to do various kinds of USB fuzzing, he said.

Opasiak summarized his talk by noting that you can investigate USB in various ways without spending a lot of money. The OpenVizsla and GreatFET-based boards are on the order of $100, he said during the Q&A after the talk. There are various open-source-software and open-hardware projects that can be used, but the precise architecture needed is dependent on what you are trying to do; there is no perfect solution for everyone. USB is everywhere these days—cars, phones, laptops, etc.—so it is important to ensure that it is functioning correctly. He hopes to see much more effort put into USB testing and fuzzing in the future.

[I would like to thank LWN's travel sponsor, the Linux Foundation, for funding to travel to San Diego for ELC.]

Index entries for this article
Conference	Embedded Linux Conference/2019

The USB debugging arsenal

Posted Sep 11, 2019 18:32 UTC (Wed) by dougg (guest, #1894) [Link] (4 responses)

Does anything in the above apply to USB PD? That is USB type C's Power Delivery protocol that runs along the CC line. IMO that is the most interesting (new-ish) USB protocol that is often forgotten. It doesn't seem to fit in with the architecture of the USB subsystem outlined here. Should it be regarded as a separate subsystem? Alternate mode (e.g. PCIe<->DP) is controlled via USB PD so it is not clear that USB PD should be part of the power subsystem, but it is related.

The USB debugging arsenal

Posted Sep 12, 2019 14:13 UTC (Thu) by marcH (subscriber, #57642) [Link] (2 responses)

> Should it be regarded as a separate subsystem?

I think so. There's a lot in Type-C that has basically nothing to do with USB (which is why "USB-C" is bad name).

This is the best Type-C resource I know: https://medium.com/@leung.benson

It's funny to observe this two phase realization with Type-C:
1. "Oh great, a single cable for everything! Simple at last!"
2. "Wait, why does this combination not display/multi-display/go high-speed/charge/etc. ?"

And let's not even get into "Type-C cable fried my laptop" or manufacturers refusing to use logos not to stain their beautiful design...

USB C one connector for everything

Posted Sep 15, 2019 18:31 UTC (Sun) by giraffedata (guest, #1954) [Link]

"USB C" single connector was a really bad development for system builders. It used to be you could tell what was compatible by seeing if the connectors fit. With USB C, the same connector can carry lots of different protocols, but doesn't have to carry them all and usually doesn't. And the systems are not smart enough to give you diagnostic information even if you can get the software working. You don't get a message somewhere that says, "You've plugged in an HDMI device to a port that can't do HDMI. This port does USB 3.1 and Thunderbolt 3, though."

This was the situation a long time ago. 40 years ago, a DB-25 connector could be just about anything, from RS-232 to SCSI. But I thought we had learned our lesson.

When I first heard about USB C for everything, I thought it meant USB was being expanded to do what all those other protocols do, so if you were lucky enough to have a device with a C connector and a computer with one, you were done.

The USB debugging arsenal

Posted Jul 2, 2021 0:11 UTC (Fri) by kevinpr (subscriber, #139762) [Link]

> There's a lot in Type-C that has basically nothing to do with USB

They're actually almost completely orthogonal systems. You can swap USB PD data role (host/device) and power role (source/sink) independent of the existing USB2.0/3.0 host/device role. You can do USB PD _only_ on a Type-C cable with no USB2.0/3.0 traffic whatsoever (like your USB-C charger) or you can do USB2.0/3.0 traffic _only_ on a Type-C cable with no USB PD traffic, in which case you'll just get the standard 5V on VBUS.

The only place they kind of combine is with Alternate Modes, some of which mandate use of the SuperSpeed lines to carry different data, like your DP video signals. But even then the SuperSpeed controller doesn't really _interact_ with the USB PD controller, it's just told to get out of the way so the physical lines can be muxed to your DP controller instead.

The USB debugging arsenal

Posted Jun 14, 2021 18:38 UTC (Mon) by grundler (guest, #23450) [Link]

> Does anything in the above apply to USB PD?

I didn't see anything - but I know USB PD snooping isn't expensive:
https://www.chromium.org/chromium-os/twinkie

1) Plugable: https://archive.plugable.com/products/usbc-tkey/plugable-...
2) Totalphase: https://www.totalphase.com/products/usb-power-delivery-an...
3) Twonkie: https://hackaday.com/2021/02/20/google-inspired-usb-pd-sn...

Plugable unit is no longer available. Check the alternatives.