LWN.net Logo

X11 wire-level analysis with x11vis

May 4, 2011

This article was contributed by Michael Stapelberg

A few weeks ago, the initial announcement of x11vis, an X11 protocol visualizer, was posted to the xorg mailing list. Only few people are developing low-level X programs these days (think of xkill, xwininfo, etc.) and all these tools actually work. Why would anyone need a tool to visualize X protocol traffic? In a way, x11vis is similar to tcpdump: both are wire-level analysis tools — tcpdump shows network traffic while x11vis shows X traffic. Even though most people are not network engineers, from time to time it comes in handy to check if a web application is really using SSL.

X basics

When an X client, say Firefox, connects to the X server, it first has to authenticate itself. As soon as the connection is established, data flows in two directions: the client can only send requests while the server sends replies (to those requests), errors, and events.

Let's have a look at the very basic task of creating an empty window on your screen. The client starts by sending a CreateWindow request which initializes a specified X ID (an unsigned 32-bit integer) with a position/size, a parent window, border, etc. Afterwards, properties such as the window title or icon are set with the ChangeProperty request. To actually display the window on your screen, a MapWindow request is sent.

These are the requests which the client sends. None of these actually have replies, but any request can generate an error — for example if you pass an invalid parent window in the CreateWindow request. After the window has been mapped (made visible on your screen), the client will receive a MapNotify event. Other often used events include things like KeyPress and ButtonPress, which are generated by keyboard and mouse input.

Before x11vis: xtrace

Before x11vis, the standard tool for analyzing X applications was xtrace. Like strace, it prints a textual representation of what happens. While that works just fine and fits well in the Unix text world, it is not easy to use for analyzing problems for several reasons:

  • The vast amount of plain text output is very hard to understand or even to navigate in. Each line of xtrace output starts with a number representing the connection which is used for this particular packet (Firefox could be number 1, GVim number 2, and so on). The rest of the line contains a full dump of the packet, including all data. For some requests, this data can be more than I can fit on my 1280x800 screen.

  • In the X protocol, a lot of IDs are used. There are IDs for windows, atoms, fonts, pixmaps, and so on. xtrace translates atom IDs to human-readable names, so you will see "UTF8_STRING" instead of 0x113. However, such a translation is completely absent for window IDs. Analyzing X sessions with more than one client quickly becomes difficult.

  • While it is possible, hiding specific events is tedious in a text editor. When debugging a real-world problem, you are usually not interested in packets such as InternAtom or PropertyNotify.

  • A user might want to display information related to information in the packet that is currently being inspected (for example all events for a affected window, not only the CreateWindow packet). This is naturally not currently possible with xtrace, as it presents only textual, non-interactive output.

Inside x11vis

x11vis strives to be better in those areas outlined above. It consists of two main parts: the so-called "interceptor" and the GUI. The interceptor is a Perl daemon that implements a proxy between your client(s) and the X server, dissecting all packets that are sent through it and dumping them into a JSON file. The code that dissects the raw bytes into a nice data structure is auto-generated from the XML protocol description in xcb-proto. The GUI parses this JSON file and displays all packets in a well-arranged fashion.

The GUI is not a stand-alone application, but is implemented as a web application using jQuery. This decision was made because building the GUI on top of the HTML Document Object Model (DOM) with CSS is a lot quicker than writing custom widgets in Qt or GTK (in terms of development time). Also, it makes x11vis easily usable on computers on your local network, which is a common setup when debugging X problems.

Example: Comparing XCB and Xlib

I mentioned XCB as the project which includes the XML protocol description. XCB stands for X C-language Bindings and is the successor of Xlib. By automatically generating the bindings from the protocol description, XCB achieves multiple goals. First of all, every function has a predictable name and by using xcb_ as a prefix, and it does not clutter the namespace (unlike Xlib with types such as Font and Display). More importantly, XCB does not hide the asynchronous nature of the X protocol from the programmer. When a typical X application starts, it has to request the Atom IDs for a number of atoms, say 20. With Xlib, there is the XInternAtom function that returns the ID for a given name. XCB instead provides two functions: xcb_intern_atom() and xcb_intern_atom_reply(). The former returns a cookie which you pass to the latter to get the actual result. The idea is that you place your requests as early as possible, do something else, then fetch all the replies.

A good example of both XCB's asynchronous nature and x11vis is analyzing the xwininfo(1) program. By starting:

    xwininfo -id 0xf00 -children
the program will first query the given window (an iceweasel window in this case) for all of its children and then request some properties for every child.

[x11vis Xlib]

The screenshot above shows the x11vis output when using xwininfo 1.0.5, which uses Xlib. On the left, you can see all the requests and replies, organized in bursts. As Xlib is blocking, each burst contains only one packet.

[x11vis XCB]

Compare the Xlib shot to the one above, where xwininfo 1.1.0 uses XCB to talk to the X server. While you can still identify three round-trips, you can see that the burst on the bottom of the screenshot contains requests for different information of more than one window.

You can see that in the first burst, x11vis displays "Iceweasel" instead of the window ID 0xf00, even though that information is only available later on. Also, the description of each packet is a short representation of the most important facts. The GetGeometry reply is labeled "(3362, 1112) 155 x 21" and can be expanded by clicking on it. In the xtrace output, the equivalent is the following line:

    000:>:0003:32: Reply to GetGeometry: depth=0x18 root=0x000000be \
        x=3362 y=1112 width=155 height=21 border-width=0

Example: Identifying a race condition in i3-wm

x11vis has been used multiple times to solve real-world X problems. For example, in the i3 window manager, there was a problem with themed mouse cursors: they would not show up on the very first window decorations that were created around already existing windows, but only on window decorations for windows that were created later on.

[Cursor error]

We know that the problem is related to creating windows and setting the cursor for these windows. Therefore, I started by scrolling down to the first CreateWindow request and checked if there were any X errors (pink background in x11vis). And in fact, as you can see on the screenshot, there is one corresponding Error packet for every Request trying to use the themed cursor. You can see the bad_value of the Cursor error being c_0 (unnamed cursor 0) which is precisely the cursor ID we are setting in the ChangeWindowAttributes request above.

[Cursor initialization]

I then used the search function of my browser to see where c_0 was actually initialized. The location of the CreateGlyphCursor request for c_0 was actually after the X errors. Now this explains the symptom, but in the code, the order is correct: First, the cursor is initialized (line 291), then the existing windows are handled (line 425). Having a closer look at the burst reveals that the cursor initialization is actually sent via the separate Xlib connection instead of the main XCB connection. As both connections buffer, my next guess was that the code neglects to flush the Xlib connection. It turns out, the guess was correct.

This bug was found in a short time due to two factors. On the one hand, X errors can be spotted very easily in x11vis. On the other hand, distinguishing the different connections requires only a quick glance to the top of each burst.

Conclusion

In this article, I explained how x11vis tries to help X developers: it visualizes the X protocol on wire level, providing some helpful features like markers, as well as folding or mapping human-readable names to connections and X IDs. x11vis is still a young project and is looking for contributors. If you want to help making x11vis become a better tool for you, please do not hesitate to contact me at michael@x11vis.org or go get the source and documentation at the project web site.


(Log in to post comments)

X11 wire-level analysis with x11vis

Posted May 6, 2011 7:58 UTC (Fri) by tfheen (subscriber, #17598) [Link]

This looks interesting. I do wonder why the author didn't do this as a wireshark plugin instead, though.

X11 wire-level analysis with x11vis

Posted May 6, 2011 10:35 UTC (Fri) by mstapelberg (subscriber, #66308) [Link]

I do see the benefits of a wireshark plugin. However, I decided against it because of the user interface constraints. With the current x11vis GUI, I am free to develop in any direction that I see fit. I especially thought of displaying the different x11 clients next to each other horizontally (useful on big monitors only), for example :).

I am also not sure if wireshark allows packages to be modified afterwards. x11vis kind of jumps back to modify packages as soon as it gets new information.

X11 wire-level analysis with x11vis

Posted May 18, 2011 20:23 UTC (Wed) by oak (guest, #2786) [Link]

Looks very interesting, but I'm not familiar with Perl, its modules or their typical package names. Which of these are lacking e.g. from Debian (Stable):
http://x11vis.org/docs/manual.html#_requirements
?

(For example "apt-cache search perl|grep -i dancer" doesn't return anything.)

X11 wire-level analysis with x11vis

Posted May 19, 2011 4:02 UTC (Thu) by dtlin (✭ supporter ✭, #36537) [Link]

Dancer is not yet packaged in Debian, but there's an ITP. Same for Twiggy (ITP).

AnyEvent, AnyEvent::Socket, IO::All, JSON::XS, Moose, MooseX::Singleton, and XML::Twig are all in Squeeze already.

X11 wire-level analysis with x11vis

Posted May 19, 2011 5:45 UTC (Thu) by mstapelberg (subscriber, #66308) [Link]

The best way to install all the modules right now is with cpanminus. See http://cpanmin.us/

After installing, just run cpanm Dancer Twiggy … as root and there you go.

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds