|
|
Subscribe / Log in / New account

Multi-touch support landing in X

January 18, 2012

This article was contributed by Nathan Willis

X.Org 1.12, the next release of the reference X server, is currently in release candidate status. With it comes several new features, but the most anticipated is probably multi-touch support, courtesy of version 2.2 of the XInput2 extension. Peter Hutterer, maintainer of XInput2, has been writing about adding multi-touch support in his blog since December 2011 — including the architecture and what application developers will need to address before they can bring multi-touch and gesture support to users.

Of course, any discussion of multi-touch X begins by explaining what it is and what it isn't. Multi-touch refers specifically to the ability to recognize and use multiple input points on a single hardware device. For example, using more than one finger to manipulate objects on a touchscreen device, or multi-finger gestures on a touchpad. It is a different animal entirely from multi-pointer X (MPX), which is the ability to use two or more on-screen cursors at the same time, controlled by separate hardware devices. MPX support was added — also by Hutterer — to XInput2 in 2009, and was first released with X.Org 1.7.

Touchpad users are probably already familiar with two-finger scrolling and two- or three-finger mouse clicks. Although detecting multiple points-of-contact is involved, this is also not genuinely multi-touch — detecting the multiple taps or scrolling simply triggers a different event from the touchpad driver. A simple way to tell the difference is that with two-finger middle-clicks, the position of the user's fingers do not matter; the cursor stays (more or less) in one place. With a multi-touch gesture event like a pinch, however, tracking and interpreting the motion and relative positions of the fingers makes all the difference in the world.

Devices and events

XInput2.2 defines two distinct classes of multi-touch device that correspond to the two major modes of multi-touch user interaction. An XIDirectDevice is one where the touch event occurs on screen (as is the case with tablets, touch-screen monitors, and the like). In these cases, the coordinates where the event happens come directly from the position of the touch. The other class is an XIDependentDevice, which is typically a non-display input device like a touchpad. An XIDependentDevice controls a cursor in the "normal" fashion most of the time, but supports multi-touch events, too. The positions of the touch events on an XIDependentDevice are interpreted relative to the cursor position.

XInput2.2 also defines three event types that together describe a touch sequence in the wild: XI_TouchBegin, XI_TouchUpdate, and XI_TouchEnd. By definition, each touch event starts with an XI_TouchBegin, followed by zero or more XI_TouchUpdates, and ends with an XI_TouchEnd. Client applications that want to catch touch events must use the XISelectEvents method to register for all three event types.

To a client application listening for the new event types, touches appear no different from any other XInput event — with the addition of a "touch ID" that is returned in the event's detail field. This ID is a 32-bit value that the application must use to keep track of multiple, simultaneous touches. The touch events use the XIDeviceEvent structure that is also used for pointer and keyboard events; handling touches separately (including interpreting gestures) is left up to the application, library, or toolkit.

Device grabbing and ownership

Although the Begin, Update, and End events cover almost all touch event cases, there is a fourth touch event called XI_TouchOwnership defined by XInput2.2 in order to provide no-delay touch event processing in unusual situations.

The need arises because it is not always unambiguously clear which X client ought to process a series of touch events. For example, on a tablet, a window manager and an image viewer might both reasonably be expected to listen for touch events on the same window — the user could click and drag the window, or perhaps pinch and zoom in on the image. XInput allows a client to stake out a claim to the events that happen in a window by "grabbing" touch events.

Calling the grab API tells the X server that the grabby client wants exclusive access to the input device. But a client application that holds a grab on a device could begin to process a touch event sequence and only then discover that the sequence is not intended for it. To follow up on the example above, the image viewer might grab a touch event sequence, then process it and determine that it is not a gesture it recognizes. In that case, the grabby client sends a reject to the server, and the server then sends the sequence to the next client (i.e., the window manager), which hopefully does recognize the gesture and can react accordingly. When a client decides that it does want to process the touch sequence, it sends an accept instead.

That process works reasonably well when only one or two clients are involved, but every client that studies and rejects the gesture adds lag time, which is exacerbated because the X server re-sends the touch sequence to each subsequent client. If there are several involved, the UI suddenly becomes sluggish to the user. XInput2.2's solution is the XI_TouchOwnership event. A client can select to listen for this event in addition to the usual touch sequence events. When it does so, the X server sends a copy of each event sequence to the client immediately — even if another client has a grab on the device.

The result is that the client listening for XI_TouchOwnership can begin to process touch events immediately and lower its response times. However, because one of the other clients may accept the touch event sequence, whatever action the listening client takes needs to be either invisible and reversible, or hold off on execution until the X server tells the client that its turn has come by sending it an XI_TouchOwnership event.

Toolkit and application support

As the grabbing/ownership scenario illustrates, multi-touch support in X is difficult precisely because X handles multiple active applications running simultaneously. In 2010, Hutterer observed that this was a higher bar than that faced by Apple iOS or Microsoft's Surface project, both of which function in pre-defined environments with only a single full-screen application active at any one time. X is also expected to make multi-touch devices co-exist peacefully with keyboards and standard pointers, which touch-only devices generally do not support.

The result is that XInput2.2's multi-touch support arrives via a separate set of input device types and events, that applications will need to add support for manually. Consequently it may be several release cycles after X.Org 1.12 before window managers, GIMP, Firefox, and the rest fully support the new multi-touch features.

GUI Toolkit support is closer. Qt already has a stable multi-touch API that supports gestures, introduced in version 4.6. Ubuntu recommended Qt to application developers in 2011, while it maintained its own uTouch framework for the Unity environment during the XInput2.1 development cycle. Although uTouch included a patchwork of components in the past, the distribution is migrating it to XInput2.2 for its 12.04 release.

That leaves the GNOME side of field. Carlos Garnacho and others are working on adding multi-touch support to GTK+ and GDK in the multitouch branch. However, it is not clear when this work is expected to land. Garnacho noted his progress in his blog twice in early 2011, but there have been no formal updates there or on the mailing lists for months. Hutterer indicated on the Fedora wiki's multi-touch page that support may land as soon as GTK+ 3.4. Interested parties can keep up to date by watching the GNOME Shell Touchscreen wiki page.

The essential building blocks are lined up (if not all in place) for true multi-touch on X.org systems, so it is plausible that before 2013 arrives, desktop and mobile Linux users will consider multi-touch applications commonplace. But that does not mean that they will be happy about it. As Hutterer mentions on his blog, one of the most challenging aspects of multi-touch is coming up with good, intuitive user interface designs to fit together with the technical underpinnings. On that front, 2012 may also be the year of multi-touch experimentation.


Index entries for this article
GuestArticlesWillis, Nathan


to post comments

Multi-touch support landing in X

Posted Jan 19, 2012 9:37 UTC (Thu) by dgm (subscriber, #49227) [Link] (1 responses)

I have long been interested in how the X11 developers were going to solve this (quite difficult) problem. I have to say that I'm impressed, the TouchOwnership solution is brilliant! Good job.

Multi-touch support landing in X

Posted Jan 21, 2012 5:51 UTC (Sat) by cnd (guest, #50542) [Link]

We (the uTouch team) were wanting a solution for low latency through XInput, but we couldn't think of a good way to do it if we had these accept/reject round trips to the server. IIRC, at the X Developer's Conference in 2010 I was giving a talk on how we hoped to architect our gesture stack. I think I mentioned this issue, and Keith Packard mentioned on IRC that maybe we could send events to potential clients before they became the "owners" of a touch sequence.

After many discussions over beers, I went back home and drafted up the touch ownership concept. It has been in Ubuntu's prototype implementation since it debuted in Ubuntu 11.04, and we were able to demonstrate its usage. This gave us all a good feeling that it was the right approach.

Now, we're almost done with our rearchitecture of the uTouch gesture stack, built on top of XInput multitouch. This functionality allows us to have a single process dedicated to gesture recognition, which then dispatches gesture events to clients who want them. The clients can then take them as they are, or analyze them further to see if they match their own filters for gestures. None of that would be possible without huge latencies if we hadn't added touch ownership support :).

-- Chase

Multi-touch support landing in X

Posted Jan 19, 2012 19:52 UTC (Thu) by daniels (subscriber, #16193) [Link]

Nice article, but the example in your WM + drawing app is backwards: grabs are traversed from the root window downwards, rather than bubbling up from the child window. So, in that example, the WM would get ownership first, then the drawing app.

Multi-touch support landing in X

Posted Jan 20, 2012 15:28 UTC (Fri) by jrb (subscriber, #31610) [Link]

Carlos Garnacho just did a pretty nice update video of multi-touch progress in a GTK+ branch on his blog in response to this article:

http://blogs.gnome.org/carlosg/2012/01/20/multitouch-is-n...

It shows quite a lot of needed functionality, and is reportedly still on track to land in GTK+ 3.4.


Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds