By Jonathan Corbet
February 3, 2008
Our graphical interfaces, as implemented through the X Window System, are
designed around a single keyboard and a single mouse. But humans are
social creatures who want to work together and share systems; they also
tend to design their activities around the fact that we have two hands.
Moving X out of the single-device model is not a task for the faint of
heart, but Peter Hutterer is making a go of it. His LCA talk on
multi-pointer X was an
interesting update on where this work stands.
The X device model is based on the idea of a core keyboard and a core
pointer. Even in a situation where multiple input devices are present (a
second mouse plugged into a laptop, say), the application still only sees a
single, core device. There is no way to tell, using these core devices,
which physical device generated any given event. This, of course, will be
an obstacle for any application wanting to provide multi-device support.
As it happens, the XInput extension has
provided basic
multiple-device support for many years. XInput events look much like core
device events, except that (1) applications must register to receive
them separately, and (2) they include an ID number identifying the
device which generated the event. XInput does not solve the problem by
itself, though, for a couple of reasons. Beyond the fact that it does not
provide a way for users to specify how different devices should be handled,
XInput suffers from the little difficulty that approximately 100% of X
applications do not make use of it. So nobody is listening to all those
nice XInput events with associated device IDs. The one exception Peter
mentioned is the
GIMP, which uses XInput to deal with tablets.
Of course, multiple devices work on current systems; that is because the X
server also generates core events for all devices. That causes the device
ID to be lost, but, since applications do not care, this is not a problem,
for now. But it does mean that we are still stuck in a world where systems have
a single pointer and a single keyboard.
Luckily for us, says Peter, multi-pointer X is on the horizon. MPX extends
X through the creation of the concept of "master" and "slave" devices.
Master devices are those which generate events seen by MPX-aware clients;
they are virtual devices which can be created and destroyed by the user at
will. Slave devices, instead, correspond to the physical devices attached
to the system. Through the use of a modified xinput command,
users can create masters and attach specific slaves to them.
In the MPX world, one of three things will happen whenever something is
done with a physical (slave) device:
- The X server will create an XInput event from the slave device and
deliver it to any applications which have asked for such events.
- If that event is not delivered (because nobody was interested), a core
event from the associated
master device is created and queued for delivery.
- If the event is still undelivered, the server will create an
XInput event from the master device to which the slave is attached and
attempt to deliver that.
The end result is a scheme where multiple devices still work as expected
with non-MPX-aware applications. But when an application which does take
advantage of MPX shows up, it will have access to the real information about what
the user is doing.
Peter ran a demo of some of the things he was able to do. By default,
there is still only one pointer and one keyboard. Once a new master is
created, though, and slave devices attached to it, things get more
interesting. Two mouse pointers exist on the screen, each of which can be used
independently. It's possible to be typing into two separate windows at the
same time. Or, with the right window manager, the user can move windows
simultaneously, or resize a window by grabbing two corners at the same
time. It was great fun to watch.
MPX brings with it an API which can be used with multi-device
applications. When applications use it, says Peter, the result is "eternal
happiness." That just leaves the problem of "the other 100%" of the
application base which lacks this awareness. To a certain extent, things
just work, even when independent pointers are used in the same
application. There are some exceptions, though, which have required some
workarounds in the system.
For example, applications typically respond when the pointer enters a
specific window - illuminating a button within the application, for
example. Things work fine when two pointers enter that button. But,
likely as not, once the first pointer leave the button, it will go dark and
refuse to respond to events from the other pointer. The solution is to
nest enter and leave events, so that only the first entry is reported to
the application, and only the final exit. Another problem results when a
mouse button is pushed while another button is being held down (for a drag operation,
perhaps) on a different device. Do that within Nautilus, and the
application simply locks up - not the eternal happiness Peter was hoping
for. So, when the application holds a grab on one
device (as happens when buttons are held down), no other button events will
be reported. Also problematic is what to do when the application asks
where the pointer is: which pointer should be reported? In this case, the
server simply assigns one pointer as the one to report on. All of this
makes standard applications work - almost all the time.
Some interesting problems remain, though. How, for example, should a
window manager place new windows in a multi-user, multi-device situation?
Users will want their windows in their part of the display space, but the
window manager has no real way of knowing where that is - or even which
user the window "belongs" to. In general, the
whole paradigm under which desktop applications have been developed is
unprepared to deal with a multi-device world.
Things will get worse as more types of input devices enter the picture.
Touch screens are bad enough; they have no persistent state, so things
change every time the user touches the device. But touch screens of the
future will report multiple touch points simultaneously, and each of those
will have attributes like the area of the touch, the pressure being
applied, etc. Perhaps the device will sense elevation - a third dimension
above the device itself.
All of this is going to require a massive rethinking of how our
applications work. There are going to be a lot of big problems. But that,
says Peter, is what happens when one explores new areas. One gets the
sense that he is looking forward to the challenge.
(
Log in to post comments)