By Nathan Willis
February 11, 2013
Linux.conf.au 2013 in Canberra
provided an interesting window into the world of
display server development with a pair of talks about the X Window
System and one about
its planned successor Wayland (a talk which will be the subject of its own
article shortly). First, Keith Packard discussed coming
improvements to compositing and rendering. He was followed by David
Airlie, who talked about recent changes and upcoming new features for
the Resize, Rotate and Reflect Extension (RandR), particularly to cope
with multiple-GPU laptops. Each talk was entertaining enough in
its own right, but they worked even better together as the speakers
interjected their own comments into one another's Q&A period (or, from
time to time, during the talks themselves).
Capacitance: sworn enemy of the X server
Packard kicked things off by framing recent work on the X server as
a battle against capacitance—more specifically, the excess power
consumption that adds up every time there is an extra copy operation
that could be avoided. Compositing application window contents and
window manager decorations together is the initial capacitance sink,
he said, since historically it required either copying an
application's content from one scanout buffer to another, or
repainting an entirely new buffer then doing a page-flip between the
back (off-screen) buffer and the front (on-screen) buffer. Either
option requires significant memory manipulation, which has steered the
direction of subsequent development, including DRI2, the rendering
infrastructure currently used by the X server.
But DRI2 has its share of other problems needing attention, he
said. For example, the Graphics Execution Manager (GEM) assigns its
own internal names called global GEM handles to the graphics memory it
allocates. These handles are simply integers, not references to
objects (such as file descriptors) that the kernel can
manage. Consequently, the kernel does not know which applications are
using any particular handle; it instead relies on every application to
"remember to forget the name" of each handle when it is
finished with it. But if one application discards the handle while
another application still thinks it is in use, the second application
will suddenly get whatever random data happens to get placed in the
graphics memory next—presumably by some unrelated application.
GEM handles have other drawbacks, including the fact that they bypass
the normal kernel security mechanisms (in fact, since the handles are
simple integers, they are hypothetically guessable). They are also
specific to GEM, rather than using general kernel infrastructure like
DMA-BUFs.
DRI2 also relies on the X server to allocate all buffers, so
applications must first request an allocation, then wait for the X
server to return one. The extra round trip is a problem on its own,
but server allocation of buffers also breaks resizing windows, since
the X server immediately allocates a new, empty back buffer. The
application does not find out about the new allocation until it
receives and processes the (asynchronous) event message from the
server, however, so whatever frame the application was
drawing can simply get lost.
The plan is to fix these problems in DRI2's successor, which
Packard referred to in slides as "DRI3000" because, he said, it
sounded futuristic. This DRI framework will allow clients, not the X
server, to allocate buffers, will use DMA-BUF objects instead of
global GEM handles, and will incorporate several strategies to reduce
the number of copy operations. For example, as long as the client
application is allocating its own buffer, it can allocate a little
excess space around the edges so that the window manager can draw
window decorations around the outside. Since most of the time the
window decorations are not animated, they can be reused from
one frame to the next. Compositing the window and decoration will thus
be faster than in the current model, which copies the application
content on every frame just to draw the window decorations around it.
Under the new scheme, if the client knows that the application state has not
changed, it does not need to trigger a buffer swap.
Moving buffer management out of the X server and into the client
has other benefits as well. Since the clients allocate the buffers
they use, they can also assign stable names to the buffers (rather than the
global GEM handles currently assigned by the server), and they can be
smarter about reusing those buffers—such as by marking the
freshness of each in the EGL_buffer_age extension. If the X
server has just performed a swap, it can report back that the previous
front buffer is now idle and available. But if the server has just
performed a blit (copying only a small region of updated pixels), it
could instead report back that the just-used back buffer is idle
instead.
There are also copy operations to be trimmed out in other ways,
such as by aligning windows with GPU memory page boundaries. This
trick is currently only doable on Intel graphics hardware, Packard
said, but results in about a 50% improvement gain from the status quo
to the hypothetical upper limit. He already has much of the
DRI-replacement work functioning ("at least on my
machine") and is targeting X server 1.15 for its release. The
page-swapping tricks are not as close to completion; a new kernel
ioctl() has been written to allow exchanging chunks of GPU
pages, but the page-alignment code is not yet implemented.
New tricks for new hardware
Airlie's talk focused more on supporting multiple displays and
multiple graphics cards. This was not an issue in the early days, of
course, when the typical system had one graphics card tied to one
display; a single Screen (as defined by the X Protocol) was
sufficient. The next step up was simply to run a separate Screen for
the second graphics card and the second display—although, on the
down side, running two separate screens meant it was not possible to
move windows from one display to the other. Similar was "Zaphod"
mode, a configuration in which one graphics card was used to drive two
displays on two separate Screens. The trick was that Zaphod mode used
two copies of the GPU driver, with one attached to each screen. Here
again, two Screens meant that it was not possible to move windows
between displays.
Things started getting more interesting with Xinerama, however.
Xinerama mode introduced a "fake" Screen wrapped around the
two real Screens. Although this approach allowed users to
move windows between their displays, it did this at the high cost of
keeping two copies of every window and pixmap, one for each real
Screen. The fake Screen approach had other weaknesses, such as the
fact that it maintained a strict mapping to objects on the real,
internal Screens—which made hot-plugging (in which the
real objects might appear and disappear instantly) impossible.
Thankfully, he said, RandR 1.2 changed this, giving us for the
first time the ability to drive two displays with one graphics card,
using one Screen. "It was like ... sanity,"
he concluded, giving people what they had long wanted for
multiple-monitor setups (including temporarily connecting an external
projector for presentations). But the sanity did not last, he continued,
because vendors started manufacturing new hardware that made his life
difficult. First, multi-seat/multi-head systems came out of the
woodwork, such as USB-to-HDMI dongles and laptop docking stations.
Second, laptops began appearing with multiple GPUs, which "come
in every possible way to mess up having two GPUs." He has one
laptop, for example, which has the display-detection lines connected
to both GPUs ... even though only one of the GPUs could actually
output video to the connected display.
RandR 1.4 solves hot-plugging of displays, work which required
adding support for udev, USB, and other standard kernel interfaces to
the X server, which up until then had been used its own methods for
bus probing and other key features. RandR 1.4's approach to USB
hotplugging worked by having the main GPU render everything, then
having the USB GPU simply copy the buffers out, performing its own
compression or other tricks to display the content on the USB-attached
display. RandR also allowed the X server to offload drawing part of
the screen (such as a game) in one window on the main display.
The future of RandR includes several new functions, such as
"simple" GPU switching. This is the relatively
straightforward-sounding action of switching display rendering from
running on one GPU to the other. Some laptops have a hardware switch
for this function, he said, while others do it in software.
Another new feature is what Airlie calls "Shatter," which splits up
rendering of a single screen between multiple GPUs.
Airlie said he has considered several approaches to getting to
this future, but at the moment Shatter seems to require adding a
layer of abstraction he called an "impedance" layer between X server
objects and GPU objects. The impedance layer tracks protocol objects
and damage events and converts them into GPU objects. "It's
quite messy," he said, describing the impedance layer as a
combination of the X server's old Composite Wrapper layer and the
Xinerama layer "munged" together. Nevertheless, he said, it is
preferable to the other approach he explored, which would rely on
pushing X protocol objects down to the GPU layer. At the moment, he
said, he has gotten the impedance layer to work, but there are some
practical problems, including the fact that so few people do X
development that there are only one or two people who would be
qualified to review the work. He is likely to take some time off to
try and write a test suite to aid further development.
Marking the spot
It is sometimes tempting to think of X as a crusty old
relic—and, indeed, both Packard and Airlie poked fun at the
display server system and its quirks more than once. But what both
talks made clear was that even if the core protocol is to be replaced, that
does not reduce window management, compositing, or rendering to a
trivial problem. The constantly-changing landscape of graphics
hardware and the ever increasing expectations of users will certainly
see to that.
(
Log in to post comments)