By Nathan Willis
September 6, 2012
The development cycle for GStreamer 1.0 took longer than many
(including some in the project itself) had originally anticipated. A
big part of the reason was the GStreamer team's desire to deliver a
stable and well-rounded 1.0 release — but that does not mean
that the 1.0 milestone designated a "completed" product with no room
for improvement. Several sessions at the 2012 GStreamer Conference in
San Diego explored what is yet to come for the multimedia framework,
including technical improvements and secure playback for untrusted
content.
Shiny new features
GStreamer release manager Tim-Philipp Müller gave the annual status
report, which both recapped the previous year's developments and set
the stage for what lies ahead. Several new content formats need
support, including the various flavors of Digital Video Broadcasting
(DVB) television (all of which are based on MPEG-2), the new streaming
standard Dynamic Adaptive Streaming over HTTP (MPEG-DASH) in both
server and client implementations, and the 3D Multiview Video Coding
(MVC) format. There is also room for improving the MPEG Transport
Stream (MPEG TS)
demultiplexer, where Müller said "lots of stuff is
happening," to the point where it can be confusing to follow.
GStreamer also still lacks support for playlists, which is a
very common feature that ends up being re-implemented by
applications.
In addition to the media formats, there are several additional
subtitle formats that the framework needs to support. But subtitle
support requires extension in other areas as well, such as porting all of
the subtitle elements to the new overlay compositor API, which allows
an application to offload compositing to the video hardware. A
related feature request is for a way to overlay subtitles at the native
resolution of the display hardware, rather than at the video content's
resolution. The two resolutions can be different, and to be readable
subtitles should be rendered at the sharpness provided by the display
resolution. The project also wants to expose more control over
subtitle rendering options to the application, again to provide
smarter choices and clearer rendering.
Hardware accelerated rendering has taken major steps forward in recent
releases, but it, too, has room for improvement. Müller mentioned
NVIDIA's Video Decode and Presentation API for Unix (VDPAU) as needing
work, and said the libva plugin that implements Video Acceleration API
(VA-API) support needed to be moved to the "good" plugin module and be
used for playing back more content. He also said more work was
required on GStreamer's OpenGL support. Although OpenGL output is
possible, a lot could be done to improve it and make integration more
natural. For starters, all OpenGL-based GStreamer elements must
currently be routed through special glupload or
gldownload elements; being able to directly connect OpenGL
elements to other video elements would simplify coding for application
developers. Second, OpenGL coding is easier when operations remain in
a single thread, which conflicts with GStreamer's heavy use of
multi-threading. There is a long
list of other proposed OpenGL improvements, including numerous
changes to the OpenGL structures.
Refactoring, streamlining, and such
The project is also intent on playing better with other device form
factors, including set-top boxes and in-vehicle systems. In some cases,
there is already outside work that simply needs to be tracked more
closely — for example, the MeeGo project's in-vehicle
infotainment (IVI) platform wrote its own metadata
extraction plugin that reportedly has excellent performance, but
has not been merged into GStreamer. In other cases, the project will
need to implement entirely new features, such as Digital Living
Network Alliance (DLNA) functionality and other "smart TV" standards
from the consumer electronics industry.
GStreamer developers have plenty of room to improve the framework's
day-to-day functionality, too. Müller noted that the new GStreamer
1.0 API introduces reworked memory management features (to reduce
overhead by cutting down costly buffer copy operations), but that many
plugins still need optimization in order to fully take advantage of
the improvements. It is also possible that the project could speed up
the process of probing elements for their features by removing the
current GstPropertyProbe and relying on D-Bus discovery.
There is also room for improvement in stream switching. As he
explained, you certainly do not want to decode all eight audio tracks
of a DVD video when you are only listening to one of them, but when
users switch audio tracks, they expect the new one to start playing
immediately and without hiccups.
Some refactoring work may take place as well. A big target is the
gstreamer-plugins-bad module, which is huge in comparison to
the gstreamer-plugins-good and
gstreamer-plugins-ugly. Historically, the "good" module
includes plugins that are high-quality and freely redistributable, the
"ugly" module includes plugins with distributability problems, and the
"bad" module contains everything else not up to par. But plugins can
end up in -bad for a variety of reasons, he said — some
because they do not work well, others because they are missing
documentation or are simply in development. Splitting the module up
(perhaps adding a gstreamer-plugins-staging) would simplify
maintenance. The project is also considering moving its Bluetooth
plugins out of the BlueZ code base and into GStreamer itself, again
for maintenance reasons. Post-1.0 development will also allow the
project to push forward on some of its add-on layers, such as the
GStreamer Streaming Server (GSS) and GStreamer Editing Services (GES)
libraries.
Finally, there have been several improvements to the GStreamer
developer tools in the past year, including the just-launched SDK and
Rene Stadler's log visualizer. Collecting those utilities into some
sort of "GStreamer tools" package could make life easier for
developers. The project is committed to accelerating its development
cycle for the same reason: faster releases mean improvements get
pushed out to application authors sooner, and code does not stagnate
relying on old releases. Müller announced that the project was
switching to a more mainstream N.odd unstable,
N.even stable numbering scheme, with the addendum that the
framework will stick to 1.x numbers until there is the need to make an
ABI break for 2.0.
Sandboxed streams
On a different note, Guillaume Emont presented a session about his
ongoing
experiments with sandboxing GStreamer media playback. The
principal use case is playing web-delivered content inside a browser.
The Internet may have been invented to watch videos of cute animals,
he said, but that does not mean you should trust arbitrary data found
online. In particular, untrusted data is dangerous when used in
combination with complex pieces of software like media decoders.
For media playback, the security risk stems from the fact that
although the decoder itself should not be considered evil and
untrustworthy (as one might regard a Java applet), the process
becomes untrustworthy when it must handle untrusted data. Thus,
GStreamer should be able to use the same decoder plugin on untrusted
and trusted content, but when handling untrusted content the framework
must have a way to initialize the player, then drop its privilege
level to isolate it.
Emont's work with sandboxing GStreamer playback started with setuid-sandbox,
a standalone version of the sandbox from Google's Chromium browser.
Setuid-sandbox creates a separate PID namespace and chroot for the
sandboxed process. Although it is not very fine-grained, Emont
thought it a good place to start and produced a working implementation
of a sandboxed GStreamer playback pipeline.
The pipeline takes the downloaded content as usual and writes it to a
file descriptor sink (fdsink element). When the
fdsink element reaches the READY state, it is opened
in an fdsrc element inside a setuid-sandbox, where it is
then decoded, loaded into a GStreamer buffer, and loaded into a
shmsink shared memory sink. The shmsink is the last
stage in the sandboxed process; outside the sandbox, the pipeline
accesses the shared memory and plays back the contents within. This
design sandboxes the demultiplexing and decoding steps in the
pipeline, which Emont said were the most likely to contain exploitable
bugs.
The playback pipeline worked, he said, but there were several issues.
First, he discovered that many GStreamer elements do not acquire all
of their resources by the time they reach the READY state,
though they do by the time they reach the PAUSED state that
follows. It might be possible to modify these elements to get their
resources earlier, he said, or to add an
ALL_RESOURCES_ACQUIRED signal. Next, he noted that the
memory created by the shmsink inside the sandbox could not be
cleaned up by the sandboxed process, but only by the "broker" portion
of the pipeline outside the sandbox. A more noticeable problem was
that sandboxing the decoder made it impossible to seek within the
file. Finally, the sandboxing process as a whole adds significant
overhead; Emont reported that a 720p Theora video would consume 30-40%
of the CPU inside the sandboxed pipeline, compared to 20-30% under
normal circumstances.
Some of the problems (such as the READY/PAUSED state
issue and the lack of seekability) might be solvable by sandboxing the
entire pipeline, he said, or by adding proxy elements to allow for
remote pipeline control. Either way, going forward there is still a
lot of work to do.
It is also possible that setuid-sandbox is simply not the
best sandboxing solution. There are others that Emont said he was
interested in trying out for comparison. He outlined the options and
their various pros and cons. Seccomp, for example, is even less
flexible, which probably makes it a poor replacement. On the other
hand, seccomp's new mode that combines
with Berkeley Packet Filters (BPF) provides a
much greater degree of control. It also has the advantage of being
usable without end-user intervention. SELinux, in contrast, could be
used to define a strict playback policy, but it is under the control
of the machine's administrators. GStreamer and application developers
could make suggestions for users, but ultimately SELinux is not under
the developers' control. Finally, Emont did his experiments on Linux,
but in the long term GStreamer really needs a sandboxing framework
that is cross-platform, and perhaps provides some sort of fallback
mechanism between different sandboxing options.
Emont's work is still experimental, and more to the point he is not
conducting it as part of GStreamer's core development. But he did
make a good case for its eventual inclusion. Certainly any part of a
large framework like GStreamer has bugs and therefore the potential to
be exploited by an attacker. But isolating the un-decoded media
payload from the rest of the system already goes a long way toward
protecting the user. As did Müller's talk, Emont's presentation shows
that GStreamer may reach 1.0 soon, but it is still far from
"complete."
Comments (6 posted)
By Jonathan Corbet
September 6, 2012
UEFI secure boot is a much-discussed mechanism by which a system's firmware
will refuse to run a bootloader that is not signed with a recognized key.
Its stated purpose is to thwart boot-time malware; in the absence of
boot-time checks, it is said, suitably privileged code could hide itself
deeply within the system. In the real world, secure boot is also useful as
a platform lockdown mechanism. It now seems that secure boot will not be
used to lock down x86-based systems or to prevent them from running Linux;
the story on the ARM architecture is less encouraging. But, even on x86,
"running Linux" is not quite the same as running the Linux system you have
now; we are now beginning to see what kinds of changes will be needed to
fit Linux into the secure boot environment.
The problem, simply put, is this: the objective of secure boot is to
prevent the system from running any unsigned code in a privileged mode.
So, if one boots a Linux system that, in turn, gives access to the machine
to untrusted code, the entire purpose has been defeated. The consequences
could hurt both locally (bad code could take control of the machine) and
globally (the signing key used to boot Linux could be revoked), so it is an
outcome that is worth avoiding. Doing so, however, requires placing
limitations in the kernel so that not even root can circumvent the secure
boot chain of trust.
The form of those limitations can now be seen in Matthew Garrett's secure boot support patch set. These patches
may see some changes before finding their way into the mainline, but
chances are that their overall form will not evolve that much.
The first step is to add a new capability bit. Capabilities describe
privileged operations that a given process can perform; they vary from
CAP_DAC_OVERRIDE (able to override file permissions) to
CAP_NET_BIND_SERVICE (can bind to a low-numbered TCP port) to
CAP_SYS_ADMIN (can do a vast number of highly privileged things).
The new capability, called CAP_SECURE_FIRMWARE, enables actions
that are not allowed in the secure boot environment. Or, more to the
point, its absence blocks actions that might otherwise enable the running
of untrusted code.
Naturally, the first thing reviewers complained about was the name. It
describes actions that can be performed in the absence of "secure
firmware"; some reviewers have also disputed whether it has anything to do
with security in the first place. So the capability will probably be
renamed, though nobody has come up with an obvious replacement yet.
Whatever it is eventually called, this capability will normally be
available to privileged processes. If the kernel determines (by asking the
firmware) that it has been booted in the secure mode, though, this
capability will be removed from the bounding set before init is
run; once a capability is removed from that set, no process can ever obtain
it. Matthew's patch set also adds a boot-time parameter
(secureboot_enable=) that can be used to simulate a secure boot on
hardware that lacks that feature.
In the secure boot world, processes lacking the new capability can no longer access
I/O memory or x86 I/O ports. Either of those could be used convince a
device to overwrite the running kernel with hostile code using DMA, compromising the
system, so they cannot be allowed. One
consequence is that graphics cards without kernel mode setting (KMS)
support cannot be used; fortunately, the number of systems with
(1) UEFI firmware and (2) non-KMS graphics is probably countable
using an eight-bit signed value. Other user-space device drivers will be
left out in the cold as well. Someday, Matthew says, it may be possible to
enable I/O access on systems where the I/O memory management unit can
enforce restrictions on the range of DMA operations, but, for now, all such
access is denied.
Similarly, all write access to /dev/mem and /dev/kmem
must be disabled, even if the kernel configuration would otherwise allow
such access.
The strongest comments came in response to another limitation — the
disabling of the kexec() system call. This call replaces the
running kernel with a new kernel and boots the result without going through
the system's firmware. It can be used for extra-fast reboots, though the
most common use, arguably, is to boot a special kernel to create a crash
dump after a system failure. Booting an arbitrary kernel obviously goes
against the spirit of secure boot, so it cannot be allowed.
Eric Biederman, in particular, complained about this limitation, saying:
This is Unix. In Unix we give root rope and let him hang himself
or shoot himself in the foot (not that we encourage it). Why are
we now implementing a security model where we don't trust root?
Matthew responded that, in fact, we can't
always trust root, and never have trusted it fully:
Because historically we've found that root is also often someone
who has determined a mechanism for running arbitrary code on your
machine, rather than someone you trust. Root and the kernel aren't
equivalent, otherwise root would just be able to turn off memory
protection in their userspace processes. This patchset merely
strengthens that existing dividing line.
In this case, the proper solution would appear to be to allow
kexec() to succeed if the target kernel has been properly signed.
That support has not yet been implemented, though. It's apparently on the
to-do list, but, as Matthew said: "We
ship with the code we have, not the code we want."
One other important piece of the puzzle, of course, is module loading; if
unsigned modules can be loaded into the kernel, the game is over. But, unlike
kexec(), module loading cannot simply be turned off, so the
implementation of some sort of signing mechanism cannot be put off. The
module signing implementation is not part of Matthew's patch set, though;
instead, David Howells has been working on the
problem for some time now. This code has been delayed as the result of
strong disagreements on how signing should be implemented; a solution was
worked out at the 2012 Kernel Summit and
this feature, in the form of a new patch set from Rusty Russell, should
find its way into the mainline as soon as the 3.7 development cycle.
The end result is that, by the time users have machines with UEFI secure
boot capabilities, the kernel should be able to do its part. Whether users
will like the result is another story. There is great value in knowing
that the system is running the software you want it to be running, and many
users will appreciate that. But others may find that the system is
refusing to run the software they want; that is harder to appreciate.
If things go well, the restrictions required by UEFI secure boot will come
to be seen like other capability-based restrictions in Linux: occasionally
obnoxious, but good for the long-term stability of the system and
ultimately circumventable if need be.
Comments (37 posted)
By Nathan Willis
September 6, 2012
At LinuxCon 2012, Bradley Kuhn, executive director of the Software
Freedom Conservancy (SFC), presented a session on funding free software
development. SFC's primary mission is to provide organizational and
legal support to free software projects, but it has also been
successful at raising funds to support development time — a task
that many projects find difficult.
Ancient history
Kuhn started the discussion with an account of his introduction to
free software, which began when he accidentally hit a key sequence in
Emacs that brought up the text of Richard Stallman's GNU Manifesto.
Reading the Manifesto was inspirational, said Kuhn, who has subsequently
pursued a career in free software — even serving as director of the
Free Software Foundation (FSF).
But on this occasion, he told the story not just as an
introduction, but also to point out an oft-overlooked section of the
document. Toward the end of the Manifesto, Stallman discusses several
possible alternatives to the proprietary software funding model.
Stallman argues that (contrary to the common objection that "no one
will code for free") free software
will always have developers, they will just earn smaller salaries than
they would writing proprietary software. He cites examples of people
who take jobs writing software in not-for-profit situations like MIT's
Artificial Intelligence Lab, and says that free software is no
different. Developers do tend to move to higher-paying jobs when they
can work on the same projects, he said, but there are many who write
free software out of commitment to the ideals.
Stallman suggests several alternative funding models under which
developers could make money working on free software. One is a
"Software Tax" in which software users each pay a small
amount into a general National Science Foundation (NSF)-like fund that
makes grants to developers. Another is that hardware manufacturers
will underwrite porting efforts; a third is that user groups will
form and collect money through dues, then pay developers with it.
Few people remember it, Kuhn said, but in the early days FSF itself
functioned much like one of the user groups Stallman describes in the
Manifesto. It accepted donations and directly paid developers to work
on GNU software. A long list of core projects, including GNU Make,
glibc, and GDB, were originally written by paid FSF employees. It
was only later, as these original developers took jobs working on free
software at companies like Red Hat and Google, that FSF turned its
primary attention to advocacy issues.
The non-profits
Today, Kuhn said, the majority of free software is written by
for-profit companies. Although that situation is a boon for free
software, the resulting code bases tend to drift in the direction of
the company's needs. He then quoted Samba's Jeremy Allison (a Google
employee) as saying "It's the duty of all Free Software
developers to steal as much time as they can from their employers for
software freedom." Since not everyone is in a position to
"be a Jeremy," Kuhn said, some developers need to be
funded by non-profit organizations in order to mitigate the risks of
for-profit control.
But proliferation of free software non-profits can be detrimental: it
confuses users, and each organization has administrative overhead
(boards, officers, and legal filings) that can steal time from
development. There are several "umbrella" non-profits that attempt to
offload the administrative overhead from the developers, including
the Apache Software Foundation (ASF), Software in the Public Interest
(SPI), and the SFC.
In addition to the administrative and legal functions of these
organizations, each has some mechanism for funding or underwriting
software development for its members. Donations to the ASF go into a
general fund, from which individual member projects can apply for
disbursement for specific work. SFC and SPI use a different model, in
which each member project has separate earmarked funds.
Most of SFC's disbursement goes toward funding developer travel to
conferences and workshops, Kuhn said. It also handles financial
arrangements for conference organizing, Google Summer of Code, and
other contracts, but the most interesting thing it does is manage
paid contracts for software developers. Typically these contracts are
fixed-length affairs that raised targeted funds for the contract
through donation drives — as opposed to, for example, earmarking
funds that accumulate through an ongoing donation button on the
project's web site.
Fundraising successes
Kuhn recounted several recent success stories from different SFC
member projects. The first was the Twisted engine for Python. Back
in 2008, the project was confronted with a familiar scenario: it was
successful enough that many core developers got high-paying jobs
working on Twisted consulting, which in turn led to bit-rot of core
functionality. The project decided to hold a fundraising drive, and
collected enough donations to pay founder Jean-Paul Calderone to work
for two years on bug-squashing, integration, and maintenance of the
core — work that was vital to the project, but not exciting
enough to attract a full-time position from the typical corporate
Twisted user.
In 2010, SFC did a similar fundraising drive to pay Matt Mackall to
maintain the Mercurial source code management system. Mackall said he
was able to support himself full-time on Linux kernel-space
development, but that it was hard to repeatedly "context
switch" to Python userspace and work on Mercurial. The SFC
fundraising drive funded Mackall full time from April 2010 through
June 2012.
The PyPy Python interpreter project launched three successful
fundraising initiatives in one year to support specific development
projects. The initiatives for PyPy's Py3k implementation of Python 3
and its port of the Numpy scientific computing package each raised
$42,000 in drives held a month apart in late 2011. The project has
also raised more than $21,000 and counting this year to fund
development of software transactional memory support. Kuhn related
that he had been concerned at one point that the frequency of the
fundraising drives would wear out the potential donor pool, but the
project forged ahead, and SFC is now funding four PyPy developers.
Fundraising challenges
A member of the audience asked what SFC thought about using
Kickstarter for fundraising, to which Kuhn replied "who is going
to Kickstarter for Python stuff who isn't also reading your
blog?" PyPy's recent success, he explained, probably owes more
to the fact that PyPy is a hot commodity in Python circles right now.
It has little trouble finding donors as a result, but by raising the
funds through drives hosted at its own site, it avoids having to pay
Kickstarter or another broker a potentially hefty cut of the
donations.
The tough part, he continued, is what to do when you are no longer on
top of the popularity bubble. Free software has a big "I gave
at the office" problem, he said. Many of free software's most
passionate users (and thus potential donors) already spend their own
time working on free software. Consequently, they react to
fundraising efforts with questions like "I code all day long, now you
want me to give money, too?"
Kuhn did not offer any simple solutions to the ongoing fundraising
issue, but clearly there are none. Like Yorba, SFC is interested
in exploring the possibility of funding free software projects, which
makes Kuhn's report on SFC's successes an interesting counterpart to
Yorba director Adam Dingle's examination of other funding methods.
It is clear that SFC's success stories differ from generic Kickstarter
or bounty-style drives in a few key respects. First, they are tied to
funding work by well-known contributors with good standing in the
projects — often key maintainers. Second, they are
tied to a development contract of specific length. But they still
differ in other important details: although the PyPy initiatives were
also tied to a specific feature set, the Twisted and Mercurial drives
were done to fund the harder-to-price tasks of bug fixing and routine
maintenance. Free software development is not a homogeneous
process, so there is certainly no one-size-fits-all answer to the
fundraising question. But it is reassuring to know that organizations
like SFC (with its commitment to software freedom) can still find
success where money is involved.
Comments (10 posted)
Page editor: Jonathan Corbet
Next page: Security>>