GStreamer has come a long way in the ten-plus years it has been in
existence. The push is on to finally put out a 1.0 release, possibly
before the end of this year, while work has not stopped on the existing
0.10.x code base. In two keynotes from the second-ever GStreamer conference,
which was held in Prague October 24-25, Wim Taymans outlined the future, while
Tim-Philipp Müller looked at developments in 0.10.x.
History and background
Taymans started his talk with a bit of history and background. GStreamer
is a library that makes it easy to create multimedia applications. It does
so by providing a pipeline architecture with multiple components that can
be plugged together in many different ways, which is what provides
GStreamer with its "power and flexibility", he said.
Applications that use GStreamer have generally set the tone for the
direction of the framework. GStreamer came into the GNOME project in 2002
with the Rhythmbox music player, and the audio capabilities worked well at
that time. Video was a different story, but by 2004, the Totem video
player ensured that the video in GStreamer worked, Taymans said.
Since GStreamer is a library, it can be integrated with many different
applications, frameworks, toolkits, development environments, and so on.
Whatever gets integrated with GStreamer gains "many new
features". That has led to browsers that use GStreamer for playing
<video> and <audio> tags. In addition,
GStreamer is integrated with (and ported to) Android to bring its
capabilities to that
mobile platform. Integration is the "strong point of
GStreamer", he said.
Taymans then went through all of the different types of applications that
GStreamer is being used for, some where it shines and other areas where it
still needs some work. For example, transcoding is in good shape; he
pointed to Transmageddon as one
example. Communication tools that provide voice and video calls, such as Empathy, are another area where
GStreamer works well.
On the other hand, complicated audio production, like
what Buzztard is doing, is an area that
needs more work. Streaming and media distribution are areas for
improvement as well. Various companies are using parts of GStreamer to do
media distribution, but it is "not quite there yet", he said.
Overall, GStreamer is "everywhere" in the multimedia space,
but there is plenty more to be done.
Getting to 1.0
Taymans then turned to the future and wanted to "look at what needs
to be done to get to 1.0". A 1.0 release has been discussed since
2007 or 2008, he said, but there has been a problem: 0.10 (and earlier) has been "doing
things very well", so there was a great temptation to just keep
extending that code.
But there are "new challenges", Taymans said. There are
things that can be done with 0.10 but aren't easy or optimal, partly
because they weren't envisioned back when the project started. Embedded
processors aren't really a new challenge, but
GStreamer could be better on today's embedded hardware, especially in the
areas of power consumption and offloading work to other specialized
processors (like DSPs) that are often available.
Video cards with graphical processing units (GPUs) are another resource
that could be used better by the framework. Moving video decoding to GPUs
when possible would help with power consumption as well. It can be done
with 0.10, but "it's hard to do", Taymans said. Other things
like video effects could be off-loaded to the GPU. Effects can be
done on the main CPU, and are today, but it consumes more power and cannot
handle higher resolution video.
Memory management is another area where improvements are needed. GPUs have
their own memory, so
the challenge is to avoid copying memory "from the CPU to GPU and back again and
again". It is important for performance and GStreamer doesn't currently
handle the situation well, Taymans said.
The formats of video in memory
can't be specified flexibly enough to match what the hardware is
producing or consuming. That means that GStreamer elements have to
copy the data to massage it into the right format. Better
memory management leads of better integration with the hardware and
increased performance, all of which leads to better battery life, he said.
Dynamic pipelines, where the elements in a multimedia pipeline are removed
or changed, are another feature that is possible with 0.10, but difficult
to do correctly. Examples that Taymans gave include the Cheese webcam
application, which can apply various effects to the video data and effects
can be added or changed on the fly. Another example is the PulseAudio
pass-through element, which may be decoding mp3 data in software when a
Bluetooth headset is plugged in. The headset can do the decoding in
hardware, so the pipeline should be changed to reflect that. There are many
more examples of dynamic pipelines, he said, but making them work right has
been difficult for application developers—something that should
change in 1.0.
The 0.11 branch (which will eventually become 1.0) opened after last year's
GStreamer conference (in
Cambridge, UK in October 2010). By June it had "everything
needed" to start porting elements (aka plugins) and applications to
use the new features. A 0.11 release was made in August, followed by
0.11.1 in September.
There are "many many cleanups" in the 0.11 branch, Taymans
said, "too many to list". Over the years, the API has
accumulated lots of "crazy" things that needed changing.
Methods were removed, signatures changed, parameters removed, and so on.
In fact, Taymans said that they "implemented everything we said we
would do at the last GStreamer conference—more actually".
Part of the "more" was the memory management changes which were not fully
thought out a year ago.
The core, base, and FFmpeg parts of GStreamer are 100% working and passing
the unit tests in 0.11. The "good", "bad", and "ugly" sets of plugins are
respectively 68%, 76%, and 11% ported to the new API. The video mixer is
not yet ported as the team "focused first on [video] playback so we
can start shipping something". Lots of other applications have been
ported, but there is still plenty to do. Many of the changes, both for
plugins and applications, can be done semi-automatically, Taymans said.
All of that puts the project on track for a 1.0 release later this year
possibly, though that will all be worked out at the conference, he said.
All of the elements will not be ported in that time frame, as doing one per
day would finish the job "sometime in June". So there will
need to be a prioritization of the plugins and applications that get ported
before the release. "People who want their application to work are
welcome to help porting", Taymans said.
The plan is
for 0.10 and 1.0 to co-exist; both can be installed and applications will
choose the right one for their needs. That will allow for a gradual
transition to 1.0 over "half a year or maybe a year".
Taymans would like to see 1.0 make it into the next releases of the major
distributions along with some applications ported to 1.0, specifically
mentioning Cheese and Totem.
Back on the 0.10 branch
On day two of the conference, Tim-Phillip Müller updated attendees on
what has been going on in the stable branch and noted that "almost
everything applies to the new branch" because it will all be merged
over. By way of introduction, Müller said that his work for the
project has been
"managing releases — or not — trying anyway". He
also likened GStreamer to Lego, noting that it allows applications to
connect up all kinds of complicated "multimedia handling"
pieces to provide the right experience for its users.
The 0.10 series has been API stable for five or six years, he said, but
0.11 changes the API to fix some longstanding problems. Both can be
installed together with some applications using the old and some the new,
which "works fine". There have been "some possibly
interesting changes" that have gone on in the 0.10 series and since
everyone can't follow bugzilla, mailing lists, and IRC, Müller's talk
was meant to fill them in.
He presented some statistics on the development of 0.10 since last year's
conference. There were 962 bugzilla bugs fixed and 2747 files changed.
That resulted in 350K lines inserted and 100K deleted in a code base of 1.4
million lines of C and C++ code. Müller said that he was surprised to
find that there had been 201 unique patch contributors in that time.
Since Taymans has been working on the new development branch, development
in the core has slowed down, Müller said, but there are some
significant new core library features. The gst_pad_push()
function (which pushes out a buffer to the next stage in the pipeline) has
been made lockless by using atomic operations. Previously, multiple locks
Progress and quality of service (QoS) messages have been added so that
elements can post status information to applications. Progress messages
could be the
status of a data transfer or a mounting operation. QoS messages allow the
internal QoS events (which indicate if a frame was dropped, rendered too
late, or too early) to propagate out to applications, which will allow them
to display statistics.
There have been new base classes added, for the obvious advantage of code
reuse, he said. There is a need for APIs that work for all of the
different use cases and are reasonably CPU-efficient, rather than having
each plugin or element write its own. The reason that it makes sense to
add new base classes so close to the end-of-life for 0.10 is that it will
make porting to 1.0 easier because some of the new code can be used early,
then you can "go to 1.0 for free", he said.
GstBaseParse is a new class to move the code that
parses the "caps" (capabilities of elements), which
has been repeated in many different places. The quality of the parsers
has increased dramatically because of the change, but they have already
seen some drawbacks in the new base class. That may result in a change to
the class or adding a new GstBaseParse2.
GstVideoEncoder and GstVideoDecoder are new base classes
to move some of the repeated code into one place. Right now, they live in
the "bad" plugins, and need some API changes before moving into "good".
Likewise, GstCollectPads2 (for notification of buffer
availability) and GstAudioVisualizer are new base classes.
There are some base classes that may be added in the future including
GstMuxer and GstDemuxer to encapsulate multiplexing (and
de-multiplexing) functionality. Some "proper filter base
classes" and a N-to-1 synchronization base class (for syncing
various streams like video and a text overlay) are also possibly on the horizon.
There is also a "new and improved high-level API", Müller
said. The playbin2 high-level playback API has some additional
features including progressive download, new buffering methods for network
streams, and support for non-raw streams. camerabin2 replaces
camerabin as the camera input high-level API. The latter
"wasn't bad, but was tricky to use". camerabin2 is a
new design with new features including support for things like viewfinder
streams, snapshotting, and other new hardware capabilities.
GstDiscoverer is a new "fast(ish)" metadata
extraction API. It allows queueing files of interest and will give the
stream topology, which allows better decisions regarding decoding,
transcoding, and transmuxing. There are still some issues with it, he
said, including getting an accurate duration for a stream without decoding
it. There is also no way yet to tell an application whether the duration
reported is an
estimate and how good of an estimate it is.
Müller also mentioned the GStreamer
Editing Services (g-e-s), which is a "very nice API" that is
used by the PiTiVi video editor. The GStreamer RTSP server (which is
really a library, he said) that allows you to "share your
video camera or stream your files". There are Python and Vala
bindings for the library and it comes with an example program that means
that you can do those things "really easily".
The "middleware" layer (which is the "plumbing" of GStreamer)
has also seen improvements, especially in the areas of parsing (using the
new base class) and moving parsing out of decoders and into parsers that
get plugged in automatically to handle metadata extraction. The PulseAudio
pass-through was mentioned again as part of the middleware changes. In
addition, Orc code
has been added to produce "just in time" machine code for things like video
scaling and conversion.
Replacing the FFmpeg color space code is currently a work in progress. It
was forked from the FFmpeg code six years ago and GStreamer should probably
apologize to the project because the fork gives FFmpeg "a bit of a
bad name" because of how old it is, he said. Supporting the VA API and VDPAU video
is also under development and will eventually automatically plug in on
systems that support video cards with those APIs. Lastly, the wrapper for
OpenMAX (which is multimedia framework for embedded devices including
Android) is being worked on. Müller said that no matter what
GStreamer folks might think of OpenMAX, "it's there and people want
to use it".
There were a number of other future features that he listed. Adding
support for playlist handling, so that each application doesn't need to do
its own, proper chapter and track handling, and device discovery and
probing were all on that list.
3D video is something that can currently be done with 0.10, but it doesn't
integrate well with hardware and other components and will likely be worked
on for 1.0.
Müller wrapped up his talk with a vote of confidence in both the 0.10
and 0.11 branches. He thinks that people will fairly quickly be switching
away from 0.10 once 1.0 is out, which is in keeping with what was seen at
the last release which changed the API (from 0.08 to 0.10). He said that Taymans had
"undersold 1.0 yesterday" in his talk, "it's gonna
rock". All in all, it seems like that the GStreamer project is very
healthy and making strides in lots of interesting directions.
[ I would like to thank the Linux Foundation for travel assistance to
attend "conference week in Prague". ]
to post comments)