|
|
Log in / Subscribe / Register

Building APIs on top of GStreamer

By Nathan Willis
October 23, 2013

GStreamer Conference

The GStreamer multimedia framework is around twelve years old, but it just made its official 1.0 release last year. The 1.0 milestone was, among other things, a statement of the project's commitment to keeping the API and ABI stable throughout the duration of the 1.x cycle. As several talks at the 2013 GStreamer Conference demonstrate, however, such stability does not mean that there is nothing left to be done in order to make GStreamer an appealing framework for developers. Instead, several ancillary projects have taken on the task and are building higher-level APIs meant to attract further development.

Getting nonlinear

One of the higher-level API projects is GStreamer Editing Services (GES), which is a framework designed to support nonlinear video editing. There have been several GStreamer-based nonlinear editors (NLEs) in the past—most notably PiTiVi—but even fans of those applications would have to admit that they have not experienced the same level of (relative) popularity as seen by GStreamer-based media players.

Within the GStreamer community, the reason for this disparity is generally accepted as the fact that GStreamer itself is optimized for tasks like playback, capture, and transcoding. NLEs, while they obviously need to make use of such functionality, have their own set of primitives—for example, video effects and transitions between two (or more) tracks.

[Duponchelle and Saunier]

GES is a framework that implements NLE functions. Mathieu Duponchelle and Thibault Saunier spoke about it on the first day of the conference. While GES is now used as the NLE framework for PiTiVi, the two said, it is intended to serve as a general-purpose framework that developers can use to add video editing functionality to their own applications or to build other editors.

In fact, they explained, GES itself is a wrapper around another intermediary framework, GNonLin. The goal is for GES to offer just the higher-level NLE APIs, while GNonLin serves as the glue between GES and GStreamer itself. As such, GNonLin implements things like the GnlObject base class, which adds properties not found in base GStreamer elements like duration, start and stop positions, and an adjustable output rate (for speeding up or slowing down playback without altering the underlying file). Similarly, the GnlOperation class encapsulates a set of transformations on GnlObjects, as is needed to define filters and effects.

GES, in turn, defines the objects used to build an editing application. The most important, they said, is the idea of the GESTimeline. A timeline is the basic video editing unit; it contains a set of layers (GESLayers in this case) stacked in a particular priority. GESLayers contain audio or video clips, and the timeline can be used to re-arrange them, to change the layer stacking order, and to composite layers together. But ultimately a GESTimeline is just a GStreamer element, the speakers said, so it can be used like any other element: its output can be plugged into a pipeline, which makes it easy for any NLE to output video in a supported GStreamer format or to send it to another application.

GES also defines several features of interest to NLE users, they said. First, it has a high-level Effects API, which is a wrapper around GStreamer filter elements. The Effects API exposes features necessary for using video effects in an editor, such as keyframes. Keyframes are control points in a media track, where the user can set a property of interest (for example, audio volume). GES will automatically interpolate the property's value between keyframes, allowing smooth changes. But GES also implements some of the most common transition effects, like cross-fading and screen swipes, making those effects trivial to use. The previous version of GES was not nearly as nice, they said; it required the user to manually create and delete even simple transition effects.

GES's other editing APIs include an advanced timeline editing API, which implements trimming a clip, "rippling" a timeline (which shifts all of the clips further down the timeline whenever a change is made to a clip earlier in the timeline), and "rolls" (switching instantly between two clips on different tracks). GES attempts to implement the most-often-used features by default, so for instance it automatically rescales imported clips to be the same size, but this behavior can be manually switched off when not needed. There is also a titling API, which overlays a text layer on top of a video track.

GES is currently at version 1.1.90, and should reach 1.2.0 shortly—which will be compatible with GStreamer 1.2 (which was released September 25). It represents nearly two years of work, they said, and although they are doing a lot of testing and debugging, GES naturally needs real-world testing on real-world media clips in order to really uncover all of its bugs. They have an integrated test suite that tests a lot of media formats (input and output) and the various effects, but real-life scenarios are often quite different.

PiTiVi is meant to be a general-purpose NLE, they said, but there are several different editing scenarios they hope GES will be useful for, such as an online video editor (perhaps a collaborative one) and a live-editing NLE for use with broadcasting. GES should also be useful for any GStreamer application that needs to mix video tracks; even if you just have two tracks, they said, mixing them in GES will be easier than doing it in lower-level GStreamer pipelines.

The work on GES is not finished, they said. Things still on the to-do list include playback speed control (the first implementation of which is being worked on by a Google Summer of Code student), automatic hardware acceleration (which is scheduled for GStreamer itself), nesting timelines (for combining multiple scenes that are edited separately into a longer finished product), and proxy editing (where a low-resolution version of a video is used in the editor but the high-resolution version is used for the final output). The latter two features are important for high-end video work.

Playback made simple

In contrast to GES, which has been developed in the open for several years, the other new GStreamer API layer discussed at the conference was Fluendo's new media player API, FluMediaPlayer, which is not open source ... yet. As Julien Moutte explained it in his session, the goal of the player is to fill in a missing piece that keeps GStreamer from being used in more third-party projects.

Ultimately, Fluendo wants world domination for GStreamer, Moutte said. So when the company sees a recurring problem, it wants to do something to fix it. Consequently, Fluendo has been spending time at developer events for Android, Windows, and OS X, the platforms where GStreamer is available but not dominating. One of the most common problems that seems to be encountered by people who incorporate GStreamer into their products is that they want to use GStreamer's powerful playback functionality to put a video on screen, but they do not want to take a course or spend a lot of time learning GStreamer internals to do so. In other words, GStreamer needs to improve its "ease of integration" offerings with simple, high-level APIs.

[Moutte]

Of course, there is already an abstraction intended to do drop-it-in-and-run playback: playbin. But using playbin still requires developers to learn about GStreamer events, scheduling, and language-specific bindings, he said. There are some other good options, such as the Bacon Video Widget, but it is very GTK+-specific and is GPL-licensed, which means many third-party developers will not use it.

Fluendo's solution is FluMediaPlayer, a new player library that implements the same feature set as playbin2 (the current incarnation of playbin), and is built on top of the GStreamer SDK. The SDK is a bit of a controversial topic on its own; it was created by Fluendo and Collabora specifically to target third-party developers, many of whom are on proprietary operating systems. It is also not up-to-date with the latest GStreamer release (relying instead on GStreamer 0.10), but Moutte said the company intends to handle the transition from 0.10 to 1.x transparently with FluMediaPlayer. The player also adds some new features, such as platform-specific bindings and the ability to connect playback to a DRM module.

FluMediaPlayer uses a simple C API; there are no dependencies to worry about, Moutte said, "just use the header." The player object takes a URI as input, creates the output media stream, and listens for a simple set of events (_play, _stop, _close, etc.). Media streams themselves are created by the player on demand (alleviating the need for the developer to set up parameters by hand), newer streaming protocols like DASH are supported, and multiple players can be run simultaneously, each with its own controls and its own error events.

Moutte then said that his aim for the talk was to get feedback from the GStreamer community: was this a good approach, for example, and if Fluendo were to open the source code up, would others be interested in participating? The company was already moving forward with the product, Moutte said, but he hoped to take the temperature of the GStreamer community and make a case to management for releasing the source. By a show of hands, it seemed like most people liked the approach, but opinion was more divided about participating. One audience member observed that several of the features Moutte had described should be landing in upcoming GStreamer releases, which makes the player seem less appealing. Another commented that the audience present might not be the best group to ask—after all, it is a self-selected group of people who are quite comfortable digging into GStreamer itself. Ideally, such a player would draw in new developers not already working with GStreamer.

Of course, the FluMediaPlayer product can certainly coexist with other GStreamer initiatives. Furthermore, if GStreamer itself does implement several of the higher-level features built into FluMediaPlayer, that will not reduce the appeal of the product to outside developers, but will simplify Fluendo's maintenance. There does seem to be a general agreement that GStreamer itself is technically sound at this point in its history, and that the next big hurdle to overcome is building a layer of services on top of it—services that up until now many GStreamer users have had to re-implement on their own.

[The author would like to thank the Linux Foundation for travel assistance to Edinburgh for GStreamer Conference 2013.]

Index entries for this article
ConferenceGStreamer Conference/2013


to post comments


Copyright © 2013, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds