November 2, 2011
This article was contributed by Nathan Willis
Geoffroy van Cutsem from Intel presented
work on the Unified Multimedia
Service (UMMS) on the first day of LinuxCon Europe. UMMS is a
high-level abstraction layer for audio/video operations, which is meant to provide an API for application developers that is independent of both the playback engine used on the back end and of the output target. Van Cutsem described it as analogous to CUPS for printing or SANE for scanners.
UMMS was initially developed for the MeeGo Smart TV user experience (UX) and, although it will be developed from this point forward as a framework for the Tizen successor to MeeGo Smart TV, it could also be useful on desktop systems and other Linux environments. At present, the design covers media playback and basic media capture features, although more advanced features needed for the smart TV platform are on the roadmap.
As most end-users know, there is no shortage of video playback engines available for Linux: GStreamer, FFmpeg, Xine, MPlayer, and so on. However, they provide no uniform API: front-end applications must be specifically written to tie in to each back-end. Most application projects choose just one, and those that choose several undertake a major duplication in effort. For any particular engine there also need to be language-specific bindings and, in most cases, the application is still responsible for low-level details like constructing its own GStreamer pipelines.
Video capture is similarly fragmented. It requires the application to manage the capture hardware, and in the case of TV tuners, to know the format (DVB, ATSC, QAM, IPTV, etc) and manage the frequency tables and other tuning details. Commercial smart TV and set-top box OEMs also face a challenge when implementing video-on-demand (VOD) applications and playback engines for protected content streams like Blu-Ray in their products, because content-production companies demand isolating that code from GPL and LGPL modules.
UMMS is an attempt to solve all of these problems at once by
constructing a consistent playback and capture API. It provides a D-Bus
service, so it is both language-independent and capable of providing
license isolation. There are playback, recording, and time-shifting
functions, as well as methods to query media properties. Applications can
access media by URI, without regard to whether the source is local or
remote, whether it uses a protected VOD playback engine or GPLed media framework, or the output format used — including the availability of hardware acceleration for the file in question. The latter capability is intended to future-proof UMMS so that it can provide applications with transparent access to new advances in video processing; Van Cutsem cited processing video as OpenGL textures as one example.
Initial API
The project is currently hosted
in the MeeGo build service, but Van Cutsem said it will be migrating to
Gitorious soon. Unfortunately online resources for the work are still on
the scarce side at the moment — there is a draft
version of the requirements document on the MeeGo wiki, but the best
documentation of the API itself is contained inside the spec/
directory of the source repository.
The API provides a way to create and manipulate "MediaPlayer"
objects. Two types are available, "attended" or "unattended"
MediaPlayer objects.
For attended objects, the application must remain active during execution
(as in most video playback scenarios), and can manipulate the video. With
unattended objects, the application registers an event with UMMS, then
shuts down. The canonical unattended example is scheduling a DVR
recording: the application provides a "time to execute" to the UMMS service, along with an input URI and a destination file name.
The code from the MeeGo build service stands at version 0.0.1, and implements sample applications of each type. There is a media player using GStreamer as the back-end framework, and a video recorder that can schedule recordings from a DVB video source. UMMS is licensed under the LGPL v2.1.
Each MediaPlayer object supports methods to report the codec used, the
height and width, the playback rate, whether the content is seekable,
allowed (by the copyright holder) to be displayed at full-screen
resolution, and the presence and location of all audio, video, and subtitle
tracks (although much of the focus deals with video content, UMMS supports
audio-only media just as easily). Applications can use UMMS to query or
set the playback position, adjust volume or playback speed, and do basic
fast-forward/reverse scrubbing.
UMMS defines a "target" as the output destination of any MediaPlayer stream. For PC usage, this would be either an X window or an OpenGL (or other hardware acceleration) pipeline. For direct connection to TVs, there are other considerations that mandate handling HDMI and other output signals differently — non-square pixels, overscan, and so on. But a target could also be a UI element inside of another application, for example a <video> object on a web page, in the case of a browser.
Although UMMS is designed to abstract away many of the details of a
media file from the application, it may not always be possible. In a discussion
on the meego-dev list in March, developer Dominig ar Foll explained that
some content sources will still demand that the application inspect codec
settings, bit rates, buffer depths, and other specifics — in order to
manage hardware resources on the device. For example, some sources are
expected to provide multiple video tracks using different codecs all in a
single multiplexed stream, allowing the application to choose between them. The plan is for UMMS itself to also support automatically selecting the codec in such a situation, based on a pre-defined policy — such as whether an unoccupied hardware decoder for one codec is available, or whether hardware-decoding of a codec would consume less power than software-decoding of another.
Extending the concept
Beyond the basic playback and scheduled recordings already set out in the reference applications, the plan is to extend UMMS to cover a few other TV-specific features, starting with additional functions for DVR applications. There will need to be methods to work with electronic program guides (EPG), as well as to support time-shifting and conditional-access restrictions (think pay-per-view and VOD features, where the content provider might want to ensure that a media file is not watched multiple times).
Smart TVs must also implement industry standards like parental controls and channel locks. In some countries, this is not merely an issue of conforming to expectations, but of adhering to mandatory regulations. Finally, as in the codec-selection example above, one of the goals of UMMS is to provide a framework for managing hardware resources for access by multiple applications. Thus it will need to be able to report status back to applications when there are no tuners or decoders currently available, as well as distinguish between the capabilities of various playback and capture resources, and prioritize requests based on policies.
Although UMMS is designed to meet the needs of the smart TV UX, both Van Cutsem and the developers on the MeeGo lists emphasized that it will hopefully provide useful functionality to other form factors as well — in-vehicle systems and tablets in particular. But it fills in a gap in PC-based Linux systems, too. The ability to abstract away the playback engine would simplify development of desktop media players, especially those wanting to use hardware video decoding. TV capture cards are more of a specialty item, at least for now. However, the slow pace of development in the open source DVRs MythTV and Freevo could probably benefit from an abstraction layer like UMMS as well.
To be sure, a portion of the free software community may always grate at the prospect of building a framework that explicitly enables proprietary playback engines and applications, but UMMS is not substantially different from CUPS or other system frameworks in that regard. In this case, supporting the needs of set-top box makers who are beholden to the content industry bears dividends for open source, too, by clearly defining an API layer Linux has been missing for too long.
Van Cutsem ended his talk promising that more details would be coming
online soon — UMMS happened to land at an awkward point in the
transition between MeeGo and Tizen, after all. The MeeGo build system,
lists, and wiki are slated to be taken offline in one year, but the Tizen
project infrastructure has not yet rolled out. "Stay tuned" seems like the appropriate message.
[The author would like to thank the Linux
Foundation for assisting with his travel to LinuxCon Europe 2011.]
(
Log in to post comments)