LWN.net Logo

New Theora encoder makes its first public release

April 1, 2009

This article was contributed by Nathan Willis

Xiph.org achieved a milestone last week, unveiling the first public release of its new encoder for Theora video. The new encoder is codenamed Thusnelda to distinguish it from previous work, and makes several big improvements, including fixes to constant bitrate and variable bitrate encoding.

Theora is derived from a video codec called VP3 created by On2 Technologies. On2 donated the code to VP3 and to the public under an open source license in 2001, and agreed to help Xiph.org develop Theora as its successor. The specification for the Theora codec's format was finalized in 2004, but the reference encoder itself — the actual binary that converts a video file into Theora format — only reached 1.0 in November of 2008. Work on Thusnelda began shortly thereafter, spearheaded by Xiph.org's Christopher Montgomery, but was bolstered by a grant from Mozilla and the Wikimedia Foundation that allowed lead Theora developer Tim Terriberry to focus on improving the encoder to coincide with the built-in Theora support slated for Firefox 3.5.

What's new

The Thusnelda encoder is denoted 1.1 alpha, and is available for download from Xiph.org in several formats: source code for the libtheora library, binaries of the ffmpeg2theora command-line conversion utility, and even a Mac OS X Quicktime component.

According to Xiph.org's Ralph Giles, the most noticeable improvement in 1.1 is proper rate control, particularly for fixed bit rate encoding, where the user specifies either the number of bits per second desired in the output (a common use case for streaming applications), or the desired file size. "The 1.0 encoder relies a lot on heuristics, instead of trying to optimize directly the trade-off between quality of the coded images and the number of bits used to represent them," he said, "More significantly, the fixed bitrate mode in the 1.0 reference encoder didn't really work; it just guessed how to meet its target and often missed the requested bitrate, sometimes by quite a bit, which was a problem for streaming and fixed-size encodes."

But Montgomery's work — supported for a year by his employer Red Hat — also included extensive refactoring of the code, which should result in improvements today and allow for easier changes moving forward. "The older encoder was structured as a bunch of nearly independent passes," Giles said, "[it] made something like 8 passes over each frame. This made some forms of decision making hard, i.e. if an earlier decision caused you problems (higher bitrate) in a later stage you were out of luck. The new encoder collapses most of the passes."

The restructuring also allows Thusnelda to take advantage of features in the Theora specification that had never been implemented before, such as "4MV" macroblocks, a motion compensation scheme that adaptively chooses whether to encode motion information for an entire segment of the picture, for a sub-segment, or for none of the segment. "Theora always breaks each image up into square blocks," Giles explained, "one of those blocks then can be split into four motion vectors, or use an average, and if any of those four don't need to be coded, the alpha encoder can skip coding a corresponding motion vector. Making a change like that was too difficult with the 1.0 codebase."

Measuring success

Naturally, real-world performance and not a feature list is the primary means of assessing an encoder. Theora has been the object of criticism in years past, especially when compared against proprietary offerings such as H.264. Reader comments on news stories at Slashdot often dismissed Theora as a poor alternative, producing larger files than the competition for the same subjective quality.

Codec testers are always at the mercy of the encoder, however, and as noted above Theora's 1.0-series encoder had significant flaws, especially with respect to constant bitrate encoding. In the oft-cited doom9.org 2005 codec shootout, the Theora encoder performed poorly by failing to meet the target file size due to poor rate control; the very feature targeted in the 1.1 branch. Similarly, Eugenia Loli-Queru's 2007 Theora versus H.264 test for OSNews repeatedly cited problems with the encoder that made direct comparison close to impossible.

Both tests pre-date the 2008 release of the final 1.0 encoder, much less the 1.1 alpha. Shortly after the Thusnelda alpha, Jan Schmidt posted the results of his personal tests on his blog, indicating a 20% reduction in file size and 14% reduction in encoding time over the 1.0 encoder. Those are significant numbers, even without accounting for better rate control and other encoding parameter improvements. As commenters to the blog pointed out, Schmidt's test was not scientific, particularly as it involved re-encoding an H.264 file rather than a lossless original, and showed example still frames rather than video results.

Video quality is ultimately a subjective, human-centric measure. Although there are attempts to quantify video encoding quality, such as peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM), they rarely replace subjective evaluations of quality. Xiph's Gregory Maxwell said that Thusnelda improves on Theora's PSNR, but that it was a mistake to assume that that equated to a subjective improvement for any particular use case.

To an extent the objective metric problem is equal to the coding problem. If we had a perfect metric we could probably make a perfect encoder (ignoring a lot of engineering details) ... If we could objectively know what 'looks good' then we could make a coder which uses that metric to decide what to code. Then the problem of coding largely reduces to efficiently packing information, which is well understood. So in any case, objective metrics are usually useful for measuring the results of small changes which are mostly 'objective' in nature; they aren't very useful for measuring perceptual changes, nor are they useful for comparing dramatically different codecs.

Terriberry concurred, noting that none of the simple objective metrics take any kind of temporal effects into account, and they are still less trustworthy than the processing done in the brain. "Like most things, it's a matter of knowing what the limitations of your tools are. PSNR and SSIM are useful for monitoring day-to-day changes in the code to identify regressions and optimize parameters. But for evaluating fundamentally different approaches, there's currently no substitute for using real humans."

Theora took hits from critics on subjective quality in the 2005 and 2007 tests, too, points which Montgomery responded to in 2007 with a page on his personal Web site. Although some subjective quality issues like discernible blockiness are not the result of problems with the 1.0 encoder, he argued, many of the most visible problems are, and he urged readers to watch the progress made in the 1.1 series.

What's next

There are several improvements still to come before 1.1 is declared final, according to the Theora team. Giles said the next major feature will be per-block quantizers, the functions that simplify a block of input data into smaller values for output. "[Theora precursor] VP3 used a fixed set of quantizers, and the "quality" knob was the only way you could change things. When VP3 became Theora, back 2004, we added support for varying those quantizers both per video, and per frame type. The 1.0 encoder was able to support alternate quantizer matrices, because you just switch them out, but there were some tuning issues."

"1.1alpha1 is still using the same set, but we expect that the change soon," Giles said. The newly-restructured codebase makes it easy to vary the quantizer used, not just on a per-file or per-frame basis, but block-by-block. Terriberry added that the new code will support 4:2:2 and 4:4:4 pixel formats, which will allow higher color quality, and the ability to use different quantization matrices for different color channels and frames.

Giles and Terriberry agreed that 1.1 final will be significantly better than even the current alpha release once all of the changes are incorporated. Terriberry noted that many of the remaining improvements are "minor things" but that added together they will be substantial. "And that's not even mentioning things like speed optimizations, which also have real practical benefits."

"There are other things still on the docket as well — we're not done yet!" added Montgomery, "However, we're finally to the point of putting together a release solidly better than 1.0 in every way, along with a much higher future ceiling."

Between now and then, the team is soliciting user input from real-world encoding tests. "We put it out to show what we've been up to, and to make it easier to give it a try," said Giles. "We're interested in samples where it really does poorly, especially relative to 1.0, compatibility testing with current decoders, and general build and integration issues which of course can only be found through people trying your software in their own environments." He encouraged users to submit concrete issues through the bug tracker, but to share other experiences through the project mailing list, or simply to blog about them for all to read.

Web video is poised to start changing dramatically once Firefox 3.5 ships with a built-in Theora decoder underlying the HTML5 video element. That makes it all the more important to get the Theora encoder right. Xiph.org does not have the full-time staff or resources of larger activist groups like the Free Software Foundation or Creative Commons, it has only software developers. Consequently, without the support of Red Hat, Mozilla, and the Wikimedia Foundation, it might not have been able to get up to speed. It remains to be seen whether the final build of Thusnelda will beat Firefox 3.5 to release, but the progress made already is encouraging.


(Log in to post comments)

Dirac vs. Theora

Posted Apr 2, 2009 2:20 UTC (Thu) by ncm (subscriber, #165) [Link]

I haven't been keeping track. Is there an encoder that pus a Dirac stream in an Ogg container? How does Dirac differ from Theora such as to lead one to choose one or the other? I presume that Theora holds it own in some ways or Tim wouldn't be working on it. Will they both be used indefinitely, or is one likely to be retired in favor of the other?

Dirac vs. Theora

Posted Apr 2, 2009 4:33 UTC (Thu) by gmaxwell (subscriber, #30048) [Link]

Ffmpeg2dirac produces Ogg/Dirac(+Vorbis) files. These files should play out of the box on modern Linux distributions, they do on Fedora at least.

So, some comparison points:
* Dirac is able to achieve very high quality at accordingly high bitrates (it can even be run losslessly); Theora has a maximum quality which is not quite unconditionally visually lossless.
* Dirac is more cpu intensive than Theora.
* Dirac tools are somewhat less mature (In particular; there exists a decently performing Theora decoder in Java, which means that a majority of internet users can video Theora today with no software install)
* With the encoders existing today Theora clearly out-performs Dirac for low to moderate bitrates.

Which codec you should use today depends on your application. If you are archiving video, building a unencumbered-bluray, or squeezing production grade 1080 HD video into 270mbit SD channels, Dirac is what you want. If you are webcasting with under a couple of megabits per second, Theora is what you want— not only will it have better quality at the applicable bitrates, but you have more playback options. (Though you might also wish to offer a high quality dirac download option…)

For example— I recently put a Theora video online at http://www.celt-codec.org/presentations/ At 150kbit/sec Theora is perfectly adequate for this application while the slides are not always readable with the current Dirac 'Schroedinger' encoder at that bitrate. Yet a full resolution version at several megabits per second looks better in Dirac than Theora. Unless you're using Firefox 3.5beta the video is only playing for you on that page because the page is able to fall back from the the video tag to a java based decoder, something which isn't yet possible for Dirac. (And might never be reasonable due to CPU use?) I'm planning on putting up a Dirac download version… but I'm still playing around with the encoder. (…and it's a fact of life that no one manages to do multiple format parallel distribution well, myself included)

The gap between Theora and Dirac at the lower bitrate end of the spectrum may persist indefinitely as thats really what theora was designed for … future improvements to Dirac encoders may close the gap but it is difficult to speculate about that as that isn't an active area of development for Dirac today.

If I had to place a bet I'd guess that Dirac would continue to be used for "artefact free and better" video for the foreseeable future; the use of which will increase over time as IP transit costs decrease and bandwidth becomes more plentiful, while Theora will be obsoleted in a number of years by a replacement or an extension… It's only natural for lower bitrate distribution formats to have a shorter life compared to archive grade formats, and there is more to be gained from small improvements in formats used where every kilobit counts.


Dirac vs. Theora

Posted Apr 2, 2009 6:36 UTC (Thu) by DonDiego (subscriber, #24141) [Link]

> there exists a decently performing Theora decoder in Java, which means
> that a majority of internet users can video Theora today with no software
> install)

Who has Java installed in their browser these days? IME nobody, so where do you get these numbers from?

Dirac vs. Theora

Posted Apr 2, 2009 7:11 UTC (Thu) by gmaxwell (subscriber, #30048) [Link]

In this case my numbers, greatly hedged by simply claiming a majority, came from directly measuring many millions of visitors to Wikipedia.

It is true that Java penetration is in decline but it still ships as a standard pre-installed component on many systems and it is still is required for access to many business web applications. While the geekerati still hate Java, as they always have, Joe sixpack simply doesn't care. For all the same reasons that he has Flash he has Java. (and, in any case, Theora in Flash 10 VM appears completely possible— if anyone cares to do it… Vorbis is already done)

Dirac vs. Theora

Posted Apr 10, 2009 11:49 UTC (Fri) by robbe (subscriber, #16131) [Link]

> In this case my numbers [...] came from directly measuring many
> millions of visitors to Wikipedia.

Niiice argument. Do you have details (or a link thereto)? I confess, I'd
have questioned your statement (a little less belligerently), had not
DonDiego beat me to it.

> While the geekerati still hate Java, [...]

I think this mainly a thing of the past. It's FOSS goodness now, and the
alternatives are much worse.

Still, I fear that the decline of the Java applet will only accelerate.

Dirac vs. Theora

Posted Apr 2, 2009 4:55 UTC (Thu) by rillian (subscriber, #11344) [Link]

There are. You can use gstreamer, with the plugins from the schrodinger implementation, or the ffmpeg2dirac transcode application to put Dirac video in an Ogg container. The oggz tools suite also has support for splitting and remuxing Ogg Dirac streams. See diracvideo.org for pointers.

Theora and Dirac will coexist for some time. Theora requires fewer computational resources than Dirac, so it's more appropriate for lightweight devices. Theora does better at low bitrates and is a better choice for streaming and web-embedded video, while most of the effort with Dirac has focussed on high bitrate encoding, visually indistinguishable from the original--or even lossless--and is a better choice for production intermediates and high quality distribution. Theora has also been around a lot longer. It's a more mature format, with a bitstream frozen in 2004. Dirac's Ogg mapping was last revised something like six months ago.

So while Dirac will offer better compression in the long run, there are reasons to use each format today, and tomorrow. Neither is a replacement for the other.

Dirac vs. Theora

Posted Apr 2, 2009 6:12 UTC (Thu) by tajyrink (subscriber, #2750) [Link]

OggConvert is the nicest I know of, and readily available in eg. Debian repositories.

great article

Posted Apr 2, 2009 10:57 UTC (Thu) by coriordan (guest, #7544) [Link]

Great article. I'm always interested to hear about Xiph/Ogg software (and GCC, tor, GNU radio, rockbox, emacs).

great comments

Posted Apr 3, 2009 7:52 UTC (Fri) by erwbgy (subscriber, #4104) [Link]

Agreed, and backed up by excellent comments. The insights of the LWN community (rather than just editors) are what makes it so much better than other "news sites". Thank you.

great comments

Posted Apr 4, 2009 6:16 UTC (Sat) by ncm (subscriber, #165) [Link]

Yes, thanks to LWN for keeping up with this, and to all the helpful commenters who filled in the off-topic context.

New Theora encoder makes its first public release

Posted Apr 3, 2009 11:37 UTC (Fri) by jmspeex (subscriber, #51639) [Link]

only reached 1.0 in November of 2008. Work on Thusnelda began shortly thereafter, spearheaded by Xiph.org's Christopher Montgomery
Actually, Monty started his work on Thusnelda even before 1.0 was released. Also, in addition to this page that the article already links, there's also parts two, three, four and five. A good read to see some of the things that have improved. Otherwise, it's nice to see an article on Xiph that doesn't get it completely wrong. Thanks!

New Theora encoder makes its first public release

Posted Apr 3, 2009 15:11 UTC (Fri) by gmaxwell (subscriber, #30048) [Link]

...and yesterday brings six.

New Theora encoder makes its first public release

Posted Jun 29, 2009 7:33 UTC (Mon) by tekNico (guest, #22) [Link]

... and seven.

Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds