By Nathan Willis
February 20, 2013
On February 4, the Mozilla Firefox and Google Chrome teams
demonstrated their interoperability by conducting a live video chat
between the two offices. This is possible because Firefox and Chrome
have both implemented support for WebRTC, the real-time multimedia
communication framework being developed by both projects, in
conjunction with the World Wide Web Consortium (W3C) and Internet
Engineering Task Force (IETF). WebRTC's JavaScript
API will allow web developers to write audio/video
chat applications that function without extensions or
plugins—and, in theory, with reliable interoperability between
browser implementations.
Mozilla documented
the test call on its Mozilla Hacks blog, and the Google team did
the same on the Chromium blog. A recording of the test call
is viewable in both posts as a YouTube video. It lasts about a minute, and
although it looks simple enough, as Mozilla Chief Innovation Officer
Todd Simpson explains, there are quite a few important details under
the surface. The call used the royalty-free VP8 and Opus codecs for video and audio,
respectively, used Interactive Connectivity Establishment (ICE), Session Traversal
Utilities for NAT (STUN), and Traversal
Using Relays around NAT (TURN) for firewall
traversal, and was encrypted with Secure Real-time Transport Protocol (SRTP) and Datagram
Transport Layer Security (DTLS). ICE is a
higher-level NAT-traversal protocol that makes use of STUN and TURN to
select from among several possible connection methods; DTLS is a
datagram-oriented secure transport layer that uses SRTP for key
exchange. The actual media streams are sent over a WebRTC PeerConnection.
The application used in the demo call is AppRTC, which runs on Google's
App Engine service. Interested parties can test it out for
themselves, but will need to use either a recent nightly build of
Firefox (the Desktop edition only, for now) or a Chrome 25 beta in
order to utilize the chat. For curious developers, AppRTC's source
code is available
for inspection; the cross-browser interoperability is made possible by
a short JavaScript adapter
that smooths over the differences between Firefox's
and Chrome's function names: Firefox prefixes its interfaces with
moz and Chrome with webkit (note that, at the moment,
Chrome appears to be the only WebKit-based browser with a WebRTC
implementation). Such prefixing behavior is a
familiar sight to web developers, although the WebRTC interoperability page says
both browsers will drop the prefixes when the specification gets "more
finalized." According to that page, there are a few other syntactic
differences between the browsers' implementations, as well as
differences in STUN support and SRTP connection negotiation. The
Mozilla blog entry also includes code snippets from another sample
application, which appears to be a Firefox-only affair.
Beyond words
Live video chatting is nice, and for Linux users in particular, having
the functionality "baked in" to two of the most popular cross-platform
browsers is a far sight more appealing than installing binary plugins. But
WebRTC's functionality offers more than just conversation. The
getUserMedia API used to access video and audio data through
webcam hardware can be used in other classes of applications. Mozilla
has a tutorial
implementing simple photo-booth functionality, for example, and Marko
Dugonjić recently speculated
that it could be used to implement proximity detection.
WebRTC also specifies a general-purpose DataChannel
API in addition to the PeerConnection media stream. Clients
can use any underlying data transport protocol they choose; WebRTC
only specifies that they agree on its setup, teardown, and
reliability. Mozilla is the first browser vendor to implement
DataChannels for WebRTC; back in November 2012, Simpson demonstrated
Firefox using DataChannels to share content over Firefox's Social
API, including live text chat and peer-to-peer file transfer.
The codec wars
The ability to use WebRTC with royalty-free codecs like Opus and
VP8 can also be seen as a partial vindication of Mozilla's 2012 decision to implement OS-fallback
support for the patent-encumbered H.264 codec. The decision enabled
playback of H.264 content by passing the necessary decoding duties
down to the operating system—including, particularly on mobile
clients, hardware video decoders. Prior to that decision, Mozilla had
argued that it would not support H.264 because doing so would require
it to pay royalties to H.264's patent holders. Mozilla instead fought
for the adoption of the royalty-free Theora and VP8 codecs, including
arguing for the inclusion of such a free codec as a requirement in the
HTML5 <video> element.
When it announced in March 2012 that it would implement a fallback
mechanism for H.264 playback, Mozilla justified the decision by
saying it needed to focus its resources on emerging media standards,
rather than by continuing to fight against an entrenched one. Brendan
Eich cited
WebRTC as the next major battlefield. The battle appears to be going
in favor of unencumbered codecs, as the IETF draft specification
requires
Opus, but it is clearly still not over. The corresponding draft that
addresses video
requirements mentions VP8, but it requires neither VP8 nor any
other specific codec.
No doubt proponents of H.264—particularly those who stand to
reap royalty payments—will continue to lobby in favor of H.264.
But the playing field is different; unlike the <video>
element, consumer video cameras (many of which record to H.264
directly via hardware encoders) do not factor into the basic WebRTC
use case. And, just as importantly, the development of WebRTC is
spearheaded by two free software browser projects. That gives
what-the-browsermakers-want an intrinsic head start against competing
codecs. The fact that users can download and use VP8-powered WebRTC
for free, real-time video chats today gives the royalty-free an even
bigger advantage: the sole working implementation.
(
Log in to post comments)