Your editor has never been a big fan of video cameras. They have a very
strong observer effect - they distort the social dynamics of events where
they are present. It is also sad to see vacationers who, on the rare
occasions when they get out, capture their every step on video; even when
they leave the house, they watch their lives on television. So your editor
has a strong preference for old-style, organic video memory.
The fact of the matter, however, is that your editor does not always get
the final vote, especially in any area related to the raising of children.
So your editor's household contains two video cameras - one ancient, one
less so - and a set of tapes with no end of priceless memories. Alcohol
may have dimmed the experience of some of those early musical performances
and such, but video tapes are forever.
Except, of course, that they are not. In particular, the older camera,
being the only device in the house which can play those old 8mm analog
tapes, is starting to make some very strange noises. The kind of noises
which generally come just before an extended session dedicated to the
extrication of a terminally crinkled tape which has just been firmly
wrapped around and embedded within a surprising amount of severely-jammed
helical scan hardware. The spouse
and the grandparents have all let it be known that this is not an
acceptable course of events, so your editor has been tasked with saving
all of this legacy data.
One could, of course, go to a local merchant who, for an amount of money
obtainable via an hour or two of consulting work, would transfer this data
safely to some sort of optical digital media, where it would be guaranteed
to survive for at least a few months. Or one could spend an order of
magnitude more time figuring out how to do the work on a Linux system
without the intervention of said merchant. Needless to say, your editor
never thought twice - something which explains a number of difficult
situations in which he has found himself over the years.
This article is the first of (probably) three which describe your editor's
odyssey through the hazards of video processing on Linux. The topic this
time around is the capture of video data - how does one get imagery from a
video tape onto a disk drive? The second segment will look at video
editing, turning a disk full of home movies into something moderately more
professional in appearance. Then the final installment will go into DVD
authoring, otherwise known as the process of getting all that old footage
into the hands (and players) of the grandparents.
The older camera is an analog-only device, necessitating the use of some
sort of analog-to-digital conversion on the way into the computer. As it
happens, your editor is in possession of a Hauppauge WinTV PVR-250 card
which, one would think, is ideally suited to this task. Hauppauge is known
for working with the free software community, with the effect that its
hardware is well supported by the IVTV driver which, after a long
development process, was merged into the 2.6.22 kernel. So, one would
think, grabbing the data from this device should be easy. And it is,
though it took your editor some time to figure out how.
As it turns out, there are very few video capture applications for Linux.
And there is nothing which is really aimed at people trying to bring in
data from analog cameras. One could use a PVR system like MythTV or Freevo
for this purpose, but they are not really intended for this use case. Your
editor, who has been through the process of setting up MythTV in the past,
chose not to take this approach.
One possible candidate was dvgrab, a tool which is part of the Kino project. This tool, however, is
intended for use with FireWire-attached video cameras - we will see how
well it works in that mode shortly. There is also a -v4l2 option
which claims to capture via Video4Linux2, seemingly ideal for this
purpose. Alas, dvgrab is written to use the V4L2 streaming mode, and,
amazingly, the IVTV driver does not support that mode. So dvgrab refuses
to work with the Hauppauge devices. A look at the code suggests that
convincing it to use the V4L2 read/write mode should not be too hard, but
that was beyond the scope of your editor's ambitions at this time.
As an aside, this sort of glitch seems to be a common problem with the
Video4Linux2 API. V4L2 is well suited to letting applications drive video
hardware to the very fullest extent of its capabilities, but that
flexibility comes at the cost of forcing quite a bit of complexity onto the
application side. A truly flexible V4L2 application must be prepared to
cope with a wide variety of hardware and to operate in very different ways
depending on what it finds. Most application developers do not make that
effort, with the result that incompatibilities between applications and
specific video devices are distressingly common. The V4L2 API is, in some
ways, similar to the approach taken by X11, with some similar results:
there was a long period where many applications performed badly when the
display was not running in an 8-bit pseudocolor mode. X11 has worked out
in the end; hopefully the same will happen with V4L2.
Another possibility was mencoder, a tool which is packaged with mplayer. Your editor does not doubt
that mencoder is capable of acquiring a video stream from this device,
converting it into any format one could imagine, and, while it's at it,
changing the camera angle and improving the musical talents of the children
being filmed. But anybody who has read the
mplayer/mencoder man page knows that it is a masterpiece of its kind -
a work written to a length that less verbose authors (Neal Stephenson, say)
could only dream about - though Stephenson does do a better job of keeping
the plot moving.
The length of the manual reflects the complexity of the tool.
A typical mencoder command seems to run to about four
terminal lines - and that's for a relatively wide terminal. An example
from the
mencoder documentation reads like this:
mencoder -oac lavc -ovc lavc -of mpeg -mpegopts format=xsvcd -vf \
scale=480:480,harddup -srate 44100 -af lavcresample=44100 -lavcopts \
vcodec=mpeg2video:mbd=2:keyint=18:vrc_buf_size=917:vrc_minrate=600:\
vbitrate=2500:vrc_maxrate=2500:acodec=mp2:abitrate=224 \
-ofps 30000/1001 -o movie.mpg movie.avi
The end result is that nobody who has not developed significant expertise
in video technology, codecs, formats, and more will be able to create one
of these commands. Mencoder is a highly capable tool, but approaching it
for a task like this is reminiscent of trying to get to the corner store
starting with a build-your-own-automobile kit. There are just too many
pieces (incomprehensible pieces at that) to put together.
Then, there is transcode. The man
page for this utility formats up to a good 50 pages, so it is not the
simplest tool either. This problem space, it would appear, forces the
creation of complex interfaces. Transcode has a V4L2 input module, which should do the
trick, but, like the dvgrab version, it requires streaming I/O capability.
So transcode, too, fails to work for this purpose; your editor is starting
to think that it might be time to hack a bit on the IVTV driver.
Another candidate was cinelerra - a
video editing tool which we will see again in future installments. Your
editor tried cinelerra on a few different platforms, using both binary
distributions and building from source. Suffice to say that building
cinelerra from source is not something to attempt when one is short on time
or short on temper. Cinelerra has a record mode, but it requires the V4L2
streaming capability. Of course, it does not bother to check whether that feature is
available or not, with the result that attempts to record video yield only silent blackness.
Cinelerra is a
vastly powerful editing tool, but it was not usable for this task.
So how did your editor finally succeed in getting the analog video data to
disk? The first step was to locate the highly-useful v4l2-ctl
application which, seemingly, is only available from the V4L-DVB code repository. This tool
provides command-line access to the extensive set of V4L2 ioctl()
calls, enabling detailed configuration of the device. In particular, your
editor made use of it to switch the device to its composite video input.
The second step, then, is decidedly low-tech:
cp /dev/video priceless-video-data.mpg
The end result is a file containing just the video and audio data desired,
in a form which, as it turns out, can be burned directly to DVD. There is
no preview of incoming data, no computer-based camera control, no little
flashing counters. But it works.
The current state of the art for video camcorders is to provide digital
data via an IEEE 1394 (FireWire) port. When one has this sort of device,
life is rather easier - though it seems that there really is only one game
in town. That game is kino - a video
editing tool - and its associated dvgrab tool. Either tool will work for
capture from a digital video device. They can control the camera, split
the incoming data into scenes, and generally make the process painless.
Technology does actually get better sometimes. Kino and dvgrab will only
store data in the DV format,
necessitating a transcoding operation before writing DVDs, but that is a
minor difficulty.
Your editor has learned a few things from this process. One is that the
IVTV driver needs some work. But the real lesson is that working with
video data under Linux involves dealing with a level of complexity that is
far beyond what most people have any desire to understand. And this
complexity hits hardest at the very front end: trying to get video data
onto the system and into a workable format. Your editor suspects that most
people who run into this wall quickly give up and buy a proprietary system
for this kind of work. In other words, there's a whole world full of
creative people doing interesting things with video, and Linux, despite
having many of the basic capabilities these people need, is not an option for
them.
Meanwhile, your editor has a disk full of video imagery - and a healthy
appreciation for just how nice the storage explosion of the last few years
has been. Now it's just a matter of bashing all of that data into a useful
form for grandparental distribution - a process which looks like it might
just take a bit of time. Stay tuned for your editor's video editing
experience, due to appear on these pages within the next few weeks.
Comments (32 posted)
By Jake Edge
December 12, 2007
Audio and video content are increasingly important components of the World
Wide Web, which some of us remember, initially, as a text-only experience.
Users of free software need not be told that the multimedia aspect of the
net can be hard to access without recourse to proprietary tools. So the
decisions which are made regarding multimedia support in the next version
of the HTML specification are of more than passing interest. A current
dispute over the recommended codecs for HTML5 shows just how hard
maintaining an interoperable web may be.
In particular, several big players have complained about the inclusion of
Ogg Vorbis and Theora into the standard, causing a predictable uproar in
the free software community. To many, it looks like a classic
free-versus-proprietary standards showdown. In truth, the issue is not
clear cut; there are nuances that are difficult to turn into a banner
headline. The heart of the problem is patents, but, unexpectedly, it is
the Ogg codecs that are claimed to be at risk.
Nokia fired a very public shot at the Ogg family with a position
paper [PDF], calling it "proprietary". It is unclear what Nokia hoped to
gain with this statement, other than inflaming the community, as Ogg Vorbis
and Theora are clearly open codecs, with free reference implementations
– just the opposite of proprietary. In addition, unlike most (or
all) other
codecs, a patent search was done to look for relevant patents for Vorbis
and Theora, with the Xiph.Org Foundation
claiming that none could be found. Some contend that an exhaustive patent
search is essentially impossible, but most
codecs (MP3, H.264, etc.) are known to be patent-encumbered, which
would seem to make them a poor choice for HTML5.
Ogg, Vorbis, and Theora
Ogg is a container format that can contain multiple chunks of data,
typically multimedia data. Ogg is designed so that it can be processed as
it is received, rather than having it all available at once, to facilitate streaming.
Vorbis is a codec (short for coder-decoder) that encodes audio data
at various bitrates. Vorbis is a lossy, compressed format that saves space
at the expense of perfect reproduction, much like MPEG-1 Audio Layer 3 aka
MP3. Theora is a codec for video data, also lossy, akin to MPEG-4. An Ogg file
could contain a mixture of Theora and Vorbis data to handle the video and
audio of a particular work, but it is not in any way tied to those
formats. An Ogg file could instead contain MP3 and MPEG-4 data or data from any
other codec.
The draft of an HTML5 specification under construction by the Web Hypertext Application Working Group
(WHATWG) contained, up until this week, a
recommendation for the Ogg codecs. Ogg was not required, only listed as
something that SHOULD (i.e. not MUST) be implemented by conforming
browsers. That recommendation was dropped from the draft this week, replaced with the
following:
It would be helpful for interoperability if all
browsers could support the same codecs. However, there are no known
codecs that satisfy all the current players: we need a codec that is
known to not require per-unit or per-distributor licensing, that is
compatible with the open source development model, that is of
sufficient quality as to be usable, and that is not an additional
submarine patent risk for large companies. This is an ongoing issue
and this section will be updated once more information is
available.
Some of the big browser makers, notably Microsoft and Apple, have said that
they will not support Ogg Theora – Vorbis is less of an issue –
out of a concern for patents, particularly submarine patents. Ian Hickson,
WHATWG spokesperson points
to the Eolas and MP3 patent attacks against Microsoft (with damages in
excess of a billion dollars) as examples of what the large, deep-pocketed
companies are concerned about. If there is a patent covering (or appearing
to cover) any of the techniques used in Theora, it is the large companies
that are going to be on the hook.
Some in the community believe
this move is part of a proprietary lock-in play:
Vorbis
provides the perfect escape for proprietary audio prisons. Apple and Nokia
are having problems with consumers and authors actually waking up and using
free, non-patent-encumbered, widely available, unrestricted,
non-proprietary
technology. Since Vorbis directly threatens their ability to sell traps,
they are extorting your compliance with threats of not supporting the HTML5
spec.
There may be some truth to that, but there are some legitimate
problems with Theora as well. The technical complaints tend to compare it
to H.264 (the most popular MPEG-4 codec), but that is something of a red
herring. Neither the WHATWG, nor the World
Wide Web Consortium (W3C) are going to allow a technology known to be
licensed only on a royalty basis into HTML5. W3C, which will eventually make
the final decision on what goes into HTML5, has a policy of requiring
technology to be licensed in a royalty-free (RF) mode before it can be approved for
inclusion into a standard.
All members of a particular W3C working group are required to disclose
patents they believe to be relevant and to provide them to implementors on
an RF basis. There may be relevant patent holders who are not members of the
working group, thus not subject to that requirement, but if they have
enforced their patent on a particular technology, the W3C will try to find
an alternative. There may also be patent trolls waiting for someone with
deep pockets to implement something covered by a patent they hold –
this is the submarine patent threat.
Apple, Nokia, Microsoft and others have already implemented (and licensed)
MPEG-4, so there would be no additional risk to them if that were used as
the baseline video codec for the web. Using Theora as an alternative is seen
by the larger players as a huge increase in their risk, with no benefit to
their customers because there is, for all intents and purposes, no Theora
content out there. For free software and smaller
companies, the situation is clearly quite different.
The lack of Theora-encoded content is the crux of the matter. There might
be lots of whining, but big companies would be forced by their customers
to support Theora, patent suit risk or no, if there were interesting
content available in only that form. This has led to a call
for more Theora content:
Do compelling demos. Release video in Theora format. It may be easy to use
a service that provides video for you in exchange for giving them certain
rights but if you want your format to succeed, then increased usage is the way.
The WHATWG folks seem to have the needs of free software firmly in mind;
certainly the W3C RF policy makes it abundantly clear that a proprietary
solution will not be required, or even recommended, for HTML5. The
participants on the mailing list, and Hickson, in
particular, have been very patient with the onslaught of flamers
screaming about the change. The whole HTML5 effort is centered around
interoperability for the web, so any technology that will not be
implemented by Microsoft and Apple runs directly counter to that goal.
WHATWG seems to be between the proverbial rock and hard place.
Several potential solutions are being considered. Possibilities include
leaving a video codec recommendation out of HTML5 – not a
particularly interoperable solution – or finding a codec that is old
enough that any patents covering it must have expired. Another alternative
would be to get some other current codec (MPEG-4 for instance) licensed on
an RF basis. This issue will undoubtedly be discussed at the W3C Video on the Web
Workshop currently being held in San Jose and Brussels. Stay tuned.
Comments (71 posted)
December 11, 2007
This article was contributed by Biju Chacko
In the last few years FOSS.in has
established itself as one of the largest open source conferences in
Asia. This year the organizers re-orientated the conference to address
what they see as the Indian open source community's biggest challenge. LWN
dropped by the conference to see the changes and get an impression of
the results.
FOSS.in was started in 2001 under the name "Linux Bangalore" in the centre
of India's software industry. At that time it was difficult to get
information about free software in India -- internet access was still not
widespread, the software industry was focused on proprietary tools and
the publishing industry had not picked up on FOSS yet. Linux Bangalore
addressed an untapped market for FOSS education and was an unqualified
success from the start.
LB, as it was known, was focused on encouraging the use of free software
in India. The content was a mix of tutorials, howtos and advocacy. The
conference retained a user orientation for many years -- the only
significant developer activity was from the Indian localization
community.
By 2005 FOSS had hit the mainstream. The Linux Bangalore organizers began
to feel that it needed a greater raison-d'etre than advocacy
and popularization. Despite changing its name to FOSS.in to reflect a
larger scope, the danger remained that the conference would soon be lost
among a host of other sources of open source information.
It was then that the FOSS.in team, led by Atul Chitnis, turned its
attention to another problem. The Indian free and open source community
had long worried that its level of participation in the open source
process was very low in relation to its size. There were very few
visible Indian hackers -- India was beginning to develop a reputation of
being a nation of FOSS consumers that did not contribute back.
This was especially alarming because many sections of the local
software industry had wholly moved to free software. The embedded
software industry, for example, had discarded proprietary alternatives
in favor of Linux. So there was a large base of qualified developers
who did not seem to be getting involved.
After a favorable response to the developer oriented tracks in
FOSS.in/2005 and 2006, the FOSS.in team decided to refocus the event on
encouraging FOSS contributions. The key, they decided, was exposure and
communication. They felt that if Indian developers had an opportunity to
meet and interact with active contributors they'd be inspired to do more
themselves.
To this end, they made a number of changes to the format. They added
'Project Days'
-- day long tracks on a specific FOSS project. They reduced the usually hectic
pace of the conference by reducing the number of talks. This gave the
audience more time to talk to speakers between talks. The more leisurely
pace encouraged lots of interesting conversations in the corridors.
Other facilities -- a "hack centre" containing machines, tents outside
the venue and a lounge area -- provided space for corridor conversations
and post-talk discussions to develop further.
The results were mixed. Attendance took a major hit. Previous editions
averaged about 3000 attendees, this year attendance dropped by over half
to about 1200. It was, however, a far more clued-in crowd which did not plague
speakers with off-topic questions. There were some complaints that
some talks were pitched at a far more basic level than were needed.
The Project Days seemed to have more participation
than was originally expected. There were tracks on Debian,
Mozilla, Gnome, OpenSolaris, Fedora, KDE, OpenOffice and the IndLinux project. In contrast, energy
levels at the main conference seemed muted. This was partly due to the
smaller crowds.
However, in the opinion of this correspondent, this was partly due
to scheduling and content. The tone of a conference is set early
on. The conference would have been better served by an initial
keynote that was flamboyant and inspiring rather the low-key
technical talk by the decidedly non-flamboyant Naba Kumar (the Anjuta lead).
The insistence on purely technical talks provided context and guidance
to potential contributors but may have failed communicate the
motivation: fun and high ideals. I think it's fair to say that the most
effective recruitment tool was when the always entertaining Rusty
Russell made a hapless member of the audience create a kernel patch
onstage and mail it to LKML.
The success of FOSS.in/2007 may not be measurable. It may be years
before the Indian FOSS community is proportional in size to the Indian software
industry. There are probably many other factors that will affect this
outcome. But the transition of FOSS.in to a true hacker conference can only
help this to happen.
Comments (3 posted)
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: On entropy and randomness; New vulnerabilities in e2fsprogs, firebird, MySQL, samba...
- Kernel: Simpler syslets; Writeout throttling; New bugs and old bugs.
- Distributions: When developers go MIA; Mandriva Directory Server 2.2.0; openSUSE 11.0 Alpha0; dynebolic 2.5.2 DHORUBA; Final report from DebConf 7; Skolelinux wins the Scandinavian Free Software award; Geubuntu; Debian Miniconf 7
- Development: The Early Stages of the GNU PDF project,
second KDE 4.0 release candidate, Maemo 4.x training materials,
Introducing Raven, Parrot progress report, Python 3000 alpha 2,
Larry Wall on scripting,
new versions of MySQL, Ext2Fsd, Samba, netqmail, Ria, Blogmaker, Midgard,
Plone, Rails, HOgg, Matplotlib, GNOME, GARNOME, LayoutEditor, LedgerSMB,
Bridge Calculator, Cyphesis, Rosegarden, HTMLi, Valgrind, Stacked GIT.
- Press: Small Linux system reviews,
Active Directory Authentication for Linux, KDE Education Meeting,
SourceForge Marketplace launched, Sun releases UltraSPARC T2 spec,
interview with Don Hopkins, Linux Phone Standard 1.0,
Using a Bluetooth phone with Linux, Introducing Raven, Eee PC and wxWidgets,
Linux dev kits for PPC, Commercial audio software,
EMF changes tune, hails embedded Linux.
- Announcements: BusyBox developers go after Verizon, GNOME Foundation election results,
ACCESS Linux Platform for mobile phones, Ulteo announces online OO.o,
SourceForge.net Marketplace, Breach Security wins AppSec award,
Mellon Award nominations, AFS and Kerberos workshop cfp, Black Hat cfp,
Debian Miniconf at LCA, O'Reilly ETech program announced, FUDCon Raleigh,
OpenOffice.org conf call for location, HITBSecConf videos, Perl conferencevideos, Ted Ts'o audio Interview.
Next page:
Security>>