It is something of a tradition to have a "State of Embedded Linux" talk at
each Embedded Linux Conference (ELC), and the recently concluded ELC
Europe did not disappoint. In his keynote, MIPS architecture
maintainer Ralf Baechle looked at the "pain points" for
embedded developers, as well as what was being done to address them. He
also looked to the future and made some predictions of what was coming for
the embedded Linux landscape.
Baechle started working with MIPS Linux in 1993 or 1994, but was using
Linux on x86 even earlier than that. He started off his talk by reporting
on two embedded Linux summits that were
recently held. One was at LinuxCon Japan and another was held in
two days before his talk. There were a large number of companies
represented at the summits, and "a lot of the big players".
There were 16 attendees at the Tokyo summit and 12 at the one in Cambridge.
"fairly good representation of the industry", he said, along
with a bunch of architecture maintainers and users.
The summits were organized to talk about problems, Baechle said, as
"the good stuff doesn't need to be talked about". The
meetings were held "off the record" so that the discussions could be
The attendees identified a number of pain points for embedded Linux.
The first is the problem with "IP blocks", which are particular components
that are licensed for use in system-on-chip (SoC) devices. A typical SoC
"consists of a number of licensed IP blocks", and it is very
hard for the kernel to determine which blocks are
supported by included on a particular
SoC. In addition, developers often don't know that a particular IP block
is supported, so drivers and other support code get developed multiple
times. There is a plan to maintain a list of these IP blocks in a wiki,
along with their support status and device tree bindings, Baechle said.
Another problem area is "legal pain", mostly surrounding the
GPL. That has caused developers to look at alternatives to glibc because
it has moved they fear it moving to the LGPLv3. In addition, the GPLv3 has been interpreted by
an unnamed company as being targeted at voiding its patents. Baechle
doesn't agree with that interpretation, but GPLv3 certainly makes some
companies uncomfortable. Android avoids all GPL code where it can, he
said. Also, the BusyBox lawsuits have caused some consternation in the
embedded world because of the demands for Makefiles and installation
instructions. Not everyone interprets the GPL to require those things, but
as yet, unresolved.
There is also a fair amount of "kernel pain" in the embedded
community, starting with the "huge version gap" among the
kernels used in embedded Linux devices. Kernels from 2.6.11 up through
recent kernels were mentioned as being used, and "not even 2.4 is
really really dead", he said. But, Linux is finding its way into
more and more products. There is a large company that has made it a policy
to put Linux in any of its products that will need to be supported for more
than 10 years.
Another part of the kernel pain is the large amount of out-of-tree code
that embedded Linux developers are working with. Part of the problem comes
from multiple groups within companies, each with its own fairly small set
of patches. There is little communication between those groups, so that
causes a "huge group of patches to build up" within the
company. But the single largest patchset that is carried around by
embedded Linux developers is the RT_PREEMPT patchset and the summit
participants "really would like to see it go upstream", he
said. There may be an effort among the participating companies to try to
help make that happen.
But, there were not only pains discussed at the summits, there was also
discussion of various things that had been added to the kernel recently,
many of them with the support of the CE Linux Forum (CELF). SquashFS, which is a compressed
read-only filesystem, was merged, as was LZO
support for it. LZMA support for the filesystem made it as far as -next
before that particular implementation was rejected by Linus Torvalds.
There is hope for the YAFFS2
flash filesystem to be merged as it is now being cosponsored by CELF and
A way to remove unused functions from kernel builds
(i.e. -ffunction-sections support) for saving
space is getting close to being merged as well, though it is currently held
up by some PA-RISC linker problems. Using that can result in savings of
around 7% of the kernel size, he said. While the merger of CELF and the
Linux Foundation was not known until the Cambridge summit, Baechle
expressed optimism that it would be good for the embedded Linux community.
Linaro presented itself at one of the summits. It considers itself a
"community facing group", he said, that is working to reduce
pain in the ARM world. It has 70 full-time engineers doing open source
work. Right now, Linux can "at best produce one image per SoC
family", which results in some projects needing as many as 50
images, all of which are slightly different variations. Linaro wants to
reduce that pain so that companies can "differentiate themselves not
by fixing random bugs, but by adding new features".
One thing that may help reduce the proliferation of slightly different
variations is the device tree work. Device trees describe the buses,
devices, memory, interrupts, and so on for a particular SoC. That tree gets
passed to the kernel at boot time, which will allow kernels to support more
SoCs within a single image.
It is currently being used by Power PC across
all of its platforms and MIPS is using it as of 2.6.37-rc1. Baechle said
that ARM maintainer Russell King is "not quite convinced"
about device trees, but he believes that King eventually will be.
Virtualization is a hot topic in the embedded Linux world these days, but
it is "not going to be for everybody". Systems that are too
resource constrained will not be interested in virtualization, but others
will be. He went through various virtualization technologies available for
Linux including containers, Xen, KVM, and two proprietary solutions from
Wind River and MontaVista. Each has its place, but containers for OS-level
virtualization and KVM for full virtualization are likely to be the
dominant players for embedded devices, at least partially because they are
part of the mainline
Baechle sees virtualization as a game changer for larger embedded systems.
For example, high availability systems can use a pair of guests that can
fail over to each other. That will also allow in-service software upgrades.
Alternatives to glibc were next on the agenda. Embedded developers are
looking for those alternatives because glibc is "the size of an
aircraft carrier". It complies with all of the standards but that
comes at a heavy price. uClibc is one alternative, but the problem is that
it is "yet another API" that application developers need to
But Embedded GLIBC (EGLIBC) offers
an alternative for embedded developers that doesn't require a separate
API. It is a variant of glibc that is maintained by CodeSourcery and is
friendly". Unlike glibc (whose maintainer "says 'embedded
crap' frequently"), it can be configured without some
features, which leads to a reduction in code size, while still allowing
applications that don't use those features to run without modification.
In many cases, the same application can run
on the desktop or the embedded device and there aren't two different
toolchains required. EGLIBC is another game changer, according to Baechle,
though it is not for the smallest systems. But it simplifies development
which leads to "instant ISV [independent software vendor]
In a look at the mobile distribution space, Baechle was clearly impressed
with MeeGo. He thinks that it will be a "fairly hot commodity in the
future" because it uses the typical Linux software stack. Android,
on the other hand, "feels alien", though Google does a good
job with its development tools. Because MeeGo is stewarded by the Linux
Foundation, it is in more neutral hands than Intel's would be, he said.
The "working upstream" policy of MeeGo is very important, he said. That policy is
increasing the pressure on other embedded Linux community members to get
their code upstream. MeeGo has the most push from the industry and a
tremendous amount of money behind it. He is optimistic about its future,
saying that "MeeGo is going to change the game a little bit".
The embedded world is changing, Baechle said.
"Embedded" used to be a synonym for "resource-constrained", with functionality
that was reasonably easy to implement. But, modern devices are multi-functional
that share a lot of technology with desktop and server systems. There are
devices using the NUMA code to get good performance from multiple memory
banks, for example. SMP was originally developed for servers, moved into
the desktop world, and is now being used by embedded devices.
In wrapping up his talk, Baechle looked into his crystal ball and made a
few predictions. Over the next year or so, he believes that three more
architectures will get merged, as will YAFFS2, but that the RT_PREEMPT
patchset won't be. The
pressure to work upstream will continue to increase which will lead
embedded companies to rethink how they handle source code and how they put
together their development teams.
"Feature-wise, Linux has become rather mature [and] very
stable", but "the complexity of the code has increased quite
dramatically over the last few years", Baechle said. There has been
progress made everywhere in the kernel, with no one feature that stands
out. That is likely to continue over the next few years, and we will be
seeing Linux in even more devices.
Comments (16 posted)
Hugin, the open source photo blending-and-stitching tool, made its second major release of 2010 this week. Among the bullet points are new visualization features, more automation for tricky parts of the image-alignment process, and two new major modes that continue to extend Hugin's functionality beyond the "panorama generator" label it typically wears.
Several release cycles ago, the Hugin project adopted a hybridized version-numbering scheme that blends release dates and traditional incremental numbering; as a result last Monday's release is designated Hugin 2010.2.0, which means it is the second stable release made in 2010 (rather than, for example, a February 2010 release). Source code packages as well as Mac OS X and Windows binaries are available for download directly from the project. Linux users can either consult compilation instructions tailored by distribution on the download page, or look for third-party builds. Regular snapshots and nightly builds are available for Fedora and Ubuntu.
Installation and setup issues
Hugin depends on a suite of external tools for the core tasks of remapping, stitching, blending, and exposure-fusing photographs. These include PanoTools library, which as of Hugin 2010.2.0 deprecates libpano12 in favor of libpano13, Enblend and Enfuse, and several OpenGL libraries (freeglut, libGLU, and GLEW). Those users compiling from source will also need version 2.7.0 or newer of the wxWidgets toolkit.
An ongoing struggle for the project is the lack of a patent-unencumbered
tool to automatically find and mark "control points" in images —
scene features shared between neighboring images in a panorama, which Hugin
uses to calculate the transformations that warp overlapping regions
together. This is particularly important for community distributions (such
as Fedora) with rules prohibiting patented software packages. The default
control point generator is Autopano-SIFT, which is
covered by a patent. For distributions that don't have Autopano-SIFT, it and other options can be installed manually, or users can simply pick control points by hand.
I tested Hugin 2010.2.0 on Ubuntu using the Hugin PPA repository. On Ubuntu, a full update includes not just the hugin, hugin-tools, and hugin-data packages, but also the libpano13 library package, without which the Hugin build will install, but fail to run due to a missing linked library. Also important to note is the autopano-sift-c package. Autopano-sift-C is a C rewrite of the original C# Autopano-SIFT utility; the autopano-sift-c package advertises that it replaces autopano-sift, but installing it does not update Hugin's preferences to point to the updated binaries. You must open "File -> Preferences -> Control Point Detectors" and select the new package, or else Hugin's automated panorama assistant will fail at run time.
Hugin presents a tabbed interface to the user, with separate tabs for the individual steps of a typical panorama-creation workflow: rearranging component images, assigning control points, calculating the "optimal" settings for remapping the images, and stitching the result into the desired format, whether that is a single combined image, a set of individual TIFFs, or any intermediate step. There is an assistant tab that automates the basic panorama-creation process, but for fine adjustments, you will have to delve into the individual tabs. The same is true when using Hugin for other purposes, such as perspective correction.
The most noticeable change for most Hugin users will be the improvements to the fast panorama preview window. This window uses OpenGL to render a small preview of the current panorama project. In addition to its value as a visualization tool, though, it can now be used to adjust the position, centering, rotation, and cropping of the final image. Left-clicking and dragging allows the user to reposition the panorama, and right-clicking allows the user to rotate it around the origin. It can even be used to make rough adjustments to individual images by de-selecting all but the desired images from a list in the toolbar.
The preview window also includes a "Layout" tab that displays thumbnails of the images in a graph, with colored edges connecting images that overlap. Gray lines denote overlapping images without control points assigned, while green, yellow, and red lines denote images with good, fair, and poor control point matching, respectively. Toolbar buttons provide one-click access to center, fit-to-window, and straighten the panorama.
Collectively, all of these changes combine to make the fast preview window a useful tool for large-scale correction to a panorama project. Without them, the user is at the mercy of the raw numbers generated by Hugin's control point and optimization routines. You can still examine the raw numbers, but it takes considerable experience to draw real meaning from them when Hugin's final output appears wildly distorted or otherwise unexpected.
Furthermore, if you make the basic image alignment in the fast preview window first (before running the control point generator), you will save time, because Hugin will only attempt to find control points between images that overlap in the preview. This behavior is configurable through Hugin's preferences.
Under the hood, Hugin also supports a wider range of camera lenses for its perspective- and distortion-correction routines. In addition to the normal and fisheye lens support of previous releases, it can correct orthographic, stereographic, and equisolid lenses.
Hugin developers have added entirely new, non-panoramic features to the
application in previous releases, such as the ability to remap a photograph
into an architectural projection, correct perspective distortion in normal
photos, remove chromatic aberration, and calibrate lenses. Two new use
cases debut in 2010.2.0: linked bracketing and mosaic stitching.
Linked bracketing builds on Hugin's exposure fusion functionality, with
which the program can combine bracketed exposures into a combined
high-dynamic-range (HDR) image (much like Luminance HDR can). In previous releases, Hugin needed to use control points and align the images before attempting the exposure fusion. With linked bracketing, the user instead simply selects the images in Hugin's "Images" tab, clicks "New stack," and moves to the final output step. Obviously, the selected images need to be aligned in-camera (such as taken from a tripod), but for those photographers who use Hugin primarily for exposure fusion, this saves considerable time.
While linked bracketing can be used in panoramic workflows, mosaic stitching represents an entirely new technique. In a panorama, the camera remains in virtually the same spot, and rotates to capture different views of the 360-degree scene. Mosaic stitching tackles the opposite situation, when the subject of the photo remains still, but you must move the camera around to photograph it.
The canonical example is photographing a large floor or wall; the subject is flat, but too large to be captured in one frame. To stitch such a mosaic in Hugin, the photographer imports the individual images, but adjusts them using the "Mosaic" mode in the Fast preview window's "Move/Drag" tab. This permits shifting the image without recalculating its position in 3-D, as is required with panoramic shots.
A supporting function introduced with 2010.2.0 is masking support. In Hugin's "Masks" tab, you can draw a polygonal mask around objects in any image that you wish to be excluded from the stitched final output. When stitching, Hugin uses samples from the other overlapping images. This can be used to cut out passersby walking through the frame, but as the site's tutorial explains, it can also be used to remove stationary objects from mosaic stitching scenes.
Weighing the changes
This release incorporates work started in several Google Summer of Code projects, and represents a good mix of new features, improvements of existing functionality, and user interface refinements. For example, I have used Hugin for several years, but this is the first release where I was happy with the control points automatically selected by the panorama "assistant" (a much friendlier alternative to a "wizard").
Similarly, the new visualization and image arrangement tools in the OpenGL-based fast panorama preview window actually make the application significantly easier to use. In fact, the fast preview window arguably includes enough tools now that it probably deserves a promotion in name. Yet it remains in the toolbar, next to the non-OpenGL panorama preview window (which I suppose should be called the "slow" preview by comparison).
Hugin's arrangement of tools is probably its main weak point. As listed in the beginning of the previous section, there are around a half-dozen image correction tasks that the application can perform, but panorama stitching is the only one that has earned a step-by-step "assistant." The existence of mosaic stitching would probably go undiscovered by anyone who did not read the project's tutorial site regularly, and the individual tools needed for lens calibration are similarly hidden, scattered among the application tabs and windows. The setting that controls Hugin's ability to skip control point generation for non-overlapping images is buried three preference windows deep, and must be set for every individual control point generator.
A side effect of the multi-tab approach taken in the Hugin UI is that even for straightforward tasks, it is often necessary to jump back and forth between the tab several times, repeating optimization on some parameters in one run, and others in another. To the inexperienced user it is difficult to see that changes made in one tab affect the contents of other tabs. For example, panoramic photographer Yuval Levy has a detailed tutorial on his site about using the new Mosaic stitching workflow. By my count, it involves at least six visits to the "Optimizer" tab; perhaps more, depending on the number of images.
Maybe Hugin is restricted somewhat in its user interface because it builds on a set of several discrete tools, but the improvement seen in the panorama assistant show that they can be linked together in a manner that is accessible even to newcomers. I hope that in the future, the project will expose more of its non-panorama functionality in a similar manner.
The other area in which Hugin could still use improvement is helping the user diagnose problems. It is fairly common to attempt to "optimize" a panorama project and be presented with a warning dialog alerting you that "very high" distortion coefficients have been found. The only options at that point are to continue, or to revert the optimization entirely. If the logic exists that allows Hugin to "know" that the coefficients are bad, assisting the user in finding and fixing the source of the trouble should not be far behind. To put it another way, although the "assistant" approach does a good job of walking the user through a successful project, it is just as important to walk the user through troubleshooting a project.
Still, no one who needs any of Hugin's image-manipulation magic has any reason to not install the 2010.2.0 update. The visualization tools in the fast panorama preview allow drastically faster adjustments than can be performed in the "Optimizer", "Exposure", and "Stitcher" tabs. Recent builds have enabled the use of GPUs for some calculations, which is a tantalizing prospect to consider while waiting for a long optimization or stitching routine to complete. While I was still able to crash Hugin once or twice when working on large, multi-image panorama stitching tasks, it was significantly more stable than the 2009 release I had been using beforehand. It still takes a time investment to produce quality work — but that is always true with photography.
Comments (1 posted)
What do you get when you put together 80 to 100 hard-core database geeks
from ten different open source databases for a weekend? OpenSQLCamp
was held most recently at MIT.
Begun three years ago, OpenSQLCamp is a semi-annual unconference for
open source database hackers to meet and collaborate on ideas and theories in
the industry. It's held at various locations alternately in Europe and the
United States, and organized and run by volunteers. This year's conference
was organized by Sheeri Cabral, a MySQL community leader who works for
This year's event included database hackers who work on MySQL, MariaDB,
PostgreSQL, VoltDB, Tokutek, and Drizzle. In contrast to the popular
perception that the various database systems are in a no-holds barred
competition for industry supremacy, most people who develop these systems
are more interested in collaborating with their peers than arguing with
them. And although it's OpenSQLCamp, programmers from "NoSQL" databases
were welcome and present, including MongoDB, Membase, Cassandra, and
While the conference was mainly database engine developers, several
high-end users were present, including staff from Rackspace, GoDaddy,
VMWare, and WidgetBox. The conference's location meant the participation
of a few MIT faculty, including conference co-chair Bradley Kuzsmaul.
While few of the students who registered actually turned up, attendees were
able to learn informally about the software technologies which are now hot
in universities (lots of work on multi-processor scaling, apparently).
The conference started with a reception at the WorkBar, a shared
office space in downtown Boston. After a little drinking and socializing, participants slid immediately into discussing database and database
industry topics, including speculation on what Oracle is going to do with
all of its open source databases (answer: nobody knows, including the
people who work there), recent releases of PostgreSQL and MySQL, and how
VoltDB works. Whiteboard markers came out and several people shifted to
technical discussions and continued the discussion until 11pm.
Jignesh Shah of VMWare brought up some interesting SSD testing results. In
high-transaction environments, it seems that batching database writes
actually reduces throughput and increases response times, completely
contrary to performance on spinning disks. For example, Jignesh had
experimented with asynchronous commit with large buffers, which means that
the database returns a success message to the client and fsyncs the data in
batches afterward. This reduced database write throughput, whereas on a
standard spinning disk RAID it would have increased it up to 30%. There
was a great deal of speculation as to why that was.
A second topic of discussion, which shifted to a whiteboard for
comprehensibility, was how to put the "consistency" in "eventual
consistency" without increasing response time. This became a session on
Sunday. This problem, which is basic to distributed databases, is the
question of how you can ensure that any write conflict is resolved in
exactly the same way on all database nodes for a transactional database
which is replicated or partitioned across multiple servers. Historical
solutions have included attempting to synchronize timestamps (which is
impossible), using centralized transaction counter servers (which become
bottlenecks), and using vector clocks (which are insufficiently
determinative on a large number of nodes). VoltDB addresses this by a
two-phase commit approach in which the node accepting the writes checks
modification timestamps on all nodes which could conflict. As with many
approaches, this solution maintains consistency and throughput at a
substantial sacrifice in response times.
The conference days were held at MIT, rather ironically in the William
H. Gates building. For those who haven't seen Frank Gehry's sculptural
architecture feat, it's as confusing on the inside as it is on the outside
outside, so the first day started late. As usual with unconferences, the
first task was to organize a schedule; participants proposed sessions
and spent a long time rearranging them in an effort to avoid
double-scheduling, which led to some "concurrency issues" with different
versions of the schedule. Eventually we had four tracks for the four
rooms, nicknamed "SELECT, INSERT, UPDATE and DELETE".
As much as I wanted to attend everything, it wasn't possible, so I'll just
write up a few of the talks here. Some of the talks and discussions will
also be available as videos from the conference web site later. I attended
and ran mostly discussion sessions, which I find to be the most useful
events of an unconference.
Monty Taylor of Drizzle talked about their current efforts to add
multi-tenancy support, and discussed implementations and tradeoffs with
other database developers. Multi-tenancy is another hot topic now that
several companies are going into "database as a service" (DaaS); it is the
concept that multiple businesses can share the same physical database while
having complete logical separation of data and being unaware of each other.
The primary implementation difficulty is that there is a harsh tradeoff
between security and performance, since the more isolated users are from
each other, the less physical resources they share. As a result, no single
multi-tenancy implementation can be perfect.
Since it was first described in the early 80's, many databases have
implemented Multi-Version Concurrency Control (MVCC). MVCC is a set of
methods which allow multiple users to read and modify the same data
concurrently while minimizing conflicts and locks, supporting the
"Atomicity", "Consistency", and "Isolation" in ACID transactions. While
the concept is conventional wisdom at this point, implementations are
fairly variable. So, on request, I moderated a panel on MVCC in
PostgreSQL, InnoDB, Cassandra, CouchDB and BerkeleyDB. The discussion
covered the basic differences in approach as well as the issues with data
Jignesh Shah of VMWare and Tim Callagan of VoltDB presented on current
issues in database performance in virtualized environments. The first,
mostly solved issue was figuring out degrees of overcommit for virtualized
databases sharing the same physical machine. Jignesh had tested with
PostgreSQL and found the optimal level in benchmark tests to be around 20%
overcommit, meaning five virtual machines (VMs) each entitled to 25% of the
server's CPU and RAM.
One work in progress is I/O scheduling. While VMWare engineers have
optimized sharing CPU and RAM among multiple VMs running databases on
the same machine, sharing I/O without conflicts or severe overallocation
still needs work.
The other major unsolved issue is multi-socket scaling. As it turns out,
attempting to scale a single VM across multiple sockets is extremely
inefficient with current software, resulting in tremendous drops in
throughput as soon as the first thread migrates to a second socket. The
current workaround is to give the VMs socket affinity and to run one VM per
socket, but nobody is satisfied with this.
After lunch, Bradley ran a Q&A panel on indexing with developers from
VoltDB, Tokutek, Cassandra, PostgreSQL, and Percona. Panelists answered
questions about types of indexes, databases without indexes, performance
optimizations, and whether server hardware advances would cause major
changes in indexing technology in the near future. The short answer to
that one is "no".
As is often the case with "camp" events, the day ended with a hacking
session. However, only the Drizzle team really took advantage of it; for
most attendees, it was a networking session.
Elena Zannoni joined the conference in order to talk about the state of
tracing on Linux. Several database geeks were surprised to find out that
SystemTap was not going to be included in the Linux kernel, and that there
was no expected schedule for release of utrace/uprobes. Many database
engineers have been waiting for Linux to provide an alternative to Dtrace,
and it seems that we still have longer to wait.
The VoltDB folks, who are local to Boston, showed up in force and did a
thorough presentation on their architecture, use case, and goals. VoltDB
is a transactional, SQL-compliant distributed database with strong
consistency. It's aimed at large companies building new in-house
applications for which they need extremely high transaction processing
rates and very high availability. VoltDB does this by requiring users to
write their applications to address the database, including putting all
transactions into stored procedures which are then precompiled and executed
in batches on each node. It's an approach which sacrifices response times
and general application portability in return for tremendous throughput,
into the 100,000's of transactions per second.
Some of the SQL geeks at the conference discussed how to make developers
more comfortable with SQL. Currently many application developers not only
don't understand SQL, but actively hate and fear it. The round-table
discussed why this is and some ideas for improvement, including: teaching
university classes, contributing to object-relational mappers (ORMs),
explaining SQL in relation to functional languages, doing fun "SQL tricks"
demos, and working on improving DBA attitudes towards developers.
In the last track of the day, I mediated a freewheeling discussion on "The
Future of Databases", in which participants tried to answer "What databases
will we be using and developing in 2020?" While nobody there had a crystal
ball, embedded databases with offline synchronization, analytical databases
which support real-time calculations, and database-as-a-service featured
heavily in the discussion.
While small, OpenSQLCamp was fascinating due to the caliber of attendee; I
learned more about several new databases over lunch than I had in the
previous year of blog reading. If you work on open-source database
technology, are a high-end user, or are just very interested in databases,
you should consider attending next year. Watch the OpenSQLCamp web site
for videos to be posted, and for the date and location of next year's
conferences in the US and Europe.
Comments (44 posted)
Page editor: Jonathan Corbet
Next page: Security>>