The current 2.6 prepatch remains 2.6.20-rc5
. Patches have started
flowing into the mainline git repository again, however - 250 or so of
them. These patches are mostly fixes, but there is also a set of patches
from the memory technology devices tree adding an AT91 NAND driver and the
"Cafe" NAND driver (for OLPC systems).
For older kernels: 22.214.171.124 was released on
January 21; it includes fixes for several security problems.
Comments (none posted)
Kernel development news
Linux cannot be said to suffer from a shortage of virtualization
solutions. What is harder to come by, however, is a paravirtualization
system which is amenable to relatively easy understanding. A relatively
recent entrant into the field, however, changes that situation
significantly. With just 6,000 lines (including the user-space code),
Rusty Russell's hypervisor
), provides a
full, if spartan paravirtualization mechanism for Linux.
The core of lguest is the lg loadable module. At initialization
time, this module allocates a chunk of memory and maps it into the kernel's
address space just above the vmalloc area - at the top, in other words. A
small hypervisor is loaded into this area; it's a bit of assembly code
which mainly concerns itself with switching between the kernel and the
virtualized guest. Switching involves playing with the page tables - what
looks like virtual memory to the host kernel is physical memory to the
guest - and managing register contents.
The hypervisor will be present in the guest systems' virtual address spaces
as well. Allowing a guest to modify the hypervisor would be bad news,
however, as that would enable the guest to escape its virtual sandbox.
Since the guest kernel will run in ring 1, normal i386 page protection
won't keep it from messing with the hypervisor code. So, instead, the
venerable segmentation mechanism is used to keep that code out of reach.
The lg module also implements the basics for a virtualized I/O
subsystem. At the lowest level, there is a "DMA" mechanism which really
just copies memory between buffers. A DMA buffer can be bound to a given
address; an attempt to perform DMA to that address then copies the memory
into the buffer. The DMA areas can be in memory which is shared between
guests, in which case the data will be copied from one guest to another and
the receiving guest will get an interrupt; this is how inter-guest
networking is implemented. If no shared DMA area is found, DMA transfers
are, instead, referred to the user-space hypervisor (described below) for
execution. Simple disk and console drivers exist as well.
Finally, the lg module implements a controlling interface accessed
via /proc/lguest - a feature which might just have to be changed
before lguest goes into the mainline. The user-space hypervisor creates a
guest by writing an "initialize" command to this file, specifying the
memory range to use, where to find the kernel, etc. This interface can
also be used to receive and execute DMA operations and send interrupts to
the guest system. Interestingly, the way to actually cause the guest to
run is to read from the control file; execution will continue until the
guest blocks on something requiring user-space attention.
Also on the kernel side is a paravirt_ops implementation
for working with the lguest hypervisor; it must be built into any
kernel which will be run as a guest. At system initialization time, this
code looks for a special signature left by the hypervisor at guest startup;
if the signature is present, it means the kernel is running under lguest.
In that situation, the lguest-specific paravirt_ops will be
installed, enabling the kernel to run properly as a guest.
The last component of the system is the user-mode hypervisor client. Its job is
to allocate a range of memory which will become the guest's "physical"
memory; the guest's kernel image is then mapped into that memory range.
The client code itself has been specially linked to sit high in the virtual
address space, leaving room for the guest system below. Once that guest
system is in place, the user-mode client performs its read on the control
file, causing the guest to boot.
file on the host system can become a disk image for the guest, with the
user-mode client handling the "DMA" requests to move blocks back and forth.
Network devices can be set up to perform communication between guests. The
lg network driver can also work in a loopback mode, connecting an
internal network device to a TAP device configured on the host; in this
way, guests can bind to ports and run servers.
With sufficient imagination, how all of this comes together can be seen in
the diagram to the right. The lguest client starts the process, running in
user space on the host. It allocates the memory indicated by the blue box,
which is to become the guest's virtualized physical memory, then maps in
the guest kernel. Once the user-mode client reads from
/proc/lguest, the page tables and segment descriptors are tweaked
to make the blue box seem like the entire system, and control is passed to
the guest kernel. The guest can request some services via the kernel-space
hypervisor code; for everything else, control is returned to the user-mode
That is a fairly complete description of what lguest can do. There is no
Xen-style live migration, no UML-style copy-on-write disk devices, no
resource usage management beyond what the kernel already provides, etc. As
Rusty put it at linux.conf.au, lguest eschews fancy features in favor of
cute pictures of puppies. The simplicity of this code is certainly one of
its most attractive qualities; it is easy to understand and to play with.
It should have a rather easier path into the kernel than some of the other
hypervisor implementations out there. Whether it can stay simple once
people start trying to do real work with it remains to be seen.
Comments (7 posted)
This is the fifth article in the irregular LWN series on writing video
drivers for Linux. Those who have not yet read the introductory article
want to start there.
Before any application can work with a video device, it must come to an
understanding with the driver about how video data will be formatted. This
negotiation can be a rather complex process, resulting from the facts that
(1) video hardware varies widely in the formats it can handle, and
(2) performing format transformations in the kernel is frowned upon.
So the application must be able to find out what formats are supported by
the hardware and set up a configuration which is workable for everybody
involved. This article will cover the basics of how formats are described;
the next installment will get into the API implemented by V4L2 drivers to
negotiate formats with applications.
A colorspace is, in broad terms, the coordinate system used to
describe colors. There are several of them defined by the V4L2
specification, but only two are used in any broad way. They are:
- V4L2_COLORSPACE_SRGB. The [red, green, blue] tuples familiar
to many developers are covered under this colorspace. They provide a
simple intensity value for each of the primary colors which, when
mixed together, create the illusion of a wide range of colors. There
are a number of ways of representing RGB values, as we will see below.
This colorspace also covers the set of YUV and YCbCr representations.
This representation derives from the need for early color
television signals to be displayable on monochrome TV sets. So the
Y (or "luminance") value is a simple brightness value; when
displayed alone, it yields a grayscale image. The U and V (or Cb and
Cr) "chrominance" values describe the blue and red components of the
color; green can be derived by subtracting those components from the
luminance. Conversion between YUV and RGB is not entirely
straightforward, however; there are several formulas to
Note that YUV and YCbCr are not exactly the same thing, though the
terms are often used interchangeably.
- V4L2_COLORSPACE_SMPTE170M is for analog color representations
used in NTSC or PAL television signals. TV tuners will often produce
data in this colorspace.
Quite a few other colorspaces exist; most of them are variants of
television-related standards. See this page from the V4L2
specification for the full list.
Packed and planar
As we have seen, pixel values are expressed as tuples, usually consisting
of RGB or YUV values. There are two commonly-used ways of organizing those
tuples into an image:
- Packed formats store all of the values for one pixel together
- Planar formats separate each component out into a separate
array. Thus a planar YUV format will have all of the Y values stored
contiguously in one array, the U values in another, and the V values
in a third. The planes are usually stored contiguously in a single
buffer, but it does not have to be that way.
Packed formats might be more commonly used, especially with RGB formats,
but both types can be generated
by hardware and requested by applications. If the video device
supports both packed and planar formats, the driver should make them both
available to user space.
Color formats are described within the V4L2 API using the venerable
"fourcc" code mechanism. These codes are 32-bit values, generated from
four ASCII characters. As such, they have the advantages of being easily
passed around and being human-readable. When a color format code reads,
for example, 'RGB4', there is no need to go look it up in a
Note that fourcc codes are used in a lot of different settings, some of
which predate Linux. The MPlayer application uses them internally. fourcc
refers only to the coding mechanism, however, and says nothing about which
codes are actually used - MPlayer has a translation function for converting
between its fourcc codes and those used by V4L2.
In the format descriptions shown below, bytes are always listed in memory
order - least significant bytes first on a little-endian machine. The
least significant bit of each byte is on the right; for each color field,
the lighter-shaded bit is the most significant.
|Name||fourcc||Byte 0||Byte 1||Byte
When formats with empty space (shown in gray, above) are used, applications
may use that space for an alpha (transparency) value.
The final format above is the "Bayer" format, which is generally something
very close to the real data from the sensor found in most cameras. There
are green values for every pixel, but blue and red only for every other
pixel. Essentially, green carries the more important intensity
information, with red and blue being interpolated across the pixels where
they are missing. This is a pattern we will see again with the YUV formats.
The packed YUV formats will be shown first. The key for reading this table
- = Y (intensity)
- = U (Cb)
- = V (Cr)
|Name||fourcc||Byte 0||Byte 1||Byte
There are several planar YUV formats in use as well. Drawing them all out
does not help much, so we'll go with one example. The commonly-used
"YUV 4:2:2" format (V4L2_PIX_FMT_YUV422, fourcc
422P) uses three separate arrays. A 4x4 image would be
represented like this:
As with the Bayer format, YUV 4:2:2 has one U and one V value for every
other Y value; displaying the image requires interpolating across the
missing values. The other planar YUV
- V4L2_PIX_FMT_YUV420: the YUV 4:2:0 format, with one U and one
V value for every four Y values. U and V must be interpolated in both
the horizontal and vertical directions. The planes are stored in
Y-U-V order, as with the example above.
- V4L2_PIX_FMT_YVU420: like YUV 4:2:0, except that the
positions of the U and V arrays are swapped.
- V4L2_PIX_FMT_YUV410: A single U and V value for each sixteen
Y values. The arrays are in the order Y-U-V.
- V4L2_PIX_FMT_YVU410: A single U and V value for each sixteen
Y values. The arrays are in the order Y-V-U.
A few other YUV formats exist, but they are rarely used; see this page for the
A couple of formats which might be useful for some drivers are:
- V4L2_PIX_FMT_JPEG: a vaguely-defined JPEG stream; a little
more information can be found here.
- V4L2_PIX_FMT_MPEG: an MPEG stream. There are a few variants
on the MPEG stream format; controlling these streams will be discussed in a
There are a number of other, miscellaneous formats, some of them
page has a list of them.
Now that we have an understanding of color formats, we can take a look at
how the V4L2 API describes image formats in general. The key structure
here is struct v4l2_pix_format (defined in
<linux/videodev2.h>, which contains these fields:
- __u32 width: the width of the image in pixels.
- __u32 height: the height of the image in pixels.
- __u32 pixelformat: the fourcc code describing the image
- enum v4l2_field field: many image sources will interlace the
data - transferring all of the even scan lines first, followed by the
odd lines. Real camera devices normally do not do interlacing. The
V4L2 API allows the application to work with interlaced fields in a surprising
number of ways. Common values include V4L2_FIELD_NONE
(fields are not interlaced), V4l2_FIELD_TOP (top field only),
or V4L2_FIELD_ANY (don't care). See this page for a
- __u32 bytesperline: the number of bytes between two adjacent
scan lines. It includes any padding the device may require. For
planar formats, this value describes the largest (Y) plane.
- __u32 sizeimage: the size of the buffer required to hold the
- enum v4l2_colorspace colorspace: the colorspace being used.
All together, these parameters describe a buffer of video data in a
reasonably complete manner. An application can fill out a
v4l2_pix_format structure asking for just about any sort of format
that a user-space developer can imagine. On the driver side, however,
things have to be restrained to the formats the hardware can work with. So
every V4L2 application must go through a negotiation process with the
driver in an attempt to arrive at an image format that is both supported by
the hardware and adequate for the application's needs. The next
installment in this series will describe how this
negotiation works from the device driver's point of view.
Comments (10 posted)
Patches and updates
Core kernel code
Filesystems and block I/O
Virtualization and containers
Page editor: Jonathan Corbet
Next page: Distributions>>