Brief items
The current development kernel is 3.1-rc3 (code-named "Divemaster
Edition"),
released on August 22.
Linus says:
And a few thank-yous are in order: things are
looking good. The diffstat looks reasonable (the one big addition is in
Documentation), and while I could have wished for even less churn, I'm
pretty happy. The rc2 to rc3 shortlog is appended, and I think it mostly
looks pretty reasonable and short. Which is not to say that I'm not hoping
that things will calm down even further in the later rc's, but at least so
far I don't think I've had much reason to complain.
See the
full changelog for all the details.
Stable updates: no stable updates have been released in the last
week, and none are in the review process as of this writing.
Comments (none posted)
WhoMeNope. That would imply that I understood it, but my brain is
far too small to understand rcutree.c - that's what we have
paulmcks for.
--
Andrew Morton
Has anybody ever looked at a real computer? Doesn't anybody know
how computer math works any more? Doing that as a (slow) double
division on 32-bit is so stupid that it's past even just
"wrong". It's way off in la-la-land, sitting in a corner, all
hopped up on drugs and painting its nails purple.
--
Linus Torvalds
Some organisations disagree with this and say the license has to
explicitly be reinstated according to GPLv2 section 4.
I've talked to quite a few lawyers worldwide and they all think
that downloading the software will give you a new license, so I
wouldn't be too worried about these organisations.
--
Armijn Hemel on the "GPL death penalty"
Comments (13 posted)
Videos of three talks at the recently concluded Linux wireless summit have
been posted. These talks cover the implementation of
dynamic frequency selection, 802.11s mesh
networking, and mesh network testing with
wmediumd.
Full Story (comments: none)
Kernel development news
By Jonathan Corbet
August 23, 2011
The "native Linux KVM tool" (which we'll call "NLKT") is a hardware
emulation system designed to support virtualized guests running under the
KVM hypervisor. It offers a number of nice features, but an attempt to get
this code merged into the 3.1 kernel was deferred by Linus, who did not
want to deal with another controversial development at that time. This
tool's developers have let it be known that it will be back for the 3.2
merge window; controversy is sure to follow. The core question raised by
this project is: what code is appropriate for the kernel tree, and which
projects should live in their own repositories elsewhere?
NLKT was started in response to unhappiness
about QEMU, the state of its code, and the pace of its development. It
was designed with simplicity in mind; NLKT is meant to be able to boot a
basic Linux kernel without the need for a BIOS image or much in the way of
legacy hardware emulation. Despite its simplicity, NLKT offers "just
works" networking, SMP support, basic graphics support, copy-on-write block
device access, host filesystem access with 9P or overlayfs, and more. It
has developed quickly and is, arguably, the easiest way to get a Linux
kernel running on a virtualized system.
Everybody seems to think that NLKT is a useful tool; nobody objects to its
existence. The controversy comes for other reasons, one of which is the
name: the tool simply calls itself "kvm." The potential for confusion with
the kernel's KVM subsystem is clear - that is why this article made up a
different acronym to refer to the tool. "KVM" is already seen as an
unfortunate name - searches for the term bring in a lot of information
about keyboard-video-mouse switches - so adding more ambiguity seems like a
bad move. It is also seemingly viewed by some as a move to be the
"official" hardware emulator for KVM. The NLKT developers have, thus far,
resisted a name change, though.
The bigger fight is over whether NLKT belongs in the kernel at all. It is
not kernel code; it is a program that runs in user space. The
question of whether such code should be in the kernel's repository is
certainly the one that will decide whether it is merged for 3.2 or not.
NLKT would not be the first user-space tool to go into the mainline kernel;
several others can be found in the tools/ directory. Many of
them are testing tools used by kernel developers, but not all. The
"cpupower" tool was merged for 3.1; it allows an administrator to tweak
various CPU power management features. The most actively developed tool in
that directory, though, is perf, which has grown by leaps and bounds since
being merged into the mainline. The developers working on perf have been
very outspoken in their belief that putting the tool into the mainline
kernel repository has helped it to advance quickly.
Proponents say that, like perf, NLKT is closely tied to the kernel and
heavily used by kernel developers; like perf, it would benefit from being
put into the same code repository. KVM, they say, is also under heavy
development; having NLKT and KVM in the same tree would help both to
improve more quickly. It would bring more review of any future KVM
ABI changes, since a user of that ABI would be merged into the kernel as
well. Keeping the hardware emulation code near the drivers that code has
to work with is said to be beneficial to both sides. All told, they say,
perf would not have been nearly as successful outside of the mainline tree
as it has been internally; merging NLKT can be expected to encourage the
same sort of success.
That success seems to be one of the things that opponents are worried
about; some have worried that the main
purpose is to increase the project's visibility so that it succeeds at the
expense of competing projects. The ABI development benefits are
challenged; any changes would clearly still have to work with tools like
QEMU regardless of whether NLKT is in the kernel, so QEMU developers would
have to remain in the loop. It is even better, some
say, to separate the implementation of an ABI from its users; that forces
the implementers to put more effort into documenting how the ABI should be
used.
There is also concern that, once we start seeing more user-space tools
going into the kernel tree, there will be an unstoppable flood of them.
Where does it stop, they ask - should we pull in the C library, the GNU
tools, or, maybe, LibreOffice? Linux is not BSD, they say; trying to put
everything into a single repository is not the right direction to take.
The answer to that complaint is that there is no interest in merging
arbitrary tools; only those that are truly tied to the kernel would
qualify. By this reasoning, NLKT is an easy decision. A C library is
something that could be considered; perhaps even graphics if the relevant
developers wanted to do that. But office suites are not really
eligible; there are limits to what should go into the mainline.
That was where the discussion stood at the beginning of the 3.1 merge
window; Linus decided not to pull NLKT at
that time. Instead, he clearly wanted the discussion to continue; he told
the NLKT developers that they would have to convince him in the 3.2 merge
window instead. It looks like that process is about to begin; the NLKT
repository is about to be added to linux-next in anticipation of a pull
request once the merge window opens. This time, with luck, we'll have a
resolution of the issue that gives some guidance for those who would merge
other user-space tools in the future.
Comments (24 posted)
By Jonathan Corbet
August 24, 2011
It is not unheard of for kernel developers to refuse to support a
particular user-space interface that, they think, is poorly designed or
hard to maintain into the future. A user-space project refusing to use a
kernel-provided interface in the hope of forcing the creation of something
better is a rather less common event. That is exactly what is happening
with the udev project's approach to device tree information, though; the
result could be a rethinking of how that information gets to applications.
OLPC laptops have, among their other quirks, a special keyboard which
requires the loading of a specific keymap to operate properly. For the
older generations of laptops, loading this keymap has been easily handled
with a udev rule:
ENV{DMI_VENDOR}=="OLPC", ATTR{[dmi/id]product_name}=="XO", \
RUN+="keymap $name olpc-xo"
This rule simply extracts the name of the machine from the desktop
management interface (DMI) data that has been made available in sysfs. If
that data indicates that the system is running on an XO laptop, the
appropriate keymap file is loaded.
DMI is an x86-specific interface, though, and the upcoming (1.75)
generation of the XO laptop is not an x86-based machine. There is no DMI
information available on that laptop, so this rule fails; some other
solution will be needed.
In current times, the source for hardware description information -
especially on non-discoverable platforms - is supposed to be the device
tree structure. So Paul Fox's solution
would seem to make sense: he created a new rule with a helper script to
extract the machine identification from the device tree, which happens to be
available in /proc/device-tree. It almost certainly came as a
surprise when this solution was rejected by
udev maintainer Kay Sievers, who said:
Reading such things from /proc is kinda taboo from code we maintain
in udev. All things not related to processes really do not belong
into /proc and udev code should never get into the way of possibly
deprecating these things in the long run, even when they might
never happen. I know, there is sometimes no other simple option,
but we generally prefer the inconvenience it causes, over adding
hacks to upstream code, to make a move to a generally useful
solution (which isn't /proc/*) more attractive.
Of course, Paul wasn't adding the /proc/device-tree interface;
criticism of such a move would not have been surprising. That file has a
long history; it has been supported, under some architectures, since the
2.2 kernel. So one might think that it is a bit late to be complaining
about it; there are a number of /proc files added in those days
which would not be allowed into /proc now. In general, those
files are considered to be part of the user-space ABI at this point; like
it or not, we are stuck with them. The device tree file has been around
for long enough that it almost certainly falls into that category; it's
hard to imagine that it would have been maintained for so long if there
were no programs making use of it. Whether or not the udev developers like
it, /proc/device-tree is not likely to go anywhere else anytime
soon.
That still doesn't mean that the udev developers have to make use of it,
though, and they seem determined to hold out for something better. Quoting Kay again:
No, sorry, the time for dirty hacks in userspace, and work-arounds
for architectures and platforms that don't provide what is commonly
used elsewhere is over. There is no rush, it's new functionality,
and no need to start with 'transitions periods' that in reality
will never end. Stuff just needs to be fixed properly these days,
and papering over just hurts us more in the end.
Kay would like to see the machine identification information exposed
separately somewhere under /sys; it has even been suggested that
platforms using device trees could emulate the DMI directory found on x86
systems. That, to them, looks like a longer-term solution that doesn't put
udev in the position of blocking an ABI cleanup sometime in the future.
In essence, what we have is a user-space utility telling the kernel that an
interface it has supported for well over a decade is unacceptable and needs
to be replaced. To force that replacement, udev is refusing to accept
changes that make use of the existing interface. Whether that is a proper
course of action depends on one's perspective. To some, it will look like
a petty attempt to force kernel developers to maintain two interfaces with
duplicate information in the hope that a long-lived /proc file will
eventually go away, despite its long history. To others, it will seem like
a straightforward attempt to help the kernel move toward interfaces that
are more supportable in the long term.
In this particular case, it looks like udev probably wins. Adding the
machine identification somewhere in sysfs will be easy enough that it is
probably not worth the effort to fight the battle. In a more general
sense, this episode shows that the kernel ABI is not just something handed
down to user space from On High. User-space developers will have their
say, even a dozen years after the interface has been established; that is a
good thing. Having more developers look at these
issues from both sides of the boundary can only help in the creation of
better interfaces.
Comments (25 posted)
By Jake Edge
August 24, 2011
With his characteristically dry British humor, Matthew Garrett outlined the
current situation with x86 platform drivers at LinuxCon. These
drivers are needed to handle various "extra" hardware devices, like special
keys, backlight control, extended battery information, fans, and so on.
There are a wide range of control mechanisms that hardware vendors use for
these devices, and, even when the controller hardware is the same, different
vendors will choose different mechanisms to talk to the devices. It is a
complicated situation that seems to require humor—and perhaps
alcohol—to master.
Garrett does a variety of things for Red Hat, including hardware support
and firmware interfaces (e.g. for EFI).
Mostly he does "stuff that nobody else is really enthusiastic about
doing", he said. Platform drivers are "bits of hardware
support code" that are required to make all of the different pieces
of modern hardware function with Linux. Today's hardware is not the PC of
old and it requires code to make things work, especially for mobile devices.
He started by looking at keys, those used to type with, but also those
that alter display brightness or turn hardware (e.g. wireless) on and off.
The "normal" way that keys have been handled is that a key press causes an
interrupt, the kernel reads a value from the keyboard controller, and the
keycode gets sent on to user space. The same thing happens for a key up
event. This is cutting edge technology from "1843 or
something", which is very difficult to get wrong, though some
manufacturers still manage to do so. The first thing anyone writes when
creating a "toy OS" is the keyboard driver because it is so
straightforward.
In contrast to that simple picture, Garrett then described what goes on for
getting key event information on a Sony laptop. The description was rather
baroque and spanned
three separate slides. Essentially, the key causes an ACPI interrupt, which
requires the kernel to do a multi-step process executing "general purpose
event" (GPE) code in the ACPI firmware, and calling ACPI methods to
eventually get a key code that ends up being sent to user
space. "This is called value add", he said.
Manufacturers are convinced that you don't want to manage WiFi the same way
on multiple devices. Instead, they believe you want to use the "Lenovo
wireless manager" (for example) to configure the wireless device.
"Some would call them insane", and Garrett is definitely in
that camp. The motivation seems to be an opportunity for the device maker
to splash their logo onto the screen when the manager program is run. As
might be guessed, there is no
documentation available because that would allow others to copy the
implementation, which obviates the supposed value add.
It is not just keyboards that require platform drivers, Garrett said.
Controlling radios, ambient light sensors ("everyone wants the
brightness to change when someone walks behind them"), extended
battery information (using identical battery controller chips, with the
interface implemented differently on each one), hard drive protection
(which always use the same accelerometer device), backlight control,
CPU temperature, fan control, LEDs (e.g. a "you have mail" indicator, that
is "not really useful" but is exposed "for people who
don't have anything better to do with their lives"), and more, all
need these drivers.
Multiple control mechanisms
There are half-a-dozen different interfaces that these drivers will use to
control the hardware, starting with plain ACPI calls. That is generally
one of the easiest methods to use, because it is relatively straightforward
to read the ACPI tables and generate a driver from that information.
Events are sent to the driver, along with an event type, and some reverse
engineering is required to work out what the types are and what they do.
There are specific ACPI calls to get more information about the event as
well. Garrett's example showed two acpi_evaluate_object() calls
for the AUSB ("attach USB") and BTPO ("Bluetooth power on") ACPI methods,
which is all that is needed to turn on Bluetooth for a Toshiba
device. "Wonderful", he said.
A small micro-controller with closed-source firmware—the embedded
controller—is another means to control hardware. Ideally, you
shouldn't have to touch the embedded controller because ACPI methods are
often provided to do so. But, sometimes you need to access the registers of the
controller to fiddle with GPIO lines or read sensor data stored there. The
problem is that these register locations can and do change between BIOS
versions. While it is "considered bad form to write a driver for a
specific BIOS version", sometimes you have to do so. It is a fairly
fragile interface, he said.
Windows Management Instrumentation (WMI) is a part of the Windows driver
model that Microsoft decided would be nice to glue into ACPI. It has
methods that are based on globally unique IDs (GUIDs) corresponding to
events. A notify handler is registered for a GUID and it gets called when
that event happens. The Managed Object Format (MOF) code that comes with a
given WMI implementation is supposed to be self-documenting, but there is a
problem: it is compressed inside the BIOS using a Microsoft proprietary
compression tool "that we don't know how to decompress". As an
example of WMI-based driver, Garrett showed a Dell laptop keyboard handling
driver that reports the exact same keycode that would have come from a
normal keyboard controller, but was routed through WMI instead, "because this is the future", he said.
Drivers might also be required to make direct BIOS calls, which
necessitates the use of a real mode int instruction. This is
"amazingly fragile" and incompatible with 64-bit processors.
Currently, the only time BIOS interrupts are invoked from user space are for
X servers and Garrett suggests that drivers should "never do this".
In fact, he went further than that: "If you ever find hardware that
does this, tell me and I will send you money for new hardware". If
you decide to write code that implements this instead, he said that he would pay
someone else money to "set fire to your house".
System Management Mode (SMM) traps are yet another way to control hardware,
but there seems to be a lot of magic involved. There are "magic
addresses" that refer to memory that is hidden from the kernel. In
order to use them, a buffer is set up and the address is poked, at which
point the "buffer contents magically change". There have been
various problems with the SMM implementations from hardware vendors
including some HP hardware that would get confused if SMM was invoked from
anything other than CPU 0. Garrett did not seem particularly enamored of
this technique, likening it to the business plan of the "Underpants
Gnomes".
The last control mechanism Garrett mentioned is to use a native driver to
access the hardware resources directly. Typically these drivers use ACPI
to identify that the hardware exists. The hardware is accessed using the
port IO calls (i.e. inb(), outb()), and will use native
interrupts to signal events. Various models of Apple hardware uses these
kinds of
drivers, Garrett said.
Consistent interfaces
While there are many ways to access the hardware, kernel hackers want to
provide a consistent interface to these devices. We don't want "to
have to install the Sony program to deal with WiFi". So, "hotkeys"
are sent through the input system, "keys are keys". Backlight
control is done via the backlight class. Radio control is handled with
rfkill, thermal and fan state via hwmon, and the LED control using the led
class. That way, users are insulated from the underlying details of how
their particular hardware implements these functions.
There are two areas that still have inconsistent interfaces, Garrett said.
The hard drive protection feature that is meant to park the disk heads when
an untoward acceleration is detected (e.g. the laptop is dropped) does not
have a consistent kernel interface. Also, the ambient light sensors are
lacking an interface. The latter has become something of a running joke
in the kernel community, he said, because Linus Torvalds thinks it should
be done one
way, but the maintainer disagrees, so, as yet, there is no consistent interface.
How do I work this?
Garrett also had some suggestions on figuring out how new/unsupported
hardware is wired up. There is a fair amount of
reverse engineering that must be done, but the starting point is to use
acpidump and acpixtract utilities to find out what is in
the ACPI
code in the hardware.
If the device is WMI-based, wmidump may
also be useful. Extracting the event GUIDs and registering a handler for
each will allow one to observe which ones fire for various external
events. Then it is a matter of flipping switches to see what happens,
parsing the data that is provided with the event, and figuring how to do
something useful. This may require alcohol, he said.
For embedded controllers or direct hardware access, there are sysfs files
that can be useful. The embedded controller can be accessed via
/sys/kernel/debug/ec/ec0/io (at least for those who have debugfs mounted), or by using the ec_access
utility. Once again, you need to hit buttons, throw various switches, and
listen for fan changes. In addition, you should test that the register
offsets are stable for various machine and BIOS version combinations, he
said. You can find the IDs of devices to access them directly via
the /sys/bus/pnp/devices/*/id files, register as a PNP bus driver
for devices of interest, and then
"work out how to drive the hardware".
The overall picture that Garrett painted is one of needless complexity and
inconsistency that is promulgated by the hardware vendors. But, it is
something that needs to be handled so that all of the "extras" baked into
today's hardware work reliably—consistently—with Linux. While
it would be nice if all of these components were documented in ways that
Linux driver writers could use, that doesn't seem likely to change anytime
soon. Until then, Garrett and the rest of the kernel community will be
wrestling with these devices so that we don't get locked into
manufacturer-specific control programs.
[ I would like to thank the Linux Foundation for travel assistance to
attend LinuxCon. ]
Comments (14 posted)
Patches and updates
Kernel trees
Core kernel code
Device drivers
Filesystems and block I/O
Memory management
Networking
Architecture-specific
Security-related
Virtualization and containers
Page editor: Jonathan Corbet
Next page: Distributions>>