Brief items
The current development kernel is 2.6.36-rc7,
released on October 6.
"
This should be the last -rc, I'm not seeing any reason to keep
delaying a real release. There was still more changes to drivers/gpu/drm
than I really would have hoped for, but they all look harmless and
good. Famous last words." The short-form changelog is in the
announcement; kernel.org has
the
full changelog.
As of this writing, the final 2.6.36 release is not out, but it can be
expected at almost any time.
Stable updates: there have been no stable updates in the last week.
Comments (none posted)
As the situation exists today, implementations for many
'high-level' programming languages operate in competition with the
kernel. Garbage Collection without Paging demonstrated a 218x
improvement in latencies and a 41x improvement in throughput by
integration of GC and the kernel's paging system (requiring a patch
to the Linux kernel). Intelligent, rational people often balk at
the notion of a high-level language-as-an-OS because GC, in their
experience, introduces so many performance issues - never mind that
GC has never had a fair, portable opportunity to leverage the
hardware features because the OS kernel is hogging them!
--
"dmbarbour"
No matter how hard I try, I always read this as "DAMAGED". Which I
can't help but imagine subliminally influences the reader's opinion
of the patches.
Of course I am excellent at naming things, see "chunkfs" and
"relatime". But some ideas for naming various concepts in this patch:
D_MIGHT_MOUNT
D_CHILL_OUT
D_ITS_COMPLICATED
--
Valerie Aurora
Quite frankly, if somebody has something in "next" (and really
meant for the _next_ merge window, not the current one) that is
marked for stable, I think that shows uncommonly bad taste. And
that, in turn, means that the "stable" tag is also very
debatable. It clearly cannot be important enough to really be for
stable if it's not even being aggressively pushed into the current
-rc.
--
Linus Torvalds
So changing kernel interfaces that get exported to user space is
always a disaster. Anybody who _designs_ for that kind of disaster
shouldn't be participating in kernel development, because they've
shown themselves to be unable to understand the pain and suffering.
--
Linus Torvalds
Hello NSA. I'm the first person ever banned from linux-kernel. I
was banned for spewing hackish off-topic stuff like a working stack
machine interpreter daemon, "Why the Plan 9 C compiler doesn't have
asm("")", and a packages-friendly internationalization of the file
names tree. Appended below is a trivial shell function that gets
rid of make.
--
Rick Hohensee is back (thanks to Valerie
Aurora)
--
Matt Mackall
Comments (13 posted)
Linsched is a user-space simulator intended to run the Linux scheduling
subsystem; it is intended to help developers working on improving (or
understanding) the scheduler. A new version, based on 2.6.35, has been
released.
"
Since Linsched allows arbitrary hardware topologies to be modeled,
it enables testing of scheduler changes on hardware that may not be
easily accessible to the developer. For example, most developers don't
have access to a quad-core quad-socket box, but they can use LinSched
to see how their changes affect the scheduler on such boxes."
Google (the source of this release, but not the original developer) uses
Linsched to validate its scheduler work.
Full Story (comments: 4)
By Jonathan Corbet
October 12, 2010
The fanotify subsystem (originally "TALPA") was designed as a hook allowing
anti-malware applications to intercept - and possibly block - file-oriented
system calls. Getting this code into the mainline has been a long process,
involving redefined requirements, reworking the low-level VFS notification
code, and redesigning the user-space interface. After all that work,
fanotify developer Eric Paris was able to get the code merged during the
2.6.36 merge window. Developers started using the interface to do
interesting things; Lennart Poettering has mentioned, for example, using it
to monitor file accesses to improve system bootstrap times.
This long story, it seemed, was near an end.
Along came Tvrtko Ursulin, who pointed out
a problem with the fanotify system calls; he then followed up with a second issue. It seems that the results of
permission decisions were not always being handled correctly, and that the
fanotify_init() system call had, somewhere along the way, lost the
intended priority argument. The second issue, in particular, is serious
because it affects the user-space ABI, which must be maintained
indefinitely.
Eric acknowledged the problems and started to
ponder ways to get around them before the 2.6.36 release, but Alan Cox advised a more cautious approach:
Given two chunks of "oh dear" last minute stuff would it be safer
to simply punt and just pull the syscall/prototype itself (leaving
the rest) for the release. That can go into the first pass of the
next kernel tree, and if it the fixes and priority bits all work
out may well then be tiny enough for -stable.
Eric, not entirely pleased with the idea, carried on the discussion for a while.
Eventually, though, he sent in a patch
disabling the fanotify system calls:
This feature can be added in an ABI compatible way in the next
release (by using a number of bits in the flags field to carry the
info) but it was suggested by Alan that maybe we should just hold
off and do it in the next cycle, likely with an (new) explicit
argument to the syscall. I don't like this approach best as I know
people are already starting to use the current interface, but Alan
is all wise and noone on list backed me up with just using what we
have. I feel this is needlessly ripping the rug out from under
people at the last minute, but if others think it needs to be a new
argument it might be the best way forward.
Linus took the patch, so, while the fanotify code will be present in the
2.6.36 release, it will not be accessible from user space. Whether the
problems can be fixed in a way which is suitable for a 2.6.36.y stable
release remains to be seen.
Comments (none posted)
Kernel development news
By Jonathan Corbet
October 12, 2010
As a general rule, kernel changes which break drivers at run time are not
seen as a good thing. Silent data corruption is also seen as the sort of
outcome that the development community would rather avoid. What happens
when it becomes necessary to choose one or the other? A long-running
debate in the ARM community provides at least one answer.
First, some background. Contemporary processors do not normally address
memory directly; instead, memory accesses are mediated through mappings
created in the hardware's memory management unit. Depending on the processor, those
mappings may be controlled through segment registers, page table entries,
or some other means. The mapping will translate a virtual address into a
physical address, but it also controls how the processor will access that
memory and, perhaps, cache its contents.
As explained by ARM maintainer Russell
King, ARM processors
have a number of attributes which affect how memory mappings work. There
is the concept of a memory type; "normal memory" is subject to reordering
of reads and writes, while "device memory" is not, for example. There is
also a bit indicating whether memory can be shared between processors;
unshared memory is faster because there is no need to worry about
cross-processor cache coherency. Then, like many CPUs, ARM processors can
specify different caching behavior in the mapping; RAM might be mapped with
writeback caching enabled, while device memory is uncached.
The ARM kernel maps RAM as normal memory with writeback caching; it's also
marked non-shared on uniprocessor systems. The ioremap() system
call, used to map I/O memory for CPU use, is different: that memory is mapped as
device memory, uncached, and, maybe, shared. These different mappings give
the expected behavior for both types of memory. Where things get tricky is
when somebody calls ioremap() to create a new mapping for system
RAM.
The problem with these multiple mappings is that they will have differing
attributes. As of version 6 of the ARM architecture, the specified
behavior in that situation is "unpredictable." Users, as a rule, are not
enamored of "unpredictable' behavior, especially when their data is
involved. So it would make sense to avoid multiple memory mappings with
differing attributes. The ARM architecture has traditionally allowed this
kind of mapping, though, and a number of drivers, as a result, rely on
being able to remap RAM in this way.
Back in April, Russell raised an alarm about this issue, and posted
a patch causing ioremap() to fail when the target is system
RAM. This change avoids the potential data corruption issue, but at the
cost of breaking every driver using ioremap() in this way. There
were complaints at the time, so the patch sat out the 2.6.35 development
cycle, but Russell merged it for 2.6.36. There it sat until, with the
release imminent, Felipe Contreras posted a
patch backing out the change, saying:
Many drivers are broken, and there's no alternative in sight. Such
a big change should stay as a warning for now, and only later
should it actually fail.
Russell was not impressed. In his view, remapping RAM in this way is a
dangerous technique which will lead to data corruption sooner or later.
Despite being warned six months ago, driver developers have not fixed the
problem - there are as many broken drivers now as there were before. So,
he says, there is no benefit to waiting any longer; the dangerous behavior
should be stopped before somebody gets burned.
On the other side, driver developers point out that everything "seems to
work" as it is, so there is no urgent need for change. Furthermore,
Russell's patch looks to them like an API change, and the normal rule of
kernel development is that anybody making internal API changes is charged
with cleaning up any resulting messes. Fixing the drivers is not a trivial
task, and it's Russell's contention that they have always been broken, so
he is not willing (or necessarily able) to make them all work again.
The situation looked stalled, with a reversion of the patch looking like
the only way forward. But, in fact, it looks like there is a way out. The
first is to allow those mappings for one more cycle, but to put in a
user-visible warning when they happen. As Andrew Morton put it:
We *do* have a plan: as of 2.6.36, the kernel will emit a WARN_ON
trace when a driver does this. Offending code will be discovered,
developers will get bug reports from worried users, etc. This is
usually pretty effective.
It is the "worried users" who have been missing from the equation so far;
they can provide a type of pressure which, seemingly, is unavailable to
worried subsystem maintainers.
The other piece of the solution is to give driver developers a way to
obtain a chunk of physically-contiguous RAM which can be remapped in this
way. Such memory cannot be mapped simultaneously as system RAM. One nice
idea would be to simply unmap system memory when it is put to a device's
use, but that proves to be difficult to implement. The alternative is to
just set aside some memory at boot time and never let the kernel use it for
any other purpose; drivers can then allocate from that pool when they need
to. Russell has posted a patch which makes
this kind of memory set-aside possible.
So this particular situation will probably have a happy outcome, presuming
that the above outcome happens and that that no users are burned by
unpredictable mappings with the 2.6.36 kernel. But it highlights some
ongoing problems. It can be very hard to get developers to fix things,
especially if the current code "seems to work." Those developers also
became aware of the change at a very late date - if, indeed, they are even
aware of it now. It seems that testing of -rc kernels by developers is not
happening as much as we would like. Still, the development process seems
to work, and problems like this are overcome eventually.
Comments (15 posted)
By Jonathan Corbet
October 11, 2010
Single-pointer, mouse-based interfaces may be with us for a while yet, but
much of the interesting user-interface development activity these days
involves multitouch interfaces - those which can track multiple input
position events simultaneously. Multitouch can be found on a number of
touchscreen-based smartphones and on some trackpad-based laptops. There's
also a lot of interesting potential for multitouch in other places - think
about multiple people working on a table-size (or wall-size) display. At
the bottom level, though, detecting multitouch events requires support from
the hardware, and, for a number of touchpad-based systems, that hardware
comes from Synaptics.
The Linux driver for Synaptics touchpads does not currently support multitouch
mode for the usual reason: Synaptics has not deigned to tell the world how
to actually use its hardware. There is hope for change, though: Chase Douglas has
recently posted a set of Synaptics
driver patches which add touchpad support based on information obtained
through reverse engineering. There are evidently still a few glitches to
work out, but the mode appears to work. Alas, that does not necessarily
mean we'll have
Synaptics multitouch support in the near future; the real
difficulties may be outside of the technical realm.
When Chase posted his patches, Takashi Iwai - better known for his ALSA
sound driver work - responded that he had
also worked on multitouch support:
Great! Finally someone found it out! I found this and made a
series of patches in 4 months ago. Since then, Novell legal
prohibited me to send the patches to the upstream due to "possible
patent infringing". Now you cracked out. Yay.
In the ensuing discussion, it became clear that Chase's patch was, in fact,
Takashi's patch with some improvements added. Takashi had apparently
posted the patch set once before Novell legal called a halt to the
exercise; that work had been stashed in a Launchpad page
until Chase stumbled across it, made some improvements, and resubmitted the
code. (Just to be clear: it does not appear that Chase was trying to take
credit for somebody else's work; he just hadn't understood the original
source of the code).
Evidently, Takashi sees the independent posting of the code as being
sufficient to get around Novell legal's objections to its merging; he
responded with enthusiasm despite being allegedly on vacation. Chase's
kernel patches have been topped up with a
series of X.org patches taking advantage of the new kernel support and
adding user-level support for nice things like three-finger and "pinch"
gestures. All of this code has seemingly been waiting for its chance to
escape into the wild; all it took was for somebody else to start pushing
the kernel-side code.
So it seems that the floodgates are open and multitouch support on
Synaptics devices will be available to all. But there could yet be a
catch. As Chase noted: "If you're
the originator of the work, and my patch is accepted, I think we'll need
your SOB on it." Without a signoff from Takashi, this code may not
be accepted into the mainline. Takashi has suggested that his signoff from
the initial posting could be used, but he appears to be unwilling to repost
the code with a signoff now.
And that's where the trouble could come: your editor has had no contact with
Novell's legal department and has no special knowledge, but it would not be
surprising to learn that the concerns that department had about this code
might not be
swept aside quite so easily. It's possible that the new signoff
from Takashi might not be forthcoming. Or, possibly, distributors will get
cold feet for the same reason that Novell legal did and decide not to
enable the feature. This code has been in the wild for some months now,
but it has not found its way into users' systems; the increasing attention
being paid to it now may not be enough to change that fact.
The inability to use Takashi's code - if, indeed, it comes to that - would
not be a huge problem. The important thing is the information on how the
hardware works; given that, some energetic hacker would undoubtedly
reimplement the changes in short order. Patent concerns could be harder to
work around, though. Without knowledge of which patents were at issue,
it's hard to say how big an obstacle they could be. By some accounts,
multitouch interfaces in general are patented, though that does not seem to
have stopped some companies from incorporating such interfaces into their
products. If it stops nervous Linux distributors, though, Linux users as a
whole will be the losers.
That, of course, is the nature of the software patent system. But the
scenario described above is highly speculative at this point. The
important thing is that the code (along with the associated hardware
information) is out there and available for those who
would incorporate it into their systems. Hopefully it will be more widely
distributed soon. Unfortunately, the wait for those nice nice wall-size
displays may be just a little bit longer.
Comments (12 posted)
By Jonathan Corbet
October 13, 2010
As this is being written, the last 2.6.36 prepatch has (with luck) been
released and the final release can be expected within a few days. So it is
time to have a look at how this development cycle has gone. There are a
couple of things which distinguish 2.6.36 from its predecessors in
interesting ways.
The 2.6.36 kernel will incorporate about 9400 changesets contributed by
1159 developers. It thus continues a recent trend toward less active
development cycles; here is what we have seen over the course of the last
year or so:
| Release | Changes |
| 2.6.31 | 10,883 |
| 2.6.32 | 10,998 |
| 2.6.33 | 10,871 |
| 2.6.34 | 9,443 |
| 2.6.35 | 9,801 |
| 2.6.36 | ~9,400 |
The work which pushed up the changeset numbers in previous development
cycles (shoveling out-of-tree code into the staging directory being at the
top of the list) continues to wind down, as does work in other areas (like
new filesystems). As a result, the kernel is going through a period of
relatively low flux - but only relative to the last couple of years - and
stabilization. That said, it's worth noting that, unless something
unexpected happens, the 2.6.36 development cycle will be one of the shortest in
recent memory; as a
result, the number of changesets merged per day is the highest since
2.6.30.
Perhaps more interesting is this set of numbers: in 2.6.36, the development
community added 604,000 lines of code and deleted 651,000 - for a total
loss of almost 47,000 lines of code. This is the first time since the
beginning of the git era that the size of the kernel source has gone down.
Given that, perhaps it is appropriate to start with a look at who has been
so busily removing code from the kernel:
| Most lines removed - 2.6.36 |
| Sam Ravnborg | 205813 | 31.6% |
| Benjamin Herrenschmidt | 133666 | 20.5% |
| Amerigo Wang | 19145 | 2.9% |
| Tony Luck | 8418 | 1.3% |
| Greg Kroah-Hartman | 7094 | 1.1% |
| Kiran Divekar | 4487 | 0.7% |
| Palash Bandyopadhyay | 4457 | 0.7% |
| Vincent Sanders | 3467 | 0.5% |
| Dave Jones | 2600 | 0.4% |
| Christoph Hellwig | 2163 | 0.3% |
Sam Ravnborg and Ben Herrenschmidt both got to the top of the list through
the removal of a bunch of defconfig files, part of a general cleanup
inspired by some grumpiness from
Linus back in June; Sam also finished up some SPARC unification work.
Amerigo Wang removed a number of old and unused drivers. Between the three
of them, they got rid of almost 360,000 lines of code - a laudable bit of
work.
Looking at code changes in general for the 2.6.36 development cycle yields
this picture:
| Most active 2.6.36 developers |
| By changesets |
| Vasiliy Kulikov | 160 | 1.7% |
| Eric Paris | 124 | 1.3% |
| Dan Carpenter | 122 | 1.3% |
| Chris Wilson | 117 | 1.3% |
| Eric Dumazet | 108 | 1.2% |
| Uwe Kleine-König | 103 | 1.1% |
| Axel Lin | 98 | 1.0% |
| Johannes Berg | 97 | 1.0% |
| Al Viro | 96 | 1.0% |
| Julia Lawall | 89 | 1.0% |
| Tejun Heo | 88 | 0.9% |
| Joe Perches | 83 | 0.9% |
| Christoph Hellwig | 73 | 0.8% |
| Alex Deucher | 71 | 0.8% |
| Ben Skeggs | 69 | 0.7% |
| John W. Linville | 68 | 0.7% |
| Stefan Richter | 64 | 0.7% |
| Stephen M. Cameron | 62 | 0.7% |
| Felix Fietkau | 60 | 0.6% |
| Randy Dunlap | 59 | 0.6% |
|
| By changed lines |
| Sam Ravnborg | 208270 | 19.4% |
| Benjamin Herrenschmidt | 134811 | 12.5% |
| Chris Metcalf | 53204 | 4.9% |
| Omar Ramirez Luna | 51087 | 4.8% |
| Amerigo Wang | 19191 | 1.8% |
| Jarod Wilson | 16020 | 1.5% |
| Felix Fietkau | 11898 | 1.1% |
| Alan Olsen | 11650 | 1.1% |
| Mike Thomas | 11087 | 1.0% |
| Lars-Peter Clausen | 10795 | 1.0% |
| Tony Luck | 9351 | 0.9% |
| Tetsuo Handa | 7955 | 0.7% |
| Casey Leedom | 7888 | 0.7% |
| John Johansen | 7591 | 0.7% |
| Greg Kroah-Hartman | 7195 | 0.7% |
| Charles Clément | 6864 | 0.6% |
| Dmitry Kravkov | 6754 | 0.6% |
| Kiran Divekar | 6753 | 0.6% |
| Ben Collins | 6540 | 0.6% |
| Christoph Hellwig | 6045 | 0.6% |
|
On the changesets side, Vasiliy Kulikov leads with a long list of mostly
small fixes, mostly in device driver code. The bulk of Eric Paris's work
is the addition of the fanotify subsystem - work which, as of this writing,
will not be enabled for the 2.6.36 release due to user-space ABI
concerns. Dan Carpenter is another master of small fixes, usually for
problems identified by static analysis tools. Chris Wilson had a large
number of changes to the Intel i915 driver - and seemingly an even larger
number fixing the resulting problems. Eric Dumazet's changes were a large
number of fixes and improvements to the networking subsystem.
Three of the top five in the "lines changed" column have already been
mentioned above. The other two are Chris Metcalf, who added the new "Tile"
architecture, and Omar Ramirez Luna, who added the TI dspbridge driver to
the staging tree.
Only one top-five developer (Dan Carpenter) was also in the top five for
2.6.35; there are a lot of new faces on the list this time around.
There were 184 employers (that we could identify) who contributed code to
the 2.6.36 kernel. The top corporate supporters were:
| Most active 2.6.36 employers |
| By changesets |
| (None) | 1524 | 16.3% |
| Red Hat | 1129 | 12.1% |
| (Unknown) | 865 | 9.2% |
| Intel | 691 | 7.4% |
| Novell | 404 | 4.3% |
| IBM | 374 | 4.0% |
| Nokia | 212 | 2.3% |
| Texas Instruments | 189 | 2.0% |
| (Academia) | 178 | 1.9% |
| Samsung | 178 | 1.9% |
| Fujitsu | 160 | 1.7% |
| NTT | 151 | 1.6% |
| Pengutronix | 145 | 1.6% |
| AMD | 131 | 1.4% |
| Google | 125 | 1.3% |
| (Consultant) | 109 | 1.2% |
| Societe Francaise de Radiotelephone | 108 | 1.2% |
| QLogic | 107 | 1.1% |
| Atheros Communications | 99 | 1.1% |
| MiTAC | 98 | 1.0% |
|
| By lines changed |
| (None) | 299115 | 27.8% |
| IBM | 151386 | 14.1% |
| Red Hat | 76455 | 7.1% |
| (Unknown) | 71662 | 6.7% |
| Tilera | 64825 | 6.0% |
| Texas Instruments | 63521 | 5.9% |
| Intel | 55167 | 5.1% |
| Novell | 25798 | 2.4% |
| Samsung | 14619 | 1.4% |
| NTT | 12187 | 1.1% |
| Marvell | 10769 | 1.0% |
| Chelsio | 10560 | 1.0% |
| (Academia) | 10345 | 1.0% |
| QLogic | 9873 | 0.9% |
| Google | 9503 | 0.9% |
| Broadcom | 8391 | 0.8% |
| ST Ericsson | 8390 | 0.8% |
| Canonical | 8354 | 0.8% |
| Nokia | 8060 | 0.7% |
| Atheros Communications | 7762 | 0.7% |
|
For the most part, this list looks the way it has for most development
cycles, but there are a couple of new names here. One is Tilera, the
company behind the Tile architecture, which got its support merged for
2.6.36. The other name appearing here for the first time is Canonical,
which got the AppArmor security module code merged at last. Meanwhile, one
should not forget the other 164 companies which do not appear on the above
list; the commercial ecosystem around the Linux kernel remains strong and
diverse.
Finally, your editor decided to rerun an old experiment to look at the
longevity of code in the kernel. Every line in the kernel source was
mapped back to the kernel release where it was last changed, then the
totals for each release were plotted. The resulting picture looks like
this:
At 1.6% of the total, 2.6.36 represents a relatively small piece of the
total code base - the smallest for a long time. Almost 29% of the kernel
code still dates back to the beginning of the git era, down from 31% last
February. While much of our kernel code is quite new - 31% of the code
comes from 2.6.30 or newer - much of it has also hung around for a long
time.
All told, 2.6.36 was a relaxed development cycle with relatively few big
new features and a fair amount of cleanup. That is certainly part of how
it was able to be stabilized in a shorter-than-usual period, and with fewer
than the usual number of regressions (56 reported as of October 10,
as opposed to 100 for 2.6.35-rc6). Whether 2.6.36 represents a new norm
for a slightly slower kernel development process remains to be seen. As of
this writing, the linux-next tree contains 5850 changesets, most of which
are presumably intended for 2.6.37. Quite a few changes still typically do not
appear in linux-next prior to the opening of the merge window, so we should
see more changes than that merged for 2.6.37. Still, current linux-next
does not look like a huge wave of pent-up changes waiting to fly into the
mainline; 2.6.37 may or may not exceed 2.6.36 in the number of changes, but
it does not look like it will be breaking any records.
Comments (9 posted)
Patches and updates
Kernel trees
Core kernel code
Development tools
Device drivers
Documentation
Filesystems and block I/O
Memory management
Networking
Architecture-specific
Security-related
Virtualization and containers
Benchmarks and bugs
Miscellaneous
Page editor: Jonathan Corbet
Next page: Distributions>>