By Jake Edge
April 13, 2011
The recently held Linux
Foundation Collaboration Summit (LFCS) had its traditional kernel panel
on April 6 at which
Andrew Morton, Arnd Bergmann, James Bottomley, and Thomas
Gleixner sat down to discuss the kernel with moderator Jonathan Corbet.
Several topics were covered, but the current struggles in the ARM community were clearly at
the forefront of the minds of participants and audience members alike.
Each of the kernel hackers introduced themselves, some with tongue planted
firmly in cheek, such as Bottomley with a declaration that he was on the
panel "to meet famous kernel developers", and Morton who said
he spent most of his time trying to figure out what the other kernel
hackers are doing to the memory management subsystem. Bergmann was a bit
modest about his contributions, so Gleixner pointed out that Bergmann had
done the last chunk of work required to remove the big kernel lock, which
was greeted with a big round of applause. For his part, Gleixner was a bit
surprised to find out that he manages bug reports for NANA flash (based
on a typo on the giant slides on either side of the stage), but noted that he
specialized in
"impossible tasks" like getting the realtime preemption
patches into the mainline piecewise.
There is a "high-level architectural issue" that Corbet wanted
the panel to tackle first, and that was the current problems in the ARM
world. It is "one of our more important architectures", he
said, without which we wouldn't have all these different Android phones to
play with. So, it is "discouraging to see that there is a
mess" in the ARM kernel community right now. What's the situation,
he asked, and how can we improve things?
For a long time, the problem in the ARM community was convincing
system-on-chip (SoC) and
board vendors to get their code upstream, Bergmann said, but now there is a
new problem in that they all "have their own subtrees that don't work
very well together". Each of those trees is going their own way,
which means that core and driver code gets copied "five times or twenty
times" into different SoC trees.
Corbet asked how the kernel community can do better with respect to
ARM. Gleixner noted that ARM maintainer Russell King tries to push back on
bad code coming in, "but he simply doesn't scale". There are
70 different sub-architectures and 500 different SoCs in the ARM tree, he
said. In addition, "people have been pushing sub-arch trees directly
to Linus", Bergmann said, so King does not have any control over
those. It is a consequence of the "tension between cleanliness and
time-to-market", Bottomley said.
Gleixner thinks that the larger kernel community should be providing the
ARM vendors with "proper abstractions" and that because of a
lack of a big picture view, those vendors cannot be expected to come up
with those themselves. By and large the ARM vendor community has a
different mindset that comes from other operating systems where changes to
the core code were impossible, so 500 line workarounds in drivers were the
norm. Bergmann suggested that the vendors get the code reviewed and
upstream before shipping products with that code. Morton said that
as the "price of admission" vendors need to be asked to
maintain various pieces horizontally across the ARM trees. Actually
motivating them to do that is difficult, he said.
From the audience, Wolfram Sang asked whether more code review for the ARM
patches would help. All agreed that more code review is good, but
Bottomley expressed some reservations because there are generally only a
few reviewers that a subsystem maintainer can trust to spot important
issues, so all code review is not created equal. Morton suggested a
"review economy" where one patch submitter needs to review the
code of another and vice versa. That would allow developers to justify the
time spent reviewing code to their managers. But, Bottomley said,
"collaborating with competitors" is a hard concept for
organizations that are new to open source development.
If a driver looks like one that is already in the tree, it should not be
merged, and instead someone needs to
get the developers to work with the existing driver, Bergmann said. There
is a lot of reuse of IP blocks in SoCs, but the developers aren't aware of
it because different teams are work on the different SoCs, Gleixner said.
The kernel
community needs people that can figure that out, he said. Bottomley
observed that "the first question should be: did anyone do it before and
can I 'steal' it?".
In response to an audience question about the panel's thoughts on Linaro,
Bergmann, who works with Linaro, said "I think it's great" with
a smile. He went on to say that Linaro is doing work that is closely related
to the ARM problems that had been discussed. Getting different SoC vendors
to work together is a big part of what Linaro is doing, and that
"everyone suffers" if that collaboration doesn't happen.
"ARM is one of the places where it [collaboration] is needed
most", he said.
Control groups
The discussion soon shifted to control groups, with Corbet noting that they
are becoming more pervasive in the kernel, but that lots of kernel hackers
hate them. It will soon be difficult to get a distribution to boot and run
without control groups, he said, and wondered if adding them to the kernel
was the right move: "did we make a
mistake?" Gleixner said that there is nothing wrong with control
groups conceptually, "just that the code is a
horror". Bottomley lamented the code that is getting "grafted
onto the side of control groups" as each resource in the kernel that is
getting controlled requires reaching into multiple subsystems in rather
intrusive ways.
As with "everything that sucks" in the kernel, control groups
needs to be
cleaned up by someone who looks at it from a global perspective; that
person will have to
"reimplement it and radically modify it", Gleixner said. That
is difficult to do because it is both a technical and a political problem,
Bottomley said. The technical part is to get the interaction right,
while the political part is that it is difficult to make changes across
subsystem boundaries in the kernel.
But Morton said that he hadn't seen much in the way of specific complaints
about control groups cross his desk. Conceptually, it extends what an
operating system should do in terms of limiting resources. "If it's
messy, it's because of how it was developed" on top of a production
kernel that gets updated every three months. Bottomley said that the
problem with doing cross-subsystem work is often just a matter of
communication, but it also requires someone to take ownership and talk to
all of the affected subsystems rather than just picking the "weakest
subsystem" and getting changes in through there.
Corbet wondered if the independence of subsystems in the kernel, something
that was very helpful in allowing its development to scale, was changing.
The panel seemed to think there wasn't much of an issue there, that while
control groups crossed a lot of boundaries, naming five things like that in
the kernel would be hard to do as Bottomley pointed out.
Twenty years ahead
With the 20 year anniversary of Linux being celebrated this year, Jon
Masters asked from the audience, what would things be like 20 years from
now. Bottomley promptly replied that four-fifths of the panel would be
retired, but Gleixner expected that the 2038 bug would have brought them
all back out of retirement. Morton said that unless some kind of quantum
computer came along to make Linux obsolete, it would still be there in 20 years.
He also expected that the first thing to be done with any new quantum
computer would be to add an x86 emulation layer.
When Corbet posited that perhaps the realtime preempt code would be merged
by then, Gleixner made one his strongest predictions yet for merging that
code: "I am planning to be done with it before I retire".
More seriously, he said that it is on a good track, he has talked to the
relevant subsystem maintainers, and is optimistic about getting it all
merged—eventually.
In 20 years, the kernel will still be supporting the existing user-space
interfaces, Corbet said. He quoted Morton from a recent kernel mailing list post: "Our hammer is kernel patches and all problems
look like nails", and wondered whether there was a problem with how
the kernel hackers developed user-space interfaces. Morton noted that the
quote was about doing more pretty printing inside the kernel, which he is
generally opposed to. It has been done in the past because it was
difficult for the kernel hackers to ship user-space code, so that it would
stay in sync with kernel changes. But perf has demonstrated that the
kernel can ship user-space code, which could be a way forward.
Gleixner noted that there was quite a bit of resistance to shipping perf,
but that it worked out pretty well as a way to "keep the strict
connection between the kernel and user space". Perf is meant to be
a simple tool to allow users to try out perf events gathering, he said, and
that people are building more full-blown tools on top of perf. Having
tools shipped with the kernel allows more freedom to experiment with the
ABI, Bottomley said. Morton said that there needs to be a middle ground,
noting that Google had a patch that exported a procfs file that
contained a shell script inside.
Ingo Molnar recently pointed out that FreeBSD is getting Linux-like quality
with a much smaller development community and suggested that it was because
the user space and kernel are developed together. Corbet asked whether
Linux was holding itself back by not taking that route. Bottomley thought
that Molnar was "both right and wrong", and that FreeBSD has
an entire distribution in its kernel tree. "I hope Linux never gets
to that", he said.
From perf to control groups, FreeBSD to ARM, as usual, the panel ranged
over a number
of topics in the hour allotted. The format and participants vary from year
to year, but it is always interesting to hear what kernel developers are
thinking about issues that Linux is facing.
(
Log in to post comments)