LWN.net Logo

LCE2011: Kernel developer panel

By Jake Edge
October 26, 2011

In what is becoming a tradition at Linux Foundation conferences, a handful of kernel developers participated in a hour-long discussion of various topics at the first LinuxCon Europe (LCE), which is being held in Prague on October 26-28. The format typically doesn't change, with four kernel hackers and a moderator, but the participants do regularly change. The LCE edition featured Linus Torvalds, Paul McKenney, Alan Cox, Thomas Gleixner, and Lennart Poettering as the moderator. A number of timely topics were discussed (security, ARM, control groups), as well as a few timeless ones (user-space interfaces, aging kernel hackers, future challenges).

[Kernel panel]

While Torvalds really doesn't need to introduce himself, he did so by saying that he writes very little code anymore and just acts as a central point to gather up the work of the other kernel hackers. He is "sadly, mainly just a manager these days", he said. McKenney maintains the read-copy-update (RCU) code in the kernel, Cox has done many things over the years but mostly works on System-on-Chip (SoC) support for Intel these days, while Gleixner works on "parts of the kernel nobody wants to touch" and maintains the realtime patch set.

User-space interfaces

Poettering said that he wanted to start things off with something a bit controversial, so he asked about user space compatibility. Clearly trying to evoke a response, he asked if it was hypocritical that certain user space interfaces have been broken in the past without being reverted. One of his examples was the recent version number change (from 2.6.x to 3.y) which broke a number of user space tools yet wasn't reverted.

The version number change is a perfect example, Torvalds said. Because it broke some (badly written) user-space programs, code was added to the kernel to optionally report the version as 2.6.x, rather than 3.y (where x = 40 + y). It is "stupid of us to add extra code to lie about our version", but it is important to do so because breaking the user experience is the biggest mistake that a program can make, he said.

From the very beginning, Torvalds's mantra has been that "breaking the user experience is a big no-no". The most important aspect of a program is how useful it is to users and "no project is more important than the user". Sometimes user space does get broken by mistake because someone writes code they think is making an improvement but breaks something. It is unavoidable and he feels bad about it when it happens.

Not only does Torvalds feel bad, but he makes sure that the other kernel developers feel bad about it as well, Gleixner said. Torvalds readily agreed that he is quite vocal about telling other kernel developers that they messed up when there is breakage of user space. "There is a saying that 'on the internet, no one can hear you being subtle'", so he doesn't even try, he said. Cox noted that the current unhappiness with GNOME 3 is a demonstration of why you don't suddenly change how things work for users.

There is pressure from hardware and software makers to constantly upgrade, but Cox said that if you ask users, they just want what worked before to continue working. Torvalds noted that he used to have some old a.out binaries that he would periodically test on new kernels to ensure they worked. One of them was an old shell that used deprecated system calls, but he wanted to ensure that even though he had moved on to newer shells, the old stuff would still work. It's fine to add new things, he said, as long you "make sure that the old ways of working still work".

The sysfs interface has been changed many times in ways that could have broken user-space programs, according to Poettering, and he wondered why that was allowed to happen. Cox said that it comes down to a question of whether users care. If no one reports the problem, then no one really sees it as one.

McKenney noted that his approach is to "work on things so far from user space that I can't even see it", but that he was concerned about adding tracepoints to the RCU code. Maintaining the exact same tracepoints could make it difficult to change RCU down the road.

An audience member asked if it was worth it to continue maintaining backward compatibility "forever", and wondered if it doesn't lead to more kernel complexity. Torvalds was unconcerned about that and said that the complexity problems in the kernel rarely arise because of backward compatibility. Open source means that the code can be fixed, he said. Cox echoed that, saying that if the kernel developers give user space better interfaces, the programs will eventually switch, and any compatibility code can be deprecated and eventually removed.

Aging hackers

After noting that many of the Kernel Summit (KS) participants were in bed by 9 or 10, Poettering asked whether the kernel community is getting too old or, as he put it, whether it has become an old man's club. Cox said that he saw a lot of fresh blood in the community, more than enough to sustain it. Torvalds noted, though, that the average age at the summit has risen by one year every year. It's not quite that bad, he said, but he also believes that the KS attendees are not an accurate reflection of the community as a whole. It tends to be maintainers that attend the summit, while many of the younger developers have not yet become maintainers.

Part of the problem is one of perception, according to Torvalds. The Linux kernel crowd used to be notably young, because the older people ignored what those crazy Linux folks were doing, he said. The kernel hackers had a reputation of being ridiculously young, but many of those same people are still around, and are just older now. Cox noted that the kernel is now a stable project and that it may be that some younger folks are gravitating to other projects that are more exciting. Those projects will eventually suffer the same fate, he said.

ARM

In response to an audience question, Torvalds wanted to clear something up: "everyone seems to think I hate ARM", but he doesn't, he said. He likes the architecture and instruction set of the ARM processor, but he doesn't like to see the fragmentation that happens because there is no standard platform for ARM like there is for x86. There are no standard interrupt controllers or timers, which is a mistake, he said. Because of the lack of a standard platform, it takes ten times the code to support ARM vs. x86.

ARM is clearly the most important architecture other than x86, he said, and some would argue that the order should be reversed. The good news is that ARM Linux is getting better, and the ARM community seems to be making progress, so he is much happier with ARM today than he was six months ago. It's not perfect, and he would like see more standardization, but things are much better. Torvalds said that he doesn't necessarily think that the PC platform is wonderful, but "supporting only a few ways to handle timers rather than hundreds is wonderful".

Security

Poettering asked if the panel would feel comfortable putting their private key on a public Linux machine that had other users. In general, the clear consensus was that it would be "stupid" (in Torvalds's words) to put private keys on a system like that. What you want, Torvalds said, is layers of security, and the kernel does a reasonable job there. "But there will always be problems", he said, so it is prudent to have multiple layers and realize that some of the layers will fail. He noted that there are three firewalls between his development machine and the internet, none of which accept incoming ssh.

Cox agreed, saying that in security, depth is important. You might have really good locks on the doors of your house, but you still keep your money in the bank, he said. His primary worry is other local users on the system, rather than remote attackers.

Challenges

Torvalds and Cox disagreed a bit about the biggest challenge facing the kernel in the coming years; Torvalds worries about complexity, while Cox is concerned about code quality. The kernel has always been complicated, Torvalds said, but it is not getting less complicated over time. There is a balancing act between the right level of complexity to do the job right and over-complicating things.

Cox believes that the complexity problem is self-correcting for the most part. Once a subsystem becomes too complicated, it eventually gets replaced with something simpler. But code quality is a problem our industry has faced for sixty years. It is not just an open source problem, but the whole industry struggles with it. When better solutions are found, it may well require language changes which would have an enormous impact on Linux.

Control groups

Poettering next asked about control groups: are they irretrievably broken or can they be fixed? Cox thought that the concept behind cgroups was good, but that some of the controller implementations are terrible. The question in his mind is whether or not the controllers can be fixed and still continue to manage the resource they are meant to. To Torvalds, controllers are a necessary evil because they provide a needed way to manage resources like memory, CPU, and networking across processes. But it is a "fundamentally hard problem to dole out resources when everyone wants everything", he said.

Cox agreed, noting that managing resources at a large scale will introduce inefficiencies, and that's the overhead that some are complaining about. Those who don't use the controllers hate the overhead, but those who need them don't care about the impact, Torvalds said. Gleixner agreed that some kind of resource management is needed.

According to Torvalds, part of the problem is that the more heavily used a feature is, the more its impact is noticed. At one point, people said that caches were bad because they impose some costs, but we got past that, he said. Cox pointed to SMP as another area where the changes needed added costs that some were not interested in paying. Torvalds noted that when Cox started working on SMP for Linux, he thought it was an interesting project but didn't have any personal interest as he couldn't even buy the machines that Cox was using to develop the feature.

Conclusion

It is always nice to get a glimpse inside of the thinking of Torvalds and some of his lieutenants, and these kernel developer panels give people just that opportunity. It is one of the few forums where folks outside of the kernel community can interact with Torvalds and the others in a more or less face-to-face way.

[ I would like to thank the Linux Foundation for supporting my travel to Prague. ]


(Log in to post comments)

LCE2011: Kernel developer panel

Posted Oct 27, 2011 9:25 UTC (Thu) by dgm (subscriber, #49227) [Link]

"The kernel has always been complicated, Torvalds said, but it is not getting less complicated over time. There is a balancing act between the right level of complexity to do the job right and over-complicating things.

Cox believes that the complexity problem is self-correcting for the most part. Once a subsystem becomes too complicated, it eventually gets replaced with something simpler. But code quality is a problem our industry has faced for sixty years."

I was going to write about how both problems are similar, but I just realized that simplicity is just a subset of the fuzzy group of properties we call "quality". It has also the nice property of being well defined (mathematically) and directly measurable.Quality on the other hand is a broad and probably subjective concept. The trouble with complexity is that we actually need some of it if we want to do anything non trivial, the trick being finding just how much is enough.

Alan's comment on complexity does easily apply to quality in general. Any part of the code bad enough will eventually be rewritten.

LCE2011: Kernel developer panel

Posted Oct 27, 2011 19:05 UTC (Thu) by iabervon (subscriber, #722) [Link]

I think there are some aspects of quality which are not covered by simplicity. For example, how large a typo do you have to make to change correct code to incorrect code, and how hard is it to notice that typo? (Is your style for an assignment and an equality test in a conditional very close?) How hard is it to prepare a patch which does not affect any lines that it doesn't need to affect? (Do you have to add a comma to the line containing a list element you are not changing? Do you have to reproduce hidden whitespace in lines you changed during development but put back?) Do your function names suggest the right effects and not the wrong ones? Are your function behaviors systematic? How closely are different stylistic conventions mixed?

I think a lot of the quality question comes down to how predictable the code is from its purpose and the rest of the code in the system. Complexity of the design limits this, and you can think of it as less entropy and therefore less complexity, but the term "complexity" doesn't usually make people wonder how many bits of entropy there are in the whitespace in the codebase.

Three firewalls

Posted Oct 27, 2011 10:21 UTC (Thu) by man_ls (guest, #15091) [Link]

So, Linus is not even behind seven proxies? What poor security!

Memes aside, I have to ask professionally: why is three firewalls better than one? Or, should I worry that I only have one firewall?

Three firewalls

Posted Oct 27, 2011 12:23 UTC (Thu) by erwbgy (subscriber, #4104) [Link]

Where I work there is a policy that the external Internet-facing firewalls be from a different vendor than the internal firewalls. That way if an exploit is found in the external firewall then hopefully the same exploit can't be used on the internal ones.

It is hard to tell whether this is actually worthwhile or an unnecessary expense.

Three firewalls

Posted Oct 27, 2011 12:59 UTC (Thu) by Stephen_Beynon (✭ supporter ✭, #4090) [Link]

I don't know about the setup Linus uses, but I have multiple firewalls protecting different classes of device.

I have a firewall in my adsl gateway protecting my "insecure" network. The insecure network has wifi/games consoles/set top box network/guest access.

I have a firewall between this insecure network and a wired only network with the machines I care about.

Most of my machines have a software firewall as standard making for a third level of firewall.

Three firewalls

Posted Oct 28, 2011 10:50 UTC (Fri) by josh (subscriber, #17465) [Link]

"wired only network with the machines I care about" doesn't work so well when laptops constitute more than half the machines you care about. :)

Three firewalls

Posted Oct 28, 2011 16:04 UTC (Fri) by jmalcolm (guest, #8876) [Link]

Well, he did say that the WIFI stuff was all on the outer network. "Wired" machines can be reached without trouble once you have breached the network as normal networking is not encrypted or secured. So, you need to protect the network (and the hosts) with things like firewalls.

You cannot put a firewall around wireless which is why wireless networking requires encryption and authentication. It is also why you do not let your wireless network inside the firewall of wired machines "you care about".

Three firewalls

Posted Oct 31, 2011 7:48 UTC (Mon) by ekj (subscriber, #1524) [Link]

You can have a encrypted, wireless network, and tunnel all your traffic to/from laptops you care about trough a VPN to the more secure cabled internal network.

Yeah, it gets complicated.

Three firewalls

Posted Oct 28, 2011 15:53 UTC (Fri) by jmalcolm (guest, #8876) [Link]

Three firewalls is a pretty common configuration really.

You use two firewalls to create a DMZ which is of course a pretty typical setup.

http://en.wikipedia.org/wiki/DMZ_%28computing%29

Add a personal firewall on your own machine (again, a standard security recommendation) and presto--you have three firewalls.

LCE2011: Kernel developer panel - backward compatibility vs complexity

Posted Oct 27, 2011 23:15 UTC (Thu) by giraffedata (subscriber, #1954) [Link]

An audience member asked if it was worth it to continue maintaining backward compatibility "forever", and wondered if it doesn't lead to more kernel complexity. Torvalds was unconcerned about that ... Cox echoed that, saying that if the kernel developers give user space better interfaces, the programs will eventually switch, and any compatibility code can be deprecated and eventually removed.

How is Cox's statement an echo? Linus said we should be backward compatible forever -- the added complexity is not a problem; Cox gives a recipe for avoiding backward compatibility, apparently to reduce complexity.

LCE2011: Kernel developer panel

Posted Nov 4, 2011 18:27 UTC (Fri) by landley (guest, #6789) [Link]

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds