LWN.net Logo

LFCS 2012: The kernel panel

By Jake Edge
April 11, 2012

On the first day of this year's Linux Foundation Collaboration Summit, several kernel developers sat down with moderator Greg Kroah-Hartman for another edition of the kernel panel. The developers covered a wide range of kernel subsystems, from graphics and memory management, to storage and networking. As is usual, a lively discussion ensued, covering a number of topical and longtime kernel concerns.

Audience questions for the panel are eagerly sought, Kroah-Hartman said, noting that a similar panel at LinuxCon Japan had turned into an entertaining argument between the kernel hackers onstage and those in the audience. He then had the panel introduce themselves. [Panel]

Mel Gorman said that he works for SUSE Labs on memory management along with fixing bugs in various SLES and openSUSE kernels. John Linville of Red Hat is the maintainer for the wireless LAN subsystem, which is, he said, not about writing "cool code" unfortunately, but is more of an administrative role shepherding others' patches and features. James Bottomley is the CTO for server virtualization at Parallels as well as maintaining the SCSI subsystem for the kernel. In addition, he "mucks about at the edges" trying to make kernel development better, which is often as much a social problem as anything else, he said. Keith Packard works for the Intel Open Source Technology Center on graphics and window systems, as well as doing kernel DRM (direct rendering manager) work.

Summit report

Kroah-Hartman then queried Bottomley and Gorman about what went on at the recently completed Linux Storage, Filesystem, and Memory Management Summit (LWN coverage: day one and day two). Bottomley rattled off a few different topics that came up in the storage and filesystems tracks including new "weird and wild" SCSI commands that are coming down the pipe. He joked that it was necessary to keep Christoph Hellwig gagged while that talk was going on so that everyone could actually hear about the commands. The summit is becoming one of the more important kernel development meetings, he said, and it is one that, unlike some kernel summits, actually has arguments; it's "definitely not boring".

Gorman also mentioned a number of different topics that were discussed in the memory management track including the two NUMA migration schemes that are currently floating around (Peter Zijlstra's sched/numa and Andrea Arcangeli's AutoNUMA), as well as containers and control groups. He said that kernel hackers are now concerned about how quickly the containers and control groups code runs, rather than whether it will run, which was the concern in the past. The discussions were "quite civil" at the summit, which contrasts with how they sometimes go on the mailing list. The meetings were definitely a success, he said, as even if a decision went against a developer's idea or plan, they got a good idea of why the others objected to it.

Wireless and graphics

Things are clearly getting much better in the wireless area, Kroah-Hartman noted; it "used to be a nightmare", but, he asked Linville, is it a solved problem now? Wireless in Linux has matured quite a bit over the last few years, Linville said, and there are a number of companies that are now participants in developing free Linux drivers, including Broadcom and Qualcomm/Atheros. It really helps to have people available "who know how the hardware works", he said.

But, wireless technology continues to evolve with things like 802.11ac coming along (Linville called it "[802.]11n on steroids") that require support and drivers for Linux. Bottomley asked about 802.11n compliance, which Linville said is going well, though there are still "things to be ironed out". The code is in place and drivers are using it, but there is still some development to be done. All of that is helped by better support from the bigger players, but some of the second-tier wireless hardware providers are working on free kernel drivers as well.

Moving on to Packard and graphics, Kroah-Hartman asked about X and mobile phones. In the past, phones shipped using X, but that really is no longer the case, he said, and wondered why. Packard said that the last six years had been spent "radically restructuring" graphics on Linux. The idea was to have kernel drivers that could support more window systems, beyond just X, because X is "not what people want anymore", he said. Today, most are interested in compositing-based models.

Compositing is a totally different windowing system model, he said, which is simpler. After all the work that was done, the existing kernel DRM layer is capable of supporting all of the different options, including Wayland, Android, and X. Android usually uses different drivers, but the idea is similar. We are, Packard said, moving away from X as the fundamental graphics layer for Linux, instead the Linux kernel now serves that purpose.

From the audience, Bdale Garbee asked Linville about the state of Broadcom drivers, noting that performance and stability of those drivers on laptops recently was "not so great". Linville said that he assumes it will get better, that the Broadcom drivers have not been in the kernel that long, and that those drivers getting exposure in distributions should help. That will lead to more bug reports, which will be beneficial. The Broadcom developers have been working well with Linville and are being diligent about looking at bug reports and fixing the problems reported there.

There is always going to be a certain amount of lag, Linville said, because some distributions are faster or slower about updating to the latest kernel. But, it is the "same old story", he said, if you find bugs, report them, and respond to the questions that are asked.

That led to a discussion of the stable kernel tree, with Bottomley noting that some distributions are more attuned to stable than others. Kroah-Hartman said that he tries to do a stable release each week, but that it is rare for Broadcom bug fixes to be sent to the stable tree. Linville said that he should remind developers to CC stable on the bug fix patches, but that there is a somewhat tricky balance there, which requires judgment calls on which fixes are appropriate.

Gorman said that it is common in memory management development to get "slapped" if thing aren't marked for stable that should be. But, he said, each subsystem deals with things differently. Bottomley said that nobody likes getting the email reminding them to send fixes to the stable maintainers but that it is important to get those patches into the stable trees.

Pet peeves

Next up was a question for all about their "pet peeves" in Linux kernel development. We often see the same problems over and over, Kroah-Hartman said, which ones are particularly irksome? Packard said that his biggest pet peeve was outside the graphics area. He uses Bluetooth a lot and is annoyed that every time an -rc1 kernel is released, Bluetooth breaks. It is good, in some ways, he guesses, because now he can debug and bisect Bluetooth problems in the kernel. But the basic problem seems to be that it is common for Bluetooth kernel development to break all of user space. Every time he has ever suggested that for graphics work, he got "flamed to a crisp". If your patch breaks the user-space interface, he said, please don't bother submitting the patch.

Bottomley is unhappy with changelogs that don't say why the change is being made. Changelogs often say what is being done, but they don't say what the user-visible effects are. Well-written changelogs should not describe the change itself, he said, because that's what the C code is for. "Almost all kernel developers can read C", he said, to a hearty laugh from the audience. When Linville suggested that he didn't really have a pet peeve, Bottomley immediately asked if he would be willing to swap subsystems with him.

On further reflection, Linville echoed some of Bottomley's complaints, noting that it is sometimes difficult to determine where a particular patch should be sent. Because the changelogs don't clearly indicate whether the patch is a fix or a new feature, he is left guessing whether it belongs in the next tree, or needs to applied more urgently. It is particularly problematic during a merge window, he said, so the changelogs need to say where the patch is bound.

Since he doesn't "have to maintain anything", Gorman is in a different position. He joked that he gets to "rag on maintainers and make their life miserable". More seriously, he pointed to mistakes that are made again and again as a pet peeve of his. He mentioned writeback causing long delays when writing to USB sticks as an example. That has been fixed "at least eight times", he said, only to be broken again in the next release. We need to do a better job checking to see that those kinds of bugs stay fixed, he said.

More audience participation

A member of the audience asked about the future of proprietary loadable kernel modules, and Kroah-Hartman immediately said that he really didn't see a future for them. The kernel developers have provided ways to operate hardware from user space that can be used for proprietary drivers. As an example, he pointed to "laser welding robots" that are driven from user space with a 3D application that uses lots of floating point math.

If companies look at the business case, Bottomley said, it is rare that closed drivers make sense. If a company produces standard hardware that lots of people will require a driver for, there is no real business value in a closed driver. For more specialized devices, user-space drivers may make sense, he said.

Structured logging was the topic of another audience question. The idea has been around since at least 2004, the audience member said, and some solutions have started to appear. The problem is that users are now supporting larger numbers of systems and "cannot manage a datacenter by hand". Where is structured logging in the kernel headed?

Kroah-Hartman mentioned that a patch had just been merged that builds atop dev_printk() and brings some structure to logging. There have been recent proposals, including one at last year's kernel summit that got derailed by a "spat over UUIDs", Bottomley said. Packard said that there is a fear of top-down proposals for structured logging, especially if driver writers have to specify their messages ahead of time. Some proposals remind him of the VMS error message documentation that came in a large binder. What's needed is a way for driver writers and others to get the benefits of structured logging without all the problems, he said.

SCSI and bufferbloat

The new SCSI commands that Bottomley mentioned at the start formed the basis of the next question. Kroah-Hartman noted that there are more high-speed storage devices these days that are avoiding SCSI because they can't get the I/O operations/second (IOPS) rates that they need, so, he asked, is SCSI dead for high speed? Bottomley said that it is really two different questions. There is a need for storage that acts like memory, and some of these efforts are intermediate steps that are being taken when "what we really need is more memory".

As far as whether SCSI will survive, Bottomley was willing to bet that it would be around for quite some time. There is a need for standards-based storage devices so that users can purchase storage at Fry's or other stores and know that they will work with their systems. Whether it will be SATA, SAS, SCSI, or something else is not clear, but he believes that SCSI will be in the mix.

Throughput used to be an important measure for storage, but that is moving to IOPS. SCSI used to sacrifice latency for throughput, but the reverse is happening because of the focus on IOPS. Linville spoke up at that point to note the parallels with bufferbloat in the networking world. Latency was de-emphasized in networking devices for throughput and then people started wondering why their interactivity had gone down.

That led to a question to Linville about bufferbloat and the patches that had recently gone into the kernel to try to address some of those problems. It is a "complicated topic", Linville said, and wireless is part of the problem, though it is seen on both wired and wireless networks. Some of the problems are inherent in wireless technologies because the available bandwidth changes over short periods of time, which can lead to high latency. There are also problems with wireless that aren't bufferbloat-related but sometimes look like bufferbloat.

Unfortunately, no "magic solution" has presented itself to fix the general bufferbloat problem. There needs to be an adaptive queue management algorithm, but none is known that solves the problem. Something that works well for wired networks is "random early discard" (RED), but that requires lots of tuning. A recent change to measure queues in bytes, rather than packets, helps, because queue length limits are set by the aggregate size of the packets being sent. But there are still questions of what the length of the byte queues should be, whether they should change, and, if so, how often. The problem is not specific to Linux, and there are some political issues surrounding it because not everyone believes it is a problem—or that it is their problem.

A grab bag of questions and answers

As the panel time slot wound down, there were a number of other audience questions and kernel hacker discussions. An audience question about participation by Linux developers in standards committees noted that it takes money and some amount of insanity to participate in those. Kroah-Hartman pointed out that the Linux Foundation has worked on standards participation in the past, while Bottomley asked why there was a perception that Linux developers and companies are not involved. He noted that in the storage area there is a "tireless Dane" (Martin Petersen) who works on standards. It isn't the money or doing what's needed to get on the committees that is the problem, Bottomley said, but instead it is finding the right people to do so. The T10, T13, and UEFI committees all have Linux representatives, he said. If there are standards committees where Linux is not represented, we want to know about that, Kroah-Hartman said.

Grant Likely asked about the progress of the Android patches into the mainline; when is that job "done"? Once Android is using mainline kernels was the answer Bottomley gave. Kroah-Hartman noted that the real problem is on the user-space side. Kernel hackers can't do anything about changing the Android user space, but companies like Linaro and Samsung are making some progress in doing so. The 3.3 kernel can boot an Android user space, but it will "eat your battery alive", he said. We are making progress, but it will require teamwork to get there.

What are kernel hackers doing to measure power consumption was the next audience question. Packard said that there is a lot of focus on that in the graphics arena. They are using wall power meters to measure the power consumption of various devices over time and trying to correlate those measurements with active functional units in graphics devices and system-on-chips (SoCs). They measure things like joules-per-movie, which is a critical measure for users. There is an effort to balance the "race to idle" with voltage and frequency scaling, he said, especially for latency-sensitive applications like displaying movies. In addition to just the graphics hardware, they are trying to measure memory power utilization and bus power utilization, he said.

The problem is bigger than graphics, Bottomley said. In the storage world, the speed of the buses has been "jacked up", which increases the power usage. On a netbook SATA link, a half-watt of power can be used just to power the bus. Power saving for SATA buses is coming, he said.

The last topic covered was that of "hardware bypass", which are devices that take on some of the tasks that are normally handled by the kernel in the interests of performance. Gorman pointed out that bypass is often done to drive some "artificial metric" and that kernel developers need to know what the metric is in order to make proper decisions. The question for those proposing bypass should be "Why are you trying to do that?", he said. The problem is not just for SCSI (or storage), but for all of the various bypass (or offload) proposals.

The CPU isolation feature, which allows an administrator to remove a CPU from those managed by the kernel to run a particular workload unperturbed by the rest of the system, is one that Gorman mentioned. One of the reasons that people give for wanting the feature is to avoid the inter-processor interrupt (IPI) "storms" that can occur. But a better way to approach that problem is to figure out why those storms are happening and to address that instead, he said. That's "ultimately the right thing to do" whenever bypass is suggested.

Linville noted that TCP offload engines provided a boost for some users, but that CPU improvements have "largely erased the gains" that were made. Bottomley said that the question should not be how to avoid the kernel code, but instead should be how to take advantage of the work that the kernel developers do. Essentially, the consensus was that bypass or offload technologies are not only bypassing the kernel, but are also ignoring the collective knowledge and abilities of the Linux kernel community.

Once again, the kernel panel gave a nice glimpse inside the heads of kernel developers. It provided some insight into how they approach problems, and where they think solutions generally lie. It was nice to have a mix of some "fresh blood" as well as "old hands" on the panel, which definitely led to an interesting discussion.


(Log in to post comments)

LFCS 2012: The kernel panel

Posted Apr 11, 2012 21:20 UTC (Wed) by dakt (guest, #74570) [Link]

Any links to video???

LFCS 2012: The kernel panel

Posted Apr 12, 2012 17:00 UTC (Thu) by Tobu (subscriber, #24111) [Link]

Here's the kernel panel; here's more from the rest of the summit.

LFCS 2012: The kernel panel

Posted Apr 12, 2012 17:43 UTC (Thu) by dakt (guest, #74570) [Link]

Thank you.

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds