Leading items

LinuxCon Japan: Advice for new kernel hackers

By Jake Edge
June 20, 2012

The final two sessions at LinuxCon Japan were targeted at helping Asian developers become more involved in kernel development. James Bottomley talked about how the social aspects of kernel engineering were as important as its technical aspects. He then moderated a panel of kernel developers that focused on the problems faced by international developers, adding advice to assist new or aspiring kernel hackers in clearing those hurdles.

Some history

Bottomley began with a bit of history, noting that the kernel started off as a tiny "few thousand line" program. In the early days, features were needed so there were scant reviews and very little feature justification required. It was "almost the wild west of patches" and he is somewhat embarrassed to think back on some of the patches he got into the kernel. The only requirement was that a developer could write the code to implement a needed feature.

As examples, he cited the error handling in both the SCSI and IDE layers, which were done at interrupt level and could sometimes lock up the machine. The block layer had a single lock for all devices, while the TTY layer had a static array of TTYs.

The era that Bottomley and other kernel developers grew up in was an era of feature development, but "that era is over", which is a good thing. That "anything goes" style produced a full-featured kernel very quickly, but left behind lots of problems. In particular, robustness and scalability suffered under that scheme.

Ten years later, around 2002 or so, things started changing. Kernel development was moving from "hot coders" trying to make a functional Unix system to a community that cared more about robustness. Various subsystems were rewritten, including SCSI, the block layer, and USB (several times). There was also a "five year plan that took ten years" to move the kernel to more fine-grained locking (and eliminate the big kernel lock).

The attitude today is all about code quality, Bottomley said, and that is something that new kernel developers run into immediately. The kernel development process is more about ensuring that new code doesn't affect older code and that there aren't regressions in kernel functionality. Patches get much more review before being merged, and can often get held up waiting for reviewers. Today's kernel is "incredibly feature rich", but that also means that is complex and thus more fragile. Any new feature that adds complexity gets a lot of scrutiny to see if it really is needed.

There is an elaborate process as well as tools to help enforce the process. It is "exactly like ISO 9001, but worse", Bottomley said to a round of laughter. He pointed out that the Linux development process is compliant with the much-maligned quality management "standardization" effort from the early 2000s. The "Signed-off-by" line in patches, as well as the coding style that is enforced kernel-wide, are elements of a well-established process that seeks to ensure the quality of kernel submissions.

Longtime kernel developers have grown up with these processes, so they make sense to those people, but may be less clear to new contributors. For the most part, they came about because of clear problems that were seen and then addressed. For example, the "Signed-off-by" requirement came about because of the SCO case, but it is useful in its own right; allowing people to track back on the origin of various parts of the kernel.

Getting patches into the kernel

The easiest way to get your patches into the kernel is by making a bug fix, he said. The processes are much easier for a bug fix than they are for a new feature. But, don't label a new feature as a bug fix; it's been done before and it won't be successful. For a bug fix, the patch just needs to describe the bug and how the changes fix that bug.

Trying to get a new feature in is much harder; there is a need to "socialize" it first. That means getting a group of users and others interested in the feature, so that they can help advocate its inclusion. Conferences are a great way to socialize a new feature by talking to other kernel developers about it. Meeting face to face to discuss the feature, perhaps over "wine or whisky", can be beneficial as kernel developers are more inclined to listen in those settings, Bottomley said.

When posting patches for review, there is a natural human desire to show your best work, but that is actually somewhat counter-productive. The patch does not need to be perfect, and the discussion about the patch may help build a community of people interested in the feature. In addition, an imperfect patch makes people think they can still give input.

One thing to remember is that a patch posting will get input of various sorts that will require the poster to be the advocate for their patch. That means that one needs to be prepared to argue on the mailing list, which can be difficult, especially for Asian developers. The key is to only respond to the technical complaints and to ignore any personal attacks that might get posted.

It is also important to identify which of the comments to focus on. There are some people that "lurk" on the mailing lists, trying to make life harder for contributors. Ignoring those comments, while responding to and working with the more influential developers is the right strategy. Developers who get lots of patches accepted (of a technical, rather than janitorial, nature) are likely to be influential, as are those whose feedback is listened to and quoted by others in the discussion. Typically, those people will come up with sensible and constructive suggestions, but they won't always be right, Bottomley said.

For that reason, argument is an essential part of getting any patch accepted. For those who are uncomfortable doing so he suggested practicing defending the patch with friends and colleagues, ideally in English. Starting out by posting to the proper subject mailing list, rather than linux-kernel, will also help as the signal to noise ratio tends to be much better on those lists. In addition, becoming better known on the mailing lists will help with subsequent postings as people will recognize your name, which will make them more favorably disposed toward new postings. It can also help to meet some of the more influential developers at conferences, he said.

If you are getting arguments from influential developers, it may be because they don't understand what the feature is—or why it is needed. It is important to be able to explain the feature to those folks, but don't waste time arguing with the wrong people. One should definitely accept feedback from the right people and make the appropriate changes; there are often multiple iterations to a patch set before it gets accepted.

Bottomley rounded out his talk with some quick hit suggestions that will help smooth the path for a patch. A good changelog message is essential, one that doesn't say what the code does ("almost all maintainers can read C") but why it is needed. Splitting up patches into manageable pieces is important to simplify the review process. One way to do that is to describe what the patch does to colleagues and split up the patch in a way that corresponds to that description. It is also important to follow the established rules, as they have evolved over a long time and there is a reason for each of them.

Kernel panel

After that, several other kernel developers joined him on the stage: Arnd Bergmann of IBM/Linaro, Chris Mason of Oracle (at that time), Herbert Xu of Red Hat, and Masami Hiramatsu of Hitachi,. After they introduced themselves, Bottomley asked each to describe what obstacles they had faced and overcome in becoming a kernel developer. Hiramatsu mentioned "English, of course" as one of the hurdles, but also noted that the time difference between Japan and the rest of kernel developers can be a barrier. Sometimes he has stayed up late to be able to be a part of the discussion that would otherwise start without him.

Xu talked about some of the problems he faced as a Debian kernel maintainer. There were numerous patches carried by distributions that he and others tried to push upstream, but they were rejected. The distribution developers had a very different view of what was needed in the kernel than the subsystem maintainers and other mainline developers had. That problem has been largely overcome now, he said, by knowing what the kernel as a whole needs and adapting the distribution patches appropriately.

Mason pointed out that the time zone problem can actually be an advantage. One way to tamp the flames in a mailing list discussion is to reply slowly, which gives all of the participants some time to reconsider their words and thoughts. He "did everything wrong" in his first kernel project by choosing an enormous feature that he had no idea how to implement, then worked on it by himself for months without asking any questions. His suggestion was that new contributors not follow his lead and instead start with smaller bug fixes and get to know the community before moving on to bigger projects.

Bergmann said that his first patch was to support some new hardware that he sent to a maintainer who disappeared shortly thereafter. His next effort displayed his lack of C knowledge and had three bugs in a single line patch. Those problems were pointed out by a good maintainer, which led him to go back and study C and programming some more before his next submission. Bottomley pointed out that Bergmann's experience is an example that normally the criticism is aimed at the code, and not at the person.

Problems specific to developers in Asia (beyond the time zone and language issues) was next up. Hiramatsu noted that his teachers always wanted him to make things perfect in his code, which is kind of the opposite of the approach that works best for the Linux kernel. While we are ultimately aiming for the perfect kernel, we get there by having an open discussion on patches that are imperfect, he said.

Don't think, just do it

Bottomley asked specifically what he recommended for Japanese developers so that they were willing to submit imperfect patches. "Don't think, just do it", Hiramatsu—who may have a future in sports gear marketing—said. The key is to not worry about it too much before sending the patch, he said.

An area that Asian developers could improve, Xu said, is in participating in the kernel as more than just their work duty. In the west, many developers see their efforts in the kernel community as independent of their work, so that even if they started working in an unrelated field, "they would still be part of the community". Engaging the community takes time, and people are very busy with work and family, so they don't have time to make a long-term commitment to one area of the kernel. But it is very useful to do so, he said, and something that developers in Asia could do better.

Both IBM and Linaro have done private reviews for new developers, Bergmann said. The idea is to help them describe and defend the patch by discussing it on an internal mailing list. Essentially it helps train developers in how to argue on mailing lists, so that it will be easier when it needs to be done on linux-kernel.

Mason said that you can only create the perfect patch indirectly. The most important thing to do is to "prove that you understand the problem". That is done with a description of the feature or bug that shows that understanding. What code actually ends up in the mainline is dependent on the maintainers, so the patch should be seen as a "sample solution" that may or may not end up in the mainline. Bottomley pointed to the "Andrew Morton bug fix solution", which is to post a bad patch as a challenge to the rest of the kernel developers, as a similar idea.

Attracting and keeping new kernel developers was another topic covered. Mason said that it is important to encourage people that offer to help, by suggesting that they study the problem they are having. If they pick a feature or bug fix that they need or really care about, they will be more successful. Bergmann suggested that people be encouraged to become maintainers of smaller pieces of the kernel, rather than take on a bigger piece. He has done that and said it has "worked surprisingly well". It makes for "better maintainers than I am", he said.

As LinuxCon Japan wound to a close, Bottomley asked for "inspirational statements" from the panel. Mason said "Linux is yours, treat it like it's yours", while Xu extended that thought: "Treat it like a hobby", he said. Hiramatsu had the final word by reprising his earlier "Don't think, just do it" message. Suitably inspired, the assembled masses then made their way to the post-conference party.

Comments (none posted)

Introducing the Defensive Patent License

By Nathan Willis
June 20, 2012

Two professors from the University of California, Berkeley School of Law have launched the Defensive Patent License (DPL), a legal tool that is designed to do for patents what the GPL did for software licenses. It creates a copyleft-style method for patent holders to automatically share their patents with others who agree to share theirs in return. The goal is to "de-weaponize" patents and thus reduce the gridlock that slows down technology sector innovation, but the DPL is likely to have an uphill battle.

The DPL's creators are Jason Schultz and Jennifer Urban, both of whom have a background in online legal activism — Schultz with the Electronic Frontier Foundation's (EFF) Patent Busting project, and Urban with ChillingEffects. We first covered the effort in 2010, and the duo have been developing the specifics of the license since (a thorough examination can be found in the May 2011 video lecture they link to from the project site). But the DPL itself is now in a "public beta" phase, with feedback solicited from the Internet at large. The current text is available as a PDF, a Google Docs document, and as Markdown-formatted text on Github. There is also a paper available describing the rationale for the DPL's specific terms.

The best offense is a good defense

The idea at the heart of the DPL is defensive patents — those that a company files or purchases solely to deter its competitors from bringing lawsuits against it. Defensive patents are not offered for licensing under revenue-generating commercial terms, nor are they used to initiate litigation against others. The result is that large companies amass giant patent portfolios and enjoy the same relative stability of the Cold War's mutually-assured destruction. An unfortunate side effect of this popular strategy is patent proliferation. That tends to make open source projects and small companies live in fear of being shut down by astronomically expensive infringement lawsuits because they cannot stockpile their own defensive patents.

The DPL is a tool the companies could use to disarm the defensive patent standoff. Under its terms, a participating company offers a non-exclusive, royalty-free, perpetual, world-wide license for all of the patents in its patent portfolio to every other patent holder that also participates in the DPL. The DPL's license can be revoked for a particular licensee only under two circumstances: if the licensee sues another DPL licensor for patent infringement offensively, or if the licensee withdraws its patent portfolio from the DPL. However, the revocation is not automatic; each DPL licensor has the option to revoke a licensee. The result is that the DPL creates a mutually cross-licensing network, whose members have full access to each others' patents. Consequently, they should have no reason to pursue infringement litigation against each other, and defensive patents (both current and future) are devalued.

Outside of the DPL family, however, licensors are permitted to license any patents in their portfolios to any party, and to litigate to their heart's content. In theory that allows them to continue making money from their patents, and to respond to patent threats from outsiders. Two additional terms are important. First, a licensor may withdraw its portfolio from the DPL, but it must give advance notice before doing so (six months in the current wording), and all existing DPL licenses will remain intact. Second, the DPL stipulates that a licensor must ensure that its patents continue to be DPLed even if they are sold or acquired (by making that condition a term of the sale or transfer).

The latter condition is an attempt to prevent players from "gaming" the system by gaining access of the DPL patent pool then selling themselves, and it is believed to ensure the DPL's persistence after a bankruptcy declaration (although the authors solicit feedback on these points, since preventing such gaming is vital to making the DPL work).

Under idyllic circumstances, then, all DPL participants have free and perpetual access to each others' patents, but can still do whatever they want against players outside of the DPL community. That provides incentive for new parties to join, and no member has the power to refuse membership in the community to another licensor. The requirement that a licensor must place its entire patent portfolio under the DPL is there to keep unscrupulous companies from donating junk patents while keeping valuable ones private, which would prevent the pool from becoming valuable in the first place.

Is it that simple?

In their talk, Schultz and Urban enumerated several concerns about the DPL raised in their conversations with outsiders. One is that access to the pool of DPL patents is not sufficient incentive to join. Another is that the full-portfolio requirement is too off-putting and that a smaller commitment ought to be required. There are also potential anti-trust issues in some jurisdictions, the possibility of loopholes not yet discovered, and the general criticism that the DPL simply adds another entanglement to the already hard-to-navigate thicket of patent problems.

They also admit that many of the technology sector's problematic patents are not defensive, so the DPL will not end all patent litigation. In particular, patent trolls would be essentially unaffected by the existence of even a large DPL. Trolls litigate with offensive patents, and they do so without fear of retaliation because they make no products or services of their own (i.e., you cannot counter-sue a patent troll for infringing on your own portfolio, because the troll has no products; the mutually-assured-destruction strategy does not work against them, DPL or not).

Since the DPL's public launch, there have been several responses that offered additional concerns. David Hayes and Eric Schulman argue that joining the DPL disproportionately favors small players with fewer patents (who thus get access to more patents than they contribute back). Stephan Kinsella notes that small players do not get much for free because you can only join the DPL community if you have patents, and patents remain expensive to get and to retain. Kinsella also observes that it may be difficult to get the DPL pool started given the unpredictability of the US federal government.

After all, it’s unfair to let companies have too big of a defense against the patent threat. That would thwart the very purpose of the patent system, heavens to betsy! Or the FTC could jump in and claim that this pooling is anticompetitive, even though the purpose is obviously to permit competition to thrive, to block the anticompetitive effect of aggressive patent lawsuits. Who knows what the schizo feds would do.

Given the state of mutually-assured destruction, it is inherently risky to be the first one to lay down one's weapon, but that concern may be overstated by DPL critics. After all, for the first patent-holder to join the pool, nothing changes: the company looks benevolent, but still has free reign to litigate non-members (i.e., everyone) at will. Still, the "all-in" portfolio requirement has another problem: it is only appealing to companies whose entire portfolio is comprised of defensive patents, and leaves no room for other kinds of patents. It does not take too much speculation to see that there are companies working in both software and hardware (e.g., Intel or IBM) with hardware-related patents that even the staunchest software-patent critic might concede are valid original inventions.

For open source software projects, though, the primary concern is likely to be Kinsella's point about requiring patents to buy a seat at the table. Schultz and Urban concede that open source projects typically do not file for patents — for many reasons, including cultural opposition and mistrust of the patent system. But the high cost of acquiring a patent is not something the DPL can change.

There have been other approaches to fixing the patent problem from open source projects' perspective, including the Open Invention Network and Twitter's recent Innovator's Patent Agreement. On June 19, the EFF launched its own patent reform campaign with a seven-fold list of fixes. Compared to the other efforts, the DPL is not so much an attempt to fix the patent system as it is a way for interested patent holders to remove themselves from the defensive-patent game.

That option certainly won't appeal to everyone — and certainly won't to patent trolls or others who profit directly from gaming the system — but then again, the GPL permits developers to escape from the typical software licensing hijinks, and it has proven remarkably successful, as has the Creative Commons license suite for authors and artists. Not every such attempt to craft a standardized license is a success; Canonical's Project Harmony attempted to draft a standardized set of contributor agreements, but so far does not seem to have caught on in widespread fashion. The DPL project says it is open to public feedback, however, so if there is a consensus to be reached on anything resembling a "GPL for patents," this is probably how we will find it.

Comments (5 posted)

Liberation fonts and the tricky task of internationalization

By Nathan Willis
June 19, 2012

Fedora is debating dropping the storied Liberation font family from its distribution in favor of a fork. Liberation was one of the highest-profile open fonts, but it has languished since its initial release. Licensing issues were part of the problem, but so was the subtler disconnect of Liberation's origin as a commissioned work from a proprietary company, without an interest in working with the community. But the pressures of internationalization means the community has long sought a replacement; one that it can continue to develop.

Liberation through the ages

Liberation was released in 2007 by Red Hat, who had commissioned the designs from the commercial foundry Ascender Corporation. The initial set consisted of three fonts — Liberation Sans, Liberation Serif, and Liberation Mono — specifically designed to have the same metrics as the proprietary Monotype fonts Arial, Times New Roman, and Courier New, respectively. That meant that every character in Liberation would be the exact same height and width as its counterpart in the proprietary font, and Liberation could serve as a "drop in" replacement without disturbing line breaks or pagination. In 2010, Oracle donated a fourth typeface to Liberation: Liberation Sans Narrow, which was designed to be metric-compatible with Arial Narrow.

The Liberation family was regarded as high-quality, but it covered only the Latin, Greek, and Cyrillic alphabets, which left a lot of writing systems unaddressed. That alone is not a problem; fonts can — and are — extended to new writing systems frequently. But Liberation was licensed under unique terms, which inadvertently prevented such expansion.

Originally, the license was the GPLv2 with the Free Software Foundation's standard font embedding exception (which specifies that embedding the font in a PDF or similar document does not make the document itself a "combined work" triggering the GPL). However, Red Hat subsequently appended additional clauses to the license covering trademark and intellectual property concerns, and included a custom anti-Tivoization provision. After an examination of the extra clauses, Debian decided that they constituted additional restrictions on the GPLv2, which made the license self-contradictory and the fonts impossible to redistribute. The FSF reportedly found the Liberation license not to be a self-contradicting paradox, but said it was incompatible with the GPL. Furthermore, in recent times the GPL-with-font-embedding-exemption approach has fallen out of favor as an open font licensing choice, largely in favor of the SIL Open Font License (OFL). Fedora is aware of this shift, and now recommends the OFL for font projects.

Regardless of the exact details, however, the general consensus was that Liberation's peculiar license was, at best, problematic. More importantly, the practical upshot that few people were interested in contributing new character sets. The fonts have essentially remained unchanged since 2007. Minor fixes and isolated characters have been added, but no entirely new scripts.

Replacement plan

But the metric-compatibility feature of Liberation was its main selling point: it enabled Linux users to share documents with colleagues that had the popular Monotype fonts installed (e.g., all Windows systems), while ensuring compatibility.

On May 17, Parag Nemade emailed the Fedora-devel list to request packaging the Croscore family as a default, to serve as an alternative to Liberation. The Croscore family covers all of the same language blocks as Liberation, plus several new ones (such as Hebrew, Pinyin, and many African alphabets). It consists of three fonts: Arimo, Tinos, and Cousine, which also offer metric compatibility with Monotype's Arial, Times New Roman, and Courier New.

They were commissioned by Google for use in ChromeOS, and not only are they also the work of Steve Matteson, the same designer who created Liberation, but they are in fact a more recent version of the exact same designs. In an amusing bit of irony, however, Ascender Corporation (Matteson's company) was acquired by Monotype in 2010, so the new font family is copyrighted by Monotype, but designed to replace other Monotype fonts.

More to the point, however, Google made Croscore available under the OFL, which makes it simpler for outside contributors to extend the fonts to new character sets. Following the discussion in Nemade's thread, Fedora font packager Pravin Satpute proposed importing the Croscore sources into the Liberation package, replacing the problematically-licensed content rather than starting a separate package.

The Fontconfig package handles automatic font substitution on Linux, so once a change is pushed through with rules that replace (for example) Courier New with Croscore's Cousine instead of Liberation Mono, the only remaining hurdle will be for users to get used to the new names in the "Font" menu. On the other hand, growing the fonts to extend coverage to new writing systems is not trivial. The OFL license makes it easier; it enables developers to import and reshape glyphs from the large assortment of other OFL-licensed fonts.

What makes all this so difficult, anyway?

The Fedora plan calls for the community to continue development on the "Liberation 2.0" series, in the open, where the original Liberation was not. It would probably be a minor story if it were not for the fact that the same stalemate situation has developed for other open font commissions.

Much the same sequence of events befell the Bitstream Vera font family, which was designed by Bitstream (another commercial foundry which has since been acquired by Monotype) for the GNOME Foundation, and released in 2003. It, too, was under a license unique to the project, and has not seen any significant updates since its original release. Google has commissioned two fonts for distribution with Android: the familiar Droid family and the newer Roboto; both licensed under the Apache License (as is most of Android itself). Both offer wide language coverage in at least one of the faces (the sans serif), but have not otherwise seen significant expansion.

About the only open font commissioned from a commercial foundry that has grown to include more languages and alphabets is the Ubuntu font family designed by the Dalton Maag foundry. Although the details are of course private, Dalton Maag has an ongoing arrangement with Canonical to add more character sets over time. But the project does use a public issue tracker and accepts input and feedback from the user community, which none of the other commercial font commissions do.

Those differences are revealing. Commissioned open font projects such as Liberation and Bitstream Vera invariably attract significant attention — as do large "donations" of other types to open source. But when they are delivered in a self-contained bundle and not developed further, they have far less impact. It is easy for those of us who natively read European languages to forget just how many writing systems are not covered by basic Latin, Greek, and Cyrillic. Meanwhile, there are purely community-driven font projects that do cover far more of the globe's writing systems, such as Linux Libertine or DejaVu (the latter of which extends Bitstream Vera, side-stepping the peculiar Bitstream license by releasing its changes into the public domain).

The perception among the public is that the commercial fonts are of higher quality than the community-built fonts; a charge likely to rankle anyone who works on free software professionally. But by choosing non-standard licenses and not establishing the font as a software package that can be studied and patched, the early commercial commissions made that charge difficult to disprove. The problem is exacerbated when the foundry is uninterested in continuing to participate (as Bitstream was, when it said it was only interested in extending Vera if it were paid to do so).

But stagnation is detrimental to a font package just as it is to any software. Not only can every font be extended to cover more languages, but there are updated technologies like OpenType features and the Web Open Font Format (WOFF) to consider. Adding new character sets to a font is clearly a challenging task, demanding familiarity with multiple alphabets, and often requiring patches to be integrated by hand in a tool like FontForge. Hopefully the re-started Liberation 2.0 effort can draw on the lessons learned by Dalton Maag and DejaVu, and grow a sustainable project around the family. The original Liberation fonts filled a vital gap on Linux desktops and watching them languish has been disconcerting. Liberation now has the opportunity to re-import an entire codebase under a better license than the one that has hampered it for five years; few projects get a chance to start over at that same level — this is one that at least deserves to take a second shot.

Comments (64 posted)

Page editor: Jonathan Corbet
Next page: Security>>