LWN.net Weekly Edition for November 25, 2010
On breaking things
Our systems run a complex mix of software which is the product of many different development projects. It is inevitable that, occasionally, a change to one part of the system will cause things to break elsewhere, at least for some users. How we respond to these incidents has a significant effect on the perceived quality of the platform as a whole and on its usability. Two recent events demonstrate two different responses - but not, necessarily, a clear correct path.The two events in question are these:
- An optimization applied to glibc changed the implementation of
memcpy(), breaking a number of
programs in the process. In particular, the proprietary Flash
plugin, which, contrary to the specification, uses memcpy()
to copy overlapping regions, is no longer able to play clear audio for
some kinds of media.
- A change in the default protections for /proc/kallsyms, merged for the 2.6.37 kernel, was found to cause certain older distributions to fail to boot. The root cause is apparently a bug in klogd, which does not properly handle a failure to open the symbol file.
In summary, we have two changes, both of which were intended to improve the behavior of the system - better performance, in the glibc case, and better security for /proc/kallsyms. In each case, the change caused other code which was buggy - but which had been working - to break. What came thereafter differed considerably, though.
In the glibc case, the problem has been experienced by users of Fedora 14, which is one of the first distributions to ship the new memcpy() implementation. Given that code using glibc has been rendered non-working by this change, one might reasonably wonder if the glibc developers have considered reverting it. As far as your editor can tell, though, nobody has even asked them; the developers of that project have built a reputation for a lack of sympathy in such situations. They would almost certainly answer that the bug is in the users of memcpy() who, for whatever reason, ignored the longstanding rule that the source and destination arrays cannot overlap. It is those users who should be fixed, not the C library.
The Fedora project, too, is in a position to revert the change. The idea was discussed at length on the fedora-devel mailing list, but the project has, so far, taken no such action. At this level, there is a clear tension between those who want to provide the best possible user experience (which includes a working Flash player) in the short term, and those who feel that allowing this kind of regression to hold back a performance improvement is bad for the best possible user experience in the longer term. According to the latter group, reverting the change would slow things down for working programs and relieve the pressure on Adobe to fix its bug. It is better, they say, for affected users to apply a workaround and complain to Adobe. That view appears to have carried the day.
In the /proc/kallsyms case, the change was reverted; an explicit choice was made to forgo a potential security improvement to avoid breaking older distributions. This decision has been somewhat controversial, both on the kernel mailing list and here on LWN. The affected distribution (Ubuntu 9.04) is relatively old; its remaining users are unlikely to put current kernels on it. So a number of voices were heard to say that, in this case, it is better to have the security improvement than compatibility with older distributions.
Linus was clear about his policy, though:
The kernel's record with regard to this rule is, needless to say, not perfect, but that record as a whole is quite good; that has served the kernel well. It is usually possible to run current kernels on very old distributions, allowing users to gain new hardware support and features, or simply to help with testing. It forms a sort of contract with the kernel's users which gives them some assurance that new releases will not cause their systems to break. And, importantly, it helps the kernel developers to keep overall kernel quality high; if you do not allow once-working things to break, you can be at least somewhat sure that the quality of the kernel is not declining over time. Once you start allowing some cases to break, you can never be sure.
There is probably little chance of a kernel-style "no regressions" rule being universally adopted. Even in current kernels, the interface to the rest of the system is relatively narrow; the system as a whole has a much larger range of things that can break. It is a challenge to keep new kernel releases from causing problems with existing applications; for a full distribution, it's perhaps an insurmountable challenge. That is part of why companies pay a lot of money for distributions which almost never make new releases.
Some kinds of regressions are also seen as being tolerable, if not actively desirable. There has never been any real sympathy for broken proprietary graphics drivers, for example. The proprietary nature of the Flash plugin will not have helped in this case either; it is irritating to know exactly how to fix a problem, but to be unable to actually apply that fix. Any free program affected by this bug would, if anybody cared about it at all, have been fixed long ago. Flash users, meanwhile, are still waiting for Adobe to change a memcpy() call to memmove(). One could certainly argue that holding Adobe responsible for its bug - and, at the same time, demonstrating the problems that come with proprietary programs - is the right thing to do.
On the other hand, one could argue that breaking Flash is a good way to demonstrate to users that they should be using a different distribution - or another operating system entirely. Your editor would suggest that perfection with regard to regressions is not achievable, but it still behooves us to try for it when we can. There is a lot to be said for creating a sense of confidence that software updates are a safe thing to apply. It will make it easier to run newer, better software, inspire users to test new code, and, maybe, even bring some vendors closer to upstream. We should make a point of keeping things from breaking, even when the bugs are not our fault.
Reports of procmail's death are not terribly exaggerated
The mail delivery agent (MDA) procmail is a Linux and Unix mainstay; for years it has been the recommended solution for sorting large volume email and filtering out spam. The trouble is that it is dead, and it has been for close to a decade. Or at least that may be the problem, depending on how you look at it. The question of when (or if) to declare an open source project dead does not have a clear answer, and many people still use procmail to process email on high-capacity systems.
For those unfamiliar with it, MDAs like procmail receive incoming mail from mail transport agents (MTAs) like Sendmail or Postfix, then process the received messages according to user-defined "recipes." Recipes examine the headers and body of messages, and are usually used to sort email to different mailboxes, forward messages to different addresses, and perhaps most importantly, to recognize and dispose of spam — often by triggering an external spam filtering tool like SpamAssassin. Recipes can also modify messages themselves, such as to truncate dangerously long message bodies or abbreviate irritatingly-long recipient lists.
Officially, the last stable procmail release was version 3.22, made in September of 2001. As one might expect, there has never been an official "the project is dead" announcement. Instead, only circumstantial evidence exists. Although several of the FTP mirrors include what appear to be development "snapshot" packages as recent as November of 2001, there does not appear to have been any substantial work since that time. The developers' mailing list has hardly seen a non-spam blip since 2003.
A side effect of a project abandoned that long ago is that there was no web-based source code repository at the time, even though such repositories are a fixture today, so only the tarballed releases uploaded to the FTP or HTTP download sites exist for FOSS archaeologists to examine. Similarly, a great many of the links on the official project page, including mailing list archives, external FAQ pages, and download mirrors, have succumbed to link-rot over the years and no longer provide access to useful information for those just getting started.
I'm not dead yet
Despite all this, procmail still has a loyal following. The procmail users' mailing list is actually quite active, with most of the traffic focusing on helping administrators maintain procmail installations and write or debug recipes. Reportedly, many of today's current procmail users are Internet service providers (ISPs), who naturally have an interest in maintaining their existing mail delivery tool set.
procmail's defenders usually cite its small size and its steady reliability as reasons not to abandon the package. A discussion popped up on the openSUSE mailing list in mid-November about whether or not the distribution should stop packaging procmail; Stefan Seyfried replied by saying that rather than dying ten years ago, the program was "finished
" ten years ago:
In a similar vein, when Robert Holtzman asked on the procmail users' list whether or not the project was abandoned, Christopher L. Barnard replied "It works, so why mess with it? It does what in needs, no more development is needed...
"
But there are risks inherent in running abandonware, even if it was of stellar quality at the last major release. First and foremost are unfixed security flaws. Mitre.org lists two vulnerabilities affecting procmail since 2001: CVE-2002-2034, which allows remote attackers to bypass the filter and execute arbitrary code by way of specially-crafted MIME attachments, and CVE-2006-5449, which uses a procmail exploit to gain access to the Horde application framework. In addition, of course, there are other bugs that remain unfixed. Matthew G. Saroff pointed out one long-standing bug, and the procmail site itself lists a dozen or so known bugs as of 2001.
Just as importantly, the email landscape and the system administration marketplace have not stood still since 2001, either. Ed Blackman noted that procmail cannot correctly handle MIME headers adhering to RFC 2047 (which include non-ASCII text), despite the fact that RFC 2047 dates back to 1996. RFC 2047-formatted headers are far from mandatory, but they do continue to rise in frequency.
Bart Schaefer notes that every now and then, someone floats the possibility of a new maintainer stepping up — but no one ever actually does so. Regardless of the theoretical questions about whether there are unfixed bugs, surely that practical reality provides the answer no one can arrive at by other logic: if no one works on the code, and no one is willing to work on the code, then surely it can be called abandoned.
What's a simple procmail veteran to do?
The most often-recommended replacement for procmail is Maildrop, an application developed by the Courier MTA project. Like procmail, Maildrop reads incoming mail on standard input and is intended to be called by the MTA, not run directly. It also requires the user to write message filters in a regular-expression-like language, but it reportedly uses an easier-to-read (and thus, easier-to-write) syntax.
The project also advertises several feature and security improvements over procmail, such as copying large messages to a temporary file before filtering them, as opposed to loading them into memory. Maildrop can also deliver messages to maildir mailboxes as well as to mbox mailboxes; procmail natively supports just mbox, although it can be patched (as distributions seem to have done) or use an external program to deliver to maildir mailboxes.
The merits of the competing filter-writing syntaxes are a bit subjective, but it is easy to see that procmail's recipe syntax is more terse, using non-alphabetic characters and absolute positioning in place of keywords like "if" and "to." For example, the Maildrop documentation provides some simple filter rules, such as this filter that is triggered by the sender address boss@domain.com and includes the string "project status" somewhere in the Subject line:
if (/^From: *boss@domain\.com/ \ && /^Subject:.*[:wbreak:]project status[:wbreak:]/) { cc "!john" to Mail/project }
The action enclosed in curly braces routes the message to the Mail/project folder, and forwards a copy of the message to the user "john." An equivalent in procmail's recipe language might look like this instead:
:0: * ^From.*boss@domain\.com * ^Subject:.*(project status) ${DEFAULT}/project ! john@domain.com
The first line specifies that this is a new recipe; the trailing colon tells procmail to lock the mail file, which is necessary when saving the message to disk. The asterisks and exclamation point that begin lines are operators indicating new "conditions" and the forwarding action, respectively — neither is part of a regular expression. As you can see, the Maildrop syntax is not noticeably longer, but it could be easier to mentally parse late at night — particularly if reading filters written by someone else. Regrettably there does not seem to be an active project to automatically convert procmail recipes to Maildrop filters, which means switching between the packages requires revisiting and rewriting the rules.
Maildrop is not the only actively maintained MDA capable of filling in for procmail, although it is the easiest to switch to, by virtue of running as a standard-in process. Dovecot's Local Delivery Agent (LDA) module, for instance, has a plugin that allows administrators to write filtering rules in the Sieve language (RFC 5228). Maildrop has an advantage over LDA, though, in that in addition to Courier, it is also designed to work with the Qmail and Postfix MTAs.
If you are currently running procmail without any trouble, then there is certainly no great need to abandon it and switch to Maildrop or any other competitor. OpenSUSE, for its part, eventually concluded that there was no reason to stop packaging procmail, for the very reasons outlined above: it works, and people are still using it. However, ten years is a worryingly long time to go without an update. The simple fact that there are only two CVEs related to procmail since its last release is in no way a guarantee that it is exploit- or remote-exploit-free. At the very least, if your mail server relies on the continued availability of procmail, now is a good time to start examining the alternatives. Lumbering undead projects can do a lot of damage when they trip and fall.
Impressions from the 12th Realtime Linux Workshop in Nairobi
A rather small crowd of researchers, kernel developers and industry experts found their way to the 12th Real-Time Linux WorkShop (RTLWS) hosted at Strathmore University in Nairobi, Kenya. The small showing was not a big surprise, but it also did not make the workshop any less interesting.After eleven workshops in Europe (Vienna, Milano, Valencia, Lille, Linz, Dresden), America (Orlando, Boston, Guadalajara) and Asia (Singapore, Lanzhou) the organization committee of the Realtime Linux workshop decided that it was time to go to Africa. The main reason for this was the numerous authors who had handed in their papers in the previous years but were not able to attend the workshop due to visa problems. Others simply were not able to attend such events due to financial constraints. So, in order to give these interested folks the opportunity to attend and to push the African FLOSS community, and of course especially the FLOSS realtime Community, Nairobi was chosen to be the first African city to host the Realtime Linux Workshop.
Kenya falls into the category of countries which seem to be completely disorganized, but very effective on the spontaneous side at the same time. As a realtime person you need to deal with very relaxed deadlines, gratuitous resource reservations and less-than-strict overall constraints, but it's always a good experience for folks from the milestone- and roadmap-driven hemisphere to be reminded that life actually goes on very well if you sit back, relax, take your time and just wait to see how things unfold.
Some of the workshop organizers arrived a few days before the conference and had adjusted enough to the local way of life so they were not taken by surprise that many of the people registered for the conference did not show up but, at the same time, unregistered attendees filled in.
Day 1
The opening session, scheduled at 9AM on Monday, started on time at 9:40, which met the already-adjusted deadline constraints perfectly well. Dr. Joseph Sevilla and deputy vice-chancellor Dr. Izael Pereira from Strathmore University and Nicholas McGuire from OSADLs Realtime Linux working group welcomed the participants. Peter Okech, the leader of the Nairobi organization team, did the introduction to the logistics.
Without further ado, Paul McKenney introduced us to the question of whether realtime applications require multicore systems. In Paul's unmistakable way he lead us through a maze of questions; only the expected quiz was missing. According to Paul, realtime systems face the same challenges as any other parallel programming problem. Parallelizing a given computation is not necessarily giving you the guarantee that things will go faster. Depending on the size of the work set, the way you split up the data set and the overhead caused by synchronization and interprocess communication, this might actually leave you very frustrated as the outcome can be significantly slower than the original, serialized approach. Paul gave the non-surprising advice that you definitely should avoid the pain and suffering of parallelizing your application if your existing serialized approach does the job already.
If you are in the unlucky position that you need to speed up your computation by parallelization, you have to be prepared to analyze the ways to split up your data set, choose one of those ways, split up your code accordingly, and figure out what happens. Your mileage may vary and you might have to lather, rinse and repeat more than once.
So that leaves you on your own, but at least there is one aspect of the problem which can be quantified. The required speedup and the number of cores available allow you to calculate the ratio between the work to be done and the communications overhead. A basic result is that you need at least N+1 cores to achieve a speedup of N, but as the number of cores increases, the ratio of communications overhead to work goes up nonlinearly, which means you have less time for work due to synchronization and communication. Larger jobs are more suitable than small ones, but, even then, it depends on the type of computation and on the ability to split up the data set in the first place. Parallelization, both within and outside of the realtime space, still seems to be an unlimited source of unsolved problems and headaches.
Paul left it to me to confuse the audience further with an introduction to the realtime preemption patch. Now admittedly the realtime preemption patch is a complex piece of software and not likely to fall into the category of realtime systems whose correctness can be verified with mathematical proof. Carsten Emde's followup talk looked at the alternative solution of monitoring such systems over a long period of time to reach a high level of confidence of correctness. There are various methods available in the kernel tracer to monitor wakeup latencies. Some of those have low-enough impact to allow long-term monitoring even on production systems. Carsten explained in depth OSADL's efforts in the realtime QA Farm. The long-term testing effort in the QA farm has improved the quality of the preempt-rt patches significantly and gives us a good insight into their behaviour across different hardware platforms and architectures.
On the more academic side, the realtime researchers from the ReTiS Lab at the Scuola Superiore Sant'Anna, Pisa, Italy looked at even more complex systems in their talk titled "Effective Realtime computing on Linux". Their main focus is on non-priority-based scheduling algorithms and their possible applications. One of the interesting aspects they looked at is resource and bandwidth guarantees for virtual machines. This is not really a realtime issue, but the base technology and scheduling theory behind it emerges from the realtime camp and might prove the usefulness of non-priority-based scheduling algorithms beyond the obvious application fields in the realtime computing space.
One of the most impressive talks on day one was the presentation of a "Distributed embedded platform" by Arnold Bett from the University of Nairobi. Arnold described an effort driven by physicists and engineers to build an extremely low-cost platform applicable to a broad range of essential needs in Kenya's households and industry. Based on a $1 Z80 microcontroller, configurable and controllable by the simplest PC running Linux, they built appliances for solar electricity, LED-based room lights and simple automation tasks in buildings and shop floors. All tools and technology around the basic control platform are based on open source technology, and both the hardware and the firmware of the platform are going to be available under a non-restrictive license. The hardware platform itself is designed to be manufactured in a very cost-effective way not requiring huge investments for the local people.
Day 2
The second day was spent with hands-on seminars about git, tracing, powerlink, rt-preempt and deadline scheduling. All sessions were attended by conference attendees and students from the local universities. In addition to the official RTLWS seminars, Nicholas McGuire gave seminars with the topics "filesystem from scratch", "application software management", "kernel build", and "packaging and customizing Debian" before and after the workshop at the University of Nairobi.
Such hands-on seminars have been held alongside most of the RTLWS workshops. From experience we know that it is often the initial resistance that stops the introduction of technologies. Proprietary solutions are presented as "easy to use", as solving problems without the need to manage the complexity of technology and without investing in the engineering capabilities of the people providing these solutions. This is and always has been an illusion or worse, a way of continued creation of dependency. People can only profit from technology when they take control of it in all aspects and when they gain the ability to express their problems and their solutions in terms of these technological capabilities. For this to happen it's not sufficient to know how to use technology. Instead it's necessary that they understand the technology and are able to manage the complexity involved. That includes mastering the task of learning and teaching technology and not "product usage". That's the intention of these hands-on seminars, and, while we have been using GNU/Linux as our vehicle to introduce core technologies, the principles go far beyond.
Day 3
The last day had a follow up talk by Peter Okech to his last year's surprising topic of inherent randomness. It was fun to see new interesting ways of exploiting the non-deterministic behavior of today's CPUs. Maybe we can get at least a seed generator for the entropy pool out of this work in the not-so-distant future.
The afternoon session was filled with an interesting panel discussion about "Open Innovation in Africa". Open Innovation is, according to Carsten Emde, a term summing up initiatives from open source to open standards with the goal of sharing non-differentiating know-how to develop common base technologies. He believes that open innovation - not only in the software area - is the best answer to the technological challenges of today and the future. Spending the collective brain power on collaborative efforts is far more worthwhile than reinventing the wheel in different and incompatible shapes and sizes all over the place.
Kamau Gachigi, Director of FabLab at the University of Nairobi, introduced the collaborative innovation efforts of FabLab. FabLabs provide access to modern technology for innovation. They began as an outreach project from MIT's Center for Bits and Atoms (CBA). While CBA works on multi-million dollar projects for next-generation fabrication technologies, FabLabs aim to provide equipment and materials in the low-digit-dollars range to gain access to state-of-the-art and innovative next-generation technologies. FabLabs have spread out from MIT all over the world, including to India and Africa, and provide a broad range of benefits from technological empowerment, technical training, localized problem solving, and high-tech business incubation to grass-roots research. Kamau showed the impressive technology work at FabLabs which is done with a very restricted budget based on collaborative efforts. FabLabs are open innovation at its best.
Alex Gakuru, Chair of ICT Consumers Association of Kenya, provided deep insight into the challenges of promoting open source solutions in Kenya. One of the examples he provided was the Kenya state program to provide access to affordable laptops to students, on whose committee he served. Alex found that it was impossible to get reasonable quotes for Linux-based machines for various reasons, ranging from the uninformed nature of committee members, through the still not-entirely-resolved corruption problem, to the massive bullying by the usual-suspect international technology corporations which want to secure their influence and grab hold of these new emerging markets. He resigned in frustration from the committee after unfruitful attempts to make progress on this matter. He is convinced that Kenya could have saved a huge amount of money if there had been a serious will to fight the mostly lobbying-driven choice of going with the "established" (best marketed solution). His resignation from this particular project did not break his enthusiasm and deep concern about consumer rights, equal opportunities and open and fair access to new technologies for all citizens.
Evans Ikua, FOSS Certification Manager at FOSSFA (Free and Open Source Software Foundation for Africa, Kenya) reported on his efforts to provide capacity building for FOSS small and medium enterprises in Africa. His main concern is to enable fair competition based on technical competence to prevent Africa being overtaken by companies which use their huge financial backings to buy themselves into the local markets.
Evans's concerns were pretty much confirmed by Joseph Sevilla, Senior Lecturer at Strathmore University, who complained about the lack of "The Open Source/Linux" company which competes with the commercial offerings of the big players. His resolution of the problem - to just give up - raised more than a few eyebrows within the panelists and the audience, though.
After the introductory talks, a lively discussion about how to apply and promote the idea of open innovation in Africa emerged, but, of course, we did not find the philosopher's stone that would bring us to a conclusive resolution. Though the panelists agreed that many of the technologies which are available in Africa have been coming in from the outside, they sometimes fit the needs and in other cases simply don't. Enabling local people to not only use but to design, develop, maintain and spread their own creative solutions to their specific problems is a key issue in developing countries. To facilitate this, they need not only access to technical solutions, but full and unrestricted control of the technological resources with which to build those solutions. Taking full control of technology is the prerequisite to effectively deploy it in the specific context - and, as the presentations showed us - Africa has its own set of challenges, many of which we simply would never have thought of. Open innovation is a key to unleash this creative potential.
Conclusions
Right after the closing session a young Kenyan researcher pulled me aside to show me a project he has been working on for quite some time. Coincidentally, this project falls into the open innovation space as well. Arthur Siro, a physicist with a strong computer science background, got tired of the fact that there is not enough material and equipment for students to get hands-on experience with interesting technology. Academic budgets are limited all over the world, but especially in a place like Kenya. At some point he noticed that an off the shelf PC contains hardware which could be used for both learning and conducting research experiments. The most interesting component is the sound card. So he started working on feeding signals into the sound card, sampling them, and feeding the samples through analytic computations like fast fourier transforms. The results can be fed to a graphic application or made available, via a simple parallel port, to external hardware. The framework is purely based on existing FOSS components and allows students to dive into this interesting technology with the cheapest PC hardware they can get their hands on. His plans go further, but he'll explain them himself soon when his project goes public.
My personal conclusion of this interesting time in Nairobi is that we really need to look out for the people who are doing the grunt work in those countries and give them any possible help we can. One thing is sure that part of this help will be to just go back there in the near future and show them that we really care. In hindsight we should have made more efforts upfront to reach out to the various groups and individuals interested in open source and open innovation, but hindsight is always easier than foresight. At least we know how to do better the next time.
On behalf of the participants and the OSADL RTLWS working group I want to say thanks again to the Nairobi organization team led by Peter Okech for setting up the conference and taking care of transportation, tours to the Nairobi national park, and guiding us safely around. Last we would like to encourage the readers of LWN.net who are involved in organizing workshops and conferences to think about bringing their events to Africa as well in order to give the developers and students there the chance to participate in the community as they deserve.
(The proceedings of the 12th RTLWS are available as a tarball of PDF files).
Novell acquired by Attachmate
The big news in the Linux world this week is Novell's
agreement to be acquired by Attachmate. While the financial terms of
that agreement seem—at first blush anyway—to be a fairly
reasonable deal for Novell shareholders, there is something of an odd
addition: a concurrent sale of "intellectual property assets
" to a newly
formed holding company. That CPTN Holdings LLC was organized by Microsoft
makes the acquisition more than a little worrisome to many in the Linux and
free software communities.
Novell has been trying to find the right buyout offer since at least March, when Elliott Associates made an unsolicited offer to buy the company for $5.75/share. Attachmate offered $6.10/share, but it also gets an influx of $450 million from the asset sale to CPTN, so it is, in effect, putting up less money than Elliott Associates would have. In any case, the Novell board, and presumably its stockholders, are likely pleased with the extra $0.35/share they will receive.
In the 8K
filing that Novell made about the acquisition, the assets that are
being sold to CPTN were specified as 882 patents. Which patents
those are is an open question. While the idea of more patents in the hands
of Microsoft and a "consortium of technology companies
" is somewhat
depressing, it's too early to say whether they are aimed squarely at
Linux. Novell has been in a lot of different businesses over the years, so
it's possible—though perhaps unlikely—that these patents cover
other areas.
While Attachmate is not a well-known company in the Linux and free software
world—or even outside of it—it has made all the right noises
about what it plans to do with Novell once the acquisition is completed.
The press release says that Attachmate "plans to operate Novell as
two business units: Novell and SUSE
", which may imply that
there isn't
a plan to break up the company and sell off the pieces—it certainly
makes logical sense to split those, basically unrelated, parts into
separate business units. Mono
project lead
Miguel de Icaza has said
that Mono development will continue as is. Attachmate also put out a brief
statement to try to reassure the openSUSE community: "Attachmate
Corporation anticipates no change to the relationship between the SUSE
business and the openSUSE project as a result of this transaction
".
The 8K mentions some interesting escape clauses for Novell, including the
ability to void the asset sale if a better offer for the company and
those patents come along. In addition, if the acquisition by Attachmate
does fall
through for some other reason, CPTN can continue with patent purchase but it must
license the patents back to Novell. That license will be a
"royalty-free, fully paid-up patent cross license
" of all
patents that both Novell and CPTN hold (including the 882 in question) on
terms that are "no less favorable
" than those offered to
others outside of CPTN. Essentially, Novell wants to ensure that it can
still use those patents if it doesn't get acquired by Attachmate.
Though the 8K is silent about what rights Attachmate will get to the patents, one plausible scenario is that Attachmate is already a member of CPTN. If that's the case, it may be exempt from any patent lawsuits using the 882 Novell patents. That could set up a situation where an attack on various other distributions—but not SUSE—is made. Given the cross-licensing language that is in the 8K, it's a bit hard to believe that Attachmate wouldn't have some kind of agreement in place. That, in turn, could imply that some of those patents are potentially applicable to Linux and free software.
It is tempting to speculate about what this means for our communities—we have done a bit of that here and many are going much further—but it is rather premature. The escape clause certainly raises the possibility that there are other Novell suitors out there, so this acquisition and asset sale may not even take place. If they do, we will find out which of Novell's patents are affected and be able to see what impact, if any, they might have on Linux and free software.
Taken at face value, Attachmate's statements about its plans seem to pose no threat to our communities or to the many members who are employed by Novell. CPTN, on the other hand, may be a potent threat if the patents are used offensively against Linux and free software. While it always makes sense to be prepared for the worst, one can always hope that this particular transaction (or set of transactions) will be fairly neutral. With luck, it may actually increase the income and profits for SUSE and lead to more investment in free software. We will just have to wait and see.
Security
The MeeGo security framework
Several kernel security solutions that haven't been used very widely—at least visibly—are making an appearance in the Mobile Simplified Security Framework (MSSF). Elena Reshetova and Casey Schaufler of Nokia presented MSSF at the recently held MeeGo conference, and it will be the basis of the MeeGo security architecture. While MSSF is targeted at MeeGo, it is not necessarily specific to that platform and, since it is an open project that has expanded from its smartphone roots, it could be adopted by other platforms.
Reshetova opened the talk with a description of the components of MSSF, starting with the chipset security, which provides secure cryptographic and key management services that can be used by the higher levels. Integrity protection will ensure that the system software, applications, and data, are protected against both on-line and off-line attacks. The access control layer will limit the resources that applications can access at runtime, while the privacy protection layer protects both data integrity and confidentiality for applications.
MSSF relies on a secure software distribution model, where packages can be authenticated before being installed. A smartphone does not have an administrator, she said, so MSSF relies on secure software distribution for managing the security policy remotely. The security policies from the device maker and user are what determine which parts of MSSF are used and how they are utilized.
Version 1 of MSSF was presented at this year's Ottawa Linux Symposium, but it was designed for Maemo—one of the two precursors to MeeGo, Moblin being the other—and smartphones. It was designed to use Debian packages, but MeeGo is RPM-based. In addition, MSSF v1 needed more features to support the various MeeGo targets (netbook, connected TV, tablet, etc.), which led to MSSF v2 that will be delivered in MeeGo 1.2.
The chipset security layer "abstracts away the hardware
" and
provides a "trusted execution environment", Reshetova said. It provides
services to the
other layers in addition to verifying the integrity of the bootloader and
kernel image prior to boot. There are two main keys used: a symmetric root
device specific key that is used for local cryptographic operations and a
root public key that is used to verify the software chain.
For access control, MSSF and applications define "protected resources",
which are things like cellular functionality, location information, or
calendar data that require access limitations. A new credential type has
been created for MSSF called a "resource token", which is a "string
that names the protected resource
". There are both global tokens
for system-provided resources (UserData, Cellular, Location) and package
specific tokens (e.g. calendar data). Applications must declare which
resource tokens they need or provide in the package manifest.
Smack for access control
Access control will be enforced using Smack (Simplified Mandatory Access Control Kernel), which is a mainline Linux security module (LSM) that was developed by Schaufler. Smack was added to Linux in April 2008, but hasn't been seen much outside of presentations and some secret embedded projects, so MeeGo will be the first highly visible user of this technology.
Schaufler described his reasons for developing Smack as a "reaction
to SELinux
", because
of the complexity of that solution. Like SELinux, Smack provides mandatory
access control (MAC), but does it in a much simpler way. "Mostly
what it [Smack] does is to stay out of the way
", he said. Smack is
a complete MAC model, Schaufler said, and there is an implementation of
resource tokens being added for MSSF v2.
Smack is based on the idea of labels on files, which are stored in the file's
extended attributes (xattrs), and processes. In order for a program to
access a file, those labels must match; if they are different then access is
denied. That "doesn't work well for things like the root of the
filesystem
", Schaufler said, so there are two special Smack labels,
"_" (floor) and "*" (star) that allow for wider access.
In addition, simple rules can be added to allow specific kinds of access between labels that don't match. Rules are specified with subject (access requester) and object labels along with the access allowed (based on the traditional read, write, execute permission bits). In order to write to an object, a subject must have both read and write permission; read permission is required to read the inode. Many operations require directory access (governed by the execute bit) as well.
Networking is handled differently from file access, he said. Sockets are
not elements in the security model, and it is the labels on the two
communicating tasks that govern access. A sender must have write access to
the receiver's label in order to send it a packet. Writing without being
able to read a response is "very useful in an environment where
applications don't want to trust each other
", he said. Packets get
labeled by Smack, and processes can query the packet label to do different
things based on that label.
The MeeGo package manager will be responsible for setting up the Smack
configuration based on the package manifest. It will attach the Smack
labels to
the files, modify the stored Smack rules, and update those rules in the
kernel. There is a "whole lot of information
" that will need
to go into the manifest, so it is something to be aware of when creating
MeeGo packages, but it won't be too difficult to do, he said.
IMA and EVM
Two other lightly used kernel facilities are also part of MSSF: the integrity measurement architecture (IMA) and the extended verification module (EVM). Reshetova came back to the microphone to explain how those two facilities will be used.
IMA is being used, rather than the "validator" that was part of MSSF v1, because it is in the mainline and uses xattrs to store its reference hashes. Essentially, IMA keeps track of the contents of files by storing a hash value in their security.ima xattr. When files are opened, the integrity of their contents are verified, and IMA ensures that when the contents have changed, the reference hash is updated. In order to do that update, the access control framework (i.e. Smack) must allow that operation.
Binary program files are the main target for IMA checking, though libraries and data files can be protected as well. The MeeGo package manager will be responsible for setting the initial reference hash value based on information in the package. IMA only guards against attacks that change those files while the system is running, as off-line attackers can change both the file contents and hash value.
EVM is what provides protection against these off-line attacks. It stores
a "keyed hash across the security attributes
" of files. Those
attributes include the IMA and Smack xattrs, as well as the owner, group,
and permissions of the file. The key that is used comes from the chipset
security layer, so that an off-line attacker cannot change any of those
attributes (which effectively includes the file contents because the IMA xattr
is included) without it being detected by EVM.
Reshetova also described the cryptographic services being offered to applications by MSSF. Applications that wish to protect the integrity or confidentiality of user data can call into libaegis-crypto (Aegis is a now-deprecated name for MSSF) to request certain security services like encryption or hashing. The library handles the interaction with the chipset security layer to create application-specific or shared keys that are never exported to user space. Applications can then request various operations by using a key identifier, rather than the key itself. By using EVM, user data can be protected even from off-line attacks.
Schaufler and Reshetova then fielded a few questions, the first of which was about the effect of IMA/EVM on boot speed. Schaufler said that they don't, yet, have measurements they are comfortable with, and Reshetova said that their earlier work with the validator found that there weren't major effects from integrity checking.
Another asked about who else was using Smack,
and Schaufler was a bit cagey in his response, noting that there is one TV
that you can buy which is using Smack, but that there are "some
people that don't like to share
" that kind of information. He also
noted that "MeeGo adopting Smack is going to take it someplace
special
" but the adoption makes sense because Smack is
"considerably more lightweight and easier to deal with than other
leading brands
".
While not specifically addressed in the presentation—though Ryan Ware touched on it briefly during his presentation immediately prior—is the complete system lockdown that MSSF could enable. Device makers can use the facilities to require that only their signed kernels run on the devices, and that only approved applications get installed. It remains to be seen how many MeeGo integrators make that choice. Device makers may well have chosen to do that regardless of whether MSSF supported it, but its presence there certainly makes it easier to create freedom-hostile devices. One hopes that at least some device makers choose otherwise.
Brief items
Security quotes of the week
EFF Tool Offers New Protection Against 'Firesheep'
The Electronic Frontier Foundation (EFF) has launched a new version of HTTPS Everywhere. "This new version of HTTPS Everywhere responds to growing concerns about website vulnerability in the wake of Firesheep, an attack tool that could enable an eavesdropper on a network to take over another user's web accounts -- on social networking sites or webmail systems, for example -- if the browser's connection to the web application either does not use cryptography or does not use it thoroughly enough."
Freedesktop.org shenanigans
Subscribers to the xorg-devel list will have seen Luc Verhaegen's November 23 note complaining about a prank commit added to the (moribund) radeonhd tree. As he rightly noted, this kind of trick (which required root access to carry out) can only serve to compromise the community's trust in the X.org project's repositories as a whole. After some hours, the perpetrator came forward; it was, as expected, an X.org developer. So there is no remaining concern that X.org's systems may have been compromised, but we may see a new discussion on how the organization's systems are managed in the future.
New vulnerabilities
php: double free flaw
Package(s): | php | CVE #(s): | CVE-2010-4150 | ||||||||||||||||||||||||||||||||||||||||||||||||
Created: | November 19, 2010 | Updated: | April 5, 2011 | ||||||||||||||||||||||||||||||||||||||||||||||||
Description: | From the Mandriva advisory:
A possible double free flaw was found in the imap extension for php. | ||||||||||||||||||||||||||||||||||||||||||||||||||
Alerts: |
|
suricata: TCP evasions
Package(s): | suricata | CVE #(s): | |||||
Created: | November 22, 2010 | Updated: | November 28, 2010 | ||||
Description: | From the Red Hat bugzilla:
It was reported that a number of TCP evasions existed in versions of Suricata prior to 1.0.2. Upstream has released version 1.0.2 to address these flaws. | ||||||
Alerts: |
|
systemtap: denial of service
Package(s): | systemtap | CVE #(s): | CVE-2010-4171 | ||||||||||||||||
Created: | November 19, 2010 | Updated: | November 23, 2010 | ||||||||||||||||
Description: | From the Red Hat bugzilla:
A security flaw was found in the way systemtap runtime tool (staprun) removed unused modules. A local attacker could use this flaw to conduct various denial of service attacks. | ||||||||||||||||||
Alerts: |
|
Page editor: Jake Edge
Kernel development
Brief items
Kernel release status
The current development kernel is 2.6.37-rc3, released on November 21. Linus said:
One notable change is that the attempt to make /proc/kallsyms unreadable by default has been reverted because it broke an older distribution (Ubuntu Jaunty).
The short-form changelog is in the announcement; see the full changelog for all the details.
Stable updates: the 2.6.27.56, 2.6.32.26, 2.6.35.9, and 2.6.36.1 updates were released on November 22; each contains a long list of important fixes. Note that 2.6.35.9 is the last update for the 2.6.35 series.
Quotes of the week
So yes, we're the Tom Jones of the engineering world.
So I can see how architecture designers could get some complexes. I understand. But even if you're a total failure in life, and you got your degree in EE rather than CompSci, stand up for yourself, man!
Repeat after me: "Yes, I too can make a difference! I'm not just a useless lump of meat! I can design hardware that is wondrous and that I don't need to be ashamed of! I can help those beautiful software people run their code better! My life has meaning!"
Doesn't that feel good? Now, look down at your keyboard, and look back at me. Look down. Look back. You may never be as beautiful and smart as a software engineer, but with Old Spice, you can at least smell like one.
Hardware and software should work together. And that does not mean that hardware should just lay there like a dead fish, while software does all the work. It should be actively participating in the action, getting all excited about its own body and about its own capabilities.
The big chunk memory allocator
Device drivers - especially those dealing with low-end hardware - sometimes need to allocate large, physically-contiguous memory buffers. As the system runs and memory fragments, those allocations are increasingly likely to fail. That had led to a lot of schemes based around techniques like setting aside memory at boot time; the contiguous memory allocator (CMA) patch set covered here in July is one example. There is an alternative approach out there, though, in the form of Hiroyuki Kamezawa's big chunk memory allocator.The big chunk allocator provides a new allocation function for large contiguous chunks:
struct page *alloc_contig_pages(unsigned long base, unsigned long end, unsigned long nr_pages, int align_order);
Unlike CMA, the big chunk allocator does not rely on setting aside memory at boot time. Instead, it will attempt to organize a suitable chunk of memory at allocation time by moving other pages around. Over time, the memory compaction and page migration mechanisms in the kernel have gotten better and memory sizes have grown. So it is more feasible to think that this kind of large allocation might be more possible than it once was.
There are some advantages to the big chunk approach. Since it does not require that memory be set aside, there is no impact on the system when there is no need for large buffers. There is also more runtime flexibility and no need for the system administrator to figure out how much memory to reserve at boot time. The down sides will be that memory allocation becomes more expensive and the chances of failure will be higher. Which system will work better in practice is entirely unknown; answering that question will require some significant testing by the people who need the large allocation capability.
Kernel development news
A collection of tracing topics
For a long time, tracing was seen as one of the weaker points of the Linux system. Things have changed dramatically over the last few years, to the point that Linux has a number of interesting tracing interfaces. The job is far from done, though, and there is not always agreement on how this work should proceed. There have been a number of conversations related to tracing recently; this article will survey some in an attempt to highlight where the remaining challenges are.
The tracing ABI
Once upon a time, Linux had no tracing-oriented interfaces at all. Now, instead, we have two: ftrace and perf events. Some types of information are only available via the ftrace interface, others are only available from perf, and some sorts of events can be obtained in either way. From the discussions that have been happening for some time it's clear that neither interface satisfies everybody's needs. In addition, there are other subsystems waiting on the wings - LTTng and a recently proposed system health subsystem, for example - which bring requirements of their own. The last thing that the system needs is an even wider variety of tracing interfaces; it would be nice, instead, to pull everything together into a single, unified interface.
Almost everybody involved agrees on that point, but that is about where the agreement stops. Your editor, unfortunately, missed the tempestuous session at the Linux Plumbers Conference where a number of tracing developers came to an agreement of sorts: a new ABI would be developed with the explicit goal of being a unified tracing and event interface for the system as a whole. This ABI would be kept out of the mainline until a number of tools had been written to use it; only when it became clear that everybody's needs are met would it be merged. Your editor talked to a number of the people involved in that discussion; all seemed pleased with the outcome.
Ftrace developer Steven Rostedt interpreted the discussion as a mandate to develop an entirely new ABI for tracing purposes:
LTTng developer Mathieu Desnoyers took things even further, posting a "tracing ABI work plan" for discussion. That posting was poorly received, being seen as a document better suited to managerial conference rooms - a perception which was not helped by Mathieu's subsequent posting of a massive common trace format document which would make a standards committee proud. Kernel developers, as always, would rather see code than extensive design documents.
When the code comes, though, it seems that there will be resistance to the idea of creating an entirely new tracing ABI. Thomas Gleixner has expressed his dislike for the current state of affairs and attempts to create complex replacements; he is calling for a gradual move toward a better interface. Ingo Molnar has said similar things:
We'll need to embark on this incremental path instead of a rewrite-the-world thing. As a maintainer my task is to say 'no' to rewrite-the-world approaches - and we can and will do better here.
The existing ABI that Ingo likes, of course, is the perf interface. He would clearly like to see all tracing and event reporting move to the perf side of the house. The perf ABI, he says, is sufficiently extendable to accommodate everybody's needs; there does not seem to be a lot of room for negotiation on this point.
Stable tracepoints
One of the conclusions reached at the 2010 Kernel Summit was that a small set of system tracepoints would be designated "stable" and moved to a separate location in the filesystem hierarchy. Tools using these tracepoints would have a high level of assurance that things would not change in future kernel releases; meanwhile, kernel developers could feel free to add and use tracepoints elsewhere without worrying that they could end up maintaining them forever. It seemed like an outcome that everybody could live with.
Steven recently posted an implementation of stable tracepoints to implement that decision. His patch adds another tricky macro (STABLE_EVENT()) which creates a stable tracepoint; all such tracepoints are essentially a second, restricted view of an existing "raw" tracepoint. That allows development-oriented tracepoints to provide more information than is deemed suitable for a stable interface and does not require cluttering the code with multiple tracepoint invocations. There is also a new "eventfs" filesystem to host stable tracepoints which is expected to be mounted on /sys/kernel/events. A small number of core tracepoints have been marked as stable - just enough to show how it's done.
There were a number of complaints about eventfs, not the least of which being Greg Kroah-Hartman's gripe that he had already written tracefs for just this purpose. Ingo had a different complaint, though: he is pushing an effort to distribute tracepoints throughout the sysfs hierarchy. The current /sys/kernel/debug/tracing/events directory would not go away (there are tools which depend on it), but future users of, say, ext4-related tracepoints would be expected to look for them in /sys/fs/ext4. It is an interesting idea which possibly makes good sense, but it is somewhat orthogonal to Steven's stable tracepoint posting; it doesn't address the stable/development distinction at all.
It eventually became clear that Ingo is opposed to the concept of marking some tracepoints as stable. He is, instead, taking the position that anything which is used by tools becomes part of the ABI, and that an excess of tools using too many tracepoints is a problem we wish we had. This opposition, needless to say, could make it hard to get the stable tracepoint concept into the kernel.
Here we see one of the hazards of skipping important developer meetings. The stable tracepoint discussion was expected to be one of the more contentious sessions at the kernel summit; in the end, though, everybody present seemed happy with the conclusion that was reached. But Ingo was not present. His point of view was not heard there, and the community believes it has reached consensus on something he apparently disagrees with. If Ingo succeeds in overriding that consensus, then Steven might not be the only person to express thoughts like:
That conversation has quieted for now, but it will almost certainly
return. If nothing else, some developers are determined to change tracepoints when the need
arises, so this issue can be expected to come up again at some point.
One possible source of conflict is the recently-announced trace utility which, according to Ingo, has "no conceptual
restrictions
" and will use tracepoints without regard for any sort
of "stable" designation.
trace_printk()
One useful, but little used tracing-related tool is trace_printk(). It can be called like printk() (though without a logging level), but its output does not go to the system log; instead, everything printed via this path goes into the tracing stream as seen by ftrace. When tracing is off, trace_printk() calls have no effect. When tracing is enabled, instead, trace_printk() data can be made available to a developer with far less overhead than normal printk() output. That overhead can matter - the slowdown caused by printk() calls is often enough to change timing-related behavior, leading to "heisenbugs" which are difficult to track down.
Output from trace_printk() does not look like a normal kernel event, though, so it is not available to the perf interface. Steven has posted a patch to rectify that, at the cost of potentially creating large numbers of new trace events. With this patch, every trace_printk() call will create a new event under ...events/printk/ based on the file name. So, to use Steven's example, a trace_printk() on line 2180 in kernel/sched.c would show up in the events hierarchy as .../events/printk/kernel/sched.c/2180. Each call could then be enabled and disabled independently, just like ordinary tracepoints. It's a convenient and understandable interface, but, if use of trace_printk() ever takes off, it could lead to the creation of large numbers of events.
That idea drew a grumble from Peter Zijlstra, who said that it would be painful to use in perf. One of the reasons for that has to do with how the perf API works: every event must be opened separately with a perf_event_open() call and managed as a separate file descriptor. If the number of events gets large, so does the number of open files which must be juggled.
A potential solution also came from Peter,
in the form of a new "tracepoint collection" event for perf. This special
event will, when opened, collect no data at all, but it supports an
ioctl() call allowing tracepoints to be added to it. All
tracepoints associated with the collection event will report through the
same file descriptor, allowing tools to deal with multiple tracepoints in a
single stream of data. Peter says that the patch "
Finally: access to tracepoints is currently limited to privileged users.
Tracepoints provide a great deal of information about what is going on
inside the kernel, so allowing anybody to watch them does not seem secure.
There is a desire, though, to make some tracepoints generally available so
that tools like trace can work in a non-privileged mode. Frederic
Weisbecker has posted a patch which makes
that possible.
Frederic's patch adds an optional TRACE_EVENT_FLAGS() declaration
for tracepoints; currently, the only defined flag is
TRACE_EVENT_FL_CAP_ANY, which grants access to unprivileged
users. This flag has been applied to the system call tracepoints, allowing
anybody to trace system calls - at least, when tracing is focused on a
process they own.
An obvious conclusion from all of the above is that there are still a lot
of problems to be solved in the tracing area. The nature of the task is
shifting, though. We now have significant tracing capabilities in place,
and the developers involved have learned a lot about how the problem should
(and should not) be solved. So we're no longer in the position of
wondering how tracing can be done at all, and there no longer seems to be
any trouble selling the concept of kernel visibility to developers. What
needs to be done now is to develop the existing capability into something
which is truly useful for the development community and beyond; that looks
like a task which will keep developers busy for some time.
If you have been following Linux kernel development over the past few
months, it has been hard to overlook the massive thread on the Linux
Kernel Mailing List (LKML) resulting from an attempt to merge the Google
Android's suspend blockers framework into the main kernel tree. Arguably,
the presentation of the patches might have been better and the explanation
of the problems they addressed
might have been more straightforward [PDF], but in the end it appears that
merging them wouldn't be the smartest thing from the technical point of
view. Unfortunately, though, it is difficult to explain that without
diving into the technical issues behind the suspend blockers patchset, so I
wrote a paper, Technical Background of the Android Suspend
Blockers Controversy [PDF], discussing them in a detailed way, which is
summarized in this article. Suspend blockers, or wakelocks in the original Android
terminology, are a part of a specific approach to power management, which
is based on aggressive utilization of full system suspend to save as much
energy as reasonably possible. In this approach the natural state of the
system is a sleep
state [PDF], in which energy is only used for refreshing memory and providing
power to a few devices that can generate wakeup signals. The working
state, in which the CPUs are executing instructions and the system is
generally doing some useful work, is only entered in response to a wakeup
signal from one of the selected devices. The system stays in that state
only as long as necessary to do certain work requested by the user. When
the work has been completed, the system automatically goes back to the
sleep state. This approach can be referred to as opportunistic suspend to
emphasize the fact that it causes the system to suspend every time there is
an opportunity to do so. To implement it effectively one has to address a
number of issues, including possible race conditions between system suspend
and wakeup events (i.e. events that cause the system to wake up from sleep
states). Namely, one of the first things done during system suspend is to
freeze user space processes (except for the suspend process itself) and
after that's been completed user space cannot react to any events signaled
by the kernel. In consequence, if a wakeup event occurs exactly at the
time the suspend process is started, user space may be frozen before it
will have a chance to consume the event, which will be delivered to it only
after the system is woken up from the sleep state as a result of
another wakeup event. Unfortunately, on a cell phone the
"deferred" wakeup event may be a very important incoming call, so
the above scenario is hardly acceptable for this type of device. On Android this issue has been addressed with the help of wakelocks.
Essentially, a wakelock is an object that can be in one of two states,
active or inactive, and the system cannot be suspended if at least one
wakelock is active. Thus, if the kernel subsystem handling a wakeup
event activates a wakelock right after the event has been signaled and
deactivates it after the event has been passed to user space, the race
condition described in the previous paragraph can be avoided. Moreover, on
Android, the suspend process is started from kernel space whenever there are
no active wakelocks, which addresses the problem of deciding when to
suspend, and user space is allowed to manipulate wakelocks. Unfortunately, that requires every
user space process doing important work to use wakelocks, which creates
unusual and cumbersome issues for application developers to
deal with. Of course, processes using wakelocks can impact the system's battery
life quite significantly, so the ability to use them has to be regarded as
a privilege that should not be given unwittingly to all applications.
Unfortunately, however, there is no general principle the system designer
can rely on to figure out what applications will be important enough to the
system user to allow them to use wakelocks by default. Therefore,
ultimately the decision is left to the user which, naturally, is only going
to really work if the user is qualified enough to make the decision.
Moreover, if the user is expected to make such a decision, they
should be informed exactly of the possible consequences of it.
The user also should be able to disallow chosen applications the use of
wakelocks at any time. On Android, though, at least up to and including
version 2.2, that simply doesn't happen. Apart from this, some advertised features of applications don't really
work on Android because of its use of opportunistic suspend. Namely, some
applications are supposed to periodically check things on remote Internet
servers. For this purpose they need to run when there's the time to make
their checks, but they obviously aren't running when the system is in a
sleep state. Thus the periodic checks the applications are supposed to make
aren't really made at that time. In fact, they are only made when the
system is in the working state incidentally for another reason, and there
happens to be the time to make them. This most likely is not what the
users of the affected applications would have expected. There is one more problem with full system suspend that is related to
time measurements, although it is not limited to the opportunistic suspend
initiated from kernel space. Namely, every suspend-resume cycle,
regardless of the way it is initiated, introduces inaccuracies into the
kernel's timekeeping subsystem. Usually, when the system goes into a sleep
state, the hardware that the kernel's timekeeping subsystem relies on is powered
off, so it has to be reinitialized during a subsequent system resume. Then,
among other things, the global kernel variables representing the current
time need to be readjusted to keep track of the time spent in the sleep
state. This involves reading the current time value from a persistent
clock which typically is much less accurate than the clock sources used by
the kernel in the system's working state. So that introduces a random shift
of the kernel's representation of current time, depending on the resolution
of the persistent clock, during every suspend-resume cycle. Moreover,
kernel timers used for scheduling the future execution of work inside of
the kernel also are affected by this issue in a similar way. In
consequence, the timing of some events in a suspending and resuming system
is different from their analogous timing without a suspend-resume
cycle. If system suspend is initiated by user space, the kernel may assume that
user space is ready for it and is somehow prepared to cope with the
consequences. For example, it may want to use settimeofday() to
set the kernel's monotonic clock using a time value taken from an NTP
server right after the subsequent system resume. On the other hand, if
system suspend is started by the kernel in an opportunistic fashion, user
space doesn't really have a chance to do anything like that. For this reason, one may think that it's better not to suspend the
system at all and use the cpuidle framework for the entire system
power management. This approach appears to allow some systems to be put
into a low-power state
resembling a sleep state. However, it may not guarantee that the
system will be put into that state sufficiently often because of
applications using busy loops to excess and kernel timers. PM quality
of service (QoS)
requests [PDF] may also prevent cpuidle from using deep low-power
state of the CPUs. Moreover, while only a few selected devices are enabled
to signal wakeup during system suspend, the runtime power management
routines that may be used by cpuidle for suspending I/O devices
tend to enable all of them to signal wakeup. Thus the system wakes up from
low-power states entered as a result of cpuidle transitions
relatively more often than from "real" sleep states, so its
ability to save energy is limited. This basically means that
cpuidle-based system power management may not be sufficient to
save as much energy as opportunistic suspend on the same system. Even if opportunistic suspend is not going to be used on a
given system, it generally makes sense to suspend the system sometimes, for
example when its user knows in advance that it will not need to be in the
working state in the near future. However, the problem of possible races
between the suspend process and wakeup events, addressed on Android with
the help of the wakelocks framework, affects all forms of system suspend,
not only the opportunistic one. Thus this problem should be addressed in
general and it is not really convenient to simply use the Android's
wakelocks for this purpose, because that would require all of user space to be
modified to use wakelocks. While that may be good for Android,
whose user space already is designed this way at least to some extent, it
wouldn't be very practical for other Linux-based systems, whose user space
is not aware of the wakelocks interface. This observation led to the kernel patch that introduced the
wakeup events framework, which was shipped in the 2.6.36 kernel. This patch introduced a running counter of signaled wakeup events,
event_count, and a counter of wakeup events whose data is being
processed by the kernel at the moment, events_in_progress. Two
interfaces have been added to allow kernel subsystems to modify these
counters in a consistent way. pm_stay_awake() is meant to keep
the system from suspending, while pm_wakeup_event() ensures that
the system stays awake during the processing of a wakeup event.
In order to do that, pm_stay_awake()
increments events_in_progress and the complementary function
pm_relax() decrements it and increments event_count at
the same time. pm_wakeup_event() increments
events_in_progress and sets up a timer to decrement it and
increment event_count in the future. The current value of event_count can be read from the new
sysfs file /sys/power/wakeup_count. In turn, writing to
it causes the current value of event_count to be stored in the
auxiliary variable saved_count, so that it can be compared with
event_count in the future. However, the write operation will only
succeed if the written number is already equal to event_count. If
that happens, another auxiliary variable events_check_enabled is
set, which tells the PM core to check whether event_count has
changed or events_in_progress is different from zero while
suspending the system. This relatively simple mechanism allows the PM core to react to wakeup
events signaled during system suspend if it is asked to do so by user space
and if the kernel subsystems detecting wakeup events use either
pm_stay_awake() or pm_wakeup_event(). Still, its support
for collecting device statistics related to wakeup events is not comparable
to the one provided by the wakelocks framework. Moreover, it assumes that
wakeup events will always be associated with devices, or at least with
entities represented by device objects, which need not be the case in all
situations. The need to address these shortcomings led to a kernel patch introducing wakeup
source objects and adding some flexibility to the existing
framework. Most importantly, the new patch introduces objects of type struct
wakeup_source to represent entities that can generate wakeup events.
Those objects are created automatically for devices enabled to signal
wakeup and are used internally by pm_wakeup_event(),
pm_stay_awake(), and pm_relax(). Although the
highest-level interfaces are still designed to report wakeup events
relative to devices, which is particularly convenient to device drivers and
subsystems that generally deal with device objects, the new framework makes
it possible to use wakeup source objects directly.
A
"standalone" wakeup source object is created by
wakeup_source_create() and added to the kernel's list of wakeup
sources by wakeup_source_add(). Afterward one can use three new
interfaces, __pm_wakeup_event(), __pm_stay_awake() and
__pm_relax(), to manipulate it and, when it is not necessary any
more, it may be removed from the global list of wakeup sources by calling
wakeup_source_remove(). It can then be deleted with the help of
wakeup_source_destroy(). Thus reported wakeup events need not be
associated with device objects any more. Also, at the kernel level, wakeup
source objects may be used to replace Android's wakelocks on a one-for-one
basis because the above interfaces are completely analogous to the ones
introduced by the wakelocks framework. The infrastructure described above ought to make it easier to port
device drivers from Android to the mainline kernel. It hasn't been
designed with opportunistic suspend in mind, but in theory it may be used
for implementing a very similar power management technique. Namely, in
principle, all wakelocks in the Android kernel can be replaced with wakeup
source objects. Then, if the /sys/power/wakeup_count interface is
used correctly, the resulting kernel will be able to abort suspend in
progress in reaction to wakeup events in the same circumstances in which
the original Android kernel would do that. Yet, user space cannot access
wakeup source objects, so the part of the wakelocks framework allowing user
space to manipulate them has to be replaced with a different mechanics
implemented entirely in user space, involving a power manager process and a
suitable IPC interface for the processes that would use wakelocks on
Android. The IPC interface in question may be implemented using three components,
a shared memory location containing a counter variable referred to as the
"suspend counter" in what follows, a mutex, and a conditional variable
associated with that mutex. Then, a process wanting to prevent the
system from suspending will acquire the mutex, increment the suspend
counter, and release the mutex. In turn, a process wanting to permit the
system to suspend will acquire the mutex and decrement the suspend counter.
If the suspend counter happens to be equal to zero at that point, the
processes waiting on the conditional variable will be unblocked. The mutex
will be released afterward. With the above IPC interface in place the power manager process can
perform the following steps in a loop:
is lightly tested
and wants some serious testing/review before merging
", but we may
see this ABI addition become ready in time for 2.6.38.
Unprivileged tracepoints
In conclusion...
An alternative to suspend blockers
Wakelocks
Timekeeping issues
The alternative implementation
Of course, this design will cause the system to be suspended very
aggressively. Although it is not entirely equivalent to the Android's
opportunistic suspend, it appears to be close enough to yield the same
level of energy savings. However, it also suffers from a number of
problems affecting the Android's approach. Some of them may be addressed
by adding complexity to the power manager and the IPC interface between it
and the processes permitted to block and unblock suspend, but the others
are not really avoidable. Thus it may be better to use system suspend less
aggressively, but in combination with some other techniques described
above.
Overall, while the idea of suspending the system extremely aggressively may be controversial, it doesn't seem reasonable to entirely dismiss automatic suspending of it as a valid power management measure. Many different operating systems do that and they achieve good battery life [PDF] with the help of it. There don't seem to be any valid reasons why Linux-based systems shouldn't do that, especially if they are battery-powered. As far as desktop and similar (e.g. laptop or netbook) systems are concerned, it makes sense to configure them to suspend automatically in specific situations so long as system suspend is known to work reliably on the given configuration of hardware. The new interfaces and ideas presented above may be used to this end.
Ghosts of Unix past, part 4: High-maintenance designs
The bible portrays the road to destruction as wide, while the road to life is narrow and hard to find. This illustration has many applications in the more temporal sphere in which we make many of our decisions. It is often the case that there are many ways to approach a problem that are unproductive and comparatively few which lead to success. So it should be no surprise that, as we have been looking for patterns in the design of Unix and their development in both Unix and Linux, we find fewer patterns of success than we do of failure.
Our final pattern in this series continues the theme of different ways to go wrong, and turns out to have a lot in common with the previous pattern of trying to "fix the unfixable". However it has a crucial difference which very much changes the way the pattern might be recognized and, so, the ways we must be on the look-out for it. This pattern we will refer to as a "high maintenance" design. Alternatively: "It seemed like a good idea at the time, but was it worth the cost?".
While "unfixable" designs were soon discovered to be insufficient and attempts were made (arguably wrongly) to fix them, "high maintenance" designs work perfectly well and do exactly what is required. However they do not fit seamlessly into their surroundings and, while they may not actually leave disaster in their wake, they do impose a high cost on other parts of the system as a whole. The effort of fixing things is expended not on the center-piece of the problem, but on all that surrounds it.
Setuid
The first of two examples we will use to illuminate this pattern is the "setuid" and "setgid" permission bits and the related functionality. In itself, the setuid bit works quite well, allowing non-privileged users to perform privileged operations in a very controlled way. In fact this is such a clever and original idea that the inventor, Dennis Ritchie, was granted a patent for the invention. This patent was since placed in the public domain. Though ultimately pointless, it is amusing to speculate what might have happened had the patent rights been asserted, leading to that aspect of Unix being invented around. Could a whole host of setuid vulnerabilities have been avoided?
The problem with this design is that programs which are running setuid exist in two realms at once and must attempt to be both a privileged service provider, and a tool available to users - much like the confused deputy recently pointed out by LWN reader "cmccabe." This creates a number of conflicts which requires special handling in various different places.
The most obvious problem comes from the inherited environment. Like any tool, the programs inherit an environment of name=value assignments which are often used by library routines to allow fine control of certain behaviors. This is great for tools but potentially quite dangerous for privileged service providers as there is a risk that the environment will change the behavior of the library and so give away some sort of access that was not intended. All libraries and all setuid programs need to be particularly suspicious of anything in the environment, and often need to explicitly ignore the environment when running setuid. The recent glibc vulnerabilities are a perfect example of the difficulty of guarding against this sort of problem.
An example of a more general conflict comes from the combination of setuid with executable shell scripts. This did not apply at the time that setuid was first invented, but once Unix gained the #!/bin/interpreter (or "shebang") method of running scripts it became possible for scripts to run setuid. This is almost always insecure, though various different interpreters have made various attempts to make it secure, such as the "-b" option to csh and the "taint mode" in perl. Whether they succeed or not, it is clear that the setuid mechanism has imposed a real burden on these interpreters.
Permission checking for signal delivery is normally a fairly straightforward matching of the UID of the sending process with the UID of the receiving process, with special exceptions for UID==0 (root) as the sender. However, the existence of setuid adds a further complication. As a setuid program runs just like a regular tool, it must respond to job-control signals and, in particular, must stop when the controlling terminal sends it a SIGTSTP. This requires that the owner of the controlling terminal must be able to request that the process continues by sending SIGCONT. So the signal delivery mechanism needs special handling for SIGCONT, simply because of the existence of setuid.
When writing to a file, Linux (like various flavors of Unix) checks if the file is setuid and, if so, clears the setuid flag. This is not absolutely essential for security, but has been found to be a valuable extra barrier to prevent exploits and is a good example of the wide ranging intrusion of setuid.
Each of these issues can be addressed and largely have been. However they are issues that must be fixed not in the setuid mechanism itself, but in surrounding code. Because of that it is quite possible for new problems to arise as new code is developed, and only eternal vigilance can protect us from these new problems. Either that, or removing setuid functionality and replacing it with something different and less intrusive.
It was recently announced that Fedora 15 would be released with a substantially reduced set of setuid programs. Superficially this seems like it might be "removing setuid functionality" as suggested, but a closer look shows that this isn't the case. The plan for Fedora is to use filesystem capabilities instead of full setuid. This isn't really a different mechanism, just a slightly reworked form of the original. Setuid stores just one bit per file which (together with the UID) determines the capabilities that the program will have. In the case of setuid to root, this is an all or nothing approach. Filesystem capabilities store more bits per file and allow different capabilities to be individually selected, so a program that does not need all of the capabilities of root will not be given them.
This certainly goes some way to increasing security by decreasing the attack surface. However it doesn't address the main problem that the setuid programs exist in an uncertain world between being tools and being service providers. It is unclear if libraries which make use of environment variables after checking that setuid is not in force, will also correctly check if capabilities are not in force. Only a comprehensive audit would be able to tell for sure.
Meanwhile, by placing extra capabilities in the filesystem we impose extra requirements on filesystem implementations, on copy and backup tools, and on tools for examining and manipulating filesystems. Thus we achieve an uncertain increase in security at the price of imposing a further maintenance burden on surrounding subsystems. It is not clear to this author that forward progress is being achieved.
Filesystem links
Our second example, completing the story of high maintenance designs, is the idea of "hard links", known simply as links before symbolic links were invented. In the design of the Unix filesystem, the name of a file is an entity separate from the file itself. Each name is treated as a link to the file, and a file can have multiple links, or even none - though of course when the last link is removed the file will soon be deleted.
This separation does have a certain elegance and there are certainly uses that it can be put to with real value. However the vast majority of files still only have one link, and there are plenty of cases where the use of links is a tempting but ultimately sub-optimal option, and where symbolic links or other mechanisms turn out to be much more effective. In some ways this is reminiscent of the Unix permission model where most of the time the subtlety it provides isn't needed, and much of the rest of the time it isn't sufficient.
Against this uncertain value, we find that:
-
Archiving programs such as tar need extra complexity to look out
for hard links, and to archive the file the first time it is seen,
but not any subsequent time.
-
Similar care is needed in du, which calculates disk usage,
and in other programs which walk the filesystem hierarchy.
-
Anyone who can read a file can create a link to that file which
the owner of the file may not be able to remove. This can lead to users
having charges against their storage quota that they cannot do
anything about.
-
Editors need to take special care of linked files. It is generally
safer to create a new file and rename it over the original rather
than to update the file in place. When a file has multiple hard
links it is not possible to do this without breaking that linkage,
which may not always be desired.
-
The Linux kernel's internals have an awkward distinction between
the "dentry" which refers to the name of a file, and the "inode",
which refers to the file itself. In many
cases
we find that a dentry is needed even when you would think that only
the file is being accessed. This distinction would be irrelevant
if hard links were not possible, and may well relate to the choice
made by the developers of Plan 9 to not support hard links at all.
- Hard links would also make it awkward to reason about any name-based access control approach (as discussed in part 3) as a given file can have many names and so multiple access permissions.
Avoiding high maintenance designs
The concept described here as "high maintenance" is certainly not unique to software engineering. It is simply a specific manifestation of the so-called law of unintended consequences which can appear in many disciplines.
As with any consequences, determining the root cause can be a real challenge, and finding an alternate approach which does not result in worse consequences is even harder. There are no magical solutions on offer by which we can avoid high maintenance designs and their associated unintended consequences. Rather, here are three thoughts that might go some small way to reining in the worst such designs.
- Studying history is the best way to avoid repeating it, and so
taking a broad and critical look at our past has some hope of
directing is well for the future. It is partly for this reason that
"patterns" were devised, to help encapsulate history.
- Building on known successes is likely to have fewer unintended
consequences than devising new ideas. So following the pattern that
started this series of "full exploitation" is, where possible, most
likely to yield valuable results.
- An effective way to understand the consequences of a design is
to document it thoroughly, particularly explaining how it should be
used to someone with little background knowledge. Often writing
such documentation will highlight irregularities which make it
easier to fix the design than to document all the corner cases of
it. This is certainly the
experience
of Michael Kerrisk who maintains the man pages for Linux, and,
apparently, of our Grumpy Editor who found that fixing the cdev
interface made him less grumpy than trying to document it, unchanged,
for LDD3.
When documenting the behavior of the Unix filesystem, it is desirable to describe it as a hierarchical structure, as that was the overall intent. However, honesty requires us to call it as directed acyclic graph (DAG) because that is what the presence of hard links turns it into. It is possible that having to write DAG instead of hierarchy several times might have been enough to raise the question of whether hard links are such a good idea after all.
Harken to the ghosts
In his classic novella "A Christmas Carol", Charles Dickens uses three "ghosts" to challenge Ebenezer Scrooge about his ideology and ethics. They reminded him of his past, presented him with a clear picture of the present, warned him about future consequences, but ultimately left the decision of how to respond to him. We, as designers and engineers, can similarly be challenged as we reflect on these "Ghosts of Unix Past" that we have been exploring. And again, the response is up to us.
It can be tempting to throw our hands up in disgust and build something new and better. Unfortunately, mere technical excellence is no guarantee of success. As Paul McKenney astutely observed, at the 2010 Kernel Summit, economic opportunity is at least an equal reason for success, and is much harder to come by. Plan 9 from Bell Labs attempted to learn from the mistakes of Unix and build something better; many of the mistakes explored in this series are addressed quite effectively in Plan 9. However while Plan 9 is an important research operating system, it does not come close to the user or developer base that Linux has, despite all the faults of the latter. So, while starting from scratch can be tempting, it is rare that it has a long-term successful outcome.
The alternative is to live with our mistakes and attempt to minimize their ongoing impact, deprecating that which cannot be discarded. The x86 CPU architecture seems to be a good example of this. Modern 64-bit processors still support the original 8086 16-bit instruction set and addressing modes. They do this with minimal optimization and using only a small fraction of the total transistor count. But they continue to support it as there has been no economic opportunity to break with the past. Similarly Linux must live with its past mistakes.
Our hope for the future is to avoid making the same sort of mistakes again, and to create such compelling new designs that the mistakes, while still being supported, can go largely unnoticed. It is to this end that it is important to study our past mistakes, collect them into patterns, and be always alert against the repetition of these patterns, or at least to learn how best to respond when the patterns inevitably recur.
So, to conclude, we have a succinct restatement of the patterns discovered on this journey, certainly not a complete set of patterns to be alert for, but a useful collection nonetheless.
Firstly there was "Full exploitation": a pattern hinted at in that early paper on Unix and which continues to provide strength today. It involves taking one idea and applying it again and again to diverse aspects of a system to bring unity and cohesiveness. As we saw with signal handlers, not all designs benefit from full exploitation, but those that do can bring significant value. It is usually best to try to further exploit an existing design before creating something new and untried.
"Conflated" designs happen when two related but distinct ideas are combined in a way that they cannot easily be separated. It can often be appropriate to combine related functionality, whether for convenience or efficiency, but it is rarely appropriate to tie aspects of functionality together in such a way that they cannot be separated. This is an error which can be recognized as the design is being created, though a bit of perspective often makes it a lot clearer.
"Unfixable" designs are particularly hard to recognize until the investment of time in them makes replacing them unpalatable. They are not clearly seen until repeated attempts to fix the original have resulted in repeated failures to produce something good. Their inertia can further be exacerbated by a stubbornness to "fix it if it kills me", or an aversion to replacement because "it is better the devil you know". It can take substantial maturity to know when it is time to learn from past mistakes, give up on failure, and build something new and better. The earlier we can make that determination, the easier it will be in the long run.
Finally "high maintenance" designs can be the hardest for early detection as the costs are usually someone else's problem. To some extent these are the antithesis of "fully exploitable" designs as, rather than serving as a unifying force to bring multiple aspects of a system together, they serve as an irritant which keeps other parts unsettled yet doesn't even produce a pearl. Possibly the best way to avoid high maintenance designs is to place more emphasis on full exploitation and to be very wary of including anything new and different.
If identifying, describing, and naming these patterns makes it easier to detect defective designs early and serves to guide and encourage effective design then they will certainly have filled their purpose.
Exercises for the interested reader
-
Identify a design element in the IP protocol suite which could be
described as "high maintenance" or as having "unintended consequences".
-
Choose a recent extension to Linux and write some comprehensive
documentation, complete with justification and examples. See if that
suggests any possible improvements in the design which would simplify
the documentation.
- Research and enumerate uses of "hard links" which are not
adequately served by using symbolic links instead. Suggest
technologies that might effectively replace these other uses.
- Describe your "favorite" failings in Unix or Linux and describe a pattern which would help with early detection and correction of similar failings.
Patches and updates
Kernel trees
Architecture-specific
Core kernel code
Development tools
Device drivers
Filesystems and block I/O
Memory management
Networking
Security-related
Virtualization and containers
Benchmarks and bugs
Miscellaneous
Page editor: Jonathan Corbet
Distributions
State of the Debian-Ubuntu relationship
The relationship between Debian and Ubuntu has been the subject of many vigorous debates over the years, ever since Ubuntu's launch in 2004. Six years later, the situation has improved and both projects are communicating better. The Natty Narwhal Ubuntu Developer Summit (UDS) featured—like all UDS for more than 2 years—a Debian Health Check session where current cooperation issues and projects are discussed. A few days after that session, Lucas Nussbaum gave a talk during the mini-Debconf Paris detailing the relationship between both projects, both at the technical and social level. He also shared some concerns for Debian's future and gave his point of view on how Debian should address them. Both events give valuable insights on the current state of the relationship.
Lucas Nussbaum's Debian-Ubuntu talk
Lucas started by introducing himself. He's an Ubuntu developer since 2006 and a Debian developer since 2007. He has worked to improve the collaboration between both projects, notably by extending the Debian infrastructure to show Ubuntu-related information. He attended conferences for both projects (Debconf, UDS) and has friends in both communities. For all of these reasons, he believes himself to be qualified to speak on this topic.
Collaboration at the technical level
He then quickly explained the task of a distribution: taking upstream software, integrating it in standardized ways, doing quality assurance on the whole, delivering the result to users, and assuring some support afterward. He pointed out that in the case of Ubuntu, the distribution has one special upstream: Debian.
Indeed Ubuntu gets most of its software from Debian (89%), and only 7% are new packages coming from other upstream projects (the remaining 4% are unknown, they are newer upstream releases of software available in Debian but he was not able to find out whether the Debian packaging had been reused or not). From all the packages imported from Debian, 17% have Ubuntu-specific changes. The reasons for those changes are varied: bugfixes, integration with Launchpad/Ubuntu One/etc., or toolchain changes. The above figures are based on Ubuntu Lucid (10.04) while excluding many Ubuntu-specific packages (language-pack-*, language-support-*, kde-l10n-*, *ubuntu*, *launchpad*).
The different agendas and the differences in philosophy (Debian often seeking perfect solutions to problems; Ubuntu accepting temporary suboptimal workarounds) also explain why so many packages are modified on the Ubuntu side. It's simply not possible to always do the work in Debian first. But keeping changes in Ubuntu requires a lot of work since they merge with Debian unstable every 6 months. That's why they have a strong incentive to push changes to upstream and/or to Debian.
There are 3 channels that Ubuntu uses to push changes to Debian: they file bug reports (between 250 to 400 during each Ubuntu release cycle), they interact directly with Debian maintainers (often the case when there's a maintenance team), or they do nothing and hope that the Debian maintainer will pick up the patch directly from the Debian Package Tracking System (it relays information provided by patches.ubuntu.com).
Lucas pointed out that those changes are not the only thing that Debian should take back. Ubuntu has a huge user base resulting in lots of bug reports sitting in Launchpad, often without anyone taking care of them. Debian maintainers who already have enough bugs on their packages are obviously not interested in even more bugs, but those who are maintaining niche packages, with few reports, might be interested by the user feedback available in Launchpad. Even if some of the reports are Ubuntu-specific, many of them are advance warnings of problems that will affect Debian later on, when the toolchain catches up with Ubuntu's aggressive updates. To make this easier for Debian maintainers, Lucas improved the Debian Package Tracking System so that they can easily get Ubuntu bug reports for their packages even without interacting with Launchpad.
Human feelings on both sides
Lucas witnessed a big evolution in the perception of Ubuntu on the Debian side. The initial climate was rather negative: there were feelings of its work being stolen, claims of giving back that did not match the observations of the Debian maintainers, and problems with specific Canonical employees that reflected badly on Ubuntu as a whole. These days most Debian developers find something positive in Ubuntu: it brings a lot of new users to Linux, it provides something that works for their friends and family, it brings new developers to Debian, and it serves as a technological playground for Debian.
On the Ubuntu side, the culture has changed as well. Debian is no longer so scary for Ubuntu contributors and contributing to Debian is The Right Thing to do. More and more Ubuntu developers are getting involved in Debian as well. But at the package level there's not always much to contribute, as many bugfixes are only temporary workarounds. And while Ubuntu's community follows this philosophy, Canonical is a for-profit company that contributes back mainly when it has compelling reasons to do so.
Consequences for Debian
In Lucas's eyes, the success of Ubuntu creates new problems. For many
new users Linux is a synonym for Ubuntu, and since much innovation happens
in Ubuntu first, Debian is overshadowed by its most popular derivative.
He goes as far as saying that because of that "Debian becomes less
relevant
".
He went on to say that Debian needs to be relevant because the project defends important values that Ubuntu does not. And it needs to stay as an independent partner that filters what comes out of Ubuntu, ensuring that quality prevails in the long term.
Fixing this problem is difficult, and the answer should not be to undermine Ubuntu. On the contrary, more cooperation is needed. If Debian developers are involved sooner in Ubuntu's projects, Debian will automatically get more credit. And if Ubuntu does more work in Debian, their work can be showcased sooner in the Debian context as well.
The other solution that Lucas proposed is that Debian needs to
communicate on why it's better than Ubuntu. Debian might not be better for
everybody but there are many reasons why one could prefer Debian over
Ubuntu. He listed some of them: "Debian has better values
"
since it's a volunteer-based project where decisions are made publicly
and it has advocated the free software philosophy since 1993. On the other hand,
Ubuntu is under control of Canonical where some decisions are imposed, it
advocates some proprietary web services (Ubuntu One), the installer
recommends adding proprietary software, and copyright assignments are
required to contribute to Canonical projects.
Debian is also better in terms of quality because every package has a maintainer who is often an expert in the field of the package. As a derivative, Ubuntu does not have the resources to do the same and instead most packages are maintained on a best effort basis by a limited set of developers who can't know everything about all packages.
In conclusion, Lucas explained that Debian can neither ignore Ubuntu
nor fight it. Instead it should consider Ubuntu as "a
chance
" and should "leverage it to get back in the center
of the FLOSS ecosystem
".
The Debian health check UDS session
While this session has existed for some time, it's only the second time that a Debian Project Leader was present at UDS to discuss collaboration issues. During UDS-M (the previous summit), this increased involvement from Debian was a nice surprise to many. Stefano Zacchiroli—the Debian leader—collected and shared the feedback of Debian developers and the session ended up being very productive. Six months later is a good time to look back and verify if decisions made during UDS-M (see blueprint) have been followed through.
Progress has been made
On the Debian side, Stefano set up a Derivatives Front Desk so that derivative distributions (not just Ubuntu) have a clear point of contact when they are trying to cooperate but don't know where to start. It's also a good place to share experiences among the various derivatives. In parallel, a #debian-ubuntu channel has been started on OFTC (the IRC network used by Debian). With more than 50 regulars coming from both distributions, it's a good place for quick queries when you need advice on how to interact with the distribution that you're not familiar with.
Ubuntu has updated its documentation to prominently feature how to cooperate with Debian. For example, the sponsorship process documentation explains how to forward patches both to the upstream developers and to Debian. It also recommends ensuring that the patch is not Ubuntu-specific and gives some explanation on how to do it (which includes checking against a list of common packaging changes made by Ubuntu). The Debian Derivative Front Desk is mentioned as a fallback when the Debian maintainer is unresponsive.
While organizing Ubuntu Developer Week, Ubuntu now reaches out to Debian developers and tries to have sessions on "working with Debian". Launchpad has also been extended to provide a list of bugs with attached patches and that information has been integrated in the Debian Package Tracking system by Lucas Nussbaum.
Still some work to do
Some of the work items have not been completed yet: many Debian maintainers would like a simpler way to issue a sync request (a process used to inject a package from Debian into Ubuntu). There's a requestsync command line tool provided by the ubuntu-dev-tools package (which is available in Debian) but it's not yet usable because Launchpad doesn't know the GPG keys of Debian maintainers.
Another issue concerns packages which are first introduced in Ubuntu. Most of them have no reason to be Ubuntu-specific and should also end up in Debian. It has thus been suggested that people packaging new software for Ubuntu also upload them to Debian. They could however immediately file a request for adoption (RFA) to find another Debian maintainer if they don't plan to maintain it in the long term. If Ubuntu doesn't make this effort, it can take a long time until someone decides to reintegrate the Ubuntu package into Debian just because nobody knows about it. This represents an important shift in the Ubuntu process and it's not certain that it's going to work out. As with any important policy change, it can take several years until people are used to it.
Both issues have been rescheduled for this release cycle, so they're still on the agenda.
This time the UDS session was probably less interesting than the previous one. Stefano explained once more what Debian considers good collaboration practices: teams with members from both distributions, and forwarding of bugs if they have been well triaged and are known to apply to Debian. He also invited Ubuntu to discuss big changes with Debian before implementing them.
An interesting suggestion that came up was that some Ubuntu developers could participate in Debcamp (one week hack-together before Debconf) to work with some Debian developers, go through Ubuntu patches, and merge the interesting bits. This would nicely complement Ubuntu's increased presence at Debconf: for the first time, community management team member Jorge Castro was at DebConf 10 giving a talk on collaboration between Debian and Ubuntu.
There was also some brainstorming on how to identify packages where the collaboration is failing. A growing number of Ubuntu revisions (identified for example by a version like 1.0-1ubuntu62) could indicate that no synchronization was made with Debian, but it would also identify packages which are badly maintained on the Debian side. If Ubuntu consistently has a newer upstream version compared to Debian, it can also indicate a problem: maybe the person maintaining the package for Ubuntu would be better off doing the same work in Debian directly since the maintainer is lagging or not doing their work. Unfortunately this doesn't hold true for all packages since many Gnome packages are newer in Ubuntu but are actively maintained on both sides.
Few of those discussions led to concrete decisions. It seems most proponents are reasonably satisfied with the current situation. Of course, one can always do better and Jono Bacon is going to ensure that all Canonical teams working on Ubuntu are aware of how to properly cooperate with Debian. The goal is to avoid heavy package modifications without coordination.
Conclusion
The Debian-Ubuntu relationships used to be a hot topic, but that's no longer the case thanks to regular efforts made on both sides. Conflicts between individuals still happen, but there are multiple places where they can be reported and discussed (#debian-ubuntu channel, Derivatives Front Desk at derivatives@debian.org on the Debian side or debian@ubuntu.com on the Ubuntu side). Documentation and infrastructure are in place to make it easier for volunteers to do the right thing.
Despite all those process improvements, the best results still come out when people build personal relationships by discussing what they are doing. It often leads to tight cooperation, up to commit rights to the source repositories. Regular contacts help build a real sense of cooperation that no automated process can ever hope to achieve.
Brief items
Distribution quotes of the week
Liberté Linux 2010.1
The first release of Liberté Linux is available. "Liberté Linux is a secure, reliable, lightweight, and easy to use Gentoo-based LiveUSB Linux distribution intended as a communication aid in hostile environments. Liberté installs as a regular directory on a USB/SD key, and after a single-click setup, boots on any desktop computer or laptop. Available internet connection is then used to set up a Tor circuit which handles all network communication."
NetBSD 5.1
NetBSD 5.1 has been released. "NetBSD 5.1 is the first feature update of the NetBSD 5.0 release branch. It includes security and bug fixes, as well as improved hardware support and new features." More information can be found in the release notes.
SimplyMEPIS 8th Anniversary Release
MEPIS founder Warren Woodford has announced the release of SimplyMEPIS 11.0 Alpha2 to celebrate the project's 8th anniversary. "11.0 continues to track with Debian Squeeze but with a 2.6.36 kernel. In this release, MEPIS has backported the Galbraith latency patch, which improves desktop performance."
Distribution News
Fedora
Appointment to the Fedora Board
Fedora Project Leader Jared K. Smith has announced that Toshio Kuratomi has accepted a seat on the Fedora board. "Toshio is a great contributor to open source in general, and has been actively collaborating with people throughout the Fedora Project for many years. I have no doubt that he'll work tirelessly to increase the level of trust, transparency, communication, and innovation within the Fedora community." Elections are open until November 28 to fill two open seats, after which another person will be appointed to the board.
Newsletters and articles of interest
Distribution newsletters
- Debian Project News (November 22)
- DistroWatch Weekly, Issue 381 (November 22)
- Fedora Weekly News Issue 252 (November 17)
- openSUSE Weekly News, Issue 150 (November 20)
Hertzog: How Ubuntu builds up on Debian
Raphaël Hertzog takes a look at how packages flow from Debian to Ubuntu. "From all the source packages coming from Debian, 17% have additional changes made by Ubuntu. Many of them are part of the "main" repository, which is actively maintained by Canonical and Ubuntu core developers. The "universe" repository is usually closer to the official Debian packages."
Page editor: Rebecca Sobol
Development
A look at LyX 2.0
The LyX project has been quietly, but effectively, hammering away at a major update to its document processing program for about two years. On November 10, the LyX project unveiled the first beta for LyX 2.0. With better revision control, document previews, and new support for many LaTeX commands, LyX 2.0 is shaping up very nicely.
For the uninitiated, LyX is a multi-platform "document processor," essentially a front-end for editing TeX/LaTeX documents without having to muck with the actual TeX/LaTeX markup. At least, not unless one wants to fiddle with the markup. LyX does make it possible to insert TeX/LaTeX markup, but you don't generally need to.
![[LyX navigation menu]](https://static.lwn.net/images/2010/lyx-navigation-sm.png)
The difference between LyX and, say, LibreOffice Writer or Microsoft Word goes beyond the document format on the backend. LyX doesn't attempt to render documents in a "What You See Is What You Get" (WYSIWYG) style. Instead, the LyX motto is "What You See Is What You Mean" (WYSIWYM). LyX presents the structure of the document instead of the exact presentation. One reason for that is that the LyX folks want users to focus on the writing rather than bit-twiddling the presentation. Another is that LyX can export to a number of formats (PDF, HTML, plain text, etc.) and the presentation is going to change depending on the stylesheet and target format.
Though LyX works differently than Word or Writer, it has many of the features that users would like to have. For instance, LyX has spell checking, version tracking, and even revision control. It can produce simple documents, or entire books with beautifully (thanks to TeX/LaTeX) rendered equations.
Looking at 2.0
Binaries haven't shown up for LyX 2.0 yet, but there's very little difficulty in compiling from source. For users who want to test LyX 2.0 alongside an existing 1.6.x release, there are two options. One is to run LyX from the src
directory after compiling, the other is to use the --with-version-suffix
option when running ./configure
.
After compiling LyX 2.0 beta I set about creating a few documents and testing some of the new features listed for 2.0. From the limited testing I've done this week, the first LyX 2.0 beta seems stable enough to use for day-to-day work. One word of caution, though, for those who wish to test out 2.0: the document format seems to be backward-incompatible with 1.6.x. When trying to open a document created in 2.0, LyX 1.6.7 complained about being unable to convert the format.
Major New Features
At first glance, there's not an enormous difference between LyX 1.6.x and 2.0. Open LyX 2.0 and 1.6.7 side by side and it's difficult to tell them apart. But the two years of LyX development have generated quite a few major and minor new features.
The idea is that the formatting of LyX documents is set by the document type. For instance, the layout of a presentation is set by the Beamer class. If you want to change those things, you make changes in the class, not the document itself. It's been possible to embed layout information in a LyX document previously, but the 2.0 release adds a way to do this via the "Local Layout" tab under "Document Settings". Now users can easily define specific tweaks to a layout without having to muck with the class itself. Even so, the LyX folks caution that it's "not a good idea to mess with a layout when you are actually working on a document
".
Most books or other works include only one index — but some works call for more than one index. For example, in a book discussing the history of the Linux kernel one might want to have an index of names, and an index for features. LyX 2.0 introduces multiple indexes so users can do just that. If more than one index is defined, LyX adds menu entries for each under the "Insert" menu so that users can choose between the relevant indexes, and separate entries to place the target indexes in the document.
LyX has had thesaurus support for some time, but that support has been limited to English. This is, obviously, not terribly useful for much of the world. LyX 2.0 adds support for the MyThes library, which happens to be what OpenOffice.org uses as well. This makes all of the OpenOffice.org thesauri available to LyX 2.0 users as well. Likewise, LyX has had spell check support for some time, but it required running the spell check manually. 2.0 adds support for continual spell checking, though it can be turned off for users who prefer the manual way.
It seems that several of the changes are moving LyX towards being more directly fussy with formatting. A new feature in LyX 2.0 is support for LaTeX commands that were not previously supported in LyX. For example, by adding the "Initials" module (under "Document Settings"), users can define initial caps (where the first letter of a paragraph or section of text is larger than the surrounding text). But it does not automatically set an initial cap for any structure in the document — it adds an "Initial" entry under the menu, which allows users to define initial caps manually. There's also support for several types of new underlines (via the LaTeX ulem package). One addition that seems like it should have been available earlier is the introduction of support for the LaTeX \rule
command, which inserts a horizontal line in the document.
For those who wish to typeset songbooks, or just add a bit of musical notation to a document, LyX now supports the Lilypond LaTeX module and can import from Lilypond.
![[LyX file comparison]](https://static.lwn.net/images/2010/lyx-differences-sm.png)
One of the most interesting additions is the "compare documents" feature. LyX already had version control support and support for tracking changes, but this new feature produces a document that incorporates the changes between two documents and produces output from that. At least in theory. To test the theory I tried comparing the User Guide for 1.6.7 and 2.0 beta. It took about a minute to produce the differences document, which can be viewed in LyX. However, trying to produce output (like a PDF) of that document failed with a spectacular number of errors.
In general, LyX has had dozens of small enhancements for better control over output, document presentation, and support for new types of layout. See the "what's new" document for the full list of new features. Though most are incremental updates, taken together it adds up to quite a major revision.
LyX users who aren't itching to upgrade will be happy to know that many, but not all, of the features new to LyX 2.0 have been backported to the 1.6.8 LyX release that came out on November 15th. The 1.6.8 release is the recommended upgrade for users who want a "solid and polished" version. The LyX project hasn't yet released Linux binaries for 1.6.8, but one can find installers for Windows and Mac OS X if they want or need to use LyX on those platforms.
The release schedule for 2.0 calls for a final release in December. Since the project has hit its targets so far, it seems likely that (barring any major show-stoppers), users will have a stable LyX 2.0 in time for the holidays. Though on the surface LyX 2.0 doesn't look that different, it has a fairly significant set of improvements. What's coming in later versions? There's no roadmap for LyX releases beyond the 2.0.x series, but users can add to the wishlists, which have gotten long enough that they've been broken into two lists roughly lined up by internal features (tools, citing, inline editing) and external (saving, exporting, installing).
Assuming one wants to produce PDF, HTML, or other non-Word and ODF documents, LyX is a top-notch document processor. The 2.0 release is well worth the two years it's taken to produce.
Brief items
Report from the Buildroot Developer Day, 29th October 2010
Buildroot is "a set of Makefiles and patches that makes it easy to generate a complete embedded Linux system." The Buildroot developers held a meeting at the end of October; Thomas Petazzoni has posted a report from that gathering. "
This Developer Day has been very productive in terms of discussion, and for some complicated topics, we know have a better understanding on what should be implemented and how it should be implemented. The participants were all very satisfied of the day spent discussing Buildroot future." (Thanks to Sam Ravnborg).
Claws Mail 3.7.7 Unleashed
Version 3.7.7 of the Claws Mail email client is out. Enhancements include command-line searching and an option to add a margin to the compose window so that it's possible to actually read what you've written.Coccinelle 0.2.4 released
Version 0.2.4 of the Coccinelle code transformation tool is out. Improvements include better scripting support, a number of new metavariable types, and more. See this LWN article for an introduction to what can be done with Coccinelle.The state of Mozilla
Mitchell Baker has posted an annual report on the state of the Mozilla project. "The Internet is in a period of dramatic change. We've built the traits we care about -- innovation, opportunity, interoperability, individual control -- into one layer of Internet life through the browser. We also need to build these traits into the new ways people use the browser and the Internet. Three of our largest areas of focus are mobile, 'Open Web Apps' and the social and data sharing aspects of the Web. We're also increasing participation and collaboration with the Mozilla Drumbeat project."
Wayland switching to LGPLv2
The main Wayland libraries will be changing to version 2 of the Lesser GPL (from the MIT license) in the near future. The clients are changing too: "The demo compositor and clients are currently under GPLv2, but I'm changing them to LGPLv2 as well. This is a bit odd on the face of it, but the point of these applications is to prototype new functionality that will eventually migrate into either the client or server wayland libraries or one of the above toolkits. As we move forward and start adding developers, I just want to make sure that that won't be a problem."
Newsletters and articles
Development newsletters from the last week
- Caml Weekly News (November 23)
- LibreOffice development summary (November 22)
- PostgreSQL Weekly News (November 21)
Comparing MySQL and Postgres 9.0 Replication (TheServerSide)
TheServerSide.com has a comparison of the replication features offered by MySQL and PostgreSQL. "As demonstrated above, there are both feature and functional differences between how MySQL and PostgreSQL implement replication. However, for many general application use cases, either MySQL or PostgreSQL replication will serve just fine; technically speaking, from a functional and performance perspective, it won't matter which solution is chosen. That said, there still are some considerations to keep in mind in deciding between the different offerings."
Image Processing with OpenGL and Shaders (LinuxJournal)
The Linux Journal has an article on computing with the GPU. "This article discusses using OpenGL shaders to perform image processing. The images are obtained from a device using the Video4Linux 2 (V4L2) interface. Using horsepower from the graphics card to do some of the image processing reduces the load on the CPU and may result in better throughput. The article describes the Glutcam program, which I developed, and the pieces behind it."
systemd for administrators - killing services
The fourth installment of Lennart Poettering's "systemd for administrators" series has been posted. This installment looks at killing system daemons. "So again, what is so new and fancy about killing services in systemd? Well, for the first time on Linux we can actually properly do that. Previous solutions were always depending on the daemons to actually cooperate to bring down everything they spawned if they themselves terminate. However, usually if you want to use SIGTERM or SIGKILL you are doing that because they actually do not cooperate properly with you."
Page editor: Jonathan Corbet
Announcements
Commercial announcements
Microsoft helping OpenStreetMap
Microsoft has announced that it will be contributing to the OpenStreetMap project. "As a Principal Architect for Bing Mobile, Steve will help develop better mapping experiences for our customers and partners, and lead efforts to engage with OpenStreetMap and other open source and open data projects. As a first step in this engagement, we plan to enable access to Bing's global orthorectified aerial imagery, as a backdrop of OSM editors. Also, Microsoft is working on new tools to better enable contributions to OSM." Here, "Steve" is Steve Coast, the founder of OpenStreetMap.
Novell sold to Attachmate
Novell has announced the company's sale to Attachmate for $6.10/share. More ominously: "Novell also announced it has entered into a definitive agreement for the concurrent sale of certain intellectual property assets to CPTN Holdings LLC, a consortium of technology companies organized by Microsoft Corporation, for $450 million in cash, which cash payment is reflected in the merger consideration to be paid by Attachmate Corporation." Information on what the "certain intellectual property assets" are is scarce at the moment. (Thanks to Jeff Schroeder).
Update: Novell's 8K filing is available with a bit more information. The "certain intellectual property" is 882 patents. There is also an escape clause for Novell should somebody come along with an offer for the company that includes buying the patents.
Attachmate Corporation Statement on openSUSE project
Attachmate has released a brief statement about its plans for SUSE and openSUSE. ""The openSUSE project is an important part of the SUSE business," commented Jeff Hawn, chairman and CEO of Attachmate Corporation. "As noted in the agreement announced today, Attachmate plans to operate SUSE as a stand-alone business unit after the transaction closes. If this transaction closes, then after closing, Attachmate Corporation anticipates no change to the relationship between the SUSE business and the openSUSE project as a result of this transaction.""
OSADL Launches Real-Time Linux Quality Assurance Farm
The Open Source Automation Development Lab launched a quality assurance farm that contains a number of test racks with various desktop and embedded computer systems. "They are equipped with a candidate of the "Latest Stable" real-time Linux kernel and undergo continuous testing. All test results and configuration data of all systems are available online at osadl.org/QA. As far as we know, this is the first QA farm of its kind; we are convinced that it represents an important step towards a generally accepted and validated real-time operating system for the automation industry."
Articles of interest
Did Google Arm Its Own Enemies With Android? (HBR)
The Harvard Business Review thinks that Google is in trouble because handset vendors can change the default search engine on Android phones. "What's the endgame here? Well, with both handset manufacturers and networks increasingly becoming commoditized, each are desperate to find new sources of revenue. Between them, the most valuable thing they have is control over what goes on the phone right before it reaches the customer: what apps, and what search. This is exactly what Google needs to control as the future shifts to the mobile web."
Surveys
Markham: MPL2: GPL Compatibility?
Gervase Markham is running an unofficial survey on whether or not the Mozilla Public License v2 should be compatible with the GPL. "I personally believe that most or all groups who are currently licensing their software under the MPL (only) would not mind, or actively desire, GPL compatibility, and the new MPL should give them the opportunity to choose it. I think most free software developers see licensing as a pain, and license incompatibility as a double pain, and would much rather everything were upwardly compatible with everything else. To test this belief, and so we can appropriately publicize the new version of the MPL when it comes out, I am creating a list of MPLed projects. If you know of an MPLed project please add it to the list. And I want to hear from those projects as to whether they are opposed to, in favour of, or indifferent to, GPL compatibility for existing projects being put into MPL 2. Have a discussion on your mailing list and let me know the outcome."
Calls for Presentations
Call for Papers: Python for High Performance and Scientific Computing
There will be a workshop on using Python for High Performance and Scientific Computing as part of the International Conference on Computational Science (ICCS 2011). The workshop will be held in Tsukuba, Japan, June 1-3, 2011. The call for papers is open until January 10, 2011.
Upcoming Events
Events: December 2, 2010 to January 31, 2011
The following event listing is taken from the LWN.net Calendar.
Date(s) | Event | Location |
---|---|---|
December 4 | London Perl Workshop 2010 | London, United Kingdom |
December 6 December 8 |
PGDay Europe 2010 | Stuttgart, Germany |
December 11 | Open Source Conference Fukuoka 2010 | Fukuoka, Japan |
December 13 December 18 |
SciPy.in 2010 | Hyderabad, India |
December 15 December 17 |
FOSS.IN/2010 | Bangalore, India |
January 16 January 22 |
PyPy Leysin Winter Sprint | Leysin, Switzerland |
January 22 | OrgCamp 2011 | Paris, France |
January 24 January 29 |
linux.conf.au 2011 | Brisbane, Australia |
January 27 January 28 |
Southwest Drupal Summit 2011 | Houston, Texas, USA |
January 27 | Ubuntu Developer Day | Bangalore, India |
January 29 January 31 |
FUDCon Tempe 2011 | Tempe, Arizona, USA |
If your event does not appear here, please tell us about it.
Page editor: Rebecca Sobol