LWN.net Weekly Edition for February 20, 2014
A Notmuch mail update
Your editor has, for many years, been in search of a better email client. It is continually surprising — and frustrating — to see how hard it seems to be for our community to do a good job with one of the tools it relies upon most heavily. As part of this search, your editor reviewed Notmuch nearly four years ago. Since the email-client problem has not been solved in the meantime, your editor decided that perhaps Notmuch merited another look.Shortly after the previous review, Notmuch development appeared to come to something approximating a complete stop after Carl Worth, the creator of the project, got busy with other work. In the last couple of years, though, David Bremner has accepted the challenge of pushing Notmuch forward; in response, a small but dedicated community of developers has come together. Notmuch is, once again, making semi-regular releases and adding new features; the current release is Notmuch 0.17, which came out at the end of December.
Notmuch remains more of a kit from which an email setup can be built than a complete solution in its own right. In particular, it doesn't handle email sending and receiving at all, and its approach to the user interface, while improving, may still feel a bit desultory to anybody who is used to a more polished client. So Notmuch will not be an appropriate solution for anybody who wants a ready-made point-and-click environment. But it may well be the right solution for users who deal with large amounts of mail and are willing to put some time into creating an optimal setup.
At its core, Notmuch is a command-line tool that manages a searchable index for a local mail repository. The mail itself must be stored in the filesystem, one message per file; the Maildir and MH formats both work. The local storage requirement has been a show-stopper for your editor in the past; there are real advantages to having one's email behind an IMAP server in a central location. It's nice to be able to access email from multiple machines, including tablets and handsets. For users who live their lives out of their laptops, perhaps a local mail store is an optimal solution, but, for many of us, mail locked into a laptop is inaccessible much of the time.
That's the bad news. The good news is that the Notmuch developers have been working to improve interoperability with mail stored on an IMAP server. Notmuch still needs a local copy of the mail; the tool of choice to manage that copy appears to be Offlineimap. Recent versions of Notmuch are able to manipulate the "has been read" flags stored in Maildir filenames and to perform folder-based searches, despite the fact that "folders" are a foreign concept in the Notmuch world. The afew tool can be used to move messages between folders based on Notmuch tags; that means that local tagging can be reflected in IMAP folders on the server. These changes make it possible to use Notmuch in some settings, but still fall back on an IMAP-based client in others.
The initial task of copying the mail directory to the local machine where Notmuch is to be run (assuming it's not already there) will still be painful, of course. But, then, anybody who has sat through Thunderbird's indexing routine will already be used to that kind of suffering and may not even notice. Then it's a simple matter of running:
notmuch setup
and answering some questions to create the initial Notmuch configuration, then:
notmuch new
to read through and index all of the mail in the spool. The process can take a while for a very large mail spool, but it's surprisingly fast when one looks at how many messages are being processed.
The indexing of mail is done with the Xapian search engine. Xapian provides high-speed and flexible searching; it also performs "stemming," so that a search for "email" will also turn up "emails" and "emailed," for example. Stemming only works with English; extending it to other languages is not likely to be a simple task. Basic logical operators are supported; a search for, say:
from:torvalds AND crap
turns up a long list of messages on a linux-kernel archive. Searches can be date-constrained; it is also possible to search for words or phrases that are near each other in the text.
The other fundamental Notmuch operation is tagging. The "notmuch new" command will apply some initial tags ("unread", for example); it can be configured to do more complex tagging in response to search queries as well. Searching itself can be limited to specific tags. As is the way of the modern world, Notmuch has pretty much done away with the concept of folders; rather than existing in a single folder, a message will have any of a number of tags.
Low-level access to the Notmuch search feature is via the "notmuch search" command; it outputs a list of matching messages (or threads) in a line-oriented format. Most users, however, will want to actually do things with the result of a search — reading the messages, say, or replying to them. That is where core Notmuch throws up its hands and declares that the problem is to be solved by some higher layer of software.
Several projects are working on producing a useful user interface for
people who actually want to work with mail. There are a few alternatives for
those who want to integrate Notmuch with the Mutt email client, including
the notmuch-mutt
helper by Stefano Zacchiroli. Another terminal-based client is Alot. There is notmuch-vim
for those who want to manage their mail through the vim editor, and notmuch-web to create a
web interface.
The bulk of Notmuch users, though, would appear to be using the Emacs-based interface packaged with Notmuch itself. Starting the Emacs Notmuch mode is somewhat reminiscent of going to the main Google page; there is a search box and not a whole lot more. Once one picks a search ("unread" is a good starting place), the result is a dense screen showing the results in a one-line-per-thread format. Clicking on a line (or moving there and hitting "return") switches to a buffer with all messages in the thread, one after another. This view clearly works for many Notmuch users, but it's not entirely helpful for those of us who like to know what's in a thread before wandering into it.
Fortunately, the Notmuch 0.17 release added a new mode called
notmuch-tree; it shows a tree-structured threaded view that is
closer to what most other thread-oriented mail clients provide. In this
view, one can move through threads and messages in a fairly straightforward
manner.
There is a full range of functionality in either mode, including the ability to
pipe a message to a command — a feature that is surprisingly hard to find
in most graphical clients. And, of course, it's Emacs, so the interface is
infinitely flexible.
If Emacs is running in the graphical mode, the experience is mostly like that of a graphical mail reader. The illusion falls apart a bit, though, when it comes to dealing with attachments. Emacs can deal with various types of attachments internally; it can do inline image display, for example. But Notmuch makes little use of those capabilities, relying instead on external helpers to deal with attachments. Having external image viewers and web browser windows pop up for attachments feels a bit less integrated than a contemporary graphical mail client, but it works well enough most of the time.
Mail composition is handed off entirely to the Emacs "Message" mode, which, for the most part, handles the job nicely. There is nothing like trying to edit a complex message in a graphical mail client's composition window to make one appreciate the virtues of having a real text editor available. That said, handling of attachments is, once again, a bit primitive for those who are used to a native, graphical, point-and-click interface, but it gets the job done. Naturally, Notmuch doesn't get into the business of actually sending mail at all; it assumes there is a sendmail command around somewhere that can send a message toward its destination.
So, one might ask, will Notmuch become your editor's new email client? The
jury is still out on that one, but it might just happen. To a great
extent, it comes down to whether a practical way for accessing email from
multiple hosts, including some that use IMAP, can be found. Offlineimap
seems like one potential option; there is also a way of using a remote Notmuch
installation over SSH from an Emacs client, though the instructions
warn that "some things probably won't work perfectly
". In
either case, part of the proof will be usability over a painful conference
network connection.
Even without a full-scale switchover, Notmuch can be useful for tasks like
searching through an extensive linux-kernel archive, of course.
In general, it often seems like progress in the area of email clients came to a halt some years ago. The clients that actually are evolving are, for the most part, tied to cloud services; offloading one's email to a cloud provider may be a viable solution for some users, but many of us disliked that idea even before the extent of national surveillance efforts became widely known. So it is good to see communities like Notmuch trying to get back to the basics of what makes an email client work well, and it is good to see that Notmuch, in particular, appears to be picking up speed. Notmuch remains a project worth watching.
Fedora and the Cherokee logo
Offensive imagery is, at least partly, in the eye of the beholder. But there are classes of imagery that have come under increasing fire over the years for depicting groups of people (races, genders, religions, etc.) using stereotypes, which is often seen as demeaning and insulting to those groups. A long-simmering Fedora bug regarding images used by the Cherokee web server package flared up recently—and was escalated to the Fedora Engineering Steering Committee (FESCo) for a verdict.
At first blush, the image in question (a running, smiling "Cherokee" boy) seems fairly tame and not particularly offensive. To some, clearly it isn't offensive. But the US (in particular) has seen a dramatic shift in acceptable depictions of American Indians (or Native Americans, depending on who you ask). The stereotypical look and actions of Indians from the "Westerns" movie genre are typically frowned upon. Caricatures based on those stereotypes can be offensive as well. There has also been substantial controversy over the use of Indians and related imagery for sports teams. It is a touchy subject in many quarters, but one that is difficult for some people to grasp.
Back in 2011, Richard Fontana filed a bug contending
that the Cherokee imagery violated the Fedora packaging guidelines.
Specifically, the "Code
Vs Content" section says: "Content should not be offensive,
discriminatory, or derogatory. If you're not sure if a piece of content is
one of these things, it probably is.
" He asked that the content be
replaced in the Fedora package with content that did meet the guidelines.
Lead Cherokee developer Alvaro Lopez Ortega replied, noting that the project had discussed the issue on its mailing list:
[...] Do not take me wrong though. I'd be all for removing an offensive logo. If the logo were demeaning or tried to make fun of a collective it should be [removed]. However, quite honestly, Cherokee's logo is not the case. It is a completely respectful logo of a happy kid without any remote sign of negativeness at all.
But Fontana (along with Zachary Krebs, who started the mailing list thread and was involved in the bug report as well) is not convinced that a "happy smiling kid" is as benign as Lopez has described:
The Cherokee project developer community seems quite international in scope and I have no doubt that the choice of this logo was not intended or understood to offend. Nevertheless, the Fedora standard (as I read it) does not rest on the intent of the upstream project.
Another Indian-named web server project, Hiawatha, has a somewhat similar logo (seen on the About page) that is likely even more offensive, according to Krebs and others. That project was mentioned in both threads, but it is not packaged for Fedora. But, in an effort to try to defuse the problem, Máirín Duffy offered to create new logos for both projects.
There was more back and forth, where Fontana, Krebs, and Duffy tried to
help Lopez understand the problem, while he and Fedora Cherokee package
maintainer Pavel Lisý were, seemingly, unable to see any problem. Lopez declared
that the logo would not be changing, though he did add some
text (it has apparently since disappeared) to the Cherokee web site
pointing out "that we don't intend to be disrespectful *by any
[means]*
".
That's where things stood until the end of 2013, when Fontana noted the lack of action on the bug and asked Lisý about his plan for addressing it. There was no immediate response to the query, but it did rekindle the discussion about the logo. Current Cherokee maintainer Stefan de Konink joined in and expressed concern that the existing logo was being censored by Fedora.
But censorship is not the issue. No one is calling for laws to restrict the distribution of the logo or to have it removed from the project (though several have suggested that it would be a good idea for the project to do so). As Fontana put it in reply to De Konink:
Once again, the conversation went back and forth over the existing points, without any minds being changed. Hiawatha author Hugo Leisink added a comment that, unsurprisingly, agreed with the Cherokee folks. Meanwhile Lisý responded, but had little interest in addressing Fontana's complaint as he disagreed with its basis.
Soon thereafter, Fontana filed another bug, this time in his capacity as a member of the Red Hat legal team (his earlier efforts were simply as a member of the Fedora community). The bug also requested the removal of specific images from the Cherokee package. It led to a rather bizarre thread where De Konink asked for personnel action against Fontana. That supposed "threat" led Fontana to request removal of the entire Cherokee package from Fedora. Things were clearly spinning out of control.
Emotions on both sides were running high, but the question remained: is the logo in violation of Fedora packaging guidelines or not? That's the kind of decision that FESCo makes, and the committee took it up at its February 12 meeting. The FESCo ticket contains many of the same arguments that had been made three years earlier (and then rehashed earlier this year), but Lisý (pali) is participating more than he did in other forums.
In the end, the FESCo vote wasn't even close, with all eight members voting to ask Lisý to remove or replace the offensive images within two weeks. If that doesn't happen, the Cherokee package will be forcibly retired from Fedora. As of this writing, the logos have yet to be changed.
It is clear that there is a disconnect between the Cherokee/Hiawatha developers (and maintainers) and some in the Fedora project (including, unanimously, FESCo). It is tempting to attribute that largely to a cultural divide (as several in the thread did)—that European and US attitudes toward the depictions of Indians differ. That may be too simplistic in this case, however. There also seem to be some heels that were dug in quite early, and no amount of discourse (or information, for example, about what some people of Cherokee descent think) was going to change those minds. Given that there is an offer on the table to design new logos (at least if the Cherokee developers stop "harassing" Fontana), it would seem prudent for the projects to at least explore that option.
There is also, undoubtedly, a feeling of "political correctness run amok" about this incident. It is not a fair assessment, but these types of conflicts tend to bring out that feeling in some. People get attached to things like logos, and "attacks" on them feel personal, particularly when the offense given is not obvious (and it isn't, at least in the Cherokee case). We may well see this play out again someday; there aren't easy answers about where the lines are, which will make it all that much more difficult.
Mozilla directory tiles, ads, and you
On February 11, Mozilla announced that it was pursuing a program to place ad content into Firefox—in one specific location: the currently empty slots seen in Firefox's "New Tab" page the first few times the browser is run. But neither the limitations of the plan nor the assurances of putting users first stopped critics from decrying the decision. In subsequent days, a number of Mozilla representatives sought to quell user dissatisfaction and assure the public that Firefox was not headed down a dark path toward revenue-driven decision making.
The announcement was made in a blog post written by Mozilla's VP of Content Services, Darren Herman. In the post, Herman noted that, for users who have just recently installed Firefox, the "New Tab" page—which normally shows nine shortcuts to frequently-visited pages—instead shows an uninteresting nine blank squares. Thus, Mozilla was planning to populate those nine slots with something more useful:
The program was called Directory Tiles, he said, and more details would be forthcoming. The reaction was swift, with most people zeroing in on the "sponsored tiles" portion. After all, despite the somewhat business-speak style of wording used in the announcement, the plan clearly does mean that Firefox would show advertisements in some of the New Tab squares. As anyone who has witnessed a flamewar on the Internet would expect, commenters on the post were rather vocal and explicit when expressing their displeasure.
Several accused Herman of using vague language in an effort
to hide the real meaning of the announcement, such as a commenter
named Matt who deemed
it "corporate biz speak nonsense
" Further fueling the
negative reaction was coverage from blogs and online news outlets;
Silicon Angle's headline
accused Mozilla of "selling its soul," while Tom's Guide labeled
it "Firefox flipping its policy."
By and large, the negative comments broke into three general camps. Some immediately declared Mozilla to be unworthy of further trust and support, some argued that although the specific program was not anathema, the move was a slippery slope that would lead to a degraded, ad-driven experience in the future, while others raised specific concerns about the ads. The specific concerns included whether or not the Directory Tiles feature could be deactivated, who and what sort of ad partners would be "hand selected" (namely whether the ads would actually be of interest to the user), and whether or not the ads would involve user tracking. There also seemed to be confusion as to whether the sponsored tiles would be links to base site URLs (e.g., Amazon.com) or would be vendor-created ad units pitching specific products or pages (for example, "latest headline" articles).
Herman clarified several of those implementation details in a February 13 update post. The sponsored tiles will not incorporate user tracking, although they will return different tiles (for sponsored and non-sponsored slots) depending on IP-based geolocation information if it is available. Users will be able to switch off the feature (as they can the New Tab feature today); users will also be able to remove specific tiles (just as they can currently remove frequently-visited suggestions on the New Tab screen by clicking on the "x" button). He also explained that Directory Tiles would only be visible for 30 days or so, until Firefox has accumulated enough browser history for its "frecency" algorithm (a perhaps-questionably-cute word coined at Mozilla that combine visit frequency and most-recent-visit data) to start picking frequently-visited pages, at which point those pages will replace the Directory Tiles.
Other facets of the program are not yet clear, he said. For example, it has not been determined whether advertisers will be allowed to push specific articles/pages, or just generic site URLs. Precisely which metrics and statistics will be made available to advertisers has not been decided either, he said, but Mozilla's expectation is that it will report the total number of impressions and the number of click-throughs back to sponsors.
As for the accusations of betrayal and the slippery-slope argument,
neither of those concerns is as easily mollified by providing additional
detail. Nevertheless, Mozilla Chairperson Mitchell Baker published a response
of her own to the outcry, also on February 13. She emphasized that Mozilla
has a strong track record for maintaining the user experience over
looking for revenue streams, in part due to many of the Firefox team's
firsthand experience with AOL's attempts to monetize Netscape in the
years before Mozilla's founding. The project, she said, is far more
cautious than most of its contemporaries: "Sometimes my commercial
colleagues laugh at me for the amount of real estate we leave
unmonitored or the revenue opportunities we decline.
"
Baker went on to explain (and defend) Mozilla's process for implementing a potential revenue source, which starts with looking at the idea's potential benefits for users or whether it will be an irritant.
On February 18, Security and Privacy Program Manager Curtis Koenig wrote that, in comparison to other, more widely-accepted forms of advertisement, it is the user-tracking and privacy-invading aspects of online ads that raise the most alarm with users. On that point, all of the Mozilla spokespeople have been in clear agreement—the sponsored tiles will not involve user tracking.
Revenue generation has long been a thorny subject for Mozilla. Since the organization's largest income source is the large fee it collects from Google for making Google the default search engine in Firefox, there are regularly critics who either bemoan the lack of income diversification or accuse Mozilla of being little more than Google's puppet. But as recent history has shown, each new revenue source the organization puts together attracts its own share of complaints: 2011's Bing-powered builds of Firefox and the Firefox OS mobile operating system initiative both drew complaints, for example—the prior for "selling out" and the latter for being a foolhardy distraction.
But such criticism is par for the course when working in the open. It could certainly be argued that the primary root of the dust-up over Directory Tiles was the unclear presentation in Herman's announcement; it was several paragraphs in before the main topic came up, and even then it was after a lengthy description of meetings with the IAB (which, while not explained in the post, is the Interactive Advertising Bureau). A clearer explanation that anticipated some of the frequently asked questions might have been better received.
As for whether the program itself will prove useful or annoying, only time will tell. Certainly there is an opportunity for adding value in the non-sponsored tiles on the Directory Tiles screen—what is included in the "Mozilla ecosystem" tile category is still unclear, but support forums and add-on links (for example) could be helpful. If the sponsored tiles end up going to online casinos and shady certificate authorities, then users will certainly have valid complaints. But, as Mozilla General Counsel Denelle Dixon-Thayer pointed out in her own response, the tiles will only successfully bring in revenue for Mozilla if the users who see them find them useful. A corollary to that observation is that because the sponsored tiles are only seen by new users, they only bring in revenue when Firefox grows in popularity. So there is, to some extent, a built-in mechanism for self correction if the concept proves to be nuisance in the real world.
Security
Adding CPU randomness to the entropy pool
Kernel developers, or at least the maintainers of the random subsystem, are always on the lookout for sources of unpredictability for use in the random number pool. In particular, there are a number of scenarios where good random numbers are needed and the available pool is lacking in quality—embedded systems, early in the boot process, virtual machines, etc.—so new sources that can alleviate the problem are generally welcome. However, there is always the question of how much entropy is truly provided by these sources, which is a difficult problem to solve. Two recent patches would contribute unpredictable sources, but take a different approaches with regard to adding to the store of entropy.
The kernel has two separate interfaces for random number generation: /dev/random and /dev/urandom. They are supposed to be used for different purposes, with /dev/random only providing as many bits of randomness as bits of entropy have been added to the pool—blocking if insufficient entropy is available. It is meant to be used for long-lived keys (e.g. the SSH key for the system), while /dev/urandom provides cryptographic-strength pseudo-random numbers without the entropy requirement (and, thus, no blocking). Data read from either device comes from the same pool, but the entropy requirement is only applied for reads from /dev/random.
Unpredictable events measured by the kernel (that cannot be observed by an adversary) make up the input to the entropy pool from which the random numbers are generated. Various kinds of interrupts are used (e.g. intra-key timing from the keyboard, sometimes disk or network device intra-interrupt timing, and so on) and their values are mixed into the pool. As that is done, an estimate of how many bits of entropy are being contributed is added to the entropy count. That estimate is hopefully conservative enough that it underestimates the amount of true entropy in the pool, while trying not to make it impossible to generate a reasonable number of random bits in a reasonable time.
An even more conservative approach would be to add unpredictable data to the pool without crediting any entropy. That is already done with some sources in today's kernel, such as when adding unpredictable device-specific data using add_device_randomness(). There is value in adding "zero credit" (i.e. no entropy contributed) unpredictability to the pool. Any data that is added perturbs the state of the pool, which will change the values produced by /dev/urandom. In some situations, the same random numbers would be produced boot after boot if there were no unpredictable data added.
CPU randomness
"Zero credit" is the approach Jörn Engel took with his CPU randomness patch. It mixes uninitialized stack values with unpredictable values like jiffies into its own pool, then mixes that pool into the normal entropy pool periodically. It clearly adds unpredictability into the pool, but how much entropy it provides is hard or impossible to determine, so Engel gives it no entropy credit.
The patch gathers its info from two kernel locations: the scheduler and the slab allocator. It uses an uninitialized four-element array (input) on the stack and XORs various values into it to produce the input to the private pool. The values used are jiffies, the value of the time stamp counter (TSC), the address where the scheduler and allocator functions will return, a value specific to that invocation of the scheduler or allocator, and the address of the input array itself. It is similar in some ways to the gathering that is done for interrupts for the global pool. This collection and mixing is done quite frequently (whenever need_resched() or __do_kmalloc() are called), then the private pool is combined with normal pool at most once per second.
Perhaps surprisingly, Engel's tests showed no measurable impact on kernel performance. For the private pool, he is using a custom mixing algorithm that is faster than fast_mix() that is used on the global pool, but provides worse mixing. The normal path is used when mixing the private pool into the global.
Engel's focus is on "generating high-quality randomness as
soon as possible and with low cost to the system
". In particular,
he is targeting embedded systems:
While the values being used seem unpredictable, Ted Ts'o questioned whether an "attacker with deep
knowledge of how the kernel was compiled and what memory allocations
get done during the boot sequence
" would be able to successfully
predict some of the values. For many kernel deployments (e.g. distribution
kernels), an attacker will be able to get that deep knowledge fairly
easily. Ts'o thought Engel's approach had promise for
improving the /dev/urandom output, but agreed with the approach of
not crediting entropy (thus not affecting how much data is available from
/dev/random).
CPU jitter
Another approach was suggested by Stephan Müller in his CPU Jitter random number generator (RNG) patch
set. It was met with more skepticism, at least partly because it
does add to the entropy count. Ts'o and others are not convinced
that sufficiently knowledgeable attackers couldn't predict the output.
Müller's reliance on statistical techniques in his paper to
measure the entropy pool and RNG output is also a cause
for some skepticism. But, according to Müller,
the statistical measures are just a "necessary baseline
"
before he gets into "measuring the actual noise coming out of the
noise sources
".
Müller's method is to measure the jitter in the amount of time it takes the
CPU to perform a set of operations.
When entropy is needed, the driver repeatedly runs two "noise sources": a
memory accessing routine that "will add to the timing variations due
to an unknown amount of CPU wait states added when accessing memory
"
and a timestamp folding operation that is "deliberately
inefficient
", which requires the function to be built with no
optimization (-O0). The folding operation turns a 64-bit timestamp
into one bit that is XORed into the driver's entropy pool. The jitter in
the timing
of those two operations is also mixed
into that entropy pool one bit at a time. Once the required entropy is
available, random numbers derived from that are returned.
While the timing is unpredictable, due to a number of the factors Müller cites in his paper and patchset, it still amounts to a pseudo-random number generator (PRNG), according to H. Peter Anvin:
He goes on to say that independent clocks in a system would provide a source of quantum noise that could potentially be used to increase the entropy count, but that such clocks are rare on today's systems as clocks are typically slaved from the same source using phase-locked loops to keep them synchronized. Thus, using jitter (or Engel's CPU randomness) for mixing into the pool is reasonable, Anvin continued, but not for entropy credit:
It would be nice to assume that since there is no discernible pattern to the output, there must be an underlying entropy-adding event at play. But that is not enough for Ts'o, Anvin, and others to be convinced. Back in October, when the CPU Jitter RNG was first introduced, Ts'o replied at length to the patch and explained the problem he saw:
He also went on to describe ways that Müller could convince him that there is real random noise being generated:
If you think it is from DRAM timing, first try accessing the same memory location in kernel code with the interrupts off, over and over again, so that the memory is pinned into L1 cache. You should be able to get consistent results. If you can, then if you then try to read from DRAM with the L1 and L2 caches disabled, and with interrupts turned off, etc, and see if you get consistent results or inconsistent results. If you get consistent results in both cases, then your hypothesis is disproven. If you get consistent results with the memory pinned in L1 cache, and inconsistent results when the L1 and L2 cache are disabled, then maybe the timing of DRAM reads really are introducing entropy. But the point is you need to test each part of the system in isolation, so you can point at a specific part of the system and say, *that*'s where at least some uncertainty which an adversary can not reverse engineer, and here is the physical process from which the [chaotic] air patterns, or quantum effects, etc., which is hypothesized to cause the uncertainty.
Müller has done most or all of the testing Ts'o suggested as reported in his paper. The results seem to bear out some kind of random noise in both the memory access and folding operations. But Anvin's opinion that the jitter in modern CPUs just represents a complicated PRNG space seems to have held the day. Perhaps a further look at the testing results is in order.
The reliance of the jitter RNG on a high-resolution timer makes it unsuitable for Engel's embedded use case (as some of those systems lack such a timer), so it's not at all clear where things go from here. Ts'o is not opposed to adding something as a zero-entropy source to try to get better /dev/urandom numbers earlier in the boot. Since Engel's solution is both simpler and does not rely on a high-resolution timer, it may well get the nod.
Brief items
Security quotes of the week
Introducing ClamAV community signatures
ClamAV has announced a new "community signatures" program to gather signatures of new malware for use in the ClamAV anti-virus scanner. Malware submissions can be made to <http://www.clamav.net/lang/en/sendvirus/submit-malware/>, but signatures can be emailed:
- not be a hash-based signature
- be accompanied by a MD5/SHA1/SHA256 for a sample the signature is meant to detect.
- come with a brief description of the threat the signature is trying to detect and what the signature is looking for
New vulnerabilities
ffmpeg: multiple unspecified vulnerabilities
| Package(s): | ffmpeg | CVE #(s): | |||||||||
| Created: | February 14, 2014 | Updated: | February 19, 2014 | ||||||||
| Description: | From the Mageia advisory:
This updates provides ffmpeg version 1.1.8, which fixes several unspecified security vulnerabilities and other bugs which were corrected upstream. The Mageia bug report gives a long list of CVEs, some of which *may* be fixed by this update, but the discussion makes it clear that no one knows which. | ||||||||||
| Alerts: |
| ||||||||||
file: denial of service
| Package(s): | file | CVE #(s): | CVE-2014-1943 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 17, 2014 | Updated: | April 7, 2014 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Debian advisory:
It was discovered that file, a file type classification tool, contains a flaw in the handling of "indirect" magic rules in the libmagic library, which leads to an infinite recursion when trying to determine the file type of certain files. The Common Vulnerabilities and Exposures project ID CVE-2014-1943 has been assigned to identify this flaw. Additionally, other well-crafted files might result in long computation times (while using 100% CPU) and overlong results. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
gnutls: certificate verification error
| Package(s): | gnutls | CVE #(s): | CVE-2014-1959 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 17, 2014 | Updated: | February 26, 2014 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Mageia advisory:
Suman Jana reported a vulnerability that affects the certificate verification functions of gnutls 3.1.x and gnutls 3.2.x. A version 1 intermediate certificate will be considered as a CA certificate by default (something that deviates from the documented behavior) | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||
imapsync: TLS botch
| Package(s): | imapsync | CVE #(s): | CVE-2014-2014 | ||||||||||||||||||||
| Created: | February 14, 2014 | Updated: | October 28, 2014 | ||||||||||||||||||||
| Description: | A bit of info from the Fedora advisory:
Bug fix: Check if going to tls is ok, exit otherwise with explicit error message. Thanks to Dennis Schridde for reporting this ugly bug that deserves a CVE. See this oss-security list email for additional information. | ||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||
kernel: multiple vulnerabilities
| Package(s): | kernel | CVE #(s): | CVE-2014-0069 CVE-2014-1874 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 18, 2014 | Updated: | March 28, 2014 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Red Hat bugzilla [1, 2]:
Linux kernel built with the NSA SELinux Support(CONFIG_SECURITY_SELINUX) is vulnerable to a crash caused by an empty SELinux security context value. If a file has an empty security context, listing it via 'ls(1)' could trigger this crash. Only user/processes with CAP_MAC_ADMIN privileges are allowed to set the SELinux security context of a file. A user/process with CAP_MAC_ADMIN privileges could use this flaw to crash the kernel, resulting in a DoS. (CVE-2014-1874) A flaw was found in the way cifs handled iovecs with bogus pointers userland passed down via writev() during uncached writes. An unprivileged local user with access to cifs share could use this flaw to crash the system or leak kernel memory. Privilege escalation cannot be ruled out (since memory corruption is involved), but is unlikely. (CVE-2014-0069) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
libtar: directory traversal
| Package(s): | libtar | CVE #(s): | CVE-2013-4420 | ||||||||||||
| Created: | February 19, 2014 | Updated: | February 26, 2014 | ||||||||||||
| Description: | From the Debian advisory:
A directory traversal attack was reported against libtar, a C library for manipulating tar archives. The application does not validate the filenames inside the tar archive, allowing to extract files in arbitrary path. An attacker can craft a tar file to override files beyond the tar_extract_glob and tar_extract_all prefix parameter. | ||||||||||||||
| Alerts: |
| ||||||||||||||
lxc: privilege escalation
| Package(s): | lxc | CVE #(s): | CVE-2013-6441 | ||||||||||||
| Created: | February 13, 2014 | Updated: | February 27, 2014 | ||||||||||||
| Description: | From the Ubuntu advisory:
Florian Sagar discovered that the LXC sshd template set incorrect mount permissions. An attacker could possibly use this flaw to cause privilege escalation on the host. | ||||||||||||||
| Alerts: |
| ||||||||||||||
maas: two vulnerabilities
| Package(s): | maas | CVE #(s): | CVE-2013-1070 CVE-2013-1069 | ||||
| Created: | February 14, 2014 | Updated: | February 19, 2014 | ||||
| Description: | From the Ubuntu advisory:
James Troup discovered that MAAS stored RabbitMQ authentication credentials in a world-readable file. A local authenticated user could read this password and potentially gain privileges of other user accounts. This update restricts the file permissions to prevent unintended access. (CVE-2013-1070) Chris Glass discovered that the MAAS API was vulnerable to cross-site scripting vulnerabilities. With cross-site scripting vulnerabilities, if a user were tricked into viewing a specially crafted page, a remote attacker could exploit this to modify the contents, or steal confidential data, within the same domain. (CVE-2013-1069) | ||||||
| Alerts: |
| ||||||
maradns: denial of service
| Package(s): | maradns | CVE #(s): | CVE-2014-2031 CVE-2014-2032 | ||||||||||||||||
| Created: | February 14, 2014 | Updated: | April 3, 2014 | ||||||||||||||||
| Description: | From the Fedora advisory:
There has been a long-standing bug in Deadwood (ever since 2007) where bounds checking for strings was not correctly done under some circumstances. Because of this, it has been possible to send Deadwood a "packet of death" which will crash Deadwood. Since the attack causes out-of-bounds memory to be read, but not written to, the impact of the bug is denial of service. It appears this attack can only be exploited by an IP with permission to perform recursive queries against Deadwood. Note that this bug only affects users of the Deadwood recursive resolver. | ||||||||||||||||||
| Alerts: |
| ||||||||||||||||||
mongodb: denial of service
| Package(s): | mongodb | CVE #(s): | CVE-2012-6619 | ||||||||||||
| Created: | February 18, 2014 | Updated: | April 29, 2014 | ||||||||||||
| Description: | From the Mageia advisory:
A possible DoS issue was discovered in MongoDB. See the MongoDB advisory for more information. | ||||||||||||||
| Alerts: |
| ||||||||||||||
mpg123: denial of service
| Package(s): | mpg123 | CVE #(s): | CVE-2014-9497 | ||||||||||||
| Created: | February 14, 2014 | Updated: | October 17, 2016 | ||||||||||||
| Description: | From the Mageia advisory:
mpg123 1.14.1 and later are vulnerable to a buffer overflow that could allow a maliciously crafted audio file to crash applications that use the libmpg123 library. | ||||||||||||||
| Alerts: |
| ||||||||||||||
mysql: code execution
| Package(s): | mysql | CVE #(s): | CVE-2014-0001 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 13, 2014 | Updated: | June 9, 2014 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Red Hat advisory:
A buffer overflow flaw was found in the way the MySQL command line client tool (mysql) processed excessively long version strings. If a user connected to a malicious MySQL server via the mysql client, the server could use this flaw to crash the mysql client or, potentially, execute arbitrary code as the user running the mysql client. (CVE-2014-0001) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
numpy: insecure temp files
| Package(s): | numpy | CVE #(s): | CVE-2014-1858 CVE-2014-1859 | ||||||||||||||||
| Created: | February 17, 2014 | Updated: | March 29, 2015 | ||||||||||||||||
| Description: | From the Red Hat bugzilla:
Jakub Wilk found that f2py insecurely used a temporary file. A local attacker could use this flaw to perform a symbolic link attack to modify an arbitrary file accessible to the user running f2py. | ||||||||||||||||||
| Alerts: |
| ||||||||||||||||||
openswan: denial of service
| Package(s): | openswan | CVE #(s): | CVE-2013-6466 | ||||||||||||||||||||||||||||||||||||
| Created: | February 19, 2014 | Updated: | November 24, 2014 | ||||||||||||||||||||||||||||||||||||
| Description: | From the Red Hat advisory:
A NULL pointer dereference flaw was discovered in the way Openswan's IKE daemon processed IKEv2 payloads. A remote attacker could send specially crafted IKEv2 payloads that, when processed, would lead to a denial of service (daemon crash), possibly causing existing VPN connections to be dropped. | ||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||
perl-Capture-Tiny: insecure tmpfile use
| Package(s): | perl-Capture-Tiny | CVE #(s): | CVE-2014-1875 | ||||||||||||
| Created: | February 14, 2014 | Updated: | February 24, 2014 | ||||||||||||
| Description: | From the Mageia advisory:
perl-Capture-Tiny before 0.24 used files in /tmp in an insecure manner (CVE-2014-1875). | ||||||||||||||
| Alerts: |
| ||||||||||||||
php: denial of service
| Package(s): | php | CVE #(s): | CVE-2013-7226 | ||||||||||||
| Created: | February 13, 2014 | Updated: | March 4, 2014 | ||||||||||||
| Description: | Inadequate checking of the arguments to the imagecrop() function can lead to a heap overflow, see the PHP bug entry for lots more info. | ||||||||||||||
| Alerts: |
| ||||||||||||||
piranha: access restriction bypass
| Package(s): | piranha | CVE #(s): | CVE-2013-6492 | ||||||||||||||||||||||||||||
| Created: | February 14, 2014 | Updated: | February 19, 2014 | ||||||||||||||||||||||||||||
| Description: | From the Red Hat advisory:
It was discovered that the Piranha Configuration Tool did not properly restrict access to its web pages. A remote attacker able to connect to the Piranha Configuration Tool web server port could use this flaw to read or modify the LVS configuration without providing valid administrative credentials. (CVE-2013-6492) | ||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||
python: code execution
| Package(s): | python | CVE #(s): | CVE-2014-1912 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | February 14, 2014 | Updated: | April 14, 2014 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Red Hat bugzilla entry:
A vulnerability was reported in Python's socket module, due to a boundary error within the sock_recvfrom_into() function, which could be exploited to cause a buffer overflow. This could be used to crash a Python application that uses the socket.recvfrom_info() function or, possibly, execute arbitrary code with the permissions of the user running vulnerable Python code. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
xen: information leak
| Package(s): | xen | CVE #(s): | CVE-2014-1895 | ||||||||||||||||||||||||
| Created: | February 17, 2014 | Updated: | February 25, 2014 | ||||||||||||||||||||||||
| Description: | From the Red Hat bugzilla:
The FLASK_AVC_CACHESTAT hypercall, which provides access to per-cpu statistics on the Flask security policy, incorrectly validates the CPU for which statistics are being requested. An attacker can cause the hypervisor to read past the end of an array. This may result in either a host crash, leading to a denial of service, or access to a small and static region of hypervisor memory, leading to an information leak. | ||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||
zarafa: denial of service
| Package(s): | zarafa | CVE #(s): | CVE-2014-0037 CVE-2014-0079 | ||||||||||||||||
| Created: | February 17, 2014 | Updated: | March 3, 2014 | ||||||||||||||||
| Description: | From the Red Hat bugzilla [1, 2]:
Robert Scheck discovered a flaw in Zarafa that could allow a remote unauthenticated attacker to crash the zarafa-server daemon, preventing access to any other legitimate Zarafa users. (CVE-2014-0037) Robert Scheck discovered another flaw in Zarafa that could allow a remote unauthenticated attacker to crash the zarafa-server daemon, preventing access to any other legitimate Zarafa users. The issue affects all Zarafa versions from at least Zarafa 6.20.0 (maybe even from at least Zarafa 5.00) up to (including) Zarafa 7.1.8 final. Please note that I was not able to crash the zarafa-server daemon using official upstream Zarafa binary packages just all community builds from the source code such as shipped in Fedora. This different behaviour might be caused that upstream uses different built-time flags or other system libraries and headers. (CVE-2014-0079) | ||||||||||||||||||
| Alerts: |
| ||||||||||||||||||
Page editor: Jake Edge
Kernel development
Brief items
Kernel release status
The current development kernel is 3.14-rc3, released on February 16. Linus is starting to get grumpy: "When I made the rc2 announcement, I mentioned how nice and small it was. I also mentioned that I mistrusted you guys, and that I suspected that some people were giggling to themselves and holding back their pull requests, evil little creatures like you are. And I hate being right." One assumes that the subsystem maintainers, having been warned, will be careful about what they send for the rest of the development cycle.
Stable updates: 3.13.3, 3.12.11, 3.10.30, and 3.4.80 were all released on February 13. The 3.13.4, 3.12.12, 3.10.31, and 3.4.81 updates are in the review process as of this writing; they can be expected on or after February 20.
Quotes of the week
/me digs in git...
Ah, it looks like I might be...
Ok, let me go review this.
Kernel development news
File-private POSIX locks
File locks are heavily used by applications such as databases and file servers. Whenever you have multiple programs accessing files at the same time there is always the potential for data corruption or other bugs unless that access is carefully synchronized. File locks solve that problem, but the existing implementation can be difficult to use, especially for multi-threaded programs. File-private POSIX locks are an attempt to take elements of both BSD-style and POSIX locks and combine them into a more threading-friendly file locking API.
Multiple writers attempting to change a file at the same time can clobber each other's changes. In addition, an update to a file may need to be done in more than one place. If another thread of execution sees only part of the update, it may trigger bugs in the program.
File locks are generally available in two flavors: read (also known as shared) locks and write (also known as exclusive) locks. Multiple read locks can be given out for a portion of a file, but only one write lock can be handed out at any given time, and only if no other read or write lock for that region has been set. While file locks on some operating systems are mandatory, on Unix-like systems locking is generally advisory. Advisory locks are like stoplights — they only work if everyone pays attention to them.
One of the primary mechanisms to handle file locking is the one specified by the POSIX standard. POSIX defines a file-locking standard that allows the ability to lock arbitrary byte ranges in a file for read or write. Unfortunately, they have a couple of serious problems that make them unsuitable for use by modern applications.
The problems with POSIX locking
Whenever a program attempts to acquire a lock, that lock is either granted or denied based on whether there is already a conflicting lock set over the given range. If there is no conflicting lock present, the lock will be granted. If there is then it will be denied.
Classic POSIX lock requests from the same process never conflict with one another. When a request for a lock comes in that would conflict with an existing lock that the process set previously, the kernel treats it as a request to modify the existing lock. Thus, classic POSIX locks are useless for synchronization between threads within the same process. Given the prevalence of threaded applications in modern computing, this renders POSIX locks fairly useless as a synchronization mechanism.
More troublingly, the standard states that all locks held by a process are dropped any time the process closes any file descriptor that corresponds to the locked file, even if those locks were made using a still-open file descriptor. It is this detail that catches most programmers by surprise as it requires that a program take extra care not to close a file descriptor until it is certain that locks held on that file are able to be dropped.
That's not always a simple question to answer. If a program opens two different links of a hardlinked file, takes a lock on one file descriptor and then closes the other, that lock is implicitly dropped even though the file descriptor on which the lock was originally acquired remains open.
This is a particular problem for applications that use complex libraries that do file access. It's common to have a library routine that opens a file, reads or writes to it, and then closes it again, without the calling application ever being aware that has occurred. If the application happens to be holding a lock on the file when that occurs, it can lose that lock without ever being aware of it. That sort of behavior can lead to silent data corruption, and loss of developer sanity. Jeremy Allison has an excellent writeup of this problem and of how such a broken standard came into being (see the section entitled "First Implementation Past the Post").
There is however, another competing (or complementary) file locking standard that has its roots in BSD Unix. These locks (which are manipulated via the flock() system call) have more sane semantics. Whereas POSIX locks are owned by the process, BSD locks are owned by the open file. If a process opens a file twice and tries to set exclusive locks on both, the second one will be denied. Thus, BSD locks are usable as a synchronization mechanism between threads as long as each thread has a separate opened file. Note that cloning a file descriptor with dup() is not sufficient since that simply takes a reference to the same opened file.
Also, BSD locks are only released when the last reference to the open file on which they were acquired is closed. Thus if a program opens a file, takes a lock on it and then uses dup() to duplicate the file descriptor, the lock will only be released automatically when both file descriptors are closed.
The only real problem with BSD locks is that they are whole-file locks. POSIX locks, on the other hand can operate on arbitrary byte ranges within a file. While whole-file locks are useful (and indeed, many applications just lock entire files even with POSIX locks), they are not sufficient for many cases. Applications such as databases need granular locking in order to allow for better parallelism.
File-private POSIX locks
I will assert that what is needed is a new type of lock that is a hybrid of the two — a byte-range lock that has BSD-like semantics for inheritance across fork() and on close(). Furthermore, since there is a large legacy codebase of programs that use "classic" POSIX locks, these new locks need to be aware of the classic locks so that programs using the new locks will interoperate correctly with those applications.
Classic POSIX locks are manipulated using a set of command values passed to the fcntl() system call:
- F_GETLK - test whether a lock is able to be applied
- F_SETLK - attempt to set a lock. Return error if unable to do so
- F_SETLKW - attempt to set a lock and block until able to do so
These commands are accompanied by a pointer to a binary struct flock argument that looks something like this:
struct flock {
short int l_type; /* Type of lock: F_RDLCK, F_WRLCK, or F_UNLCK. */
short int l_whence; /* Where `l_start' is relative to (like `lseek'). */
off_t l_start; /* Offset where the lock begins. */
off_t l_len; /* Size of the locked area; zero means until EOF. */
pid_t l_pid; /* Process holding the lock. (F_GETLK only) */
};
Similarly, file-private POSIX locks are manipulated with a similar set of commands, this time appended with 'P':
- F_GETLKP - test whether a lock is able to be applied
- F_SETLKP - attempt to set a file-private lock
- F_SETLKPW - attempt to set a file-private lock and block until able to do so
The new commands should look very familiar to those used to working with classic POSIX locks and they take the same struct flock argument. The only real difference between file-private and classic POSIX locks is their "ownership". Classic POSIX locks are owned by the process whereas file-private POSIX locks are owned by the opened file.
Using file-private POSIX locks
It is currently necessary to define the _GNU_SOURCE preprocessor macro in order to get the new command definitions as file-private locks are not yet part of POSIX. Using file-private locks is very similar to using classic POSIX locks. In many cases, one can simply replace the command value with the file-private equivalent. There are subtle differences, however.
Since one of the most troublesome aspects of classic POSIX locks is their behavior on close, there should be no surprise that file-private locks behave differently. File-private locks are only released automatically when the last reference to the open file is released.
It's tempting to then consider file-private locks to be "owned" by the file descriptor, but that's not technically true. If a file descriptor is cloned via dup(), the kernel will simply take an extra reference to the open file and assign it to a new slot in the open file descriptor table. File-private locks set on a cloned file descriptor will not conflict with locks set on the original file descriptor. The kernel will treat such a lock request as a request to modify the existing lock. Furthermore, file-private locks set using either file descriptor would only be released automatically once both file descriptors are closed, though one can always release a lock manually with an F_UNLCK request.
Interaction across fork() is very similar. When fork() is called, the kernel takes an extra reference to each open file and assigns it to the same slot in the new process's file descriptor table. Locks set by either process on the same open file would not conflict with one another, and would only be automatically released once both processes have closed it.
Classic and file-private locks will always conflict with one another, even when used in the same process and/or on the same file descriptor. I don't expect that many programs will mix the two, but given the pain that undefined behaviors can cause I think it's prudent to declare that explicitly.
Whither F_GETLK?
F_GETLK would probably have been better named F_TESTLK. While it does technically fetch the existing status of a locked range, its real purpose is to allow one to test whether a given lock request could be set without actually setting it. If there happens to be a conflicting lock already set within that range, the kernel will overwrite the struct flock with information about that lock and set l_pid to the value of the process that owns that lock.
The l_pid field is a bit of a dilemma for file-private locks. File-private locks are not owned by processes. A file descriptor could have been inherited across a fork(), so the l_pid value is somewhat meaningless if the conflicting lock is a file-private one. Still, when a program using classic POSIX locks calls F_GETLK, we do need to put something in the l_pid field. That something is -1.
This precedent comes from BSD. On Linux, POSIX and BSD locks operate in a completely different namespace. On BSD, however, they operate in the same namespace and, thus, will conflict with each other. If a program holds a BSD lock on a file, and another does a F_GETLK request against it, the BSD kernel will set the l_pid to -1. Since portable programs already need to contend with such behavior, using the same behavior for file-private locks seems like a reasonable choice.
Using file-private locks with threads
It's common for modern applications to use a threading model instead of forking to create a new thread of execution. This is problematic with classic POSIX locks. They are associated with a process, so locks acquired by threads within the same process cannot conflict.
With file-private locks however we can circumvent that restriction by giving each thread its own open file. Here's an example (note that I've left out proper error handling for the sake of brevity):
#define _GNU_SOURCE
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>
#define FILENAME "/tmp/foo"
#define NUM_THREADS 3
#define ITERATIONS 5
void *
thread_start(void *arg)
{
int i, fd, len;
long tid = (long)arg;
char buf[256];
struct flock lck = {
.l_whence = SEEK_SET,
.l_start = 0,
.l_len = 1,
};
fd = open(FILENAME, O_RDWR|O_CREAT, 0666);
for (i = 0; i < ITERATIONS; i++) {
lck.l_type = F_WRLCK;
fcntl(fd, F_SETLKPW, &lck);
len = sprintf(buf, "%d: tid=%ld fd=%d\n", i, tid, fd);
lseek(fd, 0, SEEK_END);
write(fd, buf, len);
fsync(fd);
lck.l_type = F_UNLCK;
fcntl(fd, F_SETLKP, &lck);
usleep(1);
}
pthread_exit(NULL);
}
int
main(int argc, char **argv)
{
long i;
pthread_t threads[NUM_THREADS];
truncate(FILENAME, 0);
for (i = 0; i < NUM_THREADS; i++)
pthread_create(&threads[i], NULL, thread_start, (void *)i);
pthread_exit(NULL);
return 0;
}
This example spawns three threads and has each do five iterations of appending to a file. Access to that file is serialized via file-private locks. If we compile and run the above program, we end up with /tmp/foo that has 15 lines in it.
If we, however, were to replace the F_SETLKP and F_SETLKPW commands with their classic POSIX lock equivalents, the locking essentially becomes a noop since it is all done within the context of the same process. That leads to data corruption (missing lines) as some threads race in and overwrite the data from others.
Conclusion
File-private locks can solve many of the problems experienced with classic POSIX locks, but programmers intending to use them should take heed of the differences.
Developers from several projects including Samba, NFS Ganesha, SQLite, and OpenJDK have expressed interest in using file-private locks since they help simplify the code for many of their use cases, and help eliminate data corruption issues that can occur when files are closed.
The kernel patchset is available in the linux-kernel mailing list posting, or via the linux-next branch of my git tree. I plan to keep that branch updated with the latest version until it gets merged into mainline kernels. The kernel patches are currently being pulled into the linux-next tree as well. Anyone using linux-next kernels can use these now. There is also a (fairly trivial) GNU C library (glibc) patchset which implements the definitions needed to access these locks.
I'm currently aiming to have the kernel patches merged into mainline in v3.15, and glibc patches to add the new command definitions along with an update to the glibc manual should hopefully be merged soon afterward. Assuming that these patches are merged, I also intend to submit an update to the POSIX specification to make these a formal part of POSIX. I have opened a request to have it considered. There have already been some helpful suggestions and The Austin Group (who oversees POSIX) seems receptive to the general idea.
Hopefully other operating systems will follow suit and implement these as well so that programmers dealing with those platforms can reap the same benefits.
C11 atomic variables and the kernel
The C11 standard added a number of new features for the C and C++ languages. One of those features — built-in atomic types — seems like it would naturally be of interest to the kernel development community; for the first time, the language standard tries to address concurrent access to data on contemporary hardware. But, as recent discussions show, it may be a while before C11 atomics are ready for use with the kernel — if they ever are — and the kernel community may not feel any great need to switch.The kernel provides a small set of atomic types now, along with a set of operations to manipulate those types. Kernel atomics are a useful way of dealing with simple quantities in an atomic manner without the need for explicit locking in the code. C11 atomics should be useful for the implementation of the kernel's atomic types, but their scope goes beyond that application.
In particular, each access to a C11 atomic variable has an explicit "memory model" associated with it. Memory models describe how accesses to memory can be optimized by the processor or the compiler; the more relaxed models can allow operations to be reordered or combined for improved performance. The default model ("sequentially consistent") is the strictest; it does not allow any combining or reordering of operations in any way that would be visible anywhere else in the program. The problem with this model is that it is quite expensive, and, most of the time, that expense does not need to be incurred for correct operation. The more relaxed models exist to allow for optimizations to be performed in a controlled manner while ensuring correct ordering when needed.
Thus, C11 atomic variables include features that, in the kernel, are usually implemented with memory barriers. So, for example, in current kernel code, one could see something like:
smp_store_release(&x, new_value);
The smp_store_release() barrier (described in more detail in this article) tells the processor to ensure that any reads or writes executed before this assignment are visible on all processors before the assignment to x becomes visible. Reordering of operations that all occur before this barrier is still possible, as is the reordering of operations that all occur afterward. In most code, quite a bit of reordering can take place without affecting the correctness of the result. The use of explicit barriers in places where ordering does matter enables most accesses to be performed without barriers, enabling optimization and improving performance significantly.
If, instead, x were a C11 atomic type, one might write:
atomic_store(&x, new_value, memory_order_release);
Where memory_order_release specifies the same ordering requirements as smp_store_release(). (See this page for a description of the C11 memory models).
If the memory_order_relaxed model (which imposes no ordering requirements on the access) is used for surrounding accesses to other atomic variables where ordering is not important, the end result should be similar to that achieved with smp_store_release(). But the former version is implemented with tricky, architecture-specific code within the kernel; the latter version, instead, causes the desired code to be emitted directly by the compiler.
When the kernel first gained support for multiprocessor systems, the C language had no concept of atomic types or memory barriers, so the kernel developers naturally had to create their own. Now that the language standard has caught up, one might think that changing the kernel to make use of the standard atomic types would make sense. And, someday, it might, but that transition is likely to be slow and fitful at best.
Optimization worries
The problem is that compilers tend to be judged on the speed of the code they generate, so compiler developers have a strong incentive to optimize code to the greatest extent possible. Sometimes those optimizations can break code that is not written with an attentive eye toward the standard; the kernel developers' perspective is that compiler developers will often rely on a legalistic reading of standards to justify "optimizations" that (from the kernel developer's viewpoint) make no sense and break code needlessly. Highly concurrent code, as is found in the kernel, tends to be more susceptible to optimization-caused problems than just about anything else. So kernel developers have learned to be careful.
One of the scariest potential problems is "speculative stores," where an incorrect value becomes visible on a temporary basis. A classic example would be code like this:
if (x)
y = 1;
else
y = 2;
It would not be uncommon for a compiler to optimize this code by turning it into something like this:
y = 2;
if (x)
y = 1;
For sequential code operating in its own address space, the end result is the same, and the latter version avoids one jump. But if y is visible elsewhere, the value stored speculatively before the test may be seen by code that will proceed to do the wrong thing, causing things to go off the rails. Clearly, optimizations that cause incorrect values to become visible to any running thread must be avoided if the system is to run correctly.
When David Howells recently suggested that C11 atomic variables could be used in the kernel, speculative stores were one of the first concerns to be raised. The behavior of atomic variables as described by the standard is complex, to put it lightly, and there were real worries that the standard could allow compilers to generate speculative writes. An extensive and sometimes colorful discussion put most of those concerns to rest, but Paul McKenney, who has been representing the kernel's interests within the standard committee, is still not completely sure:
Another area of concern is control dependencies: situations where atomic variables and control flow interact. Consider a simple bit of code:
x = atomic_load(&a, memory_order_relaxed);
if (x)
atomic_store(&y, 42, memory_order_relaxed);
The setting of y has a control dependency on the value of x. But the C11 standard does not currently address control dependencies at all, meaning that the compiler or processor could play with the order of the two atomic operations, or even try to optimize the branch out altogether; see this explanation from GCC developer Torvald Riegel for details. Again, the results of this kind of optimization in the kernel context could be disastrous.
For cases like this, Paul suggested that some additional source-code markup and a new memory_order_control memory model could be used in the kernel to make the control dependency explicit:
x = atomic_load(&a, memory_order_control);
if (control_dependency(x))
atomic_store(&b, 42, memory_order_relaxed);
But this approach is unlikely to be taken, given just how unhappy Linus was with the idea. From his point of view, the control dependency should be obvious — the code is testing the value of x, after all. Any compiler that would move the atomic_store() operation in an externally visible way, he said, is simply broken.
There has also been some concern about "value speculation," wherein the compiler guesses that a variable will have a specific value and inserts a branch to fix things up if the guess is wrong. The processor's branch prediction hardware will then, hopefully, speed things up in cases where the guess is correct. See this note from Paul for an example of how value speculation might work — and how it might get things wrong. The good news on this front is that it seems that this kind of speculation will not be allowed. But it is not 100% clear that the current standard forbids it in all cases.
Non-local optimizations considered harmful
Yet another concern is global optimization. Compiler developers are increasingly trying to optimize programs at the level of entire source files, or even larger groups of files. This kind of optimization can work well as long as the compiler truly understands how variables are used. But the compiler is not required to understand the real hardware that the program is running on; it is, instead, required to prove its decisions against a virtual machine defined by the standard. If the real computer behaves in ways that differ from the virtual machine, things can go wrong.
Consider this example raised by Linus: the
compiler might look at how the kernel accesses page table entries and
notice that no code ever sets the "page dirty" bit. It might then conclude
that any tests against that bit could simply be optimized out. But that
bit can change; it's just that the hardware makes the change, not
the kernel code. So any optimizations made based on the notion that the
compiler can "prove" that bit will never be set will lead to bad things.
Linus concluded: "Any
optimization that tries to prove anything from more than local state is by
definition broken, because it assumes that everything is described by the
program.
"
Paul sent out a list of other
situations where the compiler's virtual machine model might not match what
is really happening. His examples included assembly code, kernel modules
(which can access exported symbols, but which might not even exist when the
compiler is making its decisions), kernel-space memory mapped into user
space, JIT-compiled BPF code, and
"probably other stuff as well
". In
short, there is a lot going on inside a kernel that the compiler cannot be
expected to know about.
One solution to many of these non-local problems is to use volatile with the affected variables. Simply identifying such variables would be an error-prone exercise, of course, but there is a worse problem: using volatile turns off all optimization for the affected variable, defeating the purpose of using atomic variables in the first place. If volatile must be used, the kernel is better off staying with its current memory barrier scheme, which is designed to allow as much compiler- and processor-level optimization as possible, but no more than that.
Will it come to that? Despite his worries, Linus has actually expressed some confidence that real-world compilers will not break things badly:
But he has also been clear that his trust of compiler developers only goes
so far and that, if necessary, the kernel community is more than prepared
to stick with its current approach, which, he said, is "generally *fine*
".
Does the kernel need C11 atomics?
Linus went on to make it clear that he is serious about this; if atomic variables as found in the C11 standard and its implementations do not provide what the kernel wants, the kernel will simply not use that feature. The kernel project, he said, is in a fairly strong bargaining position when it comes to atomic variables:
If we don't buy it, they have no serious user. Sure, they'll have lots of random other one-off users for their atomics, where each user wants one particular thing, but I suspect that we'll have the only really unified portable code base that handles pretty much *all* the serious odd cases that the C11 atomics can actually talk about to each other.
On the other hand, he said, the solutions found in the kernel now work just fine; there is no real need to move away from them if the kernel community does not want to.
In truth, there may well be other serious users; the GNU C library is using C11 atomics now for a few architectures, for example. And, while Torvald agreed that the kernel could continue to use its own solution, he also pointed out that there would be some advantages to using the standard mechanism. The widespread testing that this mechanism will receive was at the top of his list. One could also note that the kernel's tricky, architecture-specific barrier code could conceivably go away, replaced by more widely used code maintained by the compiler developers. That code would also, hopefully, be less likely to break when new releases of the compiler come out.
Beyond that, Torvald pointed out, C11 atomics can benefit from a fair amount of academic work that has been done. Some researchers at the University of Cambridge have come up with a formal description [PDF] of how C11 concurrency should work. Associated with that description is an interactive memory model simulator that can test code snippets for race conditions. And, in the end, if a large number of programs make use of C11 atomics, that should result in the quality of compiler implementations improving quickly.
Finally, if C11 atomic variables can be made to work in real-world programs, they could go a long way toward the establishment of reliable patterns for how C (and C++) can be used in concurrent environments. At the moment, there is no way for developers to know what is safe to do — now, and in the future. As Peter Sewell (one of the above-mentioned Cambridge researchers) put it:
The C11 standard is meant to be that "envelope," though, as Peter admitted,
it is "not yet fully up to that task
". But if the remaining
uncertainties and problems can be addressed, C11 atomics could become a
common language with which developers can reason about concurrency and
allowable optimizations. Developers might come to understand the issues
better, and kernel code might become a bit more widely accessible to
developers who understand the standard.
So it might well benefit the kernel to make use of this relatively new language feature. Nobody has closed the door on that possibility, but any transition in that direction will require a lot of time, testing, and confidence building. Bugs resulting from low-level concurrency management problems can be among the hardest to find, reproduce, or diagnose; nobody will be in a hurry to replace the kernel's atomics and memory barriers without a high level of assurance that the change will not result in the introduction of that kind of issue.
Patches and updates
Kernel trees
Architecture-specific
Build system
Core kernel code
Device drivers
Documentation
Filesystems and block I/O
Security-related
Virtualization and containers
Page editor: Jonathan Corbet
Distributions
Fedora.next working groups outline their products
The Fedora project's Fedora.next initiative is a new approach to the construction of the distribution. In late 2013, the Fedora Board proposed assembling three "working groups," each of which would craft its own set of requirements for a distinct Fedora final product: workstation, server, and cloud. The three products would share a common core, but could more easily differentiate themselves in packaging and configuration decisions so that—hopefully—the three product classes would be better served than they are by trying to build a single, monolithic product that meets all needs simultaneously. The working groups have now submitted their Product Requirements Documents (PRDs), so a clearer picture has begun to emerge about what the three products will provide.
Bill Nottingham, writing on behalf of the Fedora Engineering Steering Committee (FESCo), noted on February 4 that all three PRDs had been submitted and accepted by FESCo. They are available on the Fedora wiki (Workstation, Server, and Cloud), as is a PRD from the "Environment and Stacks" working group that lays out an intermediate platform layer on which the three top-level Fedora products will build. The plan also involves a "Base" layer beneath Environment and Stacks, but so far the contents of Base do not seem to be the subject of much debate; presumably they are the well-established core services already used in Fedora.
The three product PRDs share a few common factors, such as specifying target users (and/or use cases) and outlining what further work remains in order to build the final product. But they differ significantly in other respects. For example, the Workstation and Server PRDs do not address specific components or packages to be included, while the Cloud PRD goes into considerable detail about supporting specific application frameworks and orchestration projects (such as Rails, Sinatra, Django, OpenStack, and Eucalyptus).
Workstation
The Workstation PRD is the shortest document of the three, perhaps
reflecting Fedora's long history of developing a workstation-friendly
distribution. The "Target Audience" section is interesting, if only
because the target is explicitly software developers. Four use cases
are spelled out, including computer science students and developers
working at different sizes of companies. The PRD says that:
"While the developer workstation is the main target of this
system and what we try to design this for, we do of course also
welcome other users to the Fedora Workstation.
"
Targeting software developers as the primary user segment is a controversial choice to some project members; there was a lengthy debate on that subject in late November 2013. Some felt, for example, that explicitly restricting the target audience to developers was major change, and that Fedora had historically sought to be a desktop platform for a broad assortment of users. Others were not convinced that "developers" was a suitable target audience at all.
The PRD also spells out a set of technical goals. Many, like "robust upgrades," require little explanation or justification. But there are some interesting choices in the list. For instance, the list includes support for container-based application installation—meaning self-contained applications that can be installed (perhaps on a per-user basis), updated, and removed much like mobile apps are on smartphone and tablet systems. That concept was the subject of considerable discussion at GUADEC 2013, so it is interesting to see Fedora adopt it as a goal outright.
The goals also include the ability to upgrade the OS's own packages without disrupting software development—that is, it should be possible to develop a program on a Fedora Workstation that targets something other than the latest Fedora Workstation itself. In other words, development environments should be able to be isolated from the base system. If it is impossible to upgrade Fedora's cURL package and still work on code that needs an older cURL release, developers would not be happy. The goals do not specify certain components, such as whether or not GNOME would be the default desktop environment (a choice which was also the subject of considerable debate on the mailing list).
Otherwise, however, the PRD mostly notes that there are few changes anticipated from the way Fedora is currently developed: the same package formats, repositories, and ISO images that serve Fedora today should work fine on the Workstation product in the future.
Server
Similar to Workstation, the Server PRD does not describe a drastic departure from the way Fedora is currently designed, although there are some points worth noting. The Server PRD's Mission Statement (which for unexplained reasons is a distinct statement from the Server Vision Statement) explains that the project wants to define a base server platform as well as a set of "featured server roles" to address important server scenarios. That is different from the Workstation product plan, which does not involve defining a suite of use cases requiring separate treatment.
There is not yet a list of what these featured server roles will be, but the document enumerates some possibilities: FreeIPA Domain Controller, DNS server, DHCP server, file server, OpenStack Hypervisor, and so on. But the PRD does indicate that featured server roles are intended to be well-defined entities, with each one packaged so that it can be installed as a unit, and with each one providing a stable external API.
Getting to that goal will obviously require a lot of additional discussion, not to mention further work. The featured server roles need to be defined, the appropriate packages and configurations chosen for each, and the resulting meta-package implementation built and tested. The PRD notes that each featured server role will need a maintainer to coordinate its development and serve as the point of contact.
The cloud
The Cloud PRD is far and away the longest of the three product PRDs. This is probably not surprising to most, since cloud image deployment is an area where Fedora has traditionally lagged behind other distributions (statistics for this sort of metric are notoriously hard to verify, but Ubuntu, Debian, and CentOS are usually cited as the most common Linux cloud instance options).
The Cloud product will entail some differences from traditional Fedora releases, including making releases available as a machine image (the PRD lists Amazon EC2's Amazon Machine Image (AMI) as the first target format) and distributing it through different channels (in the case of AMIs, through the Amazon Web Service marketplace).
Updates and customizations are another difference; while the document says that Fedora will likely release periodic updates to the Cloud AMIs, it will also need to maintain a new set of tools for users to customize and build their own cloud images—either for use on cloud services other than EC2, or simply to customize the image's content. As mentioned earlier, the Cloud PRD lists a number of application frameworks it expects to support. Though not exhaustive, the list includes frameworks using Ruby, Python (2 and 3), PHP, Node.js, Java, Perl, and Go. Several other applications, such as Hadoop and OpenShift, are also mentioned along the way.
The PRD also lists eleven cloud services and environments it would like to support (OpenStack, Eucalyptus, Apache Cloudstack, OpenNebula, oVirt, Amazon EC2, Google Compute Engine, HP Cloud, Rackspace, Digital Ocean, and Linode), and its interest in supporting private clouds, public clouds, and hybrid setups. That is already a lengthy list of deployment scenarios on which the Fedora Cloud project will need testing, but the PRD also outlines an interest in serving as a base for cloud orchestration, distributed database management, and other higher-level functions.
Fortunately, the working group does seem to be aware of the rather large scope that these requirements make up. The initial target will be more limited, with EC2 images for Intel-based machines designed to run as single servers.
In addition, the PRD sets out a list of intermediary goals that need addressing. The first is reducing the size of the image footprint, followed by support for the Docker application container format, tools for building and deploying cloud images, developing software stacks for the various frameworks to be supported, and more.
The Cloud PRD is quite ambitious, especially when compared to the Workstation and Server equivalents. Now that FESCo has accepted all three, it will be interesting to see how the concrete plans for each of these products take shape. The Workstation product will likely demand the least, while the Cloud product could take shape over the course of several Fedora releases. But, as Fedora attempts to redefine itself in these more specific roles, having a clearer game plan is certainly the right first step.
Brief items
Software Collections for CentOS-6
The CentOS Project has announced the release of Software Collections (SCL) for CentOS-6 x86_64. SCLs are a Red Hat concept that provide "a set of dynamic programming languages, database servers, and various related packages that are either more recent than their equivalent versions included in the base Red Hat Enterprise Linux system, or are available for this system for the first time". The project has also provided a list of updates that are included in the CentOS SCL.
Debian 6.0.9 released
The Debian Project has released the ninth update to its old stable distribution, Debian 6.0 (squeeze). "This update mainly adds corrections for security problems to the oldstable release, along with a few adjustments for serious problems. Security advisories were already published separately and are referenced where available."
Distribution News
Ubuntu family
Ubuntu Community Council statement on Canonical package licensing
The Ubuntu Community Council has issued a statement regarding Canonical's requirement that binary redistributors (such as Linux Mint) obtain a license from Canonical. "We believe there is no ill-will against Linux Mint, from either the Ubuntu community or Canonical and that Canonical does not intend to prevent them from continuing their work, and that this license is to help ensure that. What Linux Mint does is appreciated, and we want to see them succeed." There is no real discussion on what is being licensed; it would appear to be a fairly mundane trademark issue stemming from the fact that Linux Mint distributes binary packages taken directly from the Ubuntu repository.
Newsletters and articles of interest
Distribution newsletters
- DistroWatch Weekly, Issue 546 (February 17)
- Ubuntu Weekly Newsletter, Issue 355 (February 16)
Shuttleworth: Losing graciously
Mark Shuttleworth responds to Debian's decision to go with systemd. "Nevertheless, the decision is for systemd, and given that Ubuntu is quite centrally a member of the Debian family, that’s a decision we support. I will ask members of the Ubuntu community to help to implement this decision efficiently, bringing systemd into both Debian and Ubuntu safely and expeditiously."
Chinese software pioneer Red Flag bites the dust (South China Morning Post)
The South China Morning Post is reporting the demise of Red Flag, which is a government-backed Linux distribution by and for the Chinese people. "China’s best hope for a home-grown computer operating system to take on global giants like Microsoft lay in tatters after state-backed Red Flag Software was forced to close its doors for business. Founded in 2000 during the dot-com boom, Red Flag was once the world’s second-largest Linux distributor, providing desktop and server software built on top of the free and open-source Linux program. Despite its lofty goals and early success, Beijing-based Red Flag has gone out of business and terminated all its employment contracts on Monday, according to a report on the Sina news portal on Thursday."
Distro Astro Is a Stunning Star Voyager (LinuxInsider)
LinuxInsider reviews Distro Astro, a distribution for astronomers. "Distro Astro 2.0 is an excellent Linux OS to learn about the basics of a simple desktop environment as well as explore the marvels of the universe. It is also an excellent all-in-one Linux platform for astronomy enthusiasts and professional astronomers alike with some of the best celestial-studying software included."
Page editor: Rebecca Sobol
Development
OpenDaylight emits "Hydrogen"
OpenDaylight is a Linux Foundation–hosted project (launched in early 2013) to develop a common platform for Software Defined Networking (SDN) and Network Functions Virtualization (NFV). On February 4, the project made its first release, "Hydrogen," which consists of an SDN framework and a set of tools for testing and deploying an SDN network. Despite the fanfare surrounding OpenDaylight, it is a project that has so far struggled to find recognition and attention from developers. Now, perhaps, its first code release will clear up some of the latent confusion, even if SDN itself remains at least a few years away from revolutionizing the computing industry.
Software what?
Understandably enough, a significant portion of the confusion about OpenDaylight comes from the unfamiliarity of SDN itself—although the project's choice of names is certainly not free from blame either. SDN is an effort to abstract away the physical details of a network: the topology and the underlying hardware. The resulting virtual network layer can then be centrally monitored and managed, and can adapt to best serve the needs of whatever applications are running at the moment. For example, reassigning newly spun-up database servers to the IP subnet already occupied by the existing database servers would make routing traffic between them simpler (and hopefully faster), but is hardly possible if the various subnets in the datacenter are statically configured.
An SDN controller platform should allow administrators to alter facets of the network configuration like switching and routing to adjust to changing demand, without having to manually reconfigure the actual switches and routers. And, of course, applications that place a lot of demand on the network (such as video streaming) should ideally be able to trigger the necessary adaptive behavior from the SDN controller on their own, rather than requiring administrators to do it.
In essence, SDN moves the "where do I forward this packet" decision from the router that encounters the packet up to a higher-level plane, where the SDN controller can make the decision based on multiple factors at multiple places in the network. The SDN community's terminology for this separation is the "control plane" and the "data plane." With that abstraction in place, of course, a host of other assumptions about networking suddenly become worth re-examining, such as which nodes are connected to which. In cloud environments, where virtual servers can be spun up or powered down at will, having a network topology that can adapt to changing circumstances and demand is of immense benefit, so there is significant work being done on SDN by industry—as well as by academics, where SDN originated and for whom the concept raises plenty of other interesting questions about security, routing, and so on.
OpenDaylight was announced in April 2013, spearheaded initially by Cisco and IBM, but with several other big names soon joining, including Juniper Networks, Citrix, Ericsson and (strange as it may seem for a project hosted by the Linux Foundation) Microsoft. Officially, the project is committed to "advancing" SDN in an open, community-driven manner—rather than, say, producing a monolithic SDN stack. Most of the platinum-level member companies donated pieces of existing code, and the project has staked out support for some external standards as part of its strategy as well, most notably the OpenFlow protocol for communicating between the SDN controller layer and the hardware layer beneath it. OpenFlow is a standard maintained by the Open Networking Foundation, which has several member companies in common with OpenDaylight but is otherwise distinct.
The component parts of Hydrogen
OpenDaylight's software includes a collection of modular components. The central component is an SDN controller layer that provides interfaces for other pluggable components (such as applications above the controller and the data plane beneath the controller). The Hydrogen release packages up these components into three different editions: Base, Virtualization, and Service Provider. All of the code is published under the Eclipse Public License (EPL), and it is primarily written in Java (so that it will, at least in theory, run on any platform). Most of the plugin modules available so far implement controller support for different brands and varieties of routers and switches, but there are also some components aimed at getting some work done higher in the stack.
The Base Edition includes the controller, an OpenFlow implementation (for communicating between the controller and the data plane), support for the Open vSwitch virtual switch, and tools for managing network equipment with YANG. The Virtualization Edition adds the Affinity metadata service (which is described as a set of APIs to express workload relationships and service levels), a DOVE (distributed overlay virtual Ethernet) plugin for building virtual networks as overlays, a security framework called Defense4All that is designed to detect and respond to distributed denial-of-service attacks, and Virtual Tenant Network (a network virtualization application using OpenFlow).
Finally, the Service Provider Edition includes the Base Edition plus Affinity, Defense4All, a traffic engineering plugin supporting Border Gateway Protocol with link-state extensions (BGP-LS) and Path Computation Element Communication Protocol (PCEP), a Simple Network Management Protocol (SNMP) plugin for managing legacy Ethernet switches, and a different network virtualization plugin called LISP Flow Mapping (which is no relation to the Lisp programming language; rather, it is a tool that maps virtual networks with the Locator/identifier Separation Protocol).
All three editions of the Hydrogen release are experimental toolkits at this stage. The release announcement advertises the Base Edition as being geared toward academia or those exploring SDN concepts. It similarly says that the Virtualization Edition is targeted at data centers managing "tenant networks" (that is, networks leased to customers) and the Service Provider Edition is targeted at carriers and other providers hoping to migrate their existing networks to SDN.
Given the large-scale networks that OpenDaylight is aimed at, it may not be easy for a novice to jump in and start experimenting with Hydrogen. But the new release does paint a clearer picture of what OpenDaylight project members have in mind. The Virtualization and Service Provider editions incorporate tools designed to manage networks of application servers being hosted for customers. That is more of interest if one owns and runs a cloud computing service than if one simply rents time on such a service. However, all of these tools occupy what OpenDaylight calls the "southbound" interface—between the SDN controller and the underlying data plane. OpenDaylight's architecture also covers the "northbound" interface, between the SDN controller and the applications running on the virtualized network.
At present there is less to see on the northbound interface side—Defense4All is a northbound application, and the Virtualization Edition documentation mentions support for OpenStack's Neutron as a northbound application. OpenDaylight does support two different APIs for northbound applications; the Java OSGi API for applications running in the same IP address space as the controller (as opposed to tenant networks), and a REST API for outside applications. The first OSGi-speaking applications are likely to be tools for monitoring and reconfiguring the network.
That said, it could be quite some time before deployable applications begin to surface for OpenDaylight. Judging by the code contributions made by the various member companies, right now the majority of the work is focused on supporting a wide range of vendors' networking hardware in the data plane through a wide range of management and configuration protocols—not a small task, to be sure. But the premise of SDN is a compelling one for anyone who manages large-scale network applications. To everyone who simply makes use of large-scale network applications, the important aspect of OpenDaylight is that it is such a collaborative effort—hopefully insulating customers from the woes of vendor lock-in, which is unpleasant whether it is software-defined or not.
Brief items
Quotes of the week
grep-2.17 released
Version 2.17 of the GNU grep utility is out. "This release is notable for its performance improvements: we don't often see a 10x speed-up in a tool like grep." Other changes include the removal of the long-deprecated --mmap option.
SyncEvolution 1.4 released
Version 1.4 of the SyncEvolution PIM-synchronization framework has been released. This update is the first to support in-vehicle infotainment (IVI) systems, including GENIVI's Diagnostic Log and Trace (DLT) system. Google CardDAV support has also been added, but the release notes say "The biggest change for normal Linux users is Google CalDAV/CardDAV authentication with OAuth2. These are the open protocol that Google currently supports and thus the recommended way of syncing with Google, replacing ActiveSync and SyncML (both no longer available to all Google customers).
"
Brython 2.0 available
Version 2.0 of Brython, an implementation of Python 3 in the browser, has been released. "Its goal is to be able to write client-side programs in Python instead of Javascript, with code inside tags <script type="text/python>...</script>. As opposed to solutions such as Pyjamas or Py2JS, the translation from Python to Javascript is done in the browser, it doesn't require a precompilation by a Python program.
"
Git v1.9.0 available
Git 1.9.0 has been released. This update incorporates numerous changes to various subsystems and features, as well as performance improvements, clean-ups, and documentation fixes. Backward-compatibility notes are also included.
virt-manager 1.0.0 available
Version 1.0.0. of virt-manager, the desktop application for managing KVM, Xen, and LXC virtual machines, has been released. The new build adds support for snapshots, new defaults, and many other new features.
Newsletters and articles
Development newsletters from the past week
- What's cooking in git.git (February 12)
- What's cooking in git.git (February 14)
- What's cooking in git.git (February 19)
- LLVM Weekly (February 17)
- OCaml Weekly News (February 18)
- OpenStack Community Weekly Newsletter (February 14)
- Perl Weekly (February 17)
- PostgreSQL Weekly News (February 17)
- Python Weekly (February 13)
- Ruby Weekly (February 13)
- This Week in Rust (February 15)
- Tor Weekly News (February 19)
How OpenStack parallels the adoption of Linux (opensource.com)
Over at opensource.com, Red Hat's cloud evangelist Gordon Haff looks at the adoption of OpenStack through the lens of the adoption of Linux (and surrounding projects). "Early Linux success didn’t come about because it was better technology than Unix. For the most part it wasn’t. Rather it often won because it was less expensive than proprietary Unix running on proprietary hardware. It also gave users a choice of both distributions and hardware vendors as well as the ability to customize the code should they so choose. However, what has truly distinguished Linux and open source broadly over time is the power of the open source development models and the innovation that comes from communities around projects."
Page editor: Nathan Willis
Announcements
Brief items
2014 Linux Jobs Report
The Linux Foundation and Dice have announced the release of the 2014 Linux Jobs Report (registration required). "Hiring managers are increasing the number of Linux professionals they are searching for. Forty six percent of hiring managers are beefing up their plans for recruiting Linux talent over the next six months, representing a three-point increase from hiring managers’ plans in 2013."
Calls for Presentations
Announcing GNU Radio Conference 2014 and Call for Presentations
The GNU Radio Conference (GRCon14) will take place September 15-19 in Washington, DC. This announcement says the CfP deadline is April 4, however the conference website says it's March 31.CFP Deadlines: February 20, 2014 to April 21, 2014
The following listing of CFP deadlines is taken from the LWN.net CFP Calendar.
| Deadline | Event Dates | Event | Location |
|---|---|---|---|
| February 27 | August 20 August 22 |
USENIX Security '14 | San Diego, CA, USA |
| March 10 | June 9 June 10 |
Erlang User Conference 2014 | Stockholm, Sweden |
| March 14 | May 20 May 22 |
LinuxCon Japan | Tokyo, Japan |
| March 14 | July 1 July 2 |
Automotive Linux Summit | Tokyo, Japan |
| March 14 | May 23 May 25 |
FUDCon APAC 2014 | Beijing, China |
| March 16 | May 20 May 21 |
PyCon Sweden | Stockholm, Sweden |
| March 17 | June 13 June 15 |
State of the Map EU 2014 | Karlsruhe, Germany |
| March 21 | April 26 April 27 |
LinuxFest Northwest 2014 | Bellingham, WA, USA |
| March 31 | July 18 July 20 |
GNU Tools Cauldron 2014 | Cambridge, England, UK |
| March 31 | September 15 September 19 |
GNU Radio Conference | Washington, DC, USA |
| March 31 | June 2 June 4 |
Tizen Developer Conference 2014 | San Francisco, CA, USA |
| March 31 | April 25 April 28 |
openSUSE Conference 2014 | Dubrovnik, Croatia |
| April 3 | August 6 August 9 |
Flock | Prague, Czech Republic |
| April 4 | June 24 June 27 |
Open Source Bridge | Portland, OR, USA |
| April 5 | June 13 June 14 |
Texas Linux Fest 2014 | Austin, TX, USA |
| April 7 | June 9 June 10 |
DockerCon | San Francisco, CA, USA |
| April 14 | May 24 | MojoConf 2014 | Oslo, Norway |
| April 17 | July 9 | PGDay UK | near Milton Keynes, UK |
| April 17 | July 8 | CHAR(14) | near Milton Keynes, UK |
| April 18 | November 9 November 14 |
Large Installation System Administration | Seattle, WA, USA |
| April 18 | June 23 June 24 |
LF Enterprise End User Summit | New York, NY, USA |
If the CFP deadline for your event does not appear here, please tell us about it.
Upcoming Events
MiniDebConf in Barcelona
Debian Women is hosting a MiniDebConf March 15-16 in Barcelona, Spain. "The idea behind the conference is not to talk about women in free software, or women in Debian, but rather to make discussion about Debian subjects more inclusive for women."
Seminar on GPL Enforcement and Legal Ethics
The Free Software Foundation is holding a seminar on GPL Enforcement and Legal Ethics. It takes place March 24 in Boston, MA. "The event sessions will be lead by Karen Sandler of the GNOME Foundation and former general counsel of the Software Freedom Law Center; Bradley Kuhn, President of the Software Freedom Conservancy and a member of the FSF's Board of Directors, and Donald R. Robertson, III, J.D., the FSF's Copyright and Licensing Associate." The event is aimed at legal professionals and law students.
Events: February 20, 2014 to April 21, 2014
The following event listing is taken from the LWN.net Calendar.
| Date(s) | Event | Location |
|---|---|---|
| February 21 February 23 |
conf.kde.in 2014 | Gandhinagar, India |
| February 21 February 23 |
Southern California Linux Expo | Los Angeles, CA, USA |
| February 25 | Open Source Software and Govenrment | McLean, VA, USA |
| February 28 March 2 |
FOSSASIA 2014 | Phnom Penh, Cambodia |
| March 3 March 7 |
Linaro Connect Asia | Macao, China |
| March 6 March 7 |
Erlang SF Factory Bay Area 2014 | San Francisco, CA, USA |
| March 15 March 16 |
Chemnitz Linux Days 2014 | Chemnitz, Germany |
| March 15 March 16 |
Women MiniDebConf Barcelona 2014 | Barcelona, Spain |
| March 18 March 20 |
FLOSS UK 'DEVOPS' | Brighton, England, UK |
| March 20 | Nordic PostgreSQL Day 2014 | Stockholm, Sweden |
| March 21 | Bacula Users & Partners Conference | Berlin, Germany |
| March 22 March 23 |
LibrePlanet 2014 | Cambridge, MA, USA |
| March 22 | Linux Info Tag | Augsburg, Germany |
| March 24 March 25 |
Linux Storage Filesystem & MM Summit | Napa Valley, CA, USA |
| March 24 | Free Software Foundation's seminar on GPL Enforcement and Legal Ethics | Boston, MA, USA |
| March 26 March 28 |
Collaboration Summit | Napa Valley, CA, USA |
| March 26 March 28 |
16. Deutscher Perl-Workshop 2014 | Hannover, Germany |
| March 29 | Hong Kong Open Source Conference 2014 | Hong Kong, Hong Kong |
| March 31 April 4 |
FreeDesktop Summit | Nuremberg, Germany |
| April 2 April 4 |
Networked Systems Design and Implementation | Seattle, WA, USA |
| April 2 April 5 |
Libre Graphics Meeting 2014 | Leipzig, Germany |
| April 3 | Open Source, Open Standards | London, UK |
| April 7 April 8 |
4th European LLVM Conference 2014 | Edinburgh, Scotland, UK |
| April 7 April 9 |
ApacheCon 2014 | Denver, CO, USA |
| April 8 April 10 |
Open Source Data Center Conference | Berlin, Germany |
| April 8 April 10 |
Lustre User Group Conference | Miami, FL, USA |
| April 11 April 13 |
PyCon 2014 | Montreal, Canada |
| April 11 | Puppet Camp Berlin | Berlin, Germany |
| April 12 April 13 |
State of the Map US 2014 | Washington, DC, USA |
| April 14 April 17 |
Red Hat Summit | San Francisco, CA, USA |
If your event does not appear here, please tell us about it.
Page editor: Rebecca Sobol
