LWN.net Weekly Edition for August 2, 2012
GUADEC: open source and open "stuff"
Developer conferences like GUADEC tend to be dominated by technical content, so talks with a different tenor stand out. Alex "Skud" Bayley's July 28 keynote "What's Next? From Open Source to Open Everything" was one such talk. Bayley has spent time in both the open source and open data movements, and offered a number of insights between those and other online, grassroots-community movements — including what open source can teach newer communities from experience, and what it can learn.
Bayley's talk was based loosely on a blog
post she wrote in January 2011 while working at the Freebase "open data" project.
After roughly a decade of working full-time on open source software,
she decided it was no longer the "fringe" and cutting-edge movement it
was in the early years, and consequently stopped being an interesting
challenge. That assessment is not a criticism, however; as Bayley put
it, the fact was that open source software had "won." "No one
seriously says 'lets use ColdFusion running on Oracle' for their Web
site anymore
". She subsequently encountered the open data
movement and found that it "freaked out enough people
"
that she was certain she was onto something interesting again.
![[Bayley at GUADEC]](https://static.lwn.net/images/2012/guadec2012-bayley-sm.jpg)
But the real insight was her discovery that the nascent open data movement was grappling with the same set of challenges that open source software had tackled roughly a decade earlier. For example, it was grappling with licensing (as open source had), and was still in the process of distilling out its core principles and how best to enshrine them in appropriate licenses. Similarly, she said, in the early days open source struggled to build its software support tools (such as build systems and version control), find working business models, and discover how to interact with governments and other entities that found the movement odd or suspicious.
Open data was repeating the same process, ten years later. Bayley
relayed an anecdote about a New Zealand government project that
attempted an open data release as a Zip archive. Nat Torkington
reacted with a number of questions illustrating how a Zip release
failed to make the grade: what if there is a bug, or an update, or a
patch? Open source and open data are not the only movements, however:
Creative Commons and Wikipedia have dealt with similar issues, as have
open education, open healthcare, open government, open access (e.g.,
to academic research), and open hardware — and Bayley found the
parallels interesting. In short, when asked what her current interest
is, she now replies "open ... stuff
".
An even broader circle encompasses not only the open technology movements, but other recent grassroots and peer-to-peer online communities, including "scanlation" groups that crowdsource translations, music or video remixing communities, unconferences, and even fan-fiction communities. Some of these groups might not seem to have any connection to open source, Bayley admitted, but the parallels are there: they are self-organizing, decentralized, non-hierarchical, and are based around the tenet of making things free. That makes them kindred spirits that open source could assist based on its own experiences, and it makes them worth learning from.
What open source can teach
The first area in which open source can offer assistance to other open
movements is licensing, Bayley said. Licensing is "really,
really important
", but most online communities don't think
about it in their early days. Open source has generally settled on a small
set of licenses that cover most participants' needs, and it has done
so largely because it started from the FSF's four freedoms. Newer
communities could benefit from open source's work articulating its
core principles, writing definitions, and figuring out the boundaries
that determine "
who's in and who's out
".
She cited several examples in the open data movement where lack of licensing standards confuses the issue. One was a genealogy site advertising its "open data" at OSCON 2010 — data that was licensed CC-Attribution-NonCommercial-NoDerivatives-ShareAlike, a choice that breaks three of the FSF's four freedoms. Another was a "community mapping" project run by Google, which used participatory and community language in its marketing, but in which all contributions became the sole property of Google.
The second area where open source can assist other movements is tools. Open source has a complete toolchain all the way down the stack, but many other communities do not. Many creative- or publishing-centric communities have no concept of version control, she said. But telling ebook authors to "just use GitHub" is not the answer; they would balk at the suggestion, and rightfully so. Rather, the open source community needs to ask "what would DocHub look like?" and help the community build the tools it requires.
Finally, open source can teach other communities about the value of
working and communicating completely in the open: open mailing lists,
open documentation, and "release early, release often" workflows. The
benefits may seem obvious to open source developers, but it is a scary
prospect to those not "soaking in it
" already, like the
open government movement. But transparency has benefits for all open
source communities, she said. It allows outsiders to see what the
community is like and how it operates, so that they can put themselves
into the situation with fewer surprises. It also means more
accountability, which is particularly important in movements like open
government.
What open source can learn
Open source software's relative maturity puts it in a position to offer experience-based advice to other online communities, Bayley said, but that fact does not mean the other communities have nothing to teach of their own. After all, she said, no one recruits the thousands of teenagers who write and share their own Harry Potter fan-fiction — they build and organize their own communities online. There are several potential lessons from these other groups, which she described in random order.
The first is the value of hands-on events. Hackerspaces often hold
short, practical events where "people can walk-in, learn
something, and walk out.
" Open source rarely does this,
expecting newcomers instead to sign up for online courses or figure
out what to do on their own. But "no one says 'I spent all
weekend reading documentation; it was awesome!'
" Many minority
or marginalized groups in particular require a slight push to get
involved; a physical event after which they can walk away having
learned something will provide this push in ways an online event
cannot. It is easy — but fundamentally selfish — to tell
others that they must learn it hard way because you did, she said.
Many other open communities also have much more age diversity than
open source. Environmental groups and music communities tend to be
all-age, she said, but open source is not. Developer conferences like
GUADEC tend to be dominated by attendees in their 20s and 30s, while
system administration conferences are much older. But as an example, she asked,
why are there no children at GUADEC? Some might expect them to find the
talk sessions dull, or to be disruptive, which is valid, but there
could still be other program content designed to reach them. She told
a story about a punk music venue in Berkeley California that held only
all-age concerts, and how she witnessed adults helping kids enjoy the
mosh pit by lifting them onto their shoulders. "If you can run
a public mosh pit for kids, you can probably solve the problem for
linux.conf.au
".
Finally, many other open communities operate with a strong "nothing
about us without us
" ethic. The phrase comes from the disability
rights community, and it means that the community tries not to embark
on projects ostensibly for disabled people unless there are people
with disabilities involved. Otherwise, the results can easily fail to
meet the needs of the target community.
An example of failing to exercise this approach happened after the 2010 Haiti earthquake. After the quake, a number of open source developers volunteered to write software to support the relief effort, but did so without partnering with the relief workers on the ground. The developers felt good about themselves — at least at first — but were ultimately disappointed because their efforts were not of much practical help. In addition to producing better outcomes, she said, the "nothing about us without us" approach has the added benefit of empowering people to build things for themselves, rather than building things for them.
Bayley's talk encompassed such a wide view of online and "open something" communities that at first it was hard to see much that connected them. But in the end, she is right: even if the reason that the other community congregates has nothing to do with the motives that drive open source software, these days we have a lot in common with anyone who uses the Internet to collaborate and to build. In her first few years of open source involvement, Bayley said, she frequently told people to switch over to Linux and open source software in tactless ways that had little impact. She hopes that she is more tactful today than she was at 18, she said, because open source has lessons to teach about freedom and community. Those lessons are valuable even for communities that have no interest in technology.
[The author would like to thank the GNOME Foundation for travel assistance to A Coruña for GUADEC.]
GUADEC: Motion tracking with Skeltrack
At the beginning of his GUADEC 2012 talk, developer Joaquim Rocha showed an image from Steven Spielberg's oft-cited 2002 film Minority Report. When the movie came out, it attracted considerable attention for its gesture-driven computing. But, Rocha said, we have already surpassed the film's technology, because the special gloves it depicted are no longer needed. Rocha's Skeltrack library can leverage Microsoft Kinect or similar depth-mapping hardware to find users and recognize their positions and movements. Skeltrack is not an all-in-one hands-free user interface, but it solves a critical problem in such an application stack.
![[Rocha at GUADEC]](https://static.lwn.net/images/2012/guadec2012-rocha-sm.jpg)
Rocha's presentation fell on Sunday, July 29, the last "talk" day of the week-long event. Although GUADEC is a GNOME project event, the Skeltrack library's primary dependency is GLib, so it should be useful on non-GNOME platforms as well. Rocha launched Skeltrack in March, and has released a few updates since. The current version is 0.1.4 from June, and is available on GitHub. For those who don't follow Microsoft hardware, the Kinect uses an infrared illuminator to project a dot pattern onto the scene in front of it, and an infrared sensor reads the distortion in the pattern to map out a "depth buffer" of the objects or people in the field of view.
How it works
Like the name suggests, Skeltrack is a library for "skeleton tracking."
It is built to take data from a depth buffer like the one provided by
the Kinect device, locate the image of a (single) human being in the buffer,
and identify the "joints." Currently Skeltrack picks out seven: one
head, two shoulders, two elbows, and two hands. Those joints can then
be used by the application, letting the user manipulate objects, or for further
processing (such as gesture recognition). The Kinect is the primary
hardware device used with Skeltrack, Rocha said (because of its low
price point and simple, hackable USB interface), but the library is hardware
independent. Skeltrack builds on the existing libfreenect library for device
control, and includes GFreenect, a GObject
wrapper library around libfreenect (because, as Rocha quipped "we
really like our APIs in GNOME
").
One might be tempted to think that acquiring the 3D depth information is the tricky part of the process, and that picking a human being out of the image is not that complicated. But such is not the case. Libfreenect, Rocha said, cannot tell you whether the depth information depicts a human being, or a cow, or a monkey, much less identify joints and poses. There are three proprietary ways to get skeleton information out of libfreenect depth buffers: the commercial OpenNI framework, Microsoft's Kinect SDK, and Microsoft's Kinect For Windows. Despite its name, OpenNI includes many non-free components, the skeleton-tracking module included. The Kinect SDK is licensed for non-commercial use only, while Kinect for Windows is a commercial offering, and only works with the desktop version of the Kinect.
Moreover, the proprietary solutions generally rely on a database of "poses" against which the depth buffer is compared, in an attempt to match the image against known patterns. That approach is slow and has difficulty picking out people of different body shapes, so Rocha looked for another approach. He found Andreas Baak's paper A Data-Driven Approach for Real-Time Full Body Pose Reconstruction from a Depth Camera [PDF]. Baak's algorithm uses pattern matching, too, but it provided a valuable starting point: locating the mathematical extrema in the body shape detected, then proceeding to deduce the skeleton.
Heuristics are used to determine which three extrema are most likely to be the head and shoulders (with the head being in the middle), and which are hands. Subsequently, a graph is built connecting the points found, and analyzed to determine which shoulder each hand belongs to (based on proximity). Elbows are inferred as being roughly halfway along the path connecting each hand to its shoulder. The result is a skeleton detected without any "computer vision" techniques, and without any prior calibration steps. The down side of this approach is that for the moment it only works for upper-body recognition, although Rocha said full-body detection is yet to come.
How to use it
Skeltrack's SkeltrackSkeleton object has tweakable parameters for expected shoulder and hand distances, plus other measurements to modify the algorithm. One of the more important parameters is smoothing, which helps cope with the jitter often found in skeleton detection. For starters, Kinect depth data can be quite noisy, and on top of that, the heuristics used to find joints in the library result in rapid, tiny changes. Rocha showed a live demo of Skeltrack on stage, and with the smoothing function deactivated, the result is entertaining to watch, but would not be pleasant to use when interacting with one's computer. The down side is that running the smoothing formula costs CPU cycles; one can maximize smoothing, but the result is higher latency, which might hamper interactive applications.
Rocha also demonstrated a few poses that can confuse Skeltrack's algorithm. For example, when standing hands-on-hips, there are no "hand" extrema to be found, leading the algorithm to conclude that the elbows are hands. With one hand raised head-height and the corresponding elbow held at shoulder height (as one might do while waving), the algorithm cannot find the shoulder, and thus cannot figure out which of the extrema is the head and which is the hand. Nevertheless, Skeltrack is quite good at interpreting common motions. Rocha demonstrated it with a sample program that simply drew the skeleton on screen, and also with a GNOME 3 desktop control application. The desktop application is hardcoded to a handful (pun semi-intended) of actions, rather than a general gesture input framework. There was also a demo set up at the Igalia (Rocha's employer) expo booth.
Skeltrack provides both an asynchronous and a synchronous API, and it reports the locations of joints in both "real world" and screen coordinates — measured in millimeters in the original scene and pixels in the webcam image. Currently the code is limited to identifying one person in the buffer, but there are evidently ways to work around the limitation. Rocha said that a company in Greece was using OpenCV to recognize multiple people in the depth buffer, then running Skeltrack separately on each part of the frame that contained a person. However, the project in question was not doing the skeleton recognition in real-time.
Libfreenect (and thus Skeltrack) is not tied into the XInput input system, nor is Skeltrack itself bound to a multi-touch application framework. That is one possible direction for the code to head in the future; hooking Skeltrack into the same touch event and gesture recognition libraries as multi-touch pads and touch-screens would make Kinect-style hardware more accessible to application developers. But that cannot be the endpoint — depth buffers offer richer information than 2D touch devices; developers can and will find more (and more unusual) things to do with this new interface method. Skeltrack is ahead of the competition (libfreenect lacks skeleton tracking, but its developers recognize the need for it), and that is a win not just for GNOME, but for open source software in general.
[The author would like to thank the GNOME Foundation for travel assistance to A Coruña for GUADEC.]
The Nexus 7: Google ships a tablet
When life presents challenges, one can always try to cope by buying a new toy. In this case, said new toy is the Nexus 7 tablet, the first "pure Android" tablet offered directly by Google; it is meant to showcase what Android can be on this type of device. The initial indications are that it is selling well, suggesting that the frantic effort to prepare Android for tablets are finally beginning to bear some fruit. What follows are your editor's impressions of this device and the associated "Jelly Bean" Android release.The Nexus 7 (N7) is an intermediate-size tablet — larger than even the biggest phones, but smaller than, say, a Xoom or iPad device. It features a 7" 1280x800 display and weighs in at 340 grams. There's 1GB of RAM, and up to 16GB of storage; the CPU is a quad-core Tegra3 processor. The notion of a quad-core system that fits easily into a back pocket is amusing to us old-timers, but that's the age we live in now. The N7 features WiFi connectivity and Bluetooth, but there is no cellular connectivity; it has 802.11n support, but cannot access the 5GHz band where 802.11n networks often live. The only camera is a front-facing 1.2 megapixel device; the N7 does not even have the camera application installed by default.
The N7 runs Android 4.1.1, the "Jelly Bean" release. 4.1.1 offers a lot of enhancements over 4.0, but is, for the most part, similar in appearance and functionality. The first impression, once the setup formalities are done, can be a little disconcerting: the home screen is dominated by a large ad for Google's "Play Magazines" service. It makes one think that "pure Android" devices might be going the crapware route, but the ad widget is easily disposed of and never appears again.
As of this writing, there is no CyanogenMod build available for the N7. That is unsurprising, given the newness of the hardware and the fact that CyanogenMod has not yet moved to the Jelly Bean release. But the N7 is an unlocked (or, at least, easily unlockable) device, so one can expect that alternative distributions will become available for it in due time.
Using the N7
Android on tablets has matured considerably since the initial "Honeycomb" release featured on the Xoom. For the most part, things work nicely, at least as far as the standard Google applications are concerned. The ability of third-party applications to work well on larger screens is still highly variable. One bit of remaining confusion is the "menu" button, which appears in different places in different applications, or is absent altogether. Playing the "find the menu" game is a common part of learning any new application. One gets the sense that the Android developers would like to do away with menus altogether, but there are many practical difficulties in doing so.
Perhaps the most jarring change is the switch to Chrome as the built-in web browser. The standard Android browser wasn't perfect, but it had accumulated some nice features over the years. Chrome is capable and fully-featured, and it arguably makes sense for Google to focus on supporting a single browser. But your editor misses the "auto-fit pages" option and the "quick controls" provided by the Android browser. Getting around with Chrome just seems to be a slower process requiring more taps and gestures. Undoubtedly there is a way to get the Android browser onto the N7, but, so far, time has been short and a quick search came up empty.
The N7's front-facing camera is clearly not meant for any sort of photographic use, unless one is especially interested in self portraits. It is useful for the "face unlock" feature, naturally. It is also clearly meant for use with applications like Skype; the N7 should make a very nice video network phone. Unfortunately, video calls in Skype fail to work on your editor's device. Some searching indicates that it works for some people and fails for others; sometimes installing the camera application helps, but not in this case. At this time, the N7 does not appear to be ready for this kind of use.
One need not have an especially conspiracy-theoretical mindset to surmise that Skype's owner (a small company called "Microsoft") might just have an incentive to ensure that Skype works better on its own operating system than on Android. But the truth of the matter is probably more prosaic: by all accounts, the Skype application is just not an example of stellar software engineering. Unfortunately, it is an example of proprietary software, so there is no way for anybody but Skype to fix it. There should really be a place for a free-software video calling application that (1) actually works, and (2) can be verified to lack backdoors for government agencies and anybody else interested in listening in on conversations. But that application does not seem to exist at this time, alas.
Electronic books
Another obvious use case for a 7" tablet is as an electronic book reader. The N7 has some obvious disadvantages relative to the current crop of electronic-ink readers, though: it weighs about twice as much, has a fraction of the battery life, and has a backlit screen that is harder to stare at for hours. Still, it is worth considering for this role; its presence in the travel bag is more easily justified if it can displace another device.
The N7 hardware, in the end, puts in a credible, though not stellar, performance as a book reader. The extra weight is noticeable, but the tablet still weighs less than most books. The rated battery life for reading is about nine hours, possibly extendable by turning off the wireless interface. Nine hours will get one through an international travel experience of moderate length, but one misses the battery life of a proper reader device that can go for weeks at a time without a recharge. The lack of dedicated buttons for page-turning and the like (which are commonly present on dedicated readers) is not a huge problem. The backlit display can actually be advantageous in situations where turning on the lights is frowned upon — when the spouse is sleeping, or on some airplanes, for example.
On the software side, there are a number of reading applications available,
ranging from the ultra-proprietary Google Books and Kindle applications to
the (nice) GPL-licensed FBReader
program. Experience shows that the rendering of text does not always work
as well in applications like FBReader or Aldiko, though; white space used
to separate
sections within chapters can disappear, for example, and block quotes can
be smashed into the surrounding paragraphs. Readers like Kindle do better
in this regard. Another annoyance is that the tablet uses the MTP
protocol over the USB connection, meaning that it does not work easily with
Calibre. One can, of course, move book files manually or use Calibre's
built-in web server to get books onto the device, but it would be a lot
nicer if Calibre could just manage the on-device library directly.
In summary, while the experience for users of walled-garden book services is probably pretty good, it remains a bit rough for those wanting to take charge of the books that they so foolishly think they, by virtue of having paid for them, actually own. Beyond that, for content that goes beyond pure text — anything with pictures, for example — a tablet can provide a nicer experience. And, of course, the tablet offers the full Internet and all the other Android applications; whether that is considered to be an advantage in a book reader is almost certainly in the eye of the user.
In the long term, it seems clear that general-purpose tablets will displace dedicated reader devices, but the N7, arguably, is not quite there yet.
In general, though, the N7 works nicely as a media consumption device. It plays videos nicely and is a pleasant device for wandering around on the web. For people who are fully hooked into the Google machine it naturally provides a nicely integrated interface into all of the related services. For the rest of us the experience is a bit more uneven; your editor still yearns for a better email client, for example. But, even with its limitations, the N7 fills in nicely where one does not want to deal with a laptop, but where a phone screen is simply too limiting. This new tablet from Google is a nice device overall; it is likely to remain in active use for some time.
Security
The leap second of doom
Since the last leap second caused a certain amount of havoc on Linux systems, it was probably only a matter of time before someone came up with the idea of "testing" for vulnerable systems again. Leap seconds are only supposed to occur at the end of June and December, with six months notice, so administrators might well have been waiting to update their servers for the problem until another was nigh. But "rogue" (or buggy) network time protocol (NTP) servers can effectively cause a leap second at the end of any month—which seems to be what happened on July 31.
It is not uncommon for "black hats" to keep exploiting vulnerabilities well after updates to fix them have been released. This situation is a bit different, though. While updating systems to avoid known vulnerabilities is clearly a "best practice", sometimes system administrators choose to delay updates, especially those that require a reboot, based on their sense of the likelihood of an attack. Given that no real leap seconds were scheduled, and the subversion of NTP servers (or traffic) may have seemed relatively unlikely, some (perhaps large) percentage of Linux systems have not been updated. But, not all "attacks" are caused by black hats; the original problem was caused by a bug, this one may also turn out that way.
Marco Marongiu appears to have been the first to notice the problem:
If you didn't take action before the leapocalypse last month, you better hurry now.
Given that the notice (to the NTP questions mailing list) came less than four hours before the second "leapocalypse", it's hard to imagine that many administrators saw it in time to take action.
The most interesting question, of course, is how this could have happened. It is tempting to see it as some kind of worldwide denial of service attack, but that is probably not the most likely cause. Further discussion in the thread with Marongiu's warning points to another possible cause.
It seems that the NTP protocol has a "leap" flag (aka LI or leap indicator), which is a two-bit field that indicates whether a second should be inserted or deleted at the end of the current month. Adding a leap second at the end of any month does not correspond with current practice (June and December leap seconds only), but depending on which standard you look at, it is reasonable to do so. RFC 5905, which governs NTP, definitely allows leap seconds at the end of any month, however, so compliant implementations should allow that.
But that still leaves the question of why the LI flag was set to 1 (i.e. add a second at the end of the month). In the thread, "demonccc" noted a server with the flag set. Furthermore, Martin Burnicki described a problem his customers saw after June's leap second in which certain older NTP servers did not reset the leap flag after the event. That could cause leap seconds at the end of every month until it gets fixed.
While there aren't widespread reports of Linux systems going into infinite loops and burning up excess power (unlike June), it does appear to have affected some systems out there. The MythTV users mailing list has a thread about the problem, for example. If it is an actual attack, it is a clever one, but there are enough signs pointing to NTP server bugs that it's pretty unlikely.
Even if it is "just" caused by a bug (or bugs), it is still a bit worrisome. NTP has not generally been seen as a vector for attacks, but this situation shows that it could be. Unpatched systems could be targeted by man-in-the-middle attacks toward the end of every month for example. Both leap-second occurrences (real and fake) point to the problems that can lurk in code that only truly gets tested once in a great while. One wonders what might happen to systems (patched or not) that receive a "subtract a second" NTP message, since there has never been a real negative leap second.
Brief items
Security quotes of the week
I don't know if it's by design, but I thought I'd mention it here in case someone else wants to look into it (I'm not really interested in video game security, I air-gap the machine I use to play games).
Privilege escalation vulnerability in the NVidia binary driver
People running the proprietary NVidia graphics driver on systems with untrusted users may want to have a look at this exploit posted by Dave Airlie. "I was given this anonymously, it has been sent to nvidia over a month ago with no reply or advisory and the original author wishes to remain anonymous but would like to have the exploit published at this time."
This Cute Chat Site Could Save Your Life And Help Overthrow Your Government (Wired)
Wired writes about crypto.cat (or Cryptocat), which is an AGPL3-licensed browser-based AES-256-encrypted chat program. It was created by 21-year-old Nadim Kobeissi, who is originally from Beirut, Lebanon and now goes to college in Montréal, Canada. "But Kobeissi also knows that it’s equally important that Cryptocat be usable and pretty. Kobeissi wants Cryptocat to be something you want to use, not just need to. Encrypted chat tools have existed for years — but have largely stayed in the hands of geeks, who usually aren’t the ones most likely to need strong crypto. 'Security is not just good crypto. It’s very important to have good crypto, and audit it. Security is not possible without (that), but security is equally impossible without making it accessible.'"
Martin: Off the Record Messaging: A Tutorial
Ben Martin has a lengthy tutorial on Off the Record (OTR) messaging on his blog. OTR is useful for realtime encrypted communication (e.g. instant messaging, IRC) and Martin's post looks at both the protocol and using libotr to add OTR support to C++ programs. "In order to operate without a web of trust, libotr implements the Socialist Millionaires' Protocol (SMP). The SMP allows two parties to verify that they both know the same secret. The secret might be a passphrase or answer to a private joke that two people will easily know. The SMP operates fine in the presence of eaves droppers (who don't get to learn the secret). Active communications tampering is not a problem, though of course it might cause the protocol not to complete successfully."
New vulnerabilities
apache-mod_auth_openid: local session ID disclosure
Package(s): | apache-mod_auth_openid | CVE #(s): | CVE-2012-2760 | ||||
Created: | July 26, 2012 | Updated: | August 1, 2012 | ||||
Description: | From the Mandriva advisory: mod_auth_openid before 0.7 for Apache uses world-readable permissions for /tmp/mod_auth_openid.db, which allows local users to obtain session ids (CVE-2012-2760). | ||||||
Alerts: |
|
bacula: symlink attack
Package(s): | bacula | CVE #(s): | CVE-2008-5373 | ||||||||||||
Created: | July 30, 2012 | Updated: | August 27, 2012 | ||||||||||||
Description: | From the CVE entry:
mtx-changer.Adic-Scalar-24 in bacula-common 2.4.2 allows local users to overwrite arbitrary files via a symlink attack on a /tmp/mtx.##### temporary file, probably a related issue to CVE-2005-2995. | ||||||||||||||
Alerts: |
|
bind9: denial of service
Package(s): | bind9 | CVE #(s): | CVE-2012-3817 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Created: | July 26, 2012 | Updated: | September 10, 2012 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Description: | From the Ubuntu advisory: Einar Lonn discovered that Bind incorrectly initialized the failing-query cache. A remote attacker could use this flaw to cause Bind to crash, resulting in a denial of service. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Alerts: |
|
ganglia: code execution
Package(s): | ganglia | CVE #(s): | |||||||||||||||||
Created: | July 26, 2012 | Updated: | April 9, 2013 | ||||||||||||||||
Description: | From the Ganglia advisory: There is a security issue in Ganglia Web going back to at least 3.1.7 which can lead to arbitrary script being executed with web user privileges possibly leading to a machine compromise. | ||||||||||||||||||
Alerts: |
|
icedtea-web: code execution
Package(s): | icedtea-web | CVE #(s): | CVE-2012-3422 CVE-2012-3423 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Created: | August 1, 2012 | Updated: | September 24, 2012 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Description: | From the Red Hat advisory:
An uninitialized pointer use flaw was found in the IcedTea-Web plug-in. Visiting a malicious web page could possibly cause a web browser using the IcedTea-Web plug-in to crash, disclose a portion of its memory, or execute arbitrary code. (CVE-2012-3422) It was discovered that the IcedTea-Web plug-in incorrectly assumed all strings received from the browser were NUL terminated. When using the plug-in with a web browser that does not NUL terminate strings, visiting a web page containing a Java applet could possibly cause the browser to crash, disclose a portion of its memory, or execute arbitrary code. (CVE-2012-3423) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Alerts: |
|
isc-dhcp: multiple vulnerabilities
Package(s): | isc-dhcp | CVE #(s): | CVE-2012-3571 CVE-2012-3954 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Created: | July 26, 2012 | Updated: | August 6, 2012 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Description: | From the Debian advisory: CVE-2012-3571: Markus Hietava of the Codenomicon CROSS project discovered that it is possible to force the server to enter an infinite loop via messages with malformed client identifiers. CVE-2012-3954: Glen Eustace discovered that DHCP servers running in DHCPv6 mode and possibly DHCPv4 mode suffer of memory leaks while processing messages. An attacker can use this flaw to exhaust resources and perform denial of service attacks. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Alerts: |
|
krb5: denial of service
Package(s): | krb5 | CVE #(s): | CVE-2012-1015 | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Created: | August 1, 2012 | Updated: | August 6, 2012 | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Description: | From the Red Hat advisory:
An uninitialized pointer use flaw was found in the way the MIT Kerberos KDC handled initial authentication requests (AS-REQ). A remote, unauthenticated attacker could use this flaw to crash the KDC via a specially-crafted AS-REQ request. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
Alerts: |
|
krb5: code execution
Package(s): | krb5 | CVE #(s): | CVE-2012-1014 | ||||||||||||||||||||
Created: | August 1, 2012 | Updated: | March 18, 2013 | ||||||||||||||||||||
Description: | From the Debian advisory:
By sending specially crafted AS-REQ (Authentication Service Request) to a KDC (Key Distribution Center), an attacker could make it free an uninitialized pointer, corrupting the heap. This can lead to process crash or even arbitrary code execution. | ||||||||||||||||||||||
Alerts: |
|
krb5: information disclosure
Package(s): | krb5 | CVE #(s): | CVE-2012-1012 | ||||
Created: | August 1, 2012 | Updated: | August 1, 2012 | ||||
Description: | From the Ubuntu advisory:
It was discovered that the kadmin protocol implementation in MIT krb5 did not properly restrict access to the SET_STRING and GET_STRINGS operations. A remote authenticated attacker could use this to expose or modify sensitive information. This issue only affected Ubuntu 12.04 LTS. | ||||||
Alerts: |
|
libjpeg-turbo: code execution
Package(s): | libjpeg-turbo | CVE #(s): | CVE-2012-2806 | ||||||||||||||||||||||||||||
Created: | August 1, 2012 | Updated: | April 8, 2013 | ||||||||||||||||||||||||||||
Description: | From the Novell bugzilla:
A Heap-based buffer overflow was found in the way libjpeg-turbo decompressed certain corrupt JPEG images in which the component count was erroneously set to a large value. An attacker could create a specially-crafted JPEG image that, when opened, could cause an application using libpng to crash or, possibly, execute arbitrary code with the privileges of the user running the application. | ||||||||||||||||||||||||||||||
Alerts: |
|
libpng14: denial of service
Package(s): | libpng14 | CVE #(s): | CVE-2012-3425 | ||||||||||||||||||||
Created: | August 1, 2012 | Updated: | August 1, 2012 | ||||||||||||||||||||
Description: | libpng crashes when loading a corrupted image. | ||||||||||||||||||||||
Alerts: |
|
puppet: IP address impersonation
Package(s): | puppet | CVE #(s): | CVE-2012-3408 | ||||
Created: | July 30, 2012 | Updated: | August 1, 2012 | ||||
Description: | From the Red Hat bugzilla:
From puppet labs: Puppet agents with certnames of IP addresses can be impersonated This affects Puppet 2.6.16 and 2.7.17 If an authenticated host with a certname of an IP address changes IP addresses, and a second host assumes the first host's former IP address, the second host will be treated by the puppet master as the first one, giving the second host access to the first host's catalog. Note: This will not be fixed in Puppet versions prior to the forthcoming 3.x. Instead, with this announcement IP-based authentication in Puppet < 3.x is deprecated. Resolved in Puppet 2.6.17, 2.7.18 | ||||||
Alerts: |
|
wireshark: remote denial of service
Package(s): | wireshark | CVE #(s): | CVE-2012-4048 CVE-2012-4049 | ||||||||||||||||||||||||||||||||
Created: | August 1, 2012 | Updated: | December 26, 2012 | ||||||||||||||||||||||||||||||||
Description: | From the CVE entries:
The PPP dissector in Wireshark 1.4.x before 1.4.14, 1.6.x before 1.6.9, and 1.8.x before 1.8.1 allows remote attackers to cause a denial of service (invalid pointer dereference and application crash) via a crafted packet, as demonstrated by a usbmon dump. (CVE-2012-4048) epan/dissectors/packet-nfs.c in the NFS dissector in Wireshark 1.4.x before 1.4.14, 1.6.x before 1.6.9, and 1.8.x before 1.8.1 allows remote attackers to cause a denial of service (loop and CPU consumption) via a crafted packet. (CVE-2012-4049) | ||||||||||||||||||||||||||||||||||
Alerts: |
|
xen: denial of service
Package(s): | xen | CVE #(s): | CVE-2012-2625 | ||||||||||||||||||||||||||||||||||||||||||||
Created: | August 1, 2012 | Updated: | September 14, 2012 | ||||||||||||||||||||||||||||||||||||||||||||
Description: | From the Red Hat advisory:
A flaw was found in the way the pyGrub boot loader handled compressed kernel images. A privileged guest user in a para-virtualized guest (a DomU) could use this flaw to create a crafted kernel image that, when attempting to boot it, could result in an out-of-memory condition in the privileged domain (the Dom0). | ||||||||||||||||||||||||||||||||||||||||||||||
Alerts: |
|
xrdp: weak encryption
Package(s): | xrdp | CVE #(s): | |||||
Created: | July 31, 2012 | Updated: | August 1, 2012 | ||||
Description: | From the SUSE advisory:
The XRDP service was changed so that the default crypto level in XRDP was changed from "low" to "high". This switches from using a 40 bit encryption to a 128 bit two-way encryption. | ||||||
Alerts: |
|
Page editor: Jake Edge
Kernel development
Brief items
Kernel release status
The 3.6 merge window remains open, so there is no current development kernel release. Changes continue to move into the mainline; see the separate article below for details.Stable updates: 3.2.24 was released on July 26, 3.4.7 came out on July 30, and 3.0.39 was released on August 1. In addition to the usual fixes, 3.0.39 includes a significant set of backported memory management performance patches.
The 3.2.25 update is in the review process as of this writing; it can be expected on or after August 2.
Quotes of the week
RIP Andre Hedrick (Register)
The Register has an article on the life and death of Andre Hedrick, the former kernel IDE maintainer who passed away on July 13. "Today, millions of people use digital restriction management systems that lock down books, songs and music - the Amazon Kindle, the BBC iPlayer and Spotify are examples - but consumers enter into the private commercial agreement knowingly. It isn't set by default in the factory, as it might have been. The PC remains open rather than becoming an appliance. Andre was never comfortable taking the credit he really deserved for this achievement." See also this weblog page where memories are being collected.
Garzik: An Andre To Remember
Jeff Garzik has shared his memories of Andre Hedrick on the linux-kernel mailing list; worth a read. "This is a time for grief and a time for celebration of Andre's accomplishments, but also it is a time to look around at our fellow geeks and offer our support, if similar behavioral signs appear."
Kernel development news
3.6 merge window part 2
As of this writing, just over 8,200 non-merge changesets have been pulled into Linus's repository; that's nearly 4,000 since last week's summary. It seems that any hopes that 3.6 might be a relatively low-volume cycle are not meant to be fulfilled. That said, things seem to be going relatively smoothly, with only a small number of problems being reported so far.User-visible changes merged since last week include:
- The btrfs send/receive feature has
been merged. Send/receive can calculate the differences between two
btrfs subvolumes or snapshots and serialize the result; it can be used
for, among other things, easy mirroring of volumes and incremental
backups.
- Btrfs has also gained the ability to apply disk quotas to subvolumes.
According to btrfs maintainer Chris Mason, "
This enables full tracking of how many blocks are allocated to each subvolume (and all snapshots) and you can set limits on a per-subvolume basis. You can also create quota groups and toss multiple subvolumes into a big group. It's everything you need to be a web hosting company and give each user their own subvolume.
" - The kernel has gained better EFI booting support. This should allow
the removal of a lot of EFI setup code from various bootloaders, which
now need only load the kernel and jump into it.
- The new "coupled cpuidle" code enables better CPU power management on
systems where CPUs cannot be powered down individually. See this
commit for more information on how this feature works.
- The LED code supports a new "oneshot" mode where applications can
request a single LED blink via sysfs. See Documentation/leds/ledtrig-oneshot.txt
for details.
- A number of random number generator
changes have been merged, hopefully leading to more secure random
numbers, especially on embedded devices.
- The VFIO subsystem, intended to be a
safe mechanism for the creation of user-space device drivers, has been
merged; see Documentation/vfio.txt for
more information.
- The swap-over-NFS patch set has been
merged, making the placement of swap files on NFS-mounted filesystems
a not entirely insane thing to do.
- New hardware support includes:
- Processors and systems:
Loongson 1B CPUs.
- Audio:
Wolfson Micro "Arizona" audio controllers (WM5102 and WM5110 in
particular).
- Input:
NXP LPC32XX key scanners,
MELFAS MMS114 touchscreen controllers, and
EDT ft5x06 based polytouch devices.
- Miscellaneous:
National Semiconductor/TI LM3533 ambient light sensors,
Analog Devices AD9523 clock generators,
Analog Devices ADF4350/ADF4351 wideband synthesizers,
Analog Devices AD7265/AD7266 analog to digital converters,
Analog Devices AD-FMCOMMS1-EBZ SPI-I2C-bridges,
Microchip MCP4725 digital-to-analog converters,
Maxim DS28E04-100 1-Wire EEPROMs,
Vishay VCNL4000 ambient light/proximity sensors,
Texas Instruments OMAP4+ temperature sensors,
EXYNOS HW random number generators,
Atmel AES, SHA1/SHA256, and AES crypto accelerators,
Blackfin CRC accelerators,
AMD 8111 GPIO controllers,
TI LM3556 and LP8788 LED controllers,
BlinkM I2C RGB LED controllers,
Calxeda Highbank memory controllers,
Maxim Semiconductor MAX77686 PMICs,
Marvell 88PM800 and 88PM805 PMICs,
Lantiq Falcon SPI controllers, and
Broadcom BCM63xx random number generators.
- Networking:
Cambridge Silicon Radio wireless controllers.
- USB:
Freescale i.MX ci13xxx USB controllers,
Marvell PXA2128 USB 3.0 controllers, and
Maxim MAX77693 MUIC USB port accessory detectors.
- Video4Linux:
Realtek RTL2832 DVB-T demodulators,
Analog Devices ADV7393 encoders,
Griffin radioSHARK and radioSHARK2 USB radio receivers, and
IguanaWorks USB IR transceivers.
- Staging graduations: IIO digital-to-analog converter drivers.
- Processors and systems:
Loongson 1B CPUs.
Changes visible to kernel developers include:
- The pstore persistent storage
mechanism has improved handling of
console log messages. The Android RAM buffer console mechanism has
been removed, since pstore is now able to provide all of the same
functionality. Pstore has also gained function tracer support,
allowing the recording of function calls prior to a panic.
- The new PWM framework eases the writing of drivers for pulse-width
modulation devices, including LEDs, fans, and more. See Documentation/pwm.txt for details.
- There is a new utility function:
size_t memweight(const void *ptr, size_t bytes);
It returns the number of bits set in the given memory region.
- The fault injection subsystem has a new module which can inject errors
into notifier call chains.
- There is a new "flexible proportions" library allowing the calculation
of proportions over a variable period. See
<linux/flex_proportions.h> for the interface.
- The new __GFP_MEMALLOC flag allows memory allocations to dip
into the emergency reserves.
- The IRQF_SAMPLE_RANDOM interrupt flag no longer does anything; it has been removed from the kernel.
Andrew Morton's big pile of patches was merged on August 1; that is usually a sign that the merge window is nearing its end. Expect a brief update after the 3.6 merge window closes, but, at this point, the feature set for this release can be expected to be nearly complete.
ACCESS_ONCE()
Even a casual reader of the kernel source code is likely to run into invocations of the ACCESS_ONCE() macro eventually; there are well over 200 of them in the current source tree. Many such readers probably do not stop to understand just what that macro means; a recent discussion on the mailing list made it clear that even core kernel developers may not have a firm idea of what it does. Your editor was equally ignorant but decided to fix that; the result, hopefully, is a reasonable explanation of why ACCESS_ONCE() exists and when it must be used.The functionality of this macro is actually well described by its name; its purpose is to ensure that the value passed as a parameter is accessed exactly once by the generated code. One might well wonder why that matters. It comes down to the fact that the C compiler will, if not given reasons to the contrary, assume that there is only one thread of execution in the address space of the program it is compiling. Concurrency is not built into the C language itself, so mechanisms for dealing with concurrent access must be built on top of the language; ACCESS_ONCE() is one such mechanism.
Consider, for example, the following code snippet from kernel/mutex.c:
for (;;) { struct task_struct *owner; owner = ACCESS_ONCE(lock->owner); if (owner && !mutex_spin_on_owner(lock, owner)) break; /* ... */
This is a small piece of the adaptive spinning code that hopes to quickly grab a mutex once the current owner drops it, without going to sleep. There is much more to this for loop than has been shown here, but this code is sufficient to show why ACCESS_ONCE() can be necessary.
Imagine for a second that the compiler in use is developed by fanatical
developers who will optimize things in every way they can. This is not a
purely hypothetical scenario; as Paul McKenney recently attested: "I have seen the glint in
their eyes when they discuss optimization techniques that you would not
want your children to know about!
" These developers might create a
compiler that concludes that, since the code in question does not actually
modify lock->owner, it is not necessary to actually fetch its
value each time through the loop. The compiler might then rearrange the
code into something like:
owner = ACCESS_ONCE(lock->owner); for (;;) { if (owner && !mutex_spin_on_owner(lock, owner)) break;
What the compiler has missed is the fact that lock->owner is being changed by another thread of execution entirely. The result is code that will fail to notice any such changes as it executes the loop multiple times, leading to unpleasant results. The ACCESS_ONCE() call prevents this optimization happening, with the result that the code (hopefully) executes as intended.
As it happens, an optimized-out access is not the only peril that this code could encounter. Some processor architectures (x86, for example) are not richly endowed with registers; on such systems, the compiler must make careful choices regarding which values to keep in registers if it is to generate the highest-performing code. Specific values may be pushed out of the register set, then pulled back in later. Should that happen to the mutex code above, the result could be multiple references to lock->owner. And that could cause trouble; if the value of lock->owner changed in the middle of the loop, the code, which is expecting the value of its local owner variable to remain constant, could become fatally confused. Once again, the ACCESS_ONCE() invocation tells the compiler not to do that, avoiding potential problems.
The actual implementation of ACCESS_ONCE(), found in <linux/compiler.h>, is fairly straightforward:
#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
In other words, it works by turning the relevant variable, temporarily, into a volatile type.
Given the kinds of hazards presented by optimizing compilers, one might well wonder why this kind of situation does not come up more often. The answer is that most concurrent access to data is (or certainly should be) protected by locks. Spinlocks and mutexes both function as optimization barriers, meaning that they prevent optimizations on one side of the barrier from carrying over to the other. If code only accesses a shared variable with the relevant lock held, and if that variable can only change when the lock is released (and held by a different thread), the compiler will not create subtle problems. It is only in places where shared data is accessed without locks (or explicit barriers) that a construct like ACCESS_ONCE() is required. Scalability pressures are causing the creation of more of this type of code, but most kernel developers still should not need to worry about ACCESS_ONCE() most of the time.
TCP Fast Open: expediting web services
Much of today's Internet traffic takes the form of short TCP data flows that consist of just a few round trips exchanging data segments before the connection is terminated. The prototypical example of this kind of short TCP conversation is the transfer of web pages over the Hypertext Transfer Protocol (HTTP).
The speed of TCP data flows is dependent on two factors: transmission delay (the width of the data pipe) and propagation delay (the time that the data takes to travel from one end of the pipe to the other). Transmission delay is dependent on network bandwidth, which has increased steadily and substantially over the life of the Internet. On the other hand, propagation delay is a function of router latencies, which have not improved to the same extent as network bandwidth, and the speed of light, which has remained stubbornly constant. (At intercontinental distances, this physical limitation means that—leaving aside router latencies—transmission through the medium alone requires several milliseconds.) The relative change in the weighting of these two factors means that over time the propagation delay has become a steadily larger component in the overall latency of web services. (This is especially so for many web pages, where a browser often opens several connections to fetch multiple small objects that compose the page.)
Reducing the number of round trips required in a TCP conversation has thus become a subject of keen interest for companies that provide web services. It is therefore unsurprising that Google should be the originator of a series of patches to the Linux networking stack to implement the TCP Fast Open (TFO) feature, which allows the elimination of one round time trip (RTT) from certain kinds of TCP conversations. According to the implementers (in "TCP Fast Open", CoNEXT 2011 [PDF]), TFO could result in speed improvements of between 4% and 41% in the page load times on popular web sites.
We first wrote about TFO back in September 2011, when the idea was still in the development stage. Now that the TFO implementation is starting to make its way into the kernel, it's time to visit it in more detail.
The TCP three-way handshake
To understand the optimization performed by TFO, we first need to note that each TCP conversation begins with a round trip in the form of the so-called three-way handshake. The three-way handshake is initiated when a client makes a connection request to a server. At the application level, this corresponds to a client performing a connect() system call to establish a connection with a server that has previously bound a socket to a well-known address and then called accept() to receive incoming connections. Figure 1 shows the details of the three-way handshake in diagrammatic form.
Figure 1: TCP three-way handshake between a client and a server![]()
During the three-way handshake, the two TCP end-points exchange SYN (synchronize) segments containing options that govern the subsequent TCP conversation—for example, the maximum segment size (MSS), which specifies the maximum number of data bytes that a TCP end-point can receive in a TCP segment. The SYN segments also contain the initial sequence numbers (ISNs) that each end-point selects for the conversation (labeled M and N in Figure 1).
The three-way handshake serves another purpose with respect to connection establishment: in the (unlikely) event that the initial SYN is duplicated (this may occur, for example, because underlying network protocols duplicate network packets), then the three-way handshake allows the duplication to be detected, so that only a single connection is created. If a connection was established before completion of the three-way handshake, then a duplicate SYN could cause a second connection to be created.
The problem with current TCP implementations is that data can only be exchanged on the connection after the initiator of the connection has received an ACK (acknowledge) segment from the peer TCP. In other words, data can be sent from the client to the server only in the third step of the three-way handshake (the ACK segment sent by the initiator). Thus, one full round trip time is lost before data is even exchanged between the peers. This lost RTT is a significant component of the latency of short web conversations.
Applications such as web browsers try to mitigate this problem using HTTP persistent connections, whereby the browser holds a connection open to the web server and reuses that connection for later HTTP requests. However, the effectiveness of this technique is decreased because idle connections may be closed before they are reused. For example, in order to limit resource usage, busy web servers often aggressively close idle HTTP connections. The result is that a high proportion of HTTP requests are cold, requiring a new TCP connection to be established to the web server.
Eliminating a round trip
Theoretically, the initial SYN segment could contain data sent by the initiator of the connection: RFC 793, the specification for TCP, does permit data to be included in a SYN segment. However, TCP is prohibited from delivering that data to the application until the three-way handshake completes. This is a necessary security measure to prevent various kinds of malicious attacks. For example, if a malicious client sent a SYN segment containing data and a spoofed source address, and the server TCP passed that segment to the server application before completion of the three-way handshake, then the segment would both cause resources to be consumed on the server and cause (possibly multiple) responses to be sent to the victim host whose address was spoofed.
The aim of TFO is to eliminate one round trip time from a TCP conversation by allowing data to be included as part of the SYN segment that initiates the connection. TFO is designed to do this in such a way that the security concerns described above are addressed. (T/TCP, a mechanism designed in the early 1990s, also tried to provide a way of short circuiting the three-way handshake, but fundamental security flaws in its design meant that it never gained wide use.)
On the other hand, the TFO mechanism does not detect duplicate SYN segments. (This was a deliberate choice made to simplify design of the protocol.) Consequently, servers employing TFO must be idempotent—they must tolerate the possibility of receiving duplicate initial SYN segments containing the same data and produce the same result regardless of whether one or multiple such SYN segments arrive. Many web services are idempotent, for example, web servers that serve static web pages in response to URL requests from browsers, or web services that manipulate internal state but have internal application logic to detect (and ignore) duplicate requests from the same client.
In order to prevent the aforementioned malicious attacks, TFO employs security cookies (TFO cookies). The TFO cookie is generated once by the server TCP and returned to the client TCP for later reuse. The cookie is constructed by encrypting the client IP address in a fashion that is reproducible (by the server TCP) but is difficult for an attacker to guess. Request, generation, and exchange of the TFO cookie happens entirely transparently to the application layer.
At the protocol layer, the client requests a TFO cookie by sending a SYN segment to the server that includes a special TCP option asking for a TFO cookie. The SYN segment is otherwise "normal"; that is, there is no data in the segment and establishment of the connection still requires the normal three-way handshake. In response, the server generates a TFO cookie that is returned in the SYN-ACK segment that the server sends to the client. The client caches the TFO cookie for later use. The steps in the generation and caching of the TFO cookie are shown in Figure 2.
Figure 2: Generating the TFO cookie![]()
At this point, the client TCP now has a token that it can use to prove to the server TCP that an earlier three-way handshake to the client's IP address completed successfully.
For subsequent conversations with the server, the client can short circuit the three-way handshake as shown in Figure 3.
Figure 3: Employing the TFO cookie![]()
The steps shown in Figure 3 are as follows:
- The client TCP sends a SYN that contains both the TFO
cookie (specified as a TCP option) and data from the client
application.
- The server TCP validates the TFO cookie by duplicating the
encryption process based on the source IP address of the new
SYN. If the cookie proves to be valid, then the server TCP can be
confident that this SYN comes from the address it claims to come
from. This means that the server TCP can immediately pass the application
data to the server application.
- From here on, the TCP conversation proceeds as normal: the server TCP sends a SYN-ACK segment to the client, which the client TCP then acknowledges, thus completing the three-way handshake. The server TCP can also send response data segments to the client TCP before it receives the client's ACK.
In the above steps, if the TFO cookie proves not to be valid, then the server TCP discards the data and sends a segment to the client TCP that acknowledges just the SYN. At this point, the TCP conversation falls back to the normal three-way handshake. If the client TCP is authentic (not malicious), then it will (transparently to the application) retransmit the data that it sent in the SYN segment.
Comparing Figure 1 and Figure 3, we can see that a complete RTT has been saved in the conversation between the client and server. (This assumes that the client's initial request is small enough to fit inside a single TCP segment. This is true for most requests, but not all. Whether it might be technically possible to handle larger requests—for example, by transmitting multiple segments from the client before receiving the server's ACK—remains an open question.)
There are various details of TFO cookie generation that we don't cover here. For example, the algorithm for generating a suitably secure TFO cookie is implementation-dependent, and should (and can) be designed to be computable with low processor effort, so as not to slow the processing of connection requests. Furthermore, the server should periodically change the encryption key used to generate the TFO cookies, so as to prevent attackers harvesting many cookies over time to use in a coordinated attack against the server.
There is one detail of the use of TFO cookies that we will revisit below. Because the TFO mechanism allows a client that submits a valid TFO cookie to trigger resource usage on the server before completion of the three-way handshake, the server can be the target of resource-exhaustion attacks. To prevent this possibility, the server imposes a limit on the number of pending TFO connections that have not yet completed the three-way handshake. When this limit is exceeded, the server ignores TFO cookies and falls back to the normal three-way handshake for subsequent client requests until the number of pending TFO connections falls below the limit; this allows the server to employ traditional measures against SYN-flood attacks.
The user-space API
As noted above, the generation and use of TFO cookies is transparent to the application level: the TFO cookie is automatically generated during the first TCP conversation between the client and server, and then automatically reused in subsequent conversations. Nevertheless, applications that wish to use TFO must notify the system using suitable API calls. Furthermore, certain system configuration knobs need to be turned in order to enable TFO.
The changes required to a server in order to support TFO are minimal, and are highlighted in the code template below.
sfd = socket(AF_INET, SOCK_STREAM, 0); // Create socket bind(sfd, ...); // Bind to well known address int qlen = 5; // Value to be chosen by application setsockopt(sfd, SOL_TCP, TCP_FASTOPEN, &qlen, sizeof(qlen)); listen(sfd, ...); // Mark socket to receive connections cfd = accept(sfd, NULL, 0); // Accept connection on new socket // read and write data on connected socket cfd close(cfd);
Setting the TCP_FASTOPEN socket option requests the kernel to use TFO for the server's socket. By implication, this is also a statement that the server can handle duplicated SYN segments in an idempotent fashion. The option value, qlen, specifies this server's limit on the size of the queue of TFO requests that have not yet completed the three-way handshake (see the remarks on prevention of resource-exhaustion attacks above).
The changes required to a client in order to support TFO are also minor, but a little more substantial than for a TFO server. A normal TCP client uses separate system calls to initiate a connection and transmit data: connect() to initiate the connection to a specified server address and (typically) write() or send() to transmit data. Since a TFO client combines connection initiation and data transmission in a single step, it needs to employ an API that allows both the server address and the data to be specified in a single operation. For this purpose, the client can use either of two repurposed system calls: sendto() and sendmsg().
The sendto() and sendmsg() system calls are normally used with datagram (e.g., UDP) sockets: since datagram sockets are connectionless, each outgoing datagram must include both the transmitted data and the destination address. Since this is the same information that is required to initiate a TFO connection, these system calls are recycled for the purpose, with the requirement that the new MSG_FASTOPEN flag must be specified in the flags argument of the system call. A TFO client thus has the following general form:
sfd = socket(AF_INET, SOCK_STREAM, 0); sendto(sfd, data, data_len, MSG_FASTOPEN, (struct sockaddr *) &server_addr, addr_len); // Replaces connect() + send()/write() // read and write further data on connected socket sfd close(sfd);
If this is the first TCP conversation between the client and server, then the above code will result in the scenario shown in Figure 2, with the result that a TFO cookie is returned to the client TCP, which then caches the cookie. If the client TCP has already obtained a TFO cookie from a previous TCP conversation, then the scenario is as shown in Figure 3, with client data being passed in the initial SYN segment and a round trip being saved.
In addition to the above APIs, there are various knobs—in the form of files in the /proc/sys/net/ipv4 directory—that control TFO on a system-wide basis:
- The tcp_fastopen file can be used to view or set a value
that enables the operation of different parts of the TFO
functionality. Setting bit 0 (i.e., the value 1) in this value enables
client TFO functionality, so that applications can request TFO
cookies. Setting bit 1 (i.e., the value 2) enables server TFO
functionality, so that server TCPs can generate TFO cookies in response to
requests from clients. (Thus, the value 3 would enable both client and
server TFO functionality on the host.)
- The tcp_fastopen_cookies file can be used to view or set a system-wide limit on the number of pending TFO connections that have not yet completed the three-way handshake. While this limit is exceeded, all incoming TFO connection attempts fall back to the normal three-way handshake.
Current state of TCP fast open
Currently, TFO is an Internet Draft with the IETF. Linux is the first operating system that is adding support for TFO. However, as yet that support remains incomplete in the mainline kernel. The client-side support has been merged for Linux 3.6. However, the server-side TFO support has not so far been merged, and from conversations with the developers it appears that this support won't be added in the current merge window. Thus, an operational TFO implementation is likely to become available only in Linux 3.7.
Once operating system support is fully available, a few further steps need to be completed to achieve wider deployment of TFO on the Internet. Among these is assignment by IANA of a dedicated TCP Option Number for TFO. (The current implementation employs the TCP Experimental Option Number facility as a placeholder for a real TCP Option Number.)
Then, of course, suitable changes must be made to both clients and servers along the lines described above. Although each client-server pair requires modification to employ TFO, it's worth noting that changes to just a small subset of applications—most notably, web servers and browsers—will likely yield most of the benefit visible to end users. During the deployment process, TFO-enabled clients may attempt connections with servers that don't understand TFO. This case is handled gracefully by the protocol: transparently to the application, the client and server will fall back to a normal three-way handshake.
There are other deployment hurdles that may be encountered. In their CoNEXT 2011 paper, the TFO developers note that a minority of middle-boxes and hosts drop TCP SYN segments containing unknown (i.e., new) TCP options or data. Such problems are likely to diminish as TFO is more widely deployed, but in the meantime a client TCP can (transparently) handle such problems by falling back to the normal three-way handshake on individual connections, or generally falling back for all connections to specific server IP addresses that show repeated failures for TFO.
Conclusion
TFO is promising technology that has the potential to make significant reductions in the latency of billions of web service transactions that take place each day. Barring any unforeseen security flaws (and the developers seem to have considered the matter quite carefully), TFO is likely to see rapid deployment in web browsers and servers, as well as in a number of other commonly used web applications.
Patches and updates
Kernel trees
Architecture-specific
Build system
Core kernel code
Development tools
Device drivers
Filesystems and block I/O
Memory management
Networking
Security-related
Virtualization and containers
Miscellaneous
Page editor: Jonathan Corbet
Distributions
CeroWrt: Bufferbloat, IPv6, and more
The CeroWrt project is an effort aimed at helping to solve a number of different problems in current home router distributions, but its primary focus is on bufferbloat. The problem of excessive buffering of network packets is endemic on the Internet as a whole, but it is much easier to start addressing the problem at the home router end, especially considering the easy availability of Linux-based firmware distributions. Beyond bufferbloat, though, CeroWrt also enables experiments with two "next generation" Internet features, IPv6 and DNSSEC.
CeroWrt is built atop the OpenWrt project's router firmware. It uses the OpenWrt development version ("Attitude Adjustment") with extras added by the CeroWrt team. Unlike OpenWrt's extensive list of supported hardware, CeroWrt focuses on supporting just two router devices: the Netgear WNDR3700v2 and WNDR3800. Both are capable devices with free driver support for all of the hardware and, importantly, the wireless networking hardware.
The most recent release is 3.3.8-10 from July 9. There is a 3.3.8-11 version available, but project lead Dave Täht suggested that people steer clear until a problem with the 5GHz wireless AP is resolved. Installing CeroWrt is fairly straightforward, either through the web-based GUI by uploading the "sysupgrade" image, or via tftp using the "factory" image.
Once the device has been flashed, one can connect to it on its default address, 172.30.42.1. CeroWrt specifically chose to avoid the other blocks of non-routable IP addresses (10.0.0.0/8 and 192.168.0.0/16) so that it can be experimented with in existing networks. Most home networks live in 192.168.x.y space and the 10.x.y.z addresses are often used by Internet backbones. The web UI is hosted on port 81 (and only available on the inside of the network, not via the WAN) so that users can use port 80 for their own router-based web site if they wish.
![[CeroWrt status]](https://static.lwn.net/images/2012/cero-status-sm.png)
The web UI is very similar to that of the current OpenWrt "Backfire" (10.03.1) release that I run on my venerable Linksys WRT54GL. The UI is built using LuCI, a Lua-based tool for building web interfaces for embedded devices. LuCI is noticeably snappier on the WNDR3700v2 that I used for CeroWrt testing than it is on the WRT54GL—presumably due to a faster CPU. The interface provides a great deal of status information, as well as allowing users to change various configuration settings. Everything from updating the firmware and checking firewall rules to changing DNS settings and examining system logs is available through the interface. In addition, there are various realtime graphs of system load, network connections, bandwidth usage, and so on.
The first steps after connecting to the router are some predictable things like setting the root password and adding wireless passwords, but there is another important step: enabling and configuring Active Queue Management (AQM). Essentially, one must determine the download and upload speeds (using something like SpeedTest.org) of the Internet link to plug into the web form and enable AQM. Testing bandwidth that way is static, so dynamic changes are not reflected, which is sub-optimal and the project is looking at better tests and ways to set those values automatically. It should also be noted that in limited testing, no real difference was apparent (even when copying large files while doing something interactive) with AQM enabled or disabled—more study is clearly required.
![[CeroWrt traffic graph]](https://static.lwn.net/images/2012/cero-traffic-sm.png)
The wireless networking setup is rather different than what OpenWrt (at least for Backfire) provides. There are four separate SSIDs for various kinds of WiFi access. CEROwrt and CEROwrt5 provide normal access for 2.4 and 5GHz respectively, while CEROwrt-guest and CEROwrt-guest5 are for guest access. By default, they all act as open access points and do not require a password, but enabling WPA2 for the non-guest SSIDs (at least) is suggested. There are also two babel SSIDs which are there to support mesh networking.
The guest SSIDs correspond to the guest zone in the firewall configuration. By default, guest traffic can only go to the Internet, so it does not have access to other devices on the local network. That allows one to give access to visitors (and neighbors) without risking unauthorized access to systems behind the firewall. The 172.30.42.x address space is broken up in to separate sub-networks such that each SSID gets its own set of 30 IP addresses, as does each set of wired, mesh, and DMZ devices.
But the main focus of CeroWrt is to experiment with solutions to the bufferbloat problem. To that end, it uses the 3.3.8 kernel (the CeroWrt release numbering follows that of the underlying kernel) with the addition of the controlled delay (CoDel) AQM algorithm. CoDel requires the byte queue limits feature that was added in the 3.3 kernel.
But there are additional goals for the project, and IPv6 support
("make IPv6 networking in the home as simple as IPv4
") is near
the top of the list. While it isn't as "simple" as IPv4 (yet), the instructions
are pretty easy to follow to have the router use a 6in4 tunnel, as well as
to provide IPv6 on the local net. That makes CeroWrt a nice choice for
experimenting with IPv6 as well, though some UI support to configure it
would be welcome.
There are other features to experiment with as well, including DNSSEC and the mesh
networking, though I didn't try those out.
Overall, the experience of switching over to the CeroWrt-powered router was done with very few hitches—other than a balky router "authentication" web application at my ISP. The addition of 5GHz WiFi is welcome (though my ISP is typically the bottleneck anyway), as is the availability of a guest zone. In fact, I haven't moved back to the old router, though I probably will at some point so that the WNDR3700v2 can be used for experiments without upending "Words with Friends" in the other room. The router is cheap enough that getting a second (or more likely a WNDR3800 at less than $150) to replace the WRT54GL is certainly a possibility. Though messing around with mesh networking between them might still result in spousal complaints.
Täht's 3.3.8-10 release announcement outlined the way forward (or a way forward) for CeroWrt. There is lots of work to be done, but the bufferbloat projects, including CeroWrt, are not funded, currently. That is clearly making it difficult for Täht to continue working on CeroWrt—at least to the level he would like. While it appears that there are lots of volunteers and companies helping out, the overall project maintainer role is languishing to some extent.
But, as he points out, all of the CeroWrt work is being pushed upstream to OpenWrt (and CeroWrt frequently merges back as well). The two projects are focused in different areas, but there is clearly some synergy between them, which is likely to help both. It is a bit unclear when a "stable" CeroWrt release might be forthcoming, but it is pretty usable in its current form. What it most needs, perhaps, is some developer time and, possibly, some funding.
Brief items
Distribution quote of the week
* The Debian package maintainer is dead, but nobody noticed it yet, and nobody has wanted an update badly enough to do an NMU or to adopt the package.
* The upstream release is actually a fake. It's a trojan, which was put there by the NSA in order to infiltrate the CIA mainframe. The Debian package maintainer noticed this and uploaded that version of the package to non-free instead of main, since the trojan code does not come with proper source.
* Upstream has moved the RSS feed for new releases without notifying the old feed of the move, so the Debian package maintainer missed that, and doesn't actually know about the new release. Due to a complicated series of happenstance involving rainbows, midget unicorns, and the ongoing rewrite of the Netsurf web browser, the Debian package maintainer is not able to find the new feed because it would require doing a web search and their browser doesn't have working form support now. No other browser is available on the Amiga they're using as their only computer, either.
* The new release is requested by insistent Hurd porters, and the Debian package maintainer absolutely loathes the Hurd, and will refuse to upload any packages that work on the Hurd.
* The Debian package maintainer suffers from mental problems cause by reading debian-devel too much, and now has a nervous breakdown every time they recognize a name as someone whom they've seen on the list.
* The Debian development process is being sabotaged by Microsoft sending people to the developers' houses pretending to be TV license checkers or Jehova's witnesses every time they detect, using the hardware wireless keylogger embedded in every PC, that the developer is trying to run any Debian packaging command.
* Apple is also sabotaging Debian by paying me to write snarky e-mails on Debian mailing lists to distract everyone from working on the actual release, so that we can get past the freeze and start uploading things again without having to worry that it breaks things in ways that makes the freeze longer.
Distribution News
Debian GNU/Linux
Debian's new draft trademark policy
The Debian project is attempting to rewrite its trademark policy to be "as free as possible" while still protecting the project's identity; project leader Stefano Zacchiroli has just announced a new draft for consideration. "The objective of this trademark policy is to encourage widespread use and adoption of the DEBIAN trademarks, styles, and logos (hereinafter ``trademarks'') while ensuring consistent usage which avoids any confusion in the mind of the users. The goal of this policy is to encourage use of the DEBIAN mark in commercial or non-commercial activity based around DEBIAN."
Bits from the nippy Release Team
The Debian release team reports that wheezy still has far too many RC bugs; "As mentioned in the freeze announcement [RT:FRZ], the number of RC bugs in wheezy is still significantly larger than would normally be expected at the start of a freeze. Please feel encouraged to fix a bug (or three) from the list [BTS:RC] to help get issues resolved in testing." They also tell us that Debian 8.0 (wheezy+1) will be known as "Jessie".
Fedora
New features for Fedora 18
The minutes from the July 30 meeting of the Fedora Engineering Steering Committee show than an impressive list of new features has been approved for the Fedora 18 release. New goodies in F18 will include Samba4, the GNOME2-based MATE desktop, the Linux Trace Toolkit next generation (LTTng), OwnCloud, the Federated Filesystem, and more.New Sponsor of Fedora Infrastructure
Colocation America is now a sponsor of Fedora Infrastructure with the donation of a server in their Los Angles data center. "We have put this server to use as a proxy and application server, so if you are going to any fedoraproject.org sites and you are in North America you will likely be accessing us from there."
Newsletters and articles of interest
Distribution newsletters
- Debian Project News (July 30)
- DistroWatch Weekly, Issue 467 (July 30)
- Maemo Weekly News (July 30)
- Ubuntu Weekly Newsletter, Issue 276 (July 29)
SUSE Linux powers 147,456-core German supercomputer (ars technica)
Ars technica has brief look at the world's fastest x86-based supercomputer and Europe's fastest supercomputer—not to mention the 4th most powerful in the world. The SuperMUC, which runs SUSE Linux, is located at the Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences. "A statement issued by SUSE says that the supercomputer has a unique cooling system inspired by human blood circulation that significantly reduces energy consumption. The supercomputer is reportedly designed so that some of the energy can be recaptured and used to heat buildings at the LRZ campus. The statement also says that the SuperMUC has 155,000 processor cores capable of delivering a total of 3 petaflops of processing power. A report on Slashdot indicates that the computer has 324 terabytes of memory."
Page editor: Rebecca Sobol
Development
Toward generic atomic operations
Modern Linux distributions support a number of different computer architectures. Each of these architectures has its own quirks and implementation differences that are largely abstracted by a clever collaboration between the kernel and system libraries, such as the GNU C Library (glibc). However, there are still some ways in which core architecture differences are exposed to higher-level software. One example of this is in the implementation of atomic memory operations.
Atomic operations are necessary to ensure programming correctness in those situations where there are multiple threads of simultaneous execution. (Atomic operations are even necessary on uniprocessor systems, where interrupts and asynchronous scheduling of other threads provide the illusion of multithreading.) Atomicity means that a given operation (such as incrementing a counter variable) takes place in an indivisible fashion; its result is either visible to all CPUs in the system instantaneously, or does not take place at all (it is similar in concept but less fashionable than transactional memory).
Atomic operations are typically fairly small, hand-optimized assembly functions that provide for atomic increment and decrement of counters, acquisition and release of locks, and so on. Since these operations differ from one architecture to another, typically few developers on any given project understand the different implementations in their entirety, and even fewer care to vouch for the code being correct across all supported architectures. Although there are generic implementations available in libraries such as pthreads, not all projects can make use of them, for a variety of reasons, including a desire to be portable to non-Linux platforms; thus a number of projects within the average Linux distribution still contain their own custom implementations of atomic operations.
Atomic operations are particularly useful on modern systems with many CPUs running multi-threaded applications, but even a system with a single CPU (core) has a need for them. After all, the Linux kernel may interrupt a running task thread "A" to service an interrupt routine, and may then schedule a different task thread "B" before returning to the one that was originally interrupted. Without a means to ensure certain operations have taken place atomically, there would be no way to cope with potential interference between task thread B and task thread A (e.g., if both threads race for the same lock or operate on the same variable).
How do atomic operations work?
Atomic operations fundamentally require underlying hardware support. There are, broadly speaking, two popular mechanisms used by CPUs in implementing support for atomic operations in modern computer architectures. The older, more traditional approach involves directly manipulating memory locations, for example, a compare-and-swap (or compare-and-exchange) instruction such as CMPXCHG on Intel's x86 and Itanium architectures. These instructions compare the value of a given memory location with a value supplied as part of the instruction. If the two values are identical, then yet another supplied value is written back to the memory location, while the overall result is signaled in the form of a returned value (almost universally in a register). This whole sequence takes place as a single processor instruction, for example by locking the processor local bus, and disabling external interrupts for its duration.
A more modern alternative to directly acting upon memory locations is to implement a reservation engine within the processor. A reservation engine (as used by modern RISC architectures such as ARM, POWER, etc.) is typically implemented under the control of two special processor instructions. The first instruction, often called load-with-reservation, load-exclusive, or load-link, atomically loads the value of a given memory location into a register and marks that memory location as reserved.
The loaded value can then be manipulated arbitrarily before a second instruction, often called store-exclusive, or store-conditional, atomically stores an updated value from a register back to a given memory location, provided that no modification has been made to that memory location in the interim. The store-exclusive operation returns a value indicating whether the store operation completed successfully or not, which is important because there is an opportunity for external interference between the load, modification, and subsequent store. This means that higher-level atomic operations built using these instructions typically involve a loop, retrying the entire operation until the store is successful.
Reservation engines are slightly more complex to work with in software (requiring two instructions and a comparison in a loop block), but they come with multiple benefits. Although compare-and-exchange appears simpler because it is implemented in a single instruction, it in fact causes poor performance in the CPU pipeline because multiple additional sub-stages are required for the implied memory operations. By contrast, the reservation engine approach explicitly separates memory reads and stores into multiple operations. A reservation engine can be implemented separately from the bulk of the core CPU logic, and can be as complex as desired (including necessary logic to synchronize with other reservation engines).
Some reservation engine implementations handle only a single memory location at a time on a given processor, while others are more complex. In every case, outstanding reservations are invalidated upon a context switch between running tasks (often as a result of a specific invalidation in the context-switch code). The reservation approach can also handle the "ABA problem"—that is, it can detect any changes to the target memory location after the atomic load, even if the original value is written back prior to the store, because the reservation engine is aware of all memory modifications.
The story doesn't quite end there. Some architectures lack full support for certain atomic operations that are required by higher-level software, such as atomic 64-bit (multiple word) load and store operations. In this case, there are workarounds (e.g. the "kuser" helper, a VDSO-like helper on older ARM processors), but that is a topic best saved for another article.
How are atomic operations used?
Atomic-operations libraries typically provide a set of functions that include incrementing and decrementing a memory location, compare-and-swap of a memory location, and higher-level operations built using these functions, such as lock acquisition and release. All of these various operations are built using the fundamental architecture-specific processor instructions of the kind described above. As an example, the OpenMPI message-passing library includes the following inline assembly code to implement an atomic 32-bit addition operation on version 7 of the ARM Architecture:
START_FUNC(opal_atomic_add_32) LSYM(13) ldrex r2, [r0] @ exlusively load address at r0 into r2 add r2, r2, r1 @ increment the value of r2 with value in r1 strex r3, r2, [r0] @ attempt to store the value of r2 at the address in r0 cmp r3, #0 @ r3 contains result from store exclusive, test if successful bne REFLSYM(13) @ repeat entire operation if it was interrupted mov r0, r2 @ return value that was written bx lr END_FUNC(opal_atomic_add_32)
This atomic increment function works by using the special ldrex and strex instructions, which control the CPU's reservation engine, to gain exclusive access to a desired memory location. The example code first loads the contents of a given memory location into a general-purpose register, adds a value to the register, and then tests the result of attempting to exclusively store this change back to memory. If it is successful, the function returns. If it is not successful, the operation is repeated until it completes without interference.
OpenMPI includes a custom atomic-operations library that implements support for 13 different base architectures. Some of those architectures have multiple ways to achieve the same thing, depending on which version is in use. For example, ARM processors have moved away from the deprecated SWP (compare-and-swap) instruction in favor of a reservation-engine-based approach. Both approaches need to be supported if code is to run on newer and older ARM processors. It is unfortunate that projects such as OpenMPI have needed to implement their own atomic-operations libraries, which must be periodically updated for new processors and are hard to maintain because they require special knowledge of multiple underlying architectures.
The C11 memory model
The main culprit for this state of affairs is the venerable C programming language. Traditionally, C had no explicit internal notion of multi-threaded applications, and only a very weakly ordered memory model. That is, it was hard to guarantee that the compiler would not reorder memory operations on shared variables because the language lacked the built-in constructs that are necessary to inform the compiler of such hidden data dependencies. Over the years, independent platform-specific libraries have provided support for general threading abstractions, including atomic operations performed on their own defined types. This is all well and good, but not all projects can rely upon such platform libraries for atomic operations, especially those that want to remain highly portable to non-Linux systems.
This is where C11 comes in. C11 introduces a new memory model explicitly designed with support for threading and atomic operations. It introduces the new standard header stdatomic.h, atomic integer types such as atomic_int (constructed using the _Atomic type qualifier), and a new memory_order enumerated type that defines various levels of memory ordering from the weakest memory_order_relaxed (no specific ordering requirement) through to memory_order_seq (sequentially consistent), the strongest ordering. Using the C11 memory model, the previous inline assembly can be reduced to defining an _Atomic typed variable and using one of the atomic fetch-and-modify generic functions, such as atomic_fetch_add().
Here is an example of using the C11 defined atomics:
#include <stdatomic.h> _Atomic int magic_number = ATOMIC_VAR_INIT(42); // can also use _Atomic(int) atomic_fetch_add(&magic_number, 5); // make Star Trek fans happy
This defines and initializes a new variable called magic_number to the value 42 (using ATOMIC_VAR_INIT()) before correcting that value for the true answer to the ultimate question of life, the universe, and everything, which, as everyone knows, Star Trek correctly defined to be 47. Using the new C11 extensions, projects such as OpenMPI do not need to implement their own atomic-operations library, because the underlying language now provides the necessary support, already optimized for each target architecture.
There is, however, at least one little problem with rushing to embrace C11. As of this writing, GCC and glibc do not yet have full support for the new atomic types. This is slated to be added in the GCC 4.8 time frame. (The glibc maintainers are aware of the topic, and plan to incorporate support once it is available in GCC.) Meanwhile, GCC 4.7 gained support for a new set of built-in functions to provide memory-model-aware atomic operations that were designed specifically to meet the requirements of the C11 memory model. The idea is that the higher-level C11 atomic primitives can be easily built using these built-ins in time for GCC 4.8. In the meantime, there are several alternative options. One of these is to use a third-party macro-based implementation of the C11 types (which already use the GCC built-ins), of which several exist.
Another option prior to broader C11 support being available is to use the new GCC built-ins directly. For common operations, such as atomic increment, the GCC 4.7 built-in atomic functions look very similar to those that form part of the broader C11 standard:
__atomic_store_n(&v, 42, __ATOMIC_SEQ_CST); __atomic_add_fetch(&v, 5, __ATOMIC_SEQ_CST);
The OpenMPI atomic-add example code could thus be replaced with a single call to __atomic_add_fetch(), which will atomically fetch a value from a memory location, add a supplied value, and return the result, doing the right thing for every supported architecture. Compiling the example and disassembling it will (unsurprisingly) produce a sequence of operations that appears very similar to the inline assembly it replaces. Of course, one does need to be careful in using the GCC built-ins directly because they do not require the use of variables with an _Atomic type qualifier. This means that it is possible to mix the use of atomic functions with manipulations of regular variables without triggering any compiler warnings. Still, this is no different than existing code failing to use a call to a special inline assembly function, which is also incorrect.
In time, it is the hope of this author that most projects implementing custom inline assembly for atomic operations can move to a standard C11 based implementation using stdatomic.h. That would be both portable to many different platforms, and easier to maintain by distributions and upstream projects themselves, because a specialist knowledge of the architecture specifics can be abstracted by the compiler. (Note, however, that projects may need to continue to support legacy approaches to atomic operations, if they want to continue supporting old compilers.) You can read more about the new C11 atomic operations (including the details of the memory model and its orderings not covered in depth here) in section 7.17.7 of the final draft version of the C11 specification [PDF].
Brief items
Quotes of the week
New Cygwin Package: python-3.2.3-1
Cygwin Python 3.2.3-1 is now available. This is the first official release of the package supporting Python 3.Edda log visualizer released
The first release of Edda, a log visualizer for MongoDB, has been made available. "MongoDB servers generate some pretty substantial log files. These lengthy logs are one of the more important tools we have for diagnosing issues with MongoDB servers. However, correlating logs from multiple servers can be time-consuming. Enter Edda, a log visualizer for MongoDB." This release focuses on visualizations of replica sets, with more features planned for the future.
Newsletters and articles
Development newsletters from the last week
- Caml Weekly News (July 24)
- What's cooking in git.git (July 30)
- Haskell Weekly News (July 25)
- Perl Weekly (July 30)
- PostgreSQL Weekly News (July 29)
- Ruby Weekly (July 26)
KDE Release 4.9 – in memory of Claire Lotion
KDE has released version 4.9, providing major updates to KDE Plasma Workspaces, KDE Applications, and the KDE Platform. "This release is dedicated to the memory of KDE contributor Claire Lotion. Claire's vibrant personality and enthusiasm were an inspiration to many in our community, and her pioneering work on the format, scope and frequency of our developer meetings changed the way we go about implementing our mission today. Through these and other activities she left a notable mark on the software we are able to release to you today, and we are grateful for and humbled by her efforts."
OpenStreetMap bot removes waypoints after licensing change (The H)
The H writes about changes in OpenStreetMap (OSM) data. The title is a little misleading as the licensing change hasn't actually happened yet, but OpenStreetMap is preparing for it by removing data from people that did not consent to the change. "The reason for the licensing change is that the current Creative Commons licence is largely inapplicable to collections of data such as the OpenStreetMap mapping database. The Open Database licence has been developed to resolve this problem. Like the Creative Commons licence, it is a share-alike licence, meaning users must return any improvements or changes to the data to the community." The removal is said to be "
barely noticeable in many places" but there have been some complaints in the OSM community.
Otte: staring into the abyss
On his blog, Benjamin Otte has some observations and criticisms of the GNOME project. He outlines a number of problem areas including understaffing, a loss of market and mind-share, and a lack of clear goals. "In fact, these days GNOME describes itself as a “community that makes great software”, which is as nondescript as you can get for software development. The biggest problem with having no goals is that you can’t measure yourself. Nobody can say if GNOME 3 is better or worse than GNOME 2. There is no recognized metric anywhere. This also leads to frustration in lots of places."
Page editor: Nathan Willis
Announcements
Brief items
FSFE wants to better protect free software licenses from bankruptcy
The Free Software Foundation Europe is working on an interesting problem: what happens to free software licenses when the rights holder goes bankrupt? The organization currently is pushing a change to German bankruptcy law in particular: "The clause ensures that Free Software licensing model would not be negatively affected by a bankruptcy of a licensing rights holder. It makes it clear that any offer to grant Free Software license made before the licensor's bankruptcy can be accepted by anyone even after the bankruptcy proceedings started."
Articles of interest
Free Software Supporter -- Issue 52, July 2012
The July edition of the Free Software Foundation's monthly newsletter covers the winner of the Restricted Boot webcomic contest, an update to the Guide to DRM-free Living, Compliance Lab, the solution to Posner's patent problem, a 5-part interview with Richard Stallman on Restricted Boot, and several other topics.
New Books
Think Like a Programmer--New from No Starch Press
No Starch Press has released "Think Like a Programmer" by V. Anton Spraul.Learning Rails 3 -- New from O'Reilly Media
O'Reilly Media has released "Learning Rails 3" by Simon St. Laurent, Edd Dumbill and Eric J Gruber.
Calls for Presentations
PyCon UK Call for Papers
PyCon UK will take place September 28 - October 1, 2012 in Coventry, West Midlands, UK. The call for papers is open until August 14. "If you would like to share your expertise, tell us your horror stories or pimp your project, please consider giving a talk at PyConUK."
Call for Papers: PyHPC 2012
PyHPC will take place November 16 in Salt Lake City, Utah. The call for papers closes September 14. It will be held in conjunction with the International Conference for High Performance Computing, Networking, Storage and Analysis (SC12).
Upcoming Events
LPC microconference topics announced
The Linux Plumbers Conference (August 29-31, San Diego) has posted a detailed agenda showing the topics to be covered this year. "As you can see we have a wide range of issues to tackle and this year’s Linux Plumbers Conference is shaping up to be a great event." The early registration period is also about to come to an end.
LPI Forum in Warsaw, Poland
The Linux Professional Institute (LPI) and its affiliate, LPI-Central Europe, will host a forum for Linux professionals on September 28, 2012, in Warsaw, Poland. "The forum will feature speakers on a number of technology subjects including Free and Open Source Software solutions, professional skills development, IT innovation and entrepreneurship, lifelong learning, Linux certification and the workforce development of Linux and Open Source professionals."
Events: August 2, 2012 to October 1, 2012
The following event listing is taken from the LWN.net Calendar.
Date(s) | Event | Location |
---|---|---|
August 3 August 4 |
Texas Linux Fest | San Antonio, TX, USA |
August 8 August 10 |
21st USENIX Security Symposium | Bellevue, WA, USA |
August 18 August 19 |
PyCon Australia 2012 | Hobart, Tasmania |
August 20 August 22 |
YAPC::Europe 2012 in Frankfurt am Main | Frankfurt/Main, Germany |
August 20 August 21 |
Conference for Open Source Coders, Users and Promoters | Taipei, Taiwan |
August 25 | Debian Day 2012 Costa Rica | San José, Costa Rica |
August 27 August 28 |
XenSummit North America 2012 | San Diego, CA, USA |
August 27 August 28 |
GStreamer conference | San Diego, CA, USA |
August 27 August 29 |
Kernel Summit | San Diego, CA, USA |
August 28 August 30 |
Ubuntu Developer Week | IRC |
August 29 August 31 |
2012 Linux Plumbers Conference | San Diego, CA, USA |
August 29 August 31 |
LinuxCon North America | San Diego, CA, USA |
August 30 August 31 |
Linux Security Summit | San Diego, CA, USA |
August 31 September 2 |
Electromagnetic Field | Milton Keynes, UK |
September 1 September 2 |
Kiwi PyCon 2012 | Dunedin, New Zealand |
September 1 September 2 |
VideoLAN Dev Days 2012 | Paris, France |
September 1 | Panel Discussion Indonesia Linux Conference 2012 | Malang, Indonesia |
September 3 September 8 |
DjangoCon US | Washington, DC, USA |
September 3 September 4 |
Foundations of Open Media Standards and Software | Paris, France |
September 4 September 5 |
Magnolia Conference 2012 | Basel, Switzerland |
September 8 September 9 |
Hardening Server Indonesia Linux Conference 2012 | Malang, Indonesia |
September 10 September 13 |
International Conference on Open Source Systems | Hammamet, Tunisia |
September 14 September 16 |
Debian Bug Squashing Party | Berlin, Germany |
September 14 September 21 |
Debian FTPMaster sprint | Fulda, Germany |
September 14 September 16 |
KPLI Meeting Indonesia Linux Conference 2012 | Malang, Indonesia |
September 15 September 16 |
Bitcoin Conference | London, UK |
September 15 September 16 |
PyTexas 2012 | College Station, TX, USA |
September 17 September 19 |
Postgres Open | Chicago, IL, USA |
September 17 September 20 |
SNIA Storage Developers' Conference | Santa Clara, CA, USA |
September 18 September 21 |
SUSECon | Orlando, Florida, US |
September 19 September 20 |
Automotive Linux Summit 2012 | Gaydon/Warwickshire, UK |
September 19 September 21 |
2012 X.Org Developer Conference | Nürnberg, Germany |
September 21 | Kernel Recipes | Paris, France |
September 21 September 23 |
openSUSE Summit | Orlando, FL, USA |
September 24 September 25 |
OpenCms Days | Cologne, Germany |
September 24 September 27 |
GNU Radio Conference | Atlanta, USA |
September 27 September 29 |
YAPC::Asia | Tokyo, Japan |
September 27 September 28 |
PuppetConf | San Francisco, US |
September 28 September 30 |
Ohio LinuxFest 2012 | Columbus, OH, USA |
September 28 September 30 |
PyCon India 2012 | Bengaluru, India |
September 28 October 1 |
PyCon UK 2012 | Coventry, West Midlands, UK |
September 28 | LPI Forum | Warsaw, Poland |
If your event does not appear here, please tell us about it.
Page editor: Rebecca Sobol