LWN.net Logo

LWN.net Weekly Edition for January 24, 2013

Fifteen years of LWN

By Jonathan Corbet
January 23, 2013
Sometime around mid-1997, your editor and Liz Coolbaugh were discussing ways in which one might go about starting a business around the Linux operating system. As we watched the Unix wars and awaited the arrival of the universally-proclaimed operating system of the future (a thing called "Windows NT"), we realized that Linux was the best hope there was for those who wished to avoid a future dominated by proprietary, closed systems. Linux seemed to be going somewhere in a time when Unix-based systems as a whole were beginning to struggle, and we wanted to help it get there. And besides, working with Linux was a lot of fun.

The idea we went with was to form a support company joined into Red Hat's ill-fated support partner network. But how were we going to attract customers — and keep busy while waiting for those customers to show up? The idea we came up with was simple enough: start a web-based newsletter to help the world keep up with what was happening in the insanely fast-moving Linux world (the linux-kernel list sometimes carried a shocking 100 messages in a single day back then) and, at the same time, inform that world of just how clever and on top of the situation we were.

So that is what we set out to do. The first LWN.net server went on the net in January, 1998, though we would not acquire that domain until much later that year. It ran on an old machine in your editor's basement and served its content over a single ISDN line. We published the January 22 issue when we had something that looked somewhat reasonable, thus establishing the Thursday publication cycle without any conscious thought on the matter. One week later, with a second issue written (headlined by the timely announcement that the Netscape browser would be open-sourced), we sent a message to the comp.os.linux.announce newsgroup telling the world of our existence, and life was never the same thereafter.

Like many business plans, ours failed to survive contact with the real world; a number of its successors fared no better. But, through it all, we kept LWN going. It didn't take long for the ISDN line to prove inadequate, even on a site with almost no image content at all. Linux began to take off for real as it led the final wave of the dotcom boom; LWN's readership rose with it. Eventually we realized that, while our various business schemes never seemed to get far, people were always interested in LWN. Far too late, we figured out that, perhaps, LWN was the business we'd been trying to build all along.

So, toward the end of 1999, we set ourselves to that task in earnest. Our long-suffering readers have heard much about our ups and downs over the years, but, by one obvious metric, LWN is a success: fifteen years after that first issue, LWN.net is still here. There is no shortage of work to do or things to improve, but somehow we seem to have found a way to do enough right to stick around.

We have watched Linux grow from a "hobbyist" system that few took seriously into the platform on which much of the world's computing is based. When we started, the number of people paid to work on Linux could perhaps have been tracked efficiently with an eight-bit variable; now it would be hard to even begin to guess how big the Linux employment market is. We have seen companies try to FUD Linux out of existence; others have tried to claim ownership of it. And we've seen Linux survive these challenges and more; Linux, too, is still here.

When LWN started, the community had no real idea of how to run a free software project involving hundreds or thousands of people. Those that tried often ran into trouble; the kernel process choked several times while others, like the project to make a workable browser out of the Netscape code, often seemed on the verge of collapsing under their own weight. The evolution of our software over the last fifteen years has been impressive, but the evolution of our community is doubly so. We can now take on projects that seemed unattainable even in the middle of dotcom boom optimism.

Fifteen years ago, we were a small, youthful band that thought it could change the world and have fun in the process. It is fair to say that both objectives were achieved nicely. Now we are numerous, older, professional, and tightly tied into the market economy; the wild-west days are mostly behind us. There will be plenty of work to do on Linux for a long time, but one might well ask: are our days of changing the world done?

The answer to that question is almost certainly "no." We have, at this point, succeeded in the creation of a large body of software that is not under the control of any one person or company. That software now forms the platform used for the growing swarm of ubiquitous devices; as these devices get smaller and cheaper, they will only become more prevalent. We have established the expectation that the code for these devices should be available and free, and we have promoted the idea that the devices themselves should be open and hackable. But we have not yet fully created the basis for free computing and, with it, a more free society. There is a lot of work to be done yet in that area.

When LWN got its start, our community's objective was simple, create a freer, better Unix. We have long since crossed that one off the list; now we need a better operating system for the devices — and the challenges — of the future. The problem is that we don't yet know what that operating system needs to look like. Unix embodies a great many solid design principles, but a system that was designed for slow terminals on minicomputers cannot be expected to be ideal for a phone handset, much less for hardware that we cannot yet envision. The system must evolve, perhaps in ways that cause it to diverge considerably from its Unix roots. Guiding that evolution without fragmenting our community or losing our focus on freedom will be one of our biggest challenges in the coming years.

The next fifteen years, in other words, promise to be just as interesting as the last fifteen were; here at LWN, we plan to continue to be a part of our community as those years play out. LWN, too, will need to evolve to best meet the community's needs, but, like Linux, we will evolve while keeping that community's values at heart. Thousands of you, the best readers one could possibly ask for, have sustained us for these years and helped to keep us honest. It is our hope to serve all of you even better in the coming years. It has been quite a ride; thank you all for letting us be a part of it. We are looking forward to seeing where it takes us next.

Comments (43 posted)

LightZone reborn as free software

By Nathan Willis
January 23, 2013

One of the first high-quality raw photo editors available for Linux desktops was LightZone, but although it was (initially) free of charge, it was a proprietary product. Unfortunately the small company behind it eventually folded, and both the free and paid versions went away, as did the updates required to support newer cameras. The company shut its doors for good in 2011, but the software has made a sudden—and unexpected—comeback as an open source project. Fans of the original will be pleased, but the nascent effort still has considerable work ahead before it grows into a self-sustaining community project.

Flashback

LightZone was launched in mid-2005, first for Mac OS X, followed a few months later by Windows. But the application was written in Java, and in 2006 a developer at parent company Light Crafts began building it for Linux as well, posting the bundles (with permission) on his personal web site. The response was positive enough that Light Crafts soon began providing LightZone for Linux as an official release—one which, unlike the offerings for proprietary operating systems, was free. Perhaps that situation was bound to change (after all, there was evidently money to be made), and Light Crafts did eventually start charging for licenses on Linux, too.

But 2006 was also the year that resident 800-pound gorilla Adobe dove into the raw photo editor space with Lightroom and Apple's Aperture (which had been around in less-feature-filled, 1.0 form since 2005) really took off. Before Apple and Adobe entered the market, many small companies offered raw photo converters, but the heavyweights captured market share quickly. New point releases of LightZone continued to arrive, but with few major additions to the tool set. The last new version was LightZone 3.9, released in early 2010. Light Crafts shut down in 2011.

[LightZone regions]

But the application's fans were still there; users seemed especially fond of LightZone's unique tools, which offered editing options not found in competing applications. These included an exposure tool designed around Ansel Adams's zone system and the ability to apply adjustments to one part of an image only by outlining regions directly on the canvas—plus a general reputation for ease-of-use. A user community emerged at the site LightZombie.org, providing updated versions of the dcraw library (on which LightZone's raw file decoding functionality depends), support files for new camera models, and (after the Light Crafts site went offline) Internet Archive links to the installer packages. Customers who had purchased a license key could still install and activate the archived packages, or use the built-in 30-day trial period.

Reboot

After Light Crafts closed up shop, visitors to the LightZombie site began lobbying to have the source code released. The site's administrators discussed the idea privately with former Light Crafts executives, but never made any progress—until December of 2012, when LightZombie's Doug Pardee posted a cryptic announcement that "In a few days, the LightZombie Project will be replaced by something much grander." There were other hints that the code might be released after all, such as the announcement that Anton Kast, the developer who had made the initial Linux port while at Light Crafts, had joined the project.

On December 22, Kast announced that he had convinced the rights holders to release the source code, and made it available at GitHub. Initially Kast made a direct import of the 3.9.x codebase, complete with the license-key-activation modules, without any documentation, and designed for the proprietary build system in use at the company. The LightZombie site was renamed LightZoneProject.org, and maintainers put out a call for volunteers in January, to which several Linux and Windows developers responded.

In the weeks since the initial release, the focus has been on getting the application to build and install successfully with free tools. The commercial product was packaged for distribution with Install4J, although as Kast pointed out on the developers' discussion forum (which at the moment seems to require membership in order to view messages ... ) that there may be little real need for an extra packaging layer, since wrapper scripts were used to launch the application on all three platforms. The IzPack tool was suggested as a plausible open source replacement, although so far it remains an open topic of discussion.

[LightZone image browser]

A bigger issue is the version of Java required. The commercial 3.9 release bundled its own version of Sun's Java 1.6, which was already out of date when Light Crafts ceased operations. It also relied on several Sun image processing classes that are no longer available, and some classes imported from Java Advanced Imaging (JAI) that were not part of the official JAI release at the time of 3.9's development. In addition, some Linux developers expressed an interest in getting the application to run on OpenJDK since it is the default on several major Linux distributions.

Over the following two weeks, though, the developers managed to successfully replace the Sun classes with modern equivalents, and used Apache Ivy to automatically pull in a current version of JAI at build time—a strategy employed by other open source projects. For now, Pavel Benak's branch is the focus of development, and the Linux port currently builds on Ubuntu and Arch Linux, either with OpenJDK 6 or 7 or with Oracle's Java 6, 7, or 8. The Windows build is reported to be working as well, albeit only with Oracle's Java 6. The Mac OS X platform, however, has seen no development so far due to a lack of volunteers.

Let there be zones

As advertised, the codebase on Github is essentially unchanged since the last commercial release of LightZone. Pardee has updated the dcraw library and support files, so newer cameras are supported, but the application still asks for a license key at start-up. However, the 30-day trial period is still enabled as well—a time period that can be reset at will.

The majority of the tools will feel familiar to anyone who has used another raw photo editor; just like the competition, LightZone allows the user to stack together a string of image adjustments by adding them to a list on the right hand side of the window. But LightZone does offer some tools not found in other open source photo editors. One example is ZoneMapper, the Ansel Adams–inspired tool mentioned earlier. Adams's "zones" are essentially ten stops between absolute black and absolute white. ZoneMapper presents a rectangle with ten handles on it corresponding to each of the zones, the user can drag the cut-off points up or down and the zones on either side are compressed or expanded as a result. The same effects could be performed with the traditional Levels or Curves tools, but ZoneMapper is much easier to use.

[LightZone ZoneMapper]

I stopped using the proprietary version of LightZone when it was no longer free (there were, after all, free software alternatives). As a result there are several new features that were new to me, although commercial LightZone customers will find them familiar. One of these is the Relight tool, which automatically brightens underexposed portions of the image. This, too, is an operation that can be done by hand with other tools, but what makes it worth mentioning is that it works quite well without manual intervention.

Not everything in LightZone is perfect; for example, the noise reduction tool crashed the application every time I tried it. Still, it is encouraging to see how well LightZone works considering that the open source project is barely a month old. End users may wish to wait until stable packages are available, but LightZone will hold its own against Rawstudio, UFRaw, Darktable, and RawTherapee. Developers from those competing projects may find the source to be interesting reading as well; in addition to the unusual tools, LightZone enjoyed a reputation for high-quality grayscale conversion and for good tool presets.

The road ahead

Obviously working out the bugs is important, but LightZone as a self-sustaining project has a more difficult task ahead of it in other areas. For starters, the project needs to formally decide on a license. The copyright document in the source tree was imported with the rest of the code; it bears a short, BSD-like copyright statement from Light Crafts' founder Fabio Riccardi and a 2011 date, but the project will need to make this clear. Moving forward, as Tex Andrews posted on the new project site, the group will have to start work on documentation, translations, and discuss "certain organizational issues that now confront us."

Andrews and Pardee, who managed the LightZombie project for more than a year, have cultivated an enthusiastic user base. That will supply the new project with momentum, but it does not guarantee that it will thrive. Keeping a community project alive takes considerable effort, as just about anyone in free software will attest. At the moment, the volunteers have updated dependencies and repaired the build system in short order, but the real work of refactoring the code and improving it has yet to start (case in point being the license-key activation, which should be straightforward enough to remove).

Nevertheless, the new project is a rare gift to software users. Many a small commercial application has vanished for financial reasons and had its disappointed users lobby for the release of the source code. Most of these appeals are unsuccessful. But these lobbying efforts have the potential to turn into successes—consider Blender, for instance. LightZone will at least get a second chance to win over users, which is an opportunity few applications see.

Comments (10 posted)

A brief survey of Linux audio session managers

January 23, 2013

This article was contributed by Dave Phillips

Most of my music and sound work requires only a few applications. If I'm recording a song, I use a digital audio workstation (DAW) with a few plugins launched from within the sequencer, and if I make something in Csound, it's usually done completely in the AVSynthesis environment. The DAW and AVSynthesis programs are self-contained; they need no external software to store and recall their internal and external connections, plugin states, MIDI controller settings, and so forth. When I work in those environments, I know that I can open a previously saved session and have all former connections, states, and settings restored to their last conditions.

Recently I've been experimenting with a more heterogeneous workspace with several applications connected in a modular system. The JACK server allows flexible inter-application data exchange, and some powerful setups can be configured with the many JACK-compliant sequencers, synthesizers, and plugins. Unfortunately, the reconnection and reconfiguration of a complex modular system becomes a real problem when you've opened a DAW — with plugins, of course — and half a dozen standalone synthesizers are active with an equal number of standalone audio processors, each with its own internal settings and configuration details. The resulting maze of connections and configurations takes time to re-establish, valuable time that could be better spent on making music. Fortunately, help is here, thanks to Linux audio session managers.

A session manager (or "session handler") is a utility that stores and recalls every aspect of your audio production session. Typically each session has a unique identifier, so a click or two can reload, reconfigure, and reconnect all the pieces of your session exactly as you saved it. For smaller sessions, a manager may be a neat convenience, but for large-scale sessions with dozens of components the session manager is indispensable.

Technicalities

Session managers work in a few different ways, though all do the same things. Some can only manage applications that are JACK-compliant or that subscribe to a particular session protocol; others are more accommodating. Most managers include a graphical interface, but, since this is Linux, there are non-graphical solutions too. In fact, even the bash shell makes it easy to write a script that will relaunch an entire session. Such scripts are powerful and effective, but interaction is limited and it may be difficult to monitor and store state changes. Changes to your basic setup configuration may require additions to the script outside the session.

Session management protocols are designed to handle these matters. A basic session handler should be able to restore a session exactly as the user left it, but the handler must deal with a mash-up of harsh realities, including the internal and external states of host and clients (with display positions and sizes), the instantiation of all active plugins with their selected presets and internal edits, the loading of all necessary files used by the project, and the connections between the components in the system, including audio and MIDI I/O, track and bus assignments, and any other routing considerations. The session manager must be toolkit- and distribution-independent and it must handle the full variety of plugin and sound file formats. It should also be indifferent to the audio backend, but it should be aware of system services such as D-Bus. Finally, the manager should provide robust error reporting when things go wrong.

Session managers typically provide a library of services and a daemon to monitor and report process changes. An application that is linked to the manager's library can be monitored by the session daemon and update its state data for saving and recall. Obviously, the efficiency of any session management software depends on the services provided by the library and the client's implementation of the library services (if any).

In this article we'll look at the current crop of session managers for Linux audio production. We begin with a little history to understand how we've come to the present state of the art.

From LADCCA to LASH

The first attempt at Linux audio session management appeared as LADCCA, the Linux Audio Developer's Configuration and Connection API. Developer Bob Ham's LADCCA Reference Manual indicates that, even at the earliest stage, developers were aware of the basic problems and issues. LADCCA's development group expanded and the project eventually became LASH, the Linux Audio Session Handler. Focus remained on managing applications written for JACK and/or plain ALSA, with expanded services via D-Bus, improved signal handling, backward compatibility with previous APIs (including LADCCA), and other amenities. LASH developer Dave Robillard's LASH Manual provides more details for programmers. LASH became a popular solution for complex sessions and is still found in many software repositories.

On to LADISH

LADISH — the Linux Audio Developers (Integrated? Interesting? Indefatigable?) Session Handler — is the current successor to the LADCCA/LASH lineage. The project goals remain JACK-centric, with sophisticated client identification according to "levels" that define the detail of state storage and recall. D-Bus and jack-dbus are required, and the LASH and JACK Session APIs are supported.

The concept of levels is neatly summarized in the LADISH documentation, quoted below:

There are four levels of interaction between an application and the session handler.
  • Level 0 — JACK application is not linked to a session handling library (liblash, libladish). User has to save application projects manually or rely on autosave support from application.

  • Level 1 — JACK application is not linked to a session handling library (liblash, libladish). Application saves when a UNIX signal is received.

  • Level 2, LASH — JACK application is linked to liblash. Limited interaction with session handler because of liblash API limitations.

  • Level 2, JACK Session — JACK application uses JACK Session. Limited interaction with session handler because of jack-session API limitations.

  • Level 3 — JACK application is linked to libladish. Full interaction with session handler. Query room virtual ports.
L0 and L1 are implemented since ladish-0.2. L2 is implemented since ladish-1, both LASH and JACK Session variants. L3 is still not implemented.

The use of these levels is an attempt to sort and regulate the various possible conditions for any Linux audio application. Those conditions include the degree of JACK compliance, any WINE or DOS requirements, network operation, the multiplicity of existing APIs, and so forth.

Like the original LASH project, LADISH includes a GUI (called gladish) for configuring LADISH and your session management preferences:

[gladish interface]

Gladish works perfectly, providing a neat concept of virtual Studios, Rooms, and Projects for organizing your session components. As an alternative, users of the KXStudio system can also choose to use its excellent Claudia utility, a custom-made frontend for the LADISH System:

[Claudia]

A "getting started with LADISH" tutorial is on-line at the LADISH Wiki. Further excellent tutorial material can be found at Lampros Liontos's wonderful Penguin Producer web site.

The Non Session Manager

The Non Session Manager is a component in the Non* suite of audio/MIDI software by Jonathan Liles. It can be used as a standalone program without the rest of the suite, but you will need to install the core NonToolKit library first. Check your distribution's software repositories for an installable Non suite. If it hasn't been packaged for your system, you'll have to visit the Non Web site and follow the instructions there regarding access to the project's Git repositories. The suite is easy to compile, with only a few dependencies, all commonly found in any modern Linux distribution repositories.

After installing the program it's ready to run. The command-line invocation takes two possible options, one to specify the URL of the Non Session Manager daemon (nsmd) and one to indicate a particular root path for your sessions. When the session manager starts, it requires no further configuration.

[Non session manager]

The Non Session Manager is easy to use. The top button bar exposes most of its functions, though there are a few hidden features. The "New" button starts a new session, "Add Client" adds an application to the session, "Save" saves the session, and "Close" closes it. The "Open" button opens an existing session, or you can click on the session name in the list box, where we encounter hidden feature number one: clicking on the session name will immediately close the running session and load your selection, so be sure to save your work first. The "Duplicate" button copies a running session as a template for later similar sessions.

When you add a client to a session you might discover another hidden feature, the nsm-proxy client. This convenient feature lets you add clients built without Non Session Manager or other session management support, though the addition is limited to opening and closing the particular client. If a client has LADISH level-1 support (or another method of saving its internal state), you can also select a "Save State" signal, though the Non Session Manager may not directly handle the state data.

Yet another hidden feature: When adding clients you can add a special client named JACKPatch that will store your JACK audio and MIDI connections. You can save a JACK connection map as a Patchbay in QJackCtl, but of course you need to be running QJackCtl. The Non Session Manager JACKPatch client provides a handy means of storing and restoring your connections without an external application.

The Non Session Manager also has features of interest to users of networked audio systems. Its documentation states that the Non Session Manager is the only session manager capable of saving and restoring session configurations of networked machines. Alas, I was unable to test a networked session, so I must refer the reader to the Non Session Manager Web site for more information regarding the handler's implementation over a network.

Of the session handlers I tested, the Non Session Manager gets top honors for features, flexibility, ease of use, and transparency. It does its job, otherwise it stays out of the way. It doesn't interfere with other processes, and it can be reconfigured interactively during a session. The Non Session Manager isn't perfect — I'd like to have a more direct method of entering nsm-proxy clients, and I might like to be able to rename sessions, but these are minor issues. Otherwise the Non Session Manager is highly recommended Linux audio software.

QJackCtl

The image below shows QJackCtl's JACK Session manager at work.

[QJackCtl session manager]

If your modular configuration includes only JACK-compliant applications and you use QJackCtl to manage connections, you might consider using it to manage your sessions as well. Operation is simple enough — just click on the Session button and follow the prompts for saving your work. The manager saves all aspects of its clients and restores them exactly. Incidentally, Ardour subscribes to the JACK session management protocol, making it a simple matter to include its considerable powers into a modular setup.

Programmer's documentation for the JACK session API can be perused at jackaudio.org. For normal users, the Tango Studio project has provided some excellent documentation on its helpful JACK Session tutorial page.

Chino

In my research for this article, I discovered Chino, a scripted session manager with no ties to previous management protocols. According to its author, Chino is "... a framework and toolset to build and manage a meta-application consisting of the user's favorite modular JACK audio and MIDI tools, each started and interconnected in predefined ways." Chino's developer is rather modest about his project, but I found it to be a well-designed tool that may be just the item you need if the "heavier" solutions don't suit your project requirements and you're comfortable with the Linux command line.

Installation and configuration are covered in Chino's documentation. No unusual dependencies are required, but you'll need the jack_snapshot utility to store and recall your JACK audio and MIDI connections. You can also save an SVG graphic display of your session's status. You'll need the Xsvg viewer to view it from within a Chino session, though any SVG-capable viewer can be used to view the image outside the session.

Chino defines a session in a method/application hierarchy. According to the documentation, methods are "categories for applications that are to be handled in similar ways. Every application belongs to one method." Following that definition, the documentation informs us that applications can be "(almost) anything... one or more (audio)-programs, audio/MIDI hardware or netjack ports — many things can be crammed into an application." In practice it's all simple enough. Start with this command to start a new session:

	chino -n newsession

Follow the prompts, and be sure to use the h (Help) and w (Write) options liberally. You'll find your way around Chino quickly enough.

If your saved session is named newsession.sdef you'll reopen it with this command:

	chino -o newsession.sdef

Chino is easy to use and does its job quickly and efficiently. However, it is still in development, with a few rough edges yet to polish. I created a test session from its default settings for the amSynth and Yoshimi synthesizers, with normal audio/MIDI connections made with QJackCtl. I saved the session, but, on closing, amSynth had to be manually stopped. On reopening the session, the synthesizers appeared as expected, but only amSynth's connections were restored. I must emphasize that this is a single test case and in no way stands as a general estimation. I'll need to spend some more time with Chino — there's much more to it than what I've presented, and it is definitely worth watching.

Summing the signals

When I started writing this article I had little experience with session managers. I'm still no expert, still having a lot to learn about these projects, but I'll hazard an opinion or two. Currently I prefer the Non Session Manager for the greatest number of my projects. The LADISH-based handlers are advised for sessions with only JACK-aware applications, though QJackCtl's session handler may be preferred when using that utility. Chino is available for command-line users, as is the humble and powerful bash scripting language.

Modular music systems are difficult to configure and restore, but by definition what is difficult is not impossible. Apparently the developers of these programs and utilities have a similar attitude towards difficult things, and I'm happy to report the ever-improving states and conditions of their projects.

Comments (2 posted)

Page editor: Jonathan Corbet

Security

HTTPS interception in Nokia's mobile browser

By Jake Edge
January 23, 2013

When using encrypted communication, users are at the mercy of the software that implements the cryptography. That generally works out reasonably well; users are only exposed to inadvertent bugs present in the code. But a recent report shows that sometimes using encryption may not actually result in more secure communication—such security depends on having tools that are actually trying to do what is expected of them.

When a user visits an HTTPS site, they expect their browser to use an encrypted connection between it and the web site. Truthfully, many users are not technically sophisticated enough to understand that, but they have been (hopefully) trained to trust in the "lock" icon or other user interface elements that indicate a secure connection. Whether the user knows that means "encryption" or not depends on their level of technical savvy, but they almost certainly don't expect their secure data to be sent to a third-party server. But that's evidently what Nokia's Xpress mobile browser has been doing.

HTTPS traffic is encrypted using keys that get exchanged between the destination server and client browser. A public key is contained in a server certificate that is signed by someone—typically a certificate authority (CA). The signature asserts that the key belongs to that server name. The public key is then used to encrypt and exchange session keys that are subsequently used to encrypt the session. The CA is integral to the web browser trust model; keys that don't validate under that model (e.g. keys signed by unknown or untrusted CAs, server names that do not match, etc.) are expected to cause some kind of alert from the browser.

So it came as something of a surprise to security researcher Guarang Pandya that both regular HTTP and encrypted HTTPS traffic were being re-routed when using the Xpress browser. Worse yet, the certificate presented for any site visited was not that of the site in question, it was, instead, an ovi.com certificate. Ovi is Nokia's "brand" for its internet services.

From some angles, this looks like a classic "man-in-the-middle" attack, but because the browser is complicit, Steve Schultze of the "Freedom to Tinker" blog calls it a "man-in-the-client". The man in the client is accepting a certificate for a Nokia proxy server instead of the site the user wanted to connect to, without notifying the user. Meanwhile, the man in the middle lives at the Nokia proxy server, which is making a connection to the desired destination.

The proxy is used to speed up mobile browsing by using compression. It is similar to what is done by the Opera Mini browser, which Pandya also noted in his first report. But, Nokia was also using the proxy for HTTPS traffic, which meant that it was decrypting the incoming stream at the proxy and re-encrypting it, using the real destination's key, before sending it onward.

Decrypting the HTTPS traffic from the mobile browser was not necessarily required, depending on how Nokia implemented things. It could have just relayed the traffic between the two endpoints by tunneling the traffic inside a client-to-proxy session. That would not have required decrypting the traffic, but it also would not have allowed the proxy to do its compression on the data, obviating the need for the proxy.

Nokia, however, admitted that it decrypted the traffic in a comment by Mark Durrant on Pandya's post:

Importantly, the proxy servers do not store the content of web pages visited by our users or any information they enter into them. When temporary decryption of HTTPS connections is required on our proxy servers, to transform and deliver users' content, it is done in a secure manner.

The "secure manner" phrase does not completely reassure, but this does not really look like an attempt to (knowingly) invade users' privacy. Durrant noted that Nokia has "implemented appropriate organizational and technical measures to prevent access to private information". It seems quite likely that this was simply a misstep by the company—one that could lead to a loss of privacy for Xpress users.

That interpretation seems to be borne out by changes that Nokia made to the Xpress browser after Pandya's report. After a browser update, Pandya noted that HTTPS sessions were not being handled in the same way. The HTTPS traffic is now tunneled over an HTTP connection to Nokia's servers, and the certificate being used (at least as reported by the browser) is the proper one for the destination. So, only the destination endpoint should be able to decrypt the data. Given that, though, it's not clear why the proxy is not just bypassed for HTTPS traffic.

The "welcome" notice that comes when installing the Xpress browser does make note of HTTPS decryption, though Schultze wonders how long that's been true, but certainly doesn't fully describe what's going on. Many users are likely to gloss over that statement—or not understand it at all. While web compression is a helpful feature for some users, it shouldn't come at the expense of reasonable security and privacy expectations.

As more of our traffic moves into "the cloud", we will be seeing more of these kinds of problems. Investigations like Pandya's will be needed to ensure that we at least know this type of network manipulation is occurring. Open source mobile operating systems (or even just open source browsers on proprietary systems) make it easier to find and eliminate this kind of mistake, but vigilance is needed there as well. Reviewing the code and ensuring that the "app" corresponds to the code reviewed are still required. With open source, though, we can peek inside the black box, which should make things easier—though not foolproof.

Comments (36 posted)

Brief items

Security quotes of the week

Achieving any real security requires that password verification take on the order of hundreds of milliseconds or even whole seconds. Unfortunately this hasn't been the experience of the past 20 years. MD5 was launched over 20 years ago and is still the most common implementation I see in the wild, though it's gone from being relatively expensive to evaluate to extremely cheap. Moore's Law has indeed broken MD5 as a password hash and no serious application should still use it. Human memory isn't more of a problem today than it used to be though. The problem is that we've chosen to let password verification become too cheap.
-- Joseph Bonneau

Beyond that, there's the fact that Facebook "likes" and profile settings aren't necessarily accurate reflections of reality. A search for "Married people who like Prostitutes" seems more likely to turn up people who thought it would be funny to hit "like" on a page called "Prostitutes" than actual johns. And note that those "Islamic men interested in men who live in Tehran, Iran" all say they're interested in both males and females, which probably just means that they interpreted "interested in" in a non-sexual way and decided not to discriminate by gender. Still, I wouldn't envy the hypothetical position of a Chinese citizen trying to convince Communist Party agents that he hit "like" on the "Falun Gong" page ironically or by accident.
-- Will Oremus on Facebook's new search in Slate

Comments (3 posted)

New vulnerabilities

freeradius2: authentication bypass

Package(s):freeradius2 CVE #(s):CVE-2011-4966
Created:January 17, 2013 Updated:February 7, 2013
Description:

From the Red Hat advisory:

It was found that the "unix" module ignored the password expiration setting in "/etc/shadow". If FreeRADIUS was configured to use this module for user authentication, this flaw could allow users with an expired password to successfully authenticate, even though their access should have been denied. (CVE-2011-4966)

Alerts:
Scientific Linux SL-free-20130116 2013-01-16
CentOS CESA-2013:0134 2013-01-09
openSUSE openSUSE-SU-2013:0137-1 2013-01-23
openSUSE openSUSE-SU-2013:0191-1 2013-01-23
Mageia MGASA-2013-0026 2013-02-06

Comments (none posted)

ganglia: PHP script execution

Package(s):ganglia CVE #(s):CVE-2012-3448
Created:January 22, 2013 Updated:January 23, 2013
Description: From the Debian advisory:

Insufficient input sanitization in Ganglia, a web based monitoring system, could lead to remote PHP script execution with permissions of the user running the web browser.

Alerts:
Debian DSA-2610-1 2013-01-21

Comments (none posted)

httpd: multiple vulnerabilities

Package(s):httpd CVE #(s):CVE-2008-0455 CVE-2008-0456
Created:January 17, 2013 Updated:February 12, 2013
Description:

From the Scientific Linux advisory:

Input sanitization flaws were found in the mod_negotiation module. A remote attacker able to upload or create files with arbitrary names in a directory that has the MultiViews options enabled, could use these flaws to conduct cross-site scripting and HTTP response splitting attacks against users visiting the site. (CVE-2008-0455, CVE-2008-0456)

Alerts:
Scientific Linux SL-http-20130116 2013-01-16
Fedora FEDORA-2013-1661 2013-02-12
Red Hat RHSA-2013:0512-02 2013-02-21
Oracle ELSA-2013-0512 2013-02-25
Scientific Linux SL-http-20130228 2013-02-28
CentOS CESA-2013:0512 2013-03-09

Comments (none posted)

kernel: denial of service

Package(s):linux CVE #(s):CVE-2012-5532
Created:January 18, 2013 Updated:January 23, 2013
Description:

From the Ubuntu advisory:

Florian Weimer discovered that hypervkvpd, which is distributed in the Linux kernel, was not correctly validating source addresses of netlink packets. An untrusted local user can cause a denial of service by causing hypervkvpd to exit. (CVE-2012-5532)

Alerts:
Ubuntu USN-1696-1 2013-01-17
Ubuntu USN-1699-1 2013-01-17
Ubuntu USN-1698-1 2013-01-17
Ubuntu USN-1700-1 2013-01-17
Ubuntu USN-1704-1 2013-01-22
Ubuntu USN-1699-2 2013-02-01
Ubuntu USN-1700-2 2013-02-01
Ubuntu USN-1696-2 2013-02-01
Ubuntu USN-1698-2 2013-02-01
Ubuntu USN-1704-2 2013-02-01
Ubuntu USN-1720-1 2013-02-12
Ubuntu USN-1726-1 2013-02-14

Comments (none posted)

kernel: denial of service

Package(s):kernel CVE #(s):CVE-2013-0190
Created:January 21, 2013 Updated:March 15, 2013
Description: From the Red Hat bugzilla:

A flaw was found in the way xen_failsafe_callback() handled failed iret, which causes the stack pointer to be wrong when entering the iret_exc error path. An unprivileged local guest user in the 32-bit PV Xen domain could use this flaw to crash the guest.

Alerts:
Fedora FEDORA-2013-0952 2013-01-18
Fedora FEDORA-2013-1025 2013-01-24
Oracle ELSA-2013-2503 2013-02-07
Oracle ELSA-2013-2504 2013-02-07
Ubuntu USN-1719-1 2013-02-12
Ubuntu USN-1720-1 2013-02-12
Ubuntu USN-1725-1 2013-02-14
Ubuntu USN-1728-1 2013-02-18
Red Hat RHSA-2013:0496-02 2013-02-21
Mageia MGASA-2013-0066 2013-02-22
Mageia MGASA-2013-0067 2013-02-22
Mageia MGASA-2013-0068 2013-02-22
Mageia MGASA-2013-0069 2013-02-22
Mageia MGASA-2013-0070 2013-02-22
Oracle ELSA-2013-0496 2013-02-28
Oracle ELSA-2013-2507 2013-02-28
CentOS CESA-2013:0496 2013-03-09
Scientific Linux SL-kern-20130314 2013-03-14
Ubuntu USN-1767-1 2013-03-18
Ubuntu USN-1769-1 2013-03-18
Ubuntu USN-1768-1 2013-03-18
Ubuntu USN-1774-1 2013-03-21

Comments (none posted)

kernel: information disclosure

Package(s):kernel CVE #(s):CVE-2012-4467
Created:January 18, 2013 Updated:January 23, 2013
Description:

From the Mageia advisory:

Commit 644595f89620 ("compat: Handle COMPAT_USE_64BIT_TIME in net/socket.c") introduced a bug where the helper functions to take either a 64-bit or compat time[spec|val] got the arguments in the wrong order, passing the kernel stack pointer off as a user pointer (and vice versa).

Because of the user address range check, that in turn then causes an EFAULT due to the user pointer range checking failing for the kernel address. Incorrectly resuling in a failed system call for 32-bit processes with a 64-bit kernel. On odder architectures like HP-PA (with separate user/kernel address spaces), it can be used read kernel memory.

Alerts:
Mageia MGASA-2013-0010 2013-01-18
Mageia MGASA-2013-0009 2013-01-18
Mageia MGASA-2013-0011 2013-01-18
Mageia MGASA-2013-0012 2013-01-18
Mageia MGASA-2013-0016 2013-01-24

Comments (none posted)

movabletype-opensource: command/SQL injection

Package(s):movabletype-opensource CVE #(s):CVE-2013-0209
Created:January 22, 2013 Updated:January 23, 2013
Description: From the Debian advisory:

An input sanitation problem has been found in upgrade functions of movabletype-opensource, a web-based publishing platform. Using carefully crafted requests to the mt-upgrade.cgi file, it would be possible to inject OS command and SQL queries.

Alerts:
Debian DSA-2611-1 2013-01-22

Comments (none posted)

mysql: multiple vulnerabilities

Package(s):mysql CVE #(s):CVE-2012-0572 CVE-2012-0574 CVE-2012-0578 CVE-2012-1702 CVE-2012-1705 CVE-2012-5060 CVE-2012-5096 CVE-2012-5612 CVE-2013-0367 CVE-2013-0368 CVE-2013-0371 CVE-2013-0375 CVE-2013-0383 CVE-2013-0384 CVE-2013-0385 CVE-2013-0386 CVE-2013-0389
Created:January 22, 2013 Updated:February 5, 2013
Description: MySQL 5.1.67 and 5.5.29 fix multiple security issues.

See the 5.1.67 release notes, the 5.5.29 release notes and the Oracle advisory for details.

Alerts:
Ubuntu USN-1703-1 2013-01-22
Slackware SSA:2013-022-01 2013-01-22
Mageia MGASA-2013-0019 2013-01-25
Red Hat RHSA-2013:0219-01 2013-01-31
CentOS CESA-2013:0219 2013-02-01
Oracle ELSA-2013-0219 2013-02-01
Scientific Linux SL-mysq-20130201 2013-02-01
Mandriva MDVSA-2013:007 2013-02-05
SUSE SUSE-SU-2013:0262-1 2013-02-09

Comments (none posted)

mysql: SQL command execution

Package(s):mysql-community-server CVE #(s):CVE-2012-4414
Created:January 23, 2013 Updated:January 23, 2013
Description: From the CVE entry:

Multiple SQL injection vulnerabilities in the replication code in Oracle MySQL possibly before 5.5.29, and MariaDB 5.1.x through 5.1.62, 5.2.x through 5.2.12, 5.3.x through 5.3.7, and 5.5.x through 5.5.25, allow remote authenticated users to execute arbitrary SQL commands via vectors related to the binary log. NOTE: as of 20130116, Oracle has not commented on claims from a downstream vendor that the fix in MySQL 5.5.29 is incomplete.

Alerts:
openSUSE openSUSE-SU-2013:0135-1 2013-01-23
openSUSE openSUSE-SU-2013:0156-1 2013-01-23

Comments (none posted)

nagios: code execution

Package(s):nagios CVE #(s):CVE-2012-6096
Created:January 23, 2013 Updated:March 27, 2013
Description: From the CVE entry:

Multiple stack-based buffer overflows in the get_history function in history.cgi in Nagios Core before 3.4.4, and Icinga 1.6.x before 1.6.2, 1.7.x before 1.7.4, and 1.8.x before 1.8.4, might allow remote attackers to execute arbitrary code via a long (1) host_name variable (host parameter) or (2) svc_description variable.

Alerts:
Fedora FEDORA-2013-0732 2013-01-23
Fedora FEDORA-2013-0753 2013-01-23
Fedora FEDORA-2013-0752 2013-01-23
openSUSE openSUSE-SU-2013:0140-1 2013-01-23
openSUSE openSUSE-SU-2013:0169-1 2013-01-23
openSUSE openSUSE-SU-2013:0188-1 2013-01-23
openSUSE openSUSE-SU-2013:0206-1 2013-01-29
Debian DSA-2616-1 2013-02-03
Mageia MGASA-2013-0039 2013-02-08
Mandriva MDVSA-2013:028 2013-03-18
Debian DSA-2653-1 2013-03-26

Comments (none posted)

php5: information disclosure

Package(s):php5 CVE #(s):CVE-2012-6113
Created:January 22, 2013 Updated:January 23, 2013
Description: From the CVE entry:

The openssl_encrypt function in ext/openssl/openssl.c in PHP 5.3.9 through 5.3.13 does not initialize a certain variable, which allows remote attackers to obtain sensitive information from process memory by providing zero bytes of input data.

Alerts:
Ubuntu USN-1702-1 2013-01-22

Comments (none posted)

rails: privilege escalation

Package(s):rails CVE #(s):CVE-2013-0155
Created:January 17, 2013 Updated:January 23, 2013
Description:

From the Debian advisory:

An interpretation conflict can cause the Active Record component of Rails, a web framework for the Ruby programming language, to truncate queries in unexpected ways. This may allow attackers to elevate their privileges.

Alerts:
Debian DSA-2609-1 2013-01-16
Fedora FEDORA-2013-0568 2013-01-20
Fedora FEDORA-2013-0568 2013-01-20
Fedora FEDORA-2013-0568 2013-01-20
Fedora FEDORA-2013-0635 2013-01-23
Fedora FEDORA-2013-0686 2013-01-23
Fedora FEDORA-2013-0635 2013-01-23
Fedora FEDORA-2013-0686 2013-01-23
Fedora FEDORA-2013-0635 2013-01-23
Fedora FEDORA-2013-0686 2013-01-23
Fedora FEDORA-2013-0635 2013-01-23
Fedora FEDORA-2013-0686 2013-01-23
openSUSE openSUSE-SU-2013:0278-1 2013-02-12
openSUSE openSUSE-SU-2013:0280-1 2013-02-12
Red Hat RHSA-2013:0582-01 2013-02-28
SUSE SUSE-SU-2013:0486-1 2013-03-19
SUSE SUSE-SU-2013:0508-1 2013-03-20

Comments (none posted)

rpm: incorrect signature checking

Package(s):rpm CVE #(s):CVE-2012-6088
Created:January 17, 2013 Updated:January 23, 2013
Description:

From the Ubuntu advisory:

It was discovered that RPM incorrectly handled signature checking. An attacker could create a specially-crafted rpm with an invalid signature which could pass the signature validation check.

Alerts:
Ubuntu USN-1694-1 2013-01-17

Comments (none posted)

sleuthkit: evade detection by forensic analysis

Package(s):sleuthkit CVE #(s):CVE-2012-5619
Created:January 23, 2013 Updated:February 7, 2013
Description: From the Red Hat bugzilla:

A security flaw was found in the way the Sleuth Kit (TSK), a collection of UNIX-based command line tools allowing to investigate a computer, performed management of '.' (dotfile) file system entry. An attacker could use this flaw to evade detection by forensic analysis (hide certain files not to be scanned) by renaming the file in question it to be '.' file system entry.

The original reports speaks about this attack vector to be present when scanning FAT (File Allocation Table) file system. It is possible though, the flaw to be present on other file systems, which do not reserve usage of '.' entry for special purpose, too.

Alerts:
Fedora FEDORA-2013-0320 2013-01-23
Fedora FEDORA-2013-0336 2013-01-23
Mageia MGASA-2013-0031 2013-02-06

Comments (none posted)

squirrelmail: denial of service

Package(s):squirrelmail CVE #(s):CVE-2012-2124
Created:January 17, 2013 Updated:January 23, 2013
Description:

From the Red Hat advisory:

The SquirrelMail security update RHSA-2012:0103 did not, unlike the erratum text stated, correct the CVE-2010-2813 issue, a flaw in the way SquirrelMail handled failed log in attempts. A user preference file was created when attempting to log in with a password containing an 8-bit character, even if the username was not valid. A remote attacker could use this flaw to eventually consume all hard disk space on the target SquirrelMail server. (CVE-2012-2124)

Alerts:
Scientific Linux SL-squi-20130116 2013-01-16
CentOS CESA-2013:0130 2013-01-09

Comments (none posted)

vino: multiple vulnerabilities

Package(s):vino CVE #(s):CVE-2011-1164 CVE-2011-1165 CVE-2012-4429
Created:January 22, 2013 Updated:February 7, 2013
Description: From the Red Hat advisory:

It was found that Vino transmitted all clipboard activity on the system running Vino to all clients connected to port 5900, even those who had not authenticated. A remote attacker who is able to access port 5900 on a system running Vino could use this flaw to read clipboard data without authenticating. (CVE-2012-4429)

In certain circumstances, the vino-preferences dialog box incorrectly indicated that Vino was only accessible from the local network. This could confuse a user into believing connections from external networks are not allowed (even when they are allowed). With this update, vino-preferences no longer displays connectivity and reachable information. (CVE-2011-1164)

There was no warning that Universal Plug and Play (UPnP) was used to open ports on a user's network router when the "Configure network automatically to accept connections" option was enabled (it is disabled by default) in the Vino preferences. This update changes the option's description to avoid the risk of a UPnP router configuration change without the user's consent. (CVE-2011-1165)

Alerts:
Red Hat RHSA-2013:0169-01 2013-01-21
CentOS CESA-2013:0169 2013-01-22
Ubuntu USN-1701-1 2013-01-22
Scientific Linux SL-vino-20130122 2013-01-22
Oracle ELSA-2013-0169 2013-01-22
Mageia MGASA-2013-0028 2013-02-06

Comments (1 posted)

WebYaST: information disclosure

Package(s):WebYaST CVE #(s):CVE-2012-0435
Created:January 23, 2013 Updated:January 23, 2013
Description: From the SUSE advisory:

The hosts list used by WebYaST for connecting to it's back end part was modifiable allowing to point to a malicious website which then could access all values sent by WebYaST.

The /host configuration path was removed to fix this issue.

Alerts:
SUSE SUSE-SU-2013:0053-1 2013-01-23

Comments (none posted)

xen: denial of service

Package(s):xen CVE #(s):CVE-2012-5634 CVE-2013-0154
Created:January 23, 2013 Updated:February 4, 2013
Description: From the Red Hat bugzilla:

When passing a device which is behind a legacy PCI Bridge through to a guest Xen incorrectly configures the VT-d hardware. This could allow incorrect interrupts to be injected to other guests which also have passthrough devices.

In a typical Xen system many devices are owned by domain 0 or driver domains, leaving them vulnerable to such an attack. Such a DoS is likely to have an impact on other guests running in the system.

On systems using Intel VT-d for PCI passthrough a malicious domain, given access to a device which is behind a legacy PCI bridge, can mount a denial of service attack affecting the whole system.

Alerts:
Fedora FEDORA-2013-0627 2013-01-23
Fedora FEDORA-2013-0608 2013-01-23
Fedora FEDORA-2013-1274 2013-02-02
Debian DSA-2636-1 2013-03-01
Debian DSA-2636-2 2013-03-03

Comments (none posted)

xorg-x11-apps: code execution

Package(s):xorg-x11-apps CVE #(s):CVE-2011-2504
Created:January 17, 2013 Updated:March 15, 2013
Description: From the Red Hat advisory:

It was found that the x11perfcomp utility included the current working directory in its PATH environment variable. Running x11perfcomp in an attacker-controlled directory would cause arbitrary code execution with the privileges of the user running x11perfcomp.

Alerts:
Fedora FEDORA-2013-0124 2013-01-16
Red Hat RHSA-2013:0502-02 2013-02-21
Oracle ELSA-2013-0502 2013-02-25
CentOS CESA-2013:0502 2013-03-09
CentOS CESA-2013:0502 2013-03-09
CentOS CESA-2013:0502 2013-03-09
Scientific Linux SL-NotF-20130314 2013-03-14

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The current development kernel is 3.8-rc4, released on January 17. Linus was "late" a day in releasing it, which sent him on a mission to figure out which day was the most common for releases (Sunday). "Anyway, with that digression, I can happily report that -rc4 is smaller than -rc3 despite the extra day, although not by much. There's not really a whole lot that stands out: apart from one new wireless driver (the Atheros Wilocity driver) and some OMAP drm changes, the diffstat looks pretty flat and spread out. Which just means lots of small changes all over."

Stable updates were not in short supply this week. 3.7.3, 3.4.26, 3.0.59, and 2.6.34.14 were all released on January 17; the 2.6.34.14 announcement carried a warning that updates for this kernel will cease in the near future. 3.7.4, 3.4.27 and 3.0.60 were released on January 21.

Comments (none posted)

Quotes of the week

I'm leaving the Linux world and Intel for a bit for family reasons. I'm aware that "family reasons" is usually management speak for "I think the boss is an asshole" but I'd like to assure everyone that while I frequently think Linus is an asshole (and therefore very good as kernel dictator) I am departing quite genuinely for family reasons and not because I've fallen out with Linus or Intel or anyone else.
— Best wishes, Alan Cox, we'll miss you

Yes, it's very unlikely, but we are in the business of dealing with the very unlikely. That's because in our business, the very unlikely is very likely. Damn, I need to buy a lotto ticket!
Steven Rostedt

About the only thing Kernel developers agree on is they use C and don't comment their code.
Tom St Denis

Documentation is generally considered a good thing, but few people can be bothered to write it, and few of the other people that should read it actually do.
Arnd Bergmann

Comments (none posted)

Long-term support initiative 3.4 kernel released

The Long-Term Support Initiative helps to provide support for selected kernels for a two-year period. But the project has also intended to release additional kernels aimed at the needs of the consumer electronics industry. That has come about with the announcement of the release of the LTSI 3.4 kernel. It is based on 3.4.25, but with an improved CMA memory allocator, the out-of-tree AF_BUS protocol implementation, and a backport of the CoDel queue management algorithm, along with various hardware enablement patches and other useful bits of code.

Comments (14 posted)

Kernel development news

Supporting variable-sized huge pages

By Michael Kerrisk
January 23, 2013

Huge pages are an optimization technique designed to increase virtual memory performance. The idea is that instead of a traditional small virtual memory page size (4 kB on most architectures), an application can employ (much) larger pages (e.g., 2 MB or 1 GB on x86-64). For applications that can make full use of larger pages, huge pages provide a number of performance benefits. First, a single page fault can fault in a large block of memory. Second, larger page sizes equate to shallower page tables, since fewer page-table levels are required to span the same range of virtual addresses; consequently, less time is required to traverse page table entries when translating virtual addresses to physical addresses. Finally, and most significantly, since entries for huge pages in the translation lookaside buffer (TLB) span much greater address ranges, there is an increased chance that a virtual address already has a match in one of the limited set of entries currently cached in the TLB, thus obviating the need to traverse page tables.

Applications can explicitly request the use of huge pages when making allocations, using either shmget() with the SHM_HUGETLB flag (since Linux 2.6.0) or mmap() with the MAP_HUGETLB flag (since Linux 2.6.32). It's worth noting that explicit application requests are not needed to employ huge pages: the transparent huge pages feature merged in Linux 2.6.38 allows applications to gain much of the performance benefit of huge pages without making any changes to application code. There is, however, a limitation to these APIs: they provide no way to specify the size of the huge pages to be used for an allocation. Instead, the kernel employs the "default" huge page size.

Some architectures only permit one huge page size; on those architectures, the default is in fact the only choice. However, some modern architectures permit multiple huge page sizes, and where the system administrator has configured the system to provide huge page pools of different sizes, applications may want to choose the page size used for their allocation. For example, this may be useful in a NUMA environment, where a smaller huge page size may be suitable for mappings that are shared across CPUs, while a larger page size is used for mappings local to a single CPU.

A patch by Andi Kleen that was accepted during the 3.8 merge window extends the shmget() and mmap() system calls to allow the caller to select the size used for huge page allocations. These system calls have the following prototypes:

    void *mmap(void *addr, size_t length, int prot, int flags,
               int fd, off_t offset);
    int shmget(key_t key, size_t size, int shmflg);

Neither of those calls provides an argument that can be directly used to specify the desired page size. Therefore, Andi's patch shoehorns the value into some bits that are currently unused in one of the arguments of each call—in the flags argument for mmap() and in the shmflg argument for shmget().

In both system calls, the huge page size is encoded in the six bits from 26 through to 31 (i.e., the bit mask 0xfc000000). The value in those six bits is the base-two log of the desired page size. As a special case, if the value encoded in the bits is zero, then the kernel selects the default huge page size. This provides binary backward compatibility for the interfaces. If the specified page size is not supported by the architecture, then shmget() and mmap() fail with the error ENOMEM.

An application can manually perform the required base-two log calculation and bit shift to generate the required bit-mask value, but this is clumsy. Instead, an architecture can define suitable constants for the huge page sizes that it supports. Andi's patch defines two such constants corresponding to the available page sizes on x86-64:

    #define SHM_HUGE_SHIFT  26
    #define SHM_HUGE_MASK   0x3f
    /* Flags are encoded in bits (SHM_HUGE_MASK << SHM_HUGE_SHIFT) */

    #define SHM_HUGE_2MB    (21 << SHM_HUGE_SHIFT)   /* 2 MB huge pages */
    #define SHM_HUGE_1GB    (30 << SHM_HUGE_SHIFT)   /* 1 GB huge pages */

Corresponding MAP_* constants are defined for use in the mmap() system call.

Thus, to employ a 2 MB huge page size when calling shmget(), one would write:

    shmget(key, size, flags | SHM_HUGETLB | SHM_HUGE_2MB);

That is, of course, the same as this manually calculated version:

    shmget(key, size, flags | SHM_HUGETLB | (21 << HUGE_PAGE_SHIFT));

In passing, it's worth noting that an application can determine the default page size by looking at the Hugepagesize entry in /proc/meminfo and can, if the kernel was configured with CONFIG_HUGETLBFS, discover the available page sizes on the system by scanning the directory entries under /sys/kernel/mm/hugepages.

One concern raised by your editor when reviewing an earlier version of Andi's patch was whether the bit space in the mmap() flags argument is becoming exhausted. Exactly how many bits are still unused in that argument turns out to be a little difficult to determine, because different architectures define the same flags with different values. For example, the MAP_HUGETLB flag has the values 0x4000, 0x40000, 0x80000, or 0x100000, depending on the architecture. It turns out that before Andi's patch was applied, there were only around 11 bits in flags that were unused across all architectures; now that the patch has been applied, just six are left.

The day when the mmap() flags bit space is exhausted seems to be slowly but steadily approaching. When that happens, either a new mmap()-style API with a 64-bit flags argument will be required, or, as Andi suggested, unused bits in the prot argument could be used; the latter option would be easier to implement, but would also further muddy the interface of an already complex system call. In any case, concerns about the API design didn't stop Andrew Morton from accepting the patch, although he was prompted to remark "I can't say the userspace interface is a thing of beauty, but I guess we'll live."

The new API features will roll out in few weeks' time with the 3.8 release. At that point, application writers will be able to select different huge page sizes for different memory allocations. However, it will take a little longer before the MAP_* and SHM_* page size constants percolate through to the GNU C library. In the meantime, programmers who are in a hurry will have to define their own versions of these constants.

Comments (4 posted)

GPIO in the kernel: future directions

By Jonathan Corbet
January 23, 2013
Last week's article covered the kernel's current internal API for general-purpose I/O (GPIO) lines. The GPIO API has seen relatively little change in recent years, but that situation may be about to change as the result of a couple of significant patch sets that seek to rework how the GPIO API works in the interest of greater robustness and better performance.

No more numbers

The current GPIO API relies on simple integers to identify specific GPIO lines. It works, but there are some shortcomings to this approach. Kernel code is rarely interested in "GPIO #37"; instead, it wants "the GPIO connected to the monitor's DDC line" or something to that effect. For well-defined systems where the use of GPIO lines never changes, preprocessor definitions can be used to identify lines, but that approach falls apart when the same GPIO can be put to different uses in different systems. As hardware gets more dynamic, with GPIOs possibly showing up at any time, there is no easy way to know which GPIO goes where. It can be easy to get the wrong one by mistake.

As a result, platform and driver developers have come up with various ways to locate GPIOs of interest. Even your editor once submitted a patch adding a gpio_lookup() function to the GPIO API, but that patch didn't pass muster and was eventually dropped in favor of a driver-specific solution. So the number-based API has remained — until now.

Alexandre Courbot's descriptor-based GPIO interface seeks to change the situation by introducing a new struct gpio_desc * pointer type. GPIO lines would be represented by one of these pointers; what lives behind the pointer would be hidden from GPIO users, though. Internally, gpiolib (the implementation of the GPIO API used by most architectures) is refactored to use descriptors rather than numbers, and a new set of functions is presented to users. These functions will look familiar to users of the current GPIO API:

    #include <linux/gpio/consumer.h>

    int gpiod_direction_input(struct gpio_desc *desc);
    int gpiod_direction_output(struct gpio_desc *desc, int value);
    int gpiod_get_value(struct gpio_desc *desc);
    void gpiod_set_value(struct gpio_desc *desc, int value);
    int gpiod_to_irq(struct gpio_desc *desc);
    int gpiod_export(struct gpio_desc *desc, bool direction_may_change);
    int gpiod_export_link(struct device *dev, const char *name,
			  struct gpio_desc *desc);
    void gpiod_unexport(struct gpio_desc *desc);

In short: the gpio_ prefix on the existing GPIO functions has been changed to gpiod_ and the integer GPIO number argument is now a struct gpio_desc *. There is also a new include file for the new functions; otherwise the interfaces are identical. The existing, integer-based API still exists, but it has been reimplemented as a layer on top of the descriptor-based API shown here.

What is missing from the above list, though, is any way of obtaining a descriptor for a GPIO line in the first place. One way to do that is to get the descriptor from the traditional GPIO number:

    struct gpio_desc *gpio_to_desc(unsigned gpio);

There is also a desc_to_gpio() for going in the opposite direction. Using this function makes it easy to transition existing code over to the new API. Obtaining a descriptor in this manner will ensure that no code accesses a GPIO without having first properly obtained a descriptor for it, but it would be better to do away with the numbers altogether in favor of a more robust way of looking up GPIOs. The patch set adds this functionality in this form:

    struct gpio_desc *gpiod_get(struct device *dev, const char *name);

Here, dev should be the device providing the GPIO line, and "name" describes that line. The dev pointer is needed to disambiguate the name, and because code accessing a GPIO line should know which device it is working through in any case. So, for example, a video acquisition bridge device may need access to GPIO lines with names like "sensor-power", "sensor-reset", "sensor-i2c-clock" and "sensor-i2c-data". The driver could then request those lines by name with gpiod_get() without ever having to be concerned with numbers.

Needless to say, there is a gpiod_put() for releasing access to a GPIO line.

The actual association of names with GPIO lines can be done by the driver that implements those lines, if the names are static and known. In many cases, though, the routing of GPIO lines will have been done by whoever designed a specific system-on-chip or board; there is no way for the driver author to know ahead of time how a specific system may be wired. In this case, the names of the GPIO lines will most likely be specified in the device tree, or, if all else fails, in a platform data structure.

The response to this interface is generally positive; it seems almost certain that it will be merged in the near future. The biggest remaining concern, perhaps, is that the descriptor interface is implemented entirely within the gpiolib layer. Most architectures use gpiolib to implement the GPIO interface, but it is not mandatory; in some cases, the gpio_* functions are implemented as macros that access the device registers directly. Such an implementation is probably more efficient, but GPIO is not usually a performance-critical part of the system. So there may be pressure for all architectures to move to gpiolib; that, in turn, would facilitate the eventual removal of the number-based API entirely.

Block GPIO

The GPIO interface as described so far is focused on the management of individual GPIO lines. But GPIOs are often used together as a group. As a simple example, consider a pair of GPIOs used as an I2C bus; one line handles data, the other the clock. A bit-banging driver can manage those two lines together to communicate with connected I2C devices; the kernel contains a driver in drivers/i2c/busses/i2-gpio.c for just this purpose.

Most of the time, managing GPIOs individually, even when they are used as a group, works fine. Computers are quite fast relative to the timing requirements of most of the serial communications protocols that are subject to implementation with GPIO. But there are exceptions, especially when the hardware implementing the GPIO lines themselves is slow; that can make it hard to change multiple lines in a simultaneous manner. But, sometimes, the hardware can change lines simultaneously if properly asked; often the lines are represented by bits in the same device register and can all be changed together with a single I/O memory write operation.

Roland Stigge's block GPIO patch set is an attempt to make that functionality available in the kernel. Code that needs to manipulate multiple GPIOs as a group would start by associating them in a single block with:

    struct gpio_block *gpio_block_create(unsigned int *gpios, size_t size,
				     	 const char *name);

gpios points to an array of size GPIO numbers which are to be grouped into a block; the given name can be used to work with the block from user space. The GPIOs should have already been requested with gpio_request(); they also need to have their direction set individually. It's worth noting that the GPIOs need not be located on the same hardware; if they are spread out, or if the underlying driver does not implement the internal block API, the block GPIO interface will just access those lines individually as is done now.

Manipulation of GPIO blocks is done with:

    unsigned long gpio_block_get(struct gpio_block *block, unsigned long mask);
    void gpio_block_set(struct gpio_block *block, unsigned long mask,
		    	unsigned long values);

For both functions, block is a GPIO block created as described above, and mask is a bitmask specifying which GPIOs in the block are to be acted upon; each bit in mask enables the corresponding GPIO in the array passed to gpio_block_create(). This API implies that the number of bits in a long forces an upper bound on number of lines grouped into a GPIO block; that seems unlikely to be a problem in real-world use. gpio_block_get() will read the specified lines, simultaneously if possible, and return a bitmask with the result. The lines in a GPIO block can be set as a unit with gpio_block_set().

A GPIO block is released with:

    void gpio_block_free(struct gpio_block *block);

There is also a pair of registration functions:

    int gpio_block_register(struct gpio_block *block);
    void gpio_block_unregister(struct gpio_block *block);

Registering a GPIO block makes it available to user space. There is a sysfs interface that can be used to query and set the GPIOs in a block. Interestingly, registration also creates a device node (using the name provided to gpio_block_create()); reading from that device returns the current state of the GPIOs in the block, while writing it will set the GPIOs accordingly. There is an ioctl() operation (which, strangely, uses zero as the command number) to set the mask to be used with read and write operations.

This patch set has not generated as much discussion as the descriptor-based API patches (it is also obviously not yet integrated with the descriptor API). Most likely, relatively few developers have felt the need for a block-based API. That said, there are cases when it is likely to be useful, and there appears to be no opposition, so this API can eventually be expected to be merged as well.

Comments (7 posted)

Making EPERM friendlier

By Michael Kerrisk
January 19, 2013

Error reporting from the kernel (and low-level system libraries such as the C library) has been a primitive affair since the earliest UNIX systems. One of the consequences of this is that end users and system administrators often encounter error messages that provide quite limited information about the cause of the error, making it difficult to diagnose the underlying problem. Some recent discussions on the libc-alpha and Linux kernel mailing lists were started by developers who would like to improve this state of affairs by having the kernel provide more detailed error information to user space.

The traditional UNIX (and Linux) method of error reporting is via the (per-thread) global errno variable. The C library wrapper functions that invoke system calls indicate an error by returning -1 as the function result and setting errno to a positive integer value that identifies the cause of the error.

The fact that errno is a global variable is a source of complications for user-space programs. Because each system call may overwrite the global value, it is sometimes necessary to save a copy of the value if it needs to be preserved while making another system call. The fact that errno is global also means that signal handlers that make system calls must save a copy of errno on entry to the handler and restore it on exit, to prevent the possibility of overwriting a errno value that had previously been set in the main program.

Another problem with errno is that the information it reports is rather minimal: one of somewhat more than one hundred integer codes. Given that the kernel provides hundreds of system calls, many of which have multiple error cases, the mapping of errors to errno values inevitably means a loss of information.

That loss of information can be particularly acute when it comes to certain commonly used errno values. In a message to the libc-alpha mailing list, Dan Walsh explained the problem for two errors that are frequently encountered by end users:

Traditionally, if a process attempts a forbidden operation, errno for that thread is set to EACCES or EPERM, and a call to strerror() returns a localized version of "Permission Denied" or "Operation not permitted". This string appears throughout textual uis and syslogs. For example, it will show up in command-line tools, in exceptions within scripting languages, etc.

Those two errors have been defined on UNIX systems since early times. POSIX defines EACCES as "an attempt was made to access a file in a way forbidden by its file access permissions" and EPERM as "an attempt was made to perform an operation limited to processes with appropriate privileges or to the owner of a file or other resource." These definitions were fairly comprehensible on early UNIX systems, where the kernel was much less complex, the only method of controlling file access was via classical rwx file permissions, and the only kind of privilege separation was via user and group IDs and superuser versus non-superuser. However, life is rather more complex on modern UNIX systems.

In all, EPERM and EACCES are returned by more than 3000 locations across the Linux 3.7 kernel source code. However, it is not so much the number of return paths yielding these errors that is the problem. Rather, the problem for end users is determining the underlying cause of the errors. The possible causes are many, including denial of file access because of insufficient (classical) file permissions or because of permissions in an ACL, lack of the right capability, denial of an operation by a Linux Security Module or by the seccomp mechanism, and any of a number of other reasons. Dan summarized the problem faced by the end user:

As we continue to add mechanisms for the Kernel to deny permissions, the Administrator/User is faced with just a message that says "Permission Denied" Then if the administrator is lucky enough or skilled enough to know where to look, he might be able to understand why the process was denied access.

Dan's mail linked to a wiki page ("Friendly EPERM") with a proposal on how to deal with the problem. That proposal involves changes to both the kernel and the GNU C library (glibc). The kernel changes would add a mechanism for exposing a "failure cookie" to user space that would provide more detailed information about the error delivered in errno. On the glibc side, strerror() and related calls (e.g., perror()) would access the failure cookie in order obtain information that could be used to provide a more detailed error message to the user.

Roland McGrath was quick to point out that the solution is not so simple. The problem is that it is quite common for applications to call strerror() only some time after a failed system call, or to do things such as saving errno in a temporary location and then restoring it later. In the meantime, the application is likely to have performed further system calls that may have changed the value of the failure cookie.

Roland went on to identify some of the problems inherent in trying to extend existing standardized interfaces in order to provide useful error information to end users:

It is indeed an unfortunate limitation of POSIX-like interfaces that error reporting is limited to a single integer. But it's very deeply ingrained in the fundamental structure of all Unix-like interfaces.

Frankly, I don't see any practical way to achieve what you're after. In most cases, you can't even add new different errno codes for different kinds of permission errors, because POSIX specifies the standard code for certain errors and you'd break both standards compliance and all applications that test for standard errno codes to treat known classes of errors in particular ways.

In response, Eric Paris, one of the other proponents of the failure-cookie idea acknowledged Roland's points, noting that since the standard APIs can't be extended, then changes would be required to each application that wanted to take advantage of any additional error information provided by the kernel.

Eric subsequently posted a note to the kernel mailing list with a proposal on the kernel changes required to support improved error reporting. In essence, he proposes exposing some form of binary structure to user space that describes the cause of the last EPERM or EACCES error returned to the process by the kernel. That structure might, for example, be exposed via a thread-specific file in the /proc filesystem.

The structure would take the form of an initial field that indicates the subsystem that triggered the error—for example, capabilities, SELinux, or file permissions—followed by a union of substructures that provide subsystem-specific detail on the circumstances that triggered the error. Thus, for a file permissions error, the substructure might return the effective user and group ID of the process, the file user ID and group ID, and the file permission bits. At the user-space level, the binary structure could be read and translated to human-readable strings, perhaps via a glibc function that Eric suggested might be named something like get_extended_error_info().

Each of the kernel call sites that returned an EPERM or EACCES error would then need to be patched to update this information. But, patching all of those call sites would not be necessary to make the feature useful. As Eric noted:

But just getting extended denial information in a couple of the hot spots would be a huge win. Put it in capable(), LSM hooks, the open() syscall and path walk code.

There were various comments on Eric's proposal. In response to concerns from Stephen Smalley that this feature might leak information (such as file attributes) that could be considered sensitive in systems with a strict security policy (enforced by an LSM), Eric responded that the system could provide a sysctl to disable the feature:

I know many people are worried about information leaks, so I'll right up front say lets add the sysctl to disable the interface for those who are concerned about the metadata information leak. But for most of us I want that data right when it happens, where it happens, so It can be exposed, used, and acted upon by the admin trying to troubleshoot why the shit just hit the fan.

Reasoning that its best to use an existing format and its tools rather than inventing a new format for error reporting, Casey Schaufler suggested that audit records should be used instead:

the string returned by get_extended_error_info() ought to be the audit record the system call would generate, regardless of whether the audit system would emit it or not. If the audit record doesn't have the information you need we should fix the audit system to provide it. Any bit of the information in the audit record might be relevant, and your admin or developer might need to see it.

Eric expressed concerns that copying an audit record to the process's task_struct would carry more of a performance hit than copying a few integers to that structure, concluding:

I don't see a problem storing the last audit record if it exists, but I don't like making audit part of the normal workflow. I'd do it if others like that though.

Jakub Jelinek wondered which system call Eric's mechanism should return information about, and whether its state would be reset if a subsequent system call succeeded. In many cases, there is no one-to-one mapping between C library calls and system calls, so that some library functions may make one system call, save errno, then make some other system call (that may or may not also fail), and then restore the first system call's errno before returning to the caller. Other C library functions themselves set errno. "So, when would it be safe to call this new get_extended_error_info function and how to determine to which syscall it was relevant?"

Eric's opinion was that the mechanism should return information about the last kernel system call. "It would be really neat for libc to have a way to save and restore the extended errno information, maybe even supply its own if it made the choice in userspace, but that sounds really hard for the first pass."

However, there are problems with such a bare-bones approach. If the value returned by get_extended_error_info() corresponds to the last system call, rather than the errno value actually returned to user space, this risks confusing user-space applications (and users). Carlos O'Donell, who had earlier raised some of the same questions as Jakub and pointed out the need to properly handle the extended error information when a signal handler interrupts the main program, agreed with Casey's assessment that get_extended_error_info() should always return a value that corresponds to the current content of errno. That implies the need for a user-space function that can save and restore the extended error information.

Finally, David Gilbert suggested that it would be useful to broaden Eric's proposal to handle errors beyond EPERM and EACESS. "I've wasted way too much time trying to figure out why mmap (for example) has given me an EINVAL; there are just too many holes you can fall into."

In the last few days, discussion in the thread has gone quiet. However, it's clear that Dan and Eric have identified a very real and practical problem (and one that has been identified by others in the past). The solution would probably need to address the concerns raised in the discussion—most notably the need to have get_extended_error_info() always correspond to the current value of errno—and might possibly also be generalized beyond EPERM and EACCES. However, that should all be feasible, assuming someone takes on the (not insignificant) work of fleshing out the design and implementing it. If they do, the lives of system administrators and end users should become considerably easier when it comes to diagnosing the causes of software error reports.

Comments (90 posted)

Patches and updates

Kernel trees

Build system

Core kernel code

Development tools

Device drivers

Filesystems and block I/O

Memory management

Networking

Architecture-specific

Security-related

Virtualization and containers

Miscellaneous

Page editor: Jonathan Corbet

Distributions

Jolla and the Hildon Foundation struggle to cooperate

By Nathan Willis
January 23, 2013

The fallout from the dissolution of the MeeGo project continues to impact many in the Linux-based mobile device world. Former employees from Nokia's Maemo/MeeGo division formed a company called Jolla, pledging to pick up development and deliver a platform to mobile phone companies. But although Jolla has announced partnerships with device vendors, the company has been less successful at engaging with the still-active Maemo/MeeGo community. Add in the overlapping open source projects claiming ties to MeeGo, and it can be confusing to see how the pieces fit together. Shaking out the relationships between players is challenging for those on the inside, too, as Jolla and a major community group have recently discovered.

Million little pieces

When Nokia terminated its participation in MeeGo there were—to put it mildly—hurt feelings, among both the developer community and Nokia's MeeGo engineers. Some of the veterans in both camps followed MeeGo's other underwriter Intel to the launch of MeeGo's official successor Tizen, but others hedged their bets. Most notable was the Mer project, which forked the existing MeeGo code and vowed to continue it as a community-run project. Subsequently, Mer project participants launched the Nemo Mobile distribution, which added MeeGo Handset UX components to the Mer base system. Still, the project continued to advertise Mer itself as a general-purpose system equally suited to be the base for MeeGo, Tizen, or other platforms, including KDE's Plasma Active.

Jolla was formed privately sometime in 2011, but it announced itself in public in July 2012. Started by a group of former Nokia employees, Jolla advertised that it would "design, develop and sell new MeeGo based smartphones." Since then the company reportedly signed a sales agreement with Chinese retailer D.Phone and hired several Mer developers to help develop the platform.

But details about the company's work have been hard to come by, with almost all information arriving through second-hand sources like interviews with executives published at blogs, or social media outlets like the company's Twitter account or Facebook page. The jolla.com site is an un-navigable matrix of Twitter and Facebook links that are dynamically loaded one screenful at a time via JavaScript. The site does offer a press release (PDF) from November 2012, though, highlighting Jolla's appearance at a start-up event in Helsinki. At the event, Jolla previewed Sailfish OS, its Mer-based software platform.

Yet there is precious little information in print about Sailfish OS; the site is a wiki that highlights Sailfish's inclusion of Mer and Qt5, but not much else. The QA page explains that Sailfish will provide hardware adaptation and a user interface layer that are not found in Mer, and that components from Nemo Mobile will be included as well (although not the latter project's UI layer). Video of the demonstration talks at the Helsinki start-up event are available on the Jolla Youtube channel, including a walkthrough of the software development kit (SDK), but the SDK has not been released.

The Mer, Nemo Mobile, and Jolla camps are largely composed of developers working on the core of the operating system. But one of Maemo and MeeGo's bigger accomplishments was its success at cultivating a large and active community of third-party application developers. As with the platform developers, some of the third-party application developers have migrated to Tizen, but a sizable contingent embarked on their own hedging strategy by forming the Hildon Foundation (HF) in September 2012. The name comes from the pre-MeeGo application framework from Maemo, but the group emphasizes that it, too, is happy to work with the other post-MeeGo projects, Mer included.

The chicken and egg salad problem

It might seem as if the HF and the various MeeGo-derived platform projects need each other in a big way; after all, a platform needs applications just as much as applications need a platform. But, so far, the disparate pieces have yet to link up and establish a strong partnership.

The HF took a step toward more cooperation on January 3, 2013 by publishing an open letter addressed to Jolla. The letter notes that one of the principal motivations behind the formation of the HF was Nokia's impending decommissioning of the maemo.org web infrastructure, which hosted discussion forums, documentation, and software repositories. The HF undertook the task of maintaining the sites, but had little success in raising the funds necessary to cover the hosting bills. The application ecosystem centered around those sites, the letter argues, would make the HF "a beneficial advocate for Jolla." To that end, it suggests the formation of a relationship between the foundation and the company:

We would welcome discussion about a relationship between our organizations… If anything, we would like to form a friendship. If Jolla would be willing to become a sponsor and aid us in promoting the open source software aspects of Jolla and Sailfish — or, possibly even foot the bill — we would offer valuable community support to Jolla. In fact, we see a huge possibility of the Hildon Foundation becoming a part of the Jolla ecosystem or Sailfish Alliance itself.

A Jolla representative responded to the letter with a comment on the post—one which was not overwhelmingly positive. "We can talk on a regular basis and we can form a friendship," it said. "But talking and claiming friendship is easy." Jolla is already providing financial support to Mer and Nemo Mobile, the comment said, projects with "very small budgets" and which:

...have based their operations around being able to do big things, with small resources, like a startup or community would – and not only with contributions from just Jolla.

For what we want to create and the world to come with Jolla and Sailfish – these projects are where you would want to be participating, to be in the front seat of what’s to come. Places to be true pioneers.

The reactions from other commenters to Jolla's reply varied; user Mark Z asserted that it amounted to the rejection of the HF's large pool of potential customers (by way, presumably, of the HF application developers' existing customers). Others, like joerg_rw, thought that a formal relationship was too much to expect from a new startup, but that establishing contact was good for the moment.

The HF board posted a longer response on January 10. The response seemed to take slight umbrage at the "true pioneers" comment in Jolla's reply, saying that "in all honesty, the Maemo community has been a true pioneer for many years now." It also laid out a case that Jolla, Mer, and Nemo Mobile lack the specific infrastructure elements that comprise the HF's mission:

Currently, neither Jolla nor any Jolla affiliates affiliates (Mer, Nemo, Sailfish OS, etc.) have any sort of "community," user base, or public space for communication to speak of. These assets can be extended if, for example, you are interested in a community repository, cross-platform development, or helping developers migrate to Sailfish. We had already intended to provide a public space for Jolla/Nemo/Sailfish and will continue to do so regardless. Would Jolla like to help with any of these efforts that will benefit Jolla?

So far, Jolla has not followed up with a second reply.

Whether time and the benefits of launching Sailfish OS on handsets in China will change minds at Jolla is anybody's guess. But at the moment, some members of the HF remain optimistic, particularly if Sailfish OS's Qt-based API is substantially similar to Nemo Mobile and MeeGo's. Logic would suggest that, at some point, Jolla will set out to cultivate an application developer community, and the Sailfish OS SDK previewed in November indicates that the plan is already in the works. One would simply hope that the company does not alienate the existing developer community between now and the SDK's public release.

On the other hand, Mer would seem to be a more natural fit for a partnership with the HF, since it too is a community-driven project. But despite its secrecy in other areas, it is evident that Jolla has financing, which cannot be said about the other, community-driven projects.

Whatever happens between Mer, Jolla, Nemo Mobile, and the HF, it is remarkable to see how durable the Maemo/MeeGo community has remained years after its corporate founder washed its hands of the project. Strangely absent from the discussion about how the above camps can work together is Tizen, which is supposed to be the official successor to MeeGo. The platform and the API differs considerably, but if anyone at Tizen is watching carefully, they may see an opportunity to step in and win over new fans and developers, since the other parties cannot seem to all get together on the same page.

Comments (2 posted)

Brief items

Distribution quotes of the week

The irregularly shaped bovine just hit my office laptop.
-- Heherson Pagcaliwagan

It's part of the nefarious master plan to turn all of our computers into giant cell phones/tablets.
-- Ian Pilcher

Comments (none posted)

CentOS 5.9 released

The CentOS project has announced the availability of CentOS 5.9. See the release notes for details.

Comments (none posted)

Fedora 18 for IBM System z 64bit official release

Fedora 18 for IBM System z is available. More information can be found in the architecture specific notes.

Full Story (comments: none)

SolusOS: The Consort Desktop Environment

SolusOS is a relatively new desktop distribution, based on Debian. The project has announced the Consort Desktop Environment, a fork of GNOME Classic. The pieces include consort-panel (gnome-panel), Athena (nautilus), consort-session (gnome-session-fallback), and Consortium (Metacity). "With Consortium, we forked Metacity 2.34. Basically its been a dead project for a while and needs some new life. Work is now underway to bring it up to GTK3/Cairo standards so that we can improve it. Fully antialiased window corners and plugins will be introduced, as well as extended theming. With this new DE, we still maintain total compatibility with the GNOME suite itself. We’re not touching core-libraries so you’ll be able to install our desktop on most distros with GTK3, and have no issues." The recent development release of SolusOS 2 Alpha 7 includes the young Consort DE.

Comments (10 posted)

Distribution News

Debian GNU/Linux

Bits from the Release Team: New members, Help needed and Goals

Neil McGovern reports on behalf of the Debian release team. Jonathan Wiltshire has joined the team as a Release Assistant. Other topics include the bug count, Bug Squashing Parties, a call for a release notes editor, and an update on the freeze policy.

Full Story (comments: none)

Newsletters and articles of interest

Distribution newsletters

Comments (none posted)

Ubuntu considers “huge” change that would end traditional release cycle (ars technica)

Ars technica reports that Canonical is considering switching to a rolling release model for the Ubuntu distribution. "But 14.04 in April 2014 could be the last version released after just a six-month development period. 14.04 is also the next 'Long Term Support' or LTS edition. Every two years, Ubuntu is sort of frozen in place with a more stable edition that is guaranteed support for five years. If the change Canonical is considering is adopted, every future edition starting with 14.04 will be an LTS, so the next version after 14.04 would be 16.04 in April 2016."

Comments (51 posted)

Linux distro spotlight: Mageia (ComputerWorld)

ComputerWorld shines a spotlight on Mageia, the two year old fork of Mandriva. "When it comes to the community around Mageia, [Trish] Fraser believes its inclusiveness has been a key strength. "Teams are approachable, and support is friendly," she says. "This might be more difficult to sustain in a much larger community, but so far it's been a very strong part of our working modus. "We don't have flame-fests, there's a better proportion of women (visible and active!) in the community than many other distros, we aren't particularly country- or language-centric — and people concentrate on supporting each other and getting the work done as well as we can."

Comments (none posted)

Meet PicUntu, a lightweight Linux designed for tiny PCs (PCWorld)

PCWorld introduces PicUntu, an Ubuntu based distribution that targets the RK3066 chipset. PicUntu has also been tested on the MK808 and UG802 devices. "Potential applications for the resulting device include a company Web server, corporate mail server, central database server, content manager, “developer's paradise,” or power GUI desktop, it says, with optional extras available such as Flash, graphics programs, and Office suite clones."

Comments (3 posted)

Page editor: Rebecca Sobol

Development

Namespaces in operation, part 4: more on PID namespaces

By Michael Kerrisk
January 23, 2013

In this article, we continue last week's discussion of PID namespaces (and extend our ongoing series on namespaces). One use of PID namespaces is to implement a package of processes (a container) that behaves like a self-contained Linux system. A key part of a traditional system—and likewise a PID namespace container—is the init process. Thus, we'll look at the special role of the init process and note one or two areas where it differs from the traditional init process. In addition, we'll look at some other details of the namespaces API as it applies to PID namespaces.

The PID namespace init process

The first process created inside a PID namespace gets a process ID of 1 within the namespace. This process has a similar role to the init process on traditional Linux systems. In particular, the init process can perform initializations required for the PID namespace as whole (e.g., perhaps starting other processes that should be a standard part of the namespace) and becomes the parent for processes in the namespace that become orphaned.

In order to explain the operation of PID namespaces, we'll make use of a few purpose-built example programs. The first of these programs, ns_child_exec.c, has the following command-line syntax:

    ns_child_exec [options] command [arguments]

The ns_child_exec program uses the clone() system call to create a child process; the child then executes the given command with the optional arguments. The main purpose of the options is to specify new namespaces that should be created as part of the clone() call. For example, the -p option causes the child to be created in a new PID namespace, as in the following example:

    $ su                  # Need privilege to create a PID namespace
    Password:
    # ./ns_child_exec -p sh -c 'echo $$'
    1

That command line creates a child in a new PID namespace to execute a shell echo command that displays the shell's PID. With a PID of 1, the shell was the init process for the PID namespace that (briefly) existed while the shell was running.

Our next example program, simple_init.c, is a program that we'll execute as the init process of a PID namespace. This program is designed to allow us to demonstrate some features of PID namespaces and the init process.

The simple_init program performs the two main functions of init. One of these functions is "system initialization". Most init systems are more complex programs that take a table-driven approach to system initialization. Our (much simpler) simple_init program provides a simple shell facility that allows the user to manually execute any shell commands that might be needed to initialize the namespace; this approach also allows us to freely execute shell commands in order to conduct experiments in the namespace. The other function performed by simple_init is to reap the status of its terminated children using waitpid().

Thus, for example, we can use the ns_child_exec program in conjunction with simple_init to fire up an init process that runs in a new PID namespace:

    # ./ns_child_exec -p ./simple_init
    init$

The init$ prompt indicates that the simple_init program is ready to read and execute a shell command.

We'll now use the two programs we've presented so far in conjunction with another small program, orphan.c, to demonstrate that processes that become orphaned inside a PID namespace are adopted by the PID namespace init process, rather than the system-wide init process.

The orphan program performs a fork() to create a child process. The parent process then exits while the child continues to run; when the parent exits, the child becomes an orphan. The child executes a loop that continues until it becomes an orphan (i.e., getppid() returns 1); once the child becomes an orphan, it terminates. The parent and the child print messages so that we can see when the two processes terminate and when the child becomes an orphan.

In order to see what that our simple_init program reaps the orphaned child process, we'll employ that program's -v option, which causes it to produce verbose messages about the children that it creates and the terminated children whose status it reaps:

    # ./ns_child_exec -p ./simple_init -v
            init: my PID is 1
    init$ ./orphan
            init: created child 2
    Parent (PID=2) created child with PID 3
    Parent (PID=2; PPID=1) terminating
            init: SIGCHLD handler: PID 2 terminated
    init$                   # simple_init prompt interleaved with output from child
    Child  (PID=3) now an orphan (parent PID=1)
    Child  (PID=3) terminating
            init: SIGCHLD handler: PID 3 terminated

In the above output, the indented messages prefixed with init: are printed by the simple_init program's verbose mode. All of the other messages (other than the init$ prompts) are produced by the orphan program. From the output, we can see that the child process (PID 3) becomes an orphan when its parent (PID 2) terminates. At that point, the child is adopted by the PID namespace init process (PID 1), which reaps the child when it terminates.

Signals and the init process

The traditional Linux init process is treated specially with respect to signals. The only signals that can be delivered to init are those for which the process has established a signal handler; all other signals are ignored. This prevents the init process—whose presence is essential for the stable operation of the system—from being accidentally killed, even by the superuser.

PID namespaces implement some analogous behavior for the namespace-specific init process. Other processes in the namespace (even privileged processes) can send only those signals for which the init process has established a handler. This prevents members of the namespace from inadvertently killing a process that has an essential role in the namespace. Note, however, that (as for the traditional init process) the kernel can still generate signals for the PID namespace init process in all of the usual circumstances (e.g., hardware exceptions, terminal-generated signals such as SIGTTOU, and expiration of a timer).

Signals can also (subject to the usual permission checks) be sent to the PID namespace init process by processes in ancestor PID namespaces. Again, only the signals for which the init process has established a handler can be sent, with two exceptions: SIGKILL and SIGSTOP. When a process in an ancestor PID namespace sends these two signals to the init process, they are forcibly delivered (and can't be caught). The SIGSTOP signal stops the init process; SIGKILL terminates it. Since the init process is essential to the functioning of the PID namespace, if the init process is terminated by SIGKILL (or it terminates for any other reason), the kernel terminates all other processes in the namespace by sending them a SIGKILL signal.

Normally, a PID namespace will also be destroyed when its init process terminates. However, there is an unusual corner case: the namespace won't be destroyed as long as a /proc/PID/ns/pid file for one of the processes in that namespaces is bind mounted or held open. However, it is not possible to create new processes in the namespace (via setns() plus fork()): the lack of an init process is detected during the fork() call, which fails with an ENOMEM error (the traditional error indicating that a PID cannot be allocated). In other words, the PID namespace continues to exist, but is no longer usable.

Mounting a procfs filesystem (revisited)

In the previous article in this series, the /proc filesystems (procfs) for the PID namespaces were mounted at various locations other than the traditional /proc mount point. This allowed us to use shell commands to look at the contents of the /proc/PID directories that corresponded to each of the new PID namespace while at the same time using the ps command to look at the processes visible in the root PID namespace.

However, tools such as ps rely on the contents of the procfs mounted at /proc to obtain the information that they require. Therefore, if we want ps to operate correctly inside a PID namespace, we need to mount a procfs for that namespace. Since the simple_init program permits us to execute shell commands, we can perform this task from the command line, using the mount command:

    # ./ns_child_exec -p -m ./simple_init
    init$ mount -t proc proc /proc
    init$ ps a
      PID TTY      STAT   TIME COMMAND
        1 pts/8    S      0:00 ./simple_init
        3 pts/8    R+     0:00 ps a

The ps a command lists all processes accessible via /proc. In this case, we see only two processes, reflecting the fact that there are only two processes running in the namespace.

When running the ns_child_exec command above, we employed that program's -m option, which places the child that it creates (i.e., the process running simple_init) inside a separate mount namespace. As a consequence, the mount command does not affect the /proc mount seen by processes outside the namespace.

unshare() and setns()

In the second article in this series, we described two system calls that are part of the namespaces API: unshare() and setns(). Since Linux 3.8, these system calls can be employed with PID namespaces, but they have some idiosyncrasies when used with those namespaces.

Specifying the CLONE_NEWPID flag in a call to unshare() creates a new PID namespace, but does not place the caller in the new namespace. Rather, any children created by the caller will be placed in the new namespace; the first such child will become the init process for the namespace.

The setns() system call now supports PID namespaces:

    setns(fd, 0);   /* Second argument can be CLONE_NEWPID to force a
                       check that 'fd' refers to a PID namespace */

The fd argument is a file descriptor that identifies a PID namespace that is a descendant of the PID namespace of the caller; that file descriptor is obtained by opening the /proc/PID/ns/pid file for one of the processes in the target namespace. As with unshare(), setns() does not move the caller to the PID namespace; instead, children that are subsequently created by the caller will be placed in the namespace.

We can use an enhanced version of the ns_exec.c program that we presented in the second article in this series to demonstrate some aspects of using setns() with PID namespaces that appear surprising until we understand what is going on. The new program, ns_run.c, has the following syntax:

    ns_run [-f] [-n /proc/PID/ns/FILE]... command [arguments]

The program uses setns() to join the namespaces specified by the /proc/PID/ns files contained within -n options. It then goes on to execute the given command with optional arguments. If the -f option is specified, it uses fork() to create a child process that is used to execute the command.

Suppose that, in one terminal window, we fire up our simple_init program in a new PID namespace in the usual manner, with verbose logging so that we are informed when it reaps child processes:

    # ./ns_child_exec -p ./simple_init -v
            init: my PID is 1
    init$ 

Then we switch to a second terminal window where we use the ns_run program to execute our orphan program. This will have the effect of creating two processes in the PID namespace governed by simple_init:

    # ps -C sleep -C simple_init
      PID TTY          TIME CMD
     9147 pts/8    00:00:00 simple_init
     # ./ns_run -f -n /proc/9147/ns/pid ./orphan
     Parent (PID=2) created child with PID 3
     Parent (PID=2; PPID=0) terminating
     # 
     Child  (PID=3) now an orphan (parent PID=1)
     Child  (PID=3) terminating

Looking at the output from the "Parent" process (PID 2) created when the orphan program is executed, we see that its parent process ID is 0. This reflects the fact that the process that started the orphan process (ns_run) is in a different namespace—one whose members are invisible to the "Parent" process. As already noted in the previous article, getppid() returns 0 in this case.

The following diagram shows the relationships of the various processes before the orphan "Parent" process terminates. The arrows indicate parent-child relationships between processes.

[Relationship of
    processes inside PID namespaces]

Returning to the window running the simple_init program, we see the following output:

    init: SIGCHLD handler: PID 3 terminated

The "Child" process (PID 3) created by the orphan program was reaped by simple_init, but the "Parent" process (PID 2) was not. This is because the "Parent" process was reaped by its parent (ns_run) in a different namespace. The following diagram shows the processes and their relationships after the orphan "Parent" process has terminated and before the "Child" terminates.

[Relationship of
    processes inside PID namespaces]

It's worth emphasizing that setns() and unshare() treat PID namespaces specially. For other types of namespaces, these system calls do change the namespace of the caller. The reason that these system calls do not change the PID namespace of the calling process is because becoming a member of another PID namespace would cause the process's idea of its own PID to change, since getpid() reports the process's PID with respect to the PID namespace in which the process resides. Many user-space programs and libraries rely on the assumption that a process's PID (as reported by getpid()) is constant (in fact, the GNU C library getpid() wrapper function caches the PID); those programs would break if a process's PID changed. To put things another way: a process's PID namespace membership is determined when the process is created, and (unlike other types of namespace membership) cannot be changed thereafter.

Concluding remarks

In this article we've looked at the special role of the PID namespace init process, shown how to mount a procfs for a PID namespace so that it can be used by tools such as ps, and looked at some of the peculiarities of unshare() and setns() when employed with PID namespaces. This completes our discussion of PID namespaces; in the next article, we'll turn to look at user namespaces.

Comments (17 posted)

Brief items

Quotes of the week

at this point in time, i personally can see absolutely no reason why a regular user should not have access to RT scheduling or memlock if the kernel and PAM (or equivalent) are normally and appropriately configured. give the user the ability to memlock 75% of the system RAM, make sure that the RT scheduling parameters reserve 5% of the CPU for non-RT tasks. done.
Paul Davis (thanks to David Nielson)

For example, I really find it appalling that Linux had proper threads support (and even in the libc!) so early, at a time where OpenBSD didn't. I think Linux really hurt the open source ecosystem with that, as people could write threaded up for Linux that then wouldn't work on OpenBSD.
Lennart Poettering

Comments (5 posted)

GNU Nettle 2.6 released

Version 2.6 of the GNU Nettle cryptographics library has been released. new features include support for SHA3, the GOST R 34.11-94 hash algorithm (RFC 4357), and the PKCS #5 PBKDF2 key derivation function (RFC 2898), used to generate a key from a password or passphrase. Notably, several SHA2 functions have also been renamed for consistency—so read extra carefully.

Full Story (comments: none)

notmuch 0.15 available

David Bremner has released version 0.15 of the notmuch email indexing system and client platform. New in this release is date-range searching, a new tagging interface, and numerous changes to the command line and Emacs front-ends.

Full Story (comments: none)

Trinity 1.1 released

Dave Jones has released version 1.1 of his Trinity fuzz-testing tool just in time for his talk on Trinity at linux.conf.au 2013 next week. The release announcement contains a long list of the changes since the 1.0 release six months ago.

Comments (none posted)

Parrot 5.0.0 available

Version 5.0.0 of the Parrot virtual machine has been released. Among the other changes from the 4.x series, this is the first stable version of Parrot to support threads.

Full Story (comments: 2)

Newsletters and articles

Development newsletters from the past week

Comments (none posted)

Humphrey: On Code Review

At his blog, Mozilla's David Humphrey discusses the project's use of mandatory code reviews. "I’ve had people tell me that so-and-so is spending too much time on reviews, implying that if they were writing code instead, they’d be more productive. The thing this misses is how one developer can only write so many patches, but their influence and ability to affect many more is increased through reviews." It is easy to work alone and keep the project small, he says, but reviews are critical for growth. "Code review is humbling, and it’s good to be humbled. Code review is educational, and it’s good to be taught. Code review is public, and it’s good for open source to happen where people can see what’s going on."

Comments (8 posted)

Weir: Merging Lotus Symphony: Allegro moderato

Apache OpenOffice developer Rob Weir has posted a summary of the Lotus Symphony code contribution and what is being done with it. "Of course, a long list of bug fixes is impressive, but what about new features? Symphony brings those as well. A large one is a user interface addition of a 'Task Pane'. Since a Task Pane is continuously available, it can greatly improve productivity with some repetitive operations. Symphony had one version of this feature, but we're not just doing a literal copy of that UI, since grafting a UI of one application onto another is rarely successful. So we're reviewing several variations on a Task Pane design, which you can see on our wiki. Coding for this feature is ongoing in a branch."

Comments (10 posted)

Peters: Announcing the Firefox OS Developer Preview Phone!

On the hacks.mozilla.org blog, Stormy Peters announced developer preview phones for Firefox OS that should be available in early February. "Developer preview phones will help make the mobile web more accessible to more people. Developers are critical to the web and to Mozilla’s mission to make the web accessible to everyone. Hundreds of millions of people worldwide use Firefox to discover, experience and connect to the Web. A web based on open standards and open technologies." Specs on the two models are available in the post and comments, but details on ordering the phones are still in the works.

Comments (3 posted)

Page editor: Nathan Willis

Announcements

Brief items

FSF: End Software Patents

The Free Software Foundation is asking people to join its "End Software Patents" campaign. "We campaign to get rid of software patents altogether. Your donations won't be used on slow, costly processes to invalidate a tiny number of 'stupid patents.' We want video formats to be free, we want software to be compatible, we want companies to compete by developing software not by buying patents, and we want everyone, including individuals and small businesses, to be allowed to write and distribute software, without having to follow rules imposed by patent holders."

Full Story (comments: none)

Articles of interest

Second set of FOSDEM 2013 speaker interviews available

Koen Vervloesem has posted the second round of interviews with this year's Free and Open Source Software Developers' European Meeting (FOSDEM) main track speakers. The first batch was posted on January 10; this batch includes talks with eight more speakers. FOSDEM 2013 is scheduled for February 2 and 3 in Brussels.

Comments (none posted)

Third set of FOSDEM speaker interviews available

The third set of interviews with the main track speakers at FOSDEM (Free and Open Source Software Developers' European Meeting) is available. The first set and the second set are still available as well. FOSDEM 2013 takes place February 2-3 in Brussels.

Comments (none posted)

New Books

'Book of Gimp' from No Starch Press

No Starch Press has released "The Book of GIMP" by Olivier Lecarme and Karine Delvare.

Full Story (comments: none)

Education and Certification

LPI Hosts Exam Labs at SCALE 11x

The Linux Professional Institute (LPI) will be hosting exams at the Southern California Linux Expo (SCALE) on February 24, 2013.

Full Story (comments: none)

Calls for Presentations

OSCON Call for Proposals

The O'Reilly Open Source Convention (OSCON) will take place July 22-26, 2013 in Portland, Oregon. The call for papers deadline is February 4. They are looking for tutorials and shorter presentations.

Full Story (comments: none)

Linux storage, filesystem and memory management summit CFP

The 2013 Linux storage, filesystem, and memory management summit will be held April 18-19 in San Francisco. Developers interested in attending this event are invited to submit proposals for discussion topics; the deadline is February 8.

Full Story (comments: none)

Upcoming Events

SCALE 11X: Kyle Rankin to keynote

The Southern California Linux Expo (SCALE 11X) has announced that Kyle Rankin will give the second keynote at the expo in Los Angeles, CA on February 24. "Rankin is the author of a variety of books, including “DevOps Troubleshooting,” “The Official Ubuntu Server Book,” “Knoppix Hacks,” “Knoppix Pocket Reference,” “Linux Multimedia Hacks,” and “Ubuntu Hacks.” He is an award-winning columnist for Linux Journal, and has written for PC Magazine, TechTarget websites and other publications. He speaks frequently on Open Source software including at SCALE, OSCON, Linux World Expo, Penguicon, and a number of Linux Users Groups.”"

Full Story (comments: 1)

2013 Linux Foundation Events, 100 Linux Tutorials

The Linux Foundation has announced its conference schedule for 2013 and introduces the 100 Linux Tutorials campaign. "This video campaign is aimed at increasing access to Linux knowledge, removing barriers to learning Linux, and transferring expertise around the globe. The Linux Foundation invites Linux enthusiasts, developers and systems administrators to be part of this worldwide campaign to collect *100 Linux video tutorials** *on the Linux.com video forum, which anyone can access and learn with the click of a mouse."

Full Story (comments: none)

Events: January 24, 2013 to March 25, 2013

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
January 28
February 2
Linux.conf.au 2013 Canberra, Australia
February 2
February 3
Free and Open Source software Developers' European Meeting Brussels, Belgium
February 15
February 17
Linux Vacation / Eastern Europe 2013 Winter Edition Minsk, Belarus
February 18
February 19
Android Builders Summit San Francisco, CA, USA
February 20
February 22
Embedded Linux Conference San Francisco, CA, USA
February 22
February 24
Mini DebConf at FOSSMeet 2013 Calicut, India
February 22
February 24
FOSSMeet 2013 Calicut, India
February 22
February 24
Southern California Linux Expo Los Angeles, CA, USA
February 23
February 24
DevConf.cz 2013 Brno, Czech Republic
February 25
March 1
ConFoo Montreal, Canada
February 26
February 28
ApacheCon NA 2013 Portland, Oregon, USA
February 26
February 28
O’Reilly Strata Conference Santa Clara, CA, USA
February 26
March 1
GUUG Spring Conference 2013 Frankfurt, Germany
March 4
March 8
LCA13: Linaro Connect Asia Hong Kong, China
March 6
March 8
Magnolia Amplify 2013 Miami, FL, USA
March 9
March 10
Open Source Days 2013 Copenhagen, DK
March 13
March 21
PyCon 2013 Santa Clara, CA, US
March 15
March 16
Open Source Conference Szczecin, Poland
March 15
March 17
German Perl Workshop Berlin, Germany
March 16
March 17
Chemnitzer Linux-Tage 2013 Chemnitz, Germany
March 19
March 21
FLOSS UK Large Installation Systems Administration Newcastle-upon-Tyne , UK
March 20
March 22
Open Source Think Tank Calistoga, CA, USA
March 23 Augsburger Linux-Infotag 2013 Augsburg, Germany
March 23
March 24
LibrePlanet 2013: Commit Change Cambridge, MA, USA

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds