User: Password:
|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for July 10, 2014

GENIVI assesses driver distraction and builds on location data

By Nathan Willis
July 9, 2014
ALS 2014

At the 2014 Automotive Linux Summit (ALS) in Tokyo, the GENIVI Alliance showcased several new open-source software projects that are slated to make their way into future in-vehicle systems. They included a framework for tracking driver attention (and, consequently, distraction level) and several new location-based services. For those who do not pay close attention to the automotive software field, these new efforts represent some of the first efforts that push open-source software past the existing, relatively predictable confines of navigation or entertainment—and into more experimental territory.

Driver workload management

[Yusuke Nakamura at ALS]

Yusuke Nakamura from Denso Corporation presented a session about Driver Workload Assessor (DWA), GENIVI's new open-source framework to track the attention of a driver and adjust the behavior of the in-vehicle infotainment (IVI) system accordingly. The need for such a system is well-known, he said; the vehicle is an increasingly complex environment, and society is more and more concerned that driver distraction will result in accidents. He pointed to several studies about the increase in distraction-related crashes, noting that there is a rising trend of distractions from integrated devices—which, as opposed to accidents involving cell phones and other portable devices, is something GENIVI can address directly.

On the flip side, he pointed out, drivers expect and even demand continual access to their information systems; consequently GENIVI's challenge is to not simply keep information away from the driver, but to design a human-machine interface (HMI) system that lets drivers focus on driving when it requires high attention, but adapts to not dissatisfy them on the straight, low-traffic stretches of road when the attention level required drops.

The "smart" solution is to monitor and manage the driver's "workload"—roughly defined as the number and intensity of physical, visual, and cognitive tasks the driver is engaged in. This is a broader definition than "driver distraction," he said; "distraction" is what happens when the workload exceeds the driver's capacity. Even so, some distractions are unhelpful (such as text messages), while other are beneficial (such as alerts and warnings).

The naive approach to managing driver workload, Nakamura said, is to consider only two states: stopped and in-motion. Such an IVI system might simply disable all user input and notifications while in motion, and allow everything when stopped. But this ignores the fact that driver workload goes up and down according to the driving task. DWA defines some middle states in between the naive "all" and "nothing" options; the current version essentially has three in-between states, for "low," "medium," and "high" levels of driver workload.

The plan is that the IVI system would respond to the current workload level by allowing or suppressing input and output. Either individual applications could monitor the current workload level, or a management process could broker API requests, restricting or delaying them when the driver is overly occupied. GENIVI's current approach is to have a "workload manager" process handle the brokering of other applications.

The trick, in either case, is that "driver workload" is fundamentally a cognitive concept. As a result, Nakamura said, software cannot measure it directly. But it can be at least partially inferred from car and environmental conditions. DWA tracks a number of vehicle system states to approximate how busy the driver is: whether the speed is constant, accelerating, or braking, whether the steering wheel is turned, whether the windshield wipers are engaged, and so on. Changes in each of these conditions increment or decrement the current driver workload level—if the driver brakes suddenly and turns the steering wheel, then clearly driving requires more attention at the moment. If a notification comes in at just such a time—for example, an incoming call on the phone paired via Bluetooth—then the workload manager might suppress the phone ringer until the steering wheel straightens out and speed returns to a constant.

Such basic vehicle states are already measured by most modern cars' diagnostic buses. Nakamura demonstrated DWA with a dummy app, in which he could change the simulated vehicle speed and change the steering angle, and DWA would suppress output messages from the dummy app in response. But there are other factors that could also be used to contribute to the driver-workload estimate in the future, he said, including rain sensors, other environmental factors, and even messages from nearby vehicles or infrastructure. There is clearly a lot more to be done, but the benefits are an IVI system that is considerably more responsive to changing conditions than the simplistic all-on/all-off design in use today.

Location-based services

[Philippe Colliot at ALS]

Philippe Colliot of Peugeot Citroën presented the recent work of GENIVI's Location-based services (LBS) expert group, which includes developing several API standards and a demonstration app for GENIVI-compliant IVI systems. The APIs represent the next level up from generic geolocation information, and are intended to let application developers create more complex services. The demo app is called Fuel Stop Advisor, and it represents one example use case: it builds on geolocation, point-of-interest (POI) data, and vehicle status to recommend the best times to stop and refuel.

The LBS group is working on a set of APIs that work in conjunction with the W3C Geolocation API. At present there are four. The Navigation Core API [PDF] (currently at version 3.0) provides a way requesting routing between destinations, including multiple transportation types, breaking the route into segments, and getting "guidance" instructions that can be used as turn-by-turn directions. The Positioning API [PDF] provides dead reckoning, taking gyroscope and compass sensor readings and establishing the vehicle's orientation and motion—so that its position can be tracked on a map even when GPS lock is lost. It is currently at version 2.0.

The Point-of-Interest (POI) Service API [PDF] is designed to serve as a bridge between a POI database and any of several applications that might request POI information. For example, a map application might simply need to display all of the POIs in a rectangular region, while a search application might request all of the POIs in some category (e.g., restaurants) within a given radius of the current position or some other specified location. The POI Service API was recently declared 1.0.

The fourth API is a traffic information API. Colliot explained that GENIVI was attempting to "not reinvent the wheel" where possible, which led to the Traffic API being developed jointly with the European Transport Protocol Experts Group (TPEG), an existing standardization project. Colliot said that the Traffic API had also recently been declared 1.0, although it does not seem to be published on the LBS Git repository.

There are still other areas where the LBS is working on additional specifications, Colliot said, including a Log Replayer API that will allow for easier application testing by playing back position and sensor data. But the group is also working on submitting its APIs to the W3C in the hopes of getting them approved as standards. The Navigation Core API has been submitted, he said, and there are already pending changes in the works based on feedback from navigation services.

Apart from writing specifications, the LBS group has also developed its first open-source app, Fuel Stop Advisor (FSA), "to show people that it is fun to write GENIVI apps." FSA uses the Navit routing engine and Open Street Map (OSM) data. It requires that the car have an active navigation route, and calculates whether or not the current fuel level is enough to get to the destination without stopping. If there is not enough, it recommends alternate routes to stop and refuel along the way.

Colliot showed a demonstration of FSA on his laptop. The user interface is "proof of concept"-level, he said, so it does not look like a finished product. But work continues; the next steps are to port the interface to Qt 5, port the graphics to use GENIVI's Layer Manager (which allows it to be composited with other running applications), and to add the ability to search for refueling stations from additional POI providers.

FSA represents new ground for GENIVI in the sense that it is an end-user application, rather than a base layer. As Colliot indicated in his talk, GENIVI is not changing its mandate—it still targets a middleware layer of software that carmakers do not want to individually reimplement. But it is progress to see that the middleware has gotten to a point where usable applications can be developed.

[Jeremiah Foster at ALS]

GENIVI's community manager Jeremiah Foster also gave a talk, in which he pointed to other projects that are reaching the point where application developers can use them. There is an IVI radio service, for example, that can handle AM, FM, and a variety of digital broadcast standards, and a speech output framework that can be used for anything from turn-by-turn directions to reading text alerts out loud.

The Media Manager project, on which GENIVI is collaborating with Automotive Grade Linux, should have a release ready by October. The goal is an API for connecting to consumer electronics devices for media playback; Foster noted that the team started with the Media Player Remote Interfacing Specification (MPRIS) and has worked with developers from several existing open source projects (like VLC) to make sure that Media Manager meets their needs as well.

Foster ended his session by asking the audience to get involved in the effort; GENIVI wants to know "what is currently missing." As the other GENIVI talks suggested, the project is reaching the point where several of the low-level tasks it has been focusing on are essentially complete, an attention now turns to more user-visible software.

[The author would like to thank The Linux Foundation for travel assistance to attend ALS 2014.]

Comments (42 posted)

Yorba, the IRS, and tax-exemption

By Nathan Willis
July 9, 2014

On June 30, Yorba Foundation director Jim Nelson posted a blog entry reporting that the US Internal Revenue Service (IRS) had denied Yorba's application to be registered as a tax-exempt 501(c)(3) charity. Nelson and others contend that this denial is cause for concern to other players in the FOSS arena, but there are voices who disagree about the implications.

As Nelson's blog post explains, Yorba had filed its 501(c)(3) application in 2009. In the US, 501(c)(3) organizations are one of several types of tax-exempt nonprofit, but there are additional benefits to being a 501(c)(3) rather than, say, a 501(c)(6) trade association. Most notably, Nelson said, donations made to 501(c)(3)s are tax deductible for the donor, and that makes fundraising easier. Many high-profile organizations in the free and open-source software realm are 501(c)(3)s, including the GNOME Foundation, the Mozilla Foundation, the Apache Software Foundation, and the Linux kernel project.

Last year, news broke that the IRS had flagged "open source" tax-exemption applications for increased scrutiny, reportedly out of concern that for-profit companies might seek to run their operations out of a nonprofit organization to evade taxes. That "be on the lookout" (or "BOLO") issue, as it was known, had allegedly started in 2010, and Nelson reported that Yorba received two requests for further information from the IRS that year.

Yorba's application was as a "charitable, scientific, and educational organization." Nelson reported that the IRS's rejection notice was dated May 22, 2014, and gave several reasons for the decision—reasons he called "hair-raising" statements that "could have a direct impact on the free software movement, at least here in the United States."

Nelson quoted five snippets from the IRS's justification for its decision, including the fact that Yorba's software could be used "by any person for any purpose, including nonexempt purposes such as commercial, recreational, or personal purposes," that Yorba does not own all of the copyrights on its software, and that releasing the source code to software does not constitute an educational function since "anything learned by people studying the source code is incidental." The IRS also contended that developing and distributing software is not a "public work" because software is not something ordinarily provided at public expense, and that open-source software is available worldwide and therefore does not "serve a community" as 501(c)(3) rules require.

Nelson pointed out that several of these rationales seem to conflict with the IRS's recognition of other open-source software foundations as 501(c)(3)s, and that several of them seem to suggest that Yorba should impose restrictions on its projects, such as limiting their usage or requiring copyright assignment from all contributors. "In other words," he surmised in one place, "we (and, presumably, everyone else) cannot license our software with a GNU license and meet the IRS’ requirements of a charitable organization."

He added that the potential impact of these statements by the IRS would be chilling to FOSS as a whole:

I doubt they’re going to start enforcing this in the future for organizations that already enjoy exemption. If they do, it will be a royal mess for those projects having to contact every author of every non-trivial contribution and get them to sign over their rights. This is all a big if, of course.

and concluded by saying that Yorba does not intend to appeal the rejection, but will continue developing its application software nonetheless.

The story was picked up by the general tech press in short order, many of whom paired it with the news that the OpenStack Foundation had received a rejection from the IRS for its 501(c)(6) application in March (a decision that OpenStack has already appealed). According to that blog post, the IRS listed three issues with the OpenStack application:

  1. That the foundation is producing software and thus is “carrying on a normal line of business.”
  2. That the foundation is not improving conditions for the entire industry.
  3. That the foundation is performing services for its members.

The rules for 501(c)(3)s and 501(c)(6)s differ, of course, but both rejections share some common themes, like the assertion that the projects are essentially engaging in normal software development practices as many for-profit companies do. To a lot of commenters, that amounted to a rejection of the core principles of FOSS. Simon Phipps, for example, in a story titled "Are open source foundations nonprofits? The IRS says no," said "it seems that the IRS no longer thinks collaborating on open source is a public good." Other news outlets took their interpretations to even greater extremes, applying them to FOSS as a whole (headlines such as "IRS says free software projects can't be nonprofits" and "The IRS wages war on open source nonprofits" are easy-to-find examples).

But others have pointed out that the decision in Yorba's case does not set precedent for any other FOSS project's application. Bradley Kuhn, in a comment on Nelson's post, said that the decision is the opinion of one IRS examiner, and should not be treated as broader in scope. Furthermore, it "doesn’t change the status of orgs that are already operating properly under 501(c)(3) status."

Karen Sandler noted in a blog post of her own that the IRS has said more than once in the past that a decision about one non-profit application has no effect on existing non-profits. Concern that existing organizations will lose their tax-exempt status would seem to be overblown.

Nevertheless, Yorba's multi-year wait for a decision from the IRS does seem to be the norm. Perhaps that is a good thing in and of itself; although no organization seems to be happy about the lengthy wait, vetting an organization is probably a process that ought to require some in-depth investigation, lest gaming the system be too easy. But the lengthy wait clearly has an impact on the projects and foundations in question, consuming time and resources.

It is also possible that the IRS (or some portion of its reviewers) is developing an attitude toward FOSS software that is fundamentally at odds with the common practice of developing and releasing free software while finding other means to fund operations. In a 2013 WIRED article, Luis Villa commented that he had heard from several projects that the IRS wanted them to put non-commercial usage restrictions into their licenses.

No doubt there are unscrupulous individuals out there who would love to be paid to write software but not have to pay taxes (and if there were none before, the idea has surely occurred to them in the wake of the Yorba story). It is a tricky problem for the IRS to sort out, determining whose work is truly in the public interest and who might be developing a standard-issue software product but putting an open-source license on it for tax purposes.

For those who are genuine in their commitment to the ideals of software freedom, though, it is just one more uphill battle among many. Hopefully others will not take the Yorba rejection as a discouragement, and hopefully Yorba will not be discouraged either. Many commenters, both on Nelson's blog post and elsewhere, spoke up to offer their encouragement in general, and their encouragement that Yorba should appeal this initial rejection.

Comments (8 posted)

A speech framework and a GUI for automotive systems

By Nathan Willis
July 9, 2014
ALS 2014

At the 2014 Automotive Linux Summit (ALS) in Tokyo, several sessions highlighted new work from the Automotive Grade Linux (AGL) and Tizen IVI projects, including a flexible speech recognition and generation framework and a graphical user interface (GUI) for in-dash head units. In addition, AGL offered teasers of several upcoming new releases and put out a call for application developers interested in open-source automotive software.

Formally speaking, AGL is a working group of the Linux Foundation focused on the task of increasing Linux adoption in vehicles. But as a practical matter, this has meant group members putting resources into developing open source software. Just prior to ALS 2014, AGL announced the release of its reference Linux platform, which is built on top of the in-vehicle infotainment (IVI) version of Tizen.

The AGL release contains several components not found in the contemporaneous Tizen IVI release, but there is clearly a close working relationship between the two projects. Some of the AGL release's additions may not make it upstream into Tizen IVI in the foreseeable future, either because they are contributed by member companies who have not yet shown an interest in Tizen, because there are licensing issues, or because they are evidently intended only as proof-of-concept code with less general appeal. The exact reasons, though, are not always clear.

For example, the AGL release includes support for controlling a MOST-connected audio amplifier. MOST (Media Oriented Systems Transport) is an automotive industry standard data bus that runs over fiber-optic cable; it provides a number of benefits compared to other vehicle buses (such as high throughput and resistance to electrical noise), but the standard is proprietary and there is reportedly no interest from MOST's governing organization to the idea of opening up the specification or loosening its licensing restrictions. There is, therefore, little chance that general-purpose MOST support will come to Tizen IVI, but AGL has an interest in demonstrating that MOST integration is possible.

The Modello user interface

[Geoffrey van Cutsem at ALS]

On the other hand, Intel's Geoffrey van Cutsem gave a talk about Tizen IVI's new GUI project, Modello, which actually started off as an AGL add-on project but is now developed within Tizen IVI. Modello is a suite of free-software HTML5 applications that cover basic GUI functionality. There is a "home" screen, a dashboard that shows vehicle statistics and sensor readings, a media player, a heater/air-conditioner controller, a phone-tethering application for hands-free usage, and a navigation tool that connects to Google Maps.

The Modello system is completely modular, Van Cutsem said; the home screen launcher can launch any application, not just those already mentioned. But the official Modello applications are all designed to look the same; they pick up the same UI elements from a central theme. As of right now, there are just two themes to choose from—and they differ only in color—but the theming engine is a flexible one. Someone could create a "nighttime" theme, he said, and have it activated automatically when the car's light sensors indicate that it is getting dark.

The Modello applications are also designed to run on 720p portrait-orientation screens, which are not the norm in today's vehicles. Van Cutsem explained the rationale: Modello is targeting the IVI systems of the future, when larger screens are expected to be commonplace. Most center consoles are "portrait-shaped," he said, and if the screen replaces many of the physical controls in use today (including climate-control knobs), users are likely to expect the biggest screen that will fit. The Tesla Model S, he said, is a good example: it sports a 17-inch portrait display.

The Modello project has also been filling in some miscellaneous missing pieces in Tizen IVI; it implements a GUI system settings utility, which has been prominently missing from prior Tizen releases. Perhaps most importantly, it allows GUI configuration of Bluetooth and WiFi networking, which, up until now, had only been configurable with command-line tools.

There is still more to come, he said. The navigation application is still quite rough; as of today it only supports pre-set destinations. Although Van Cutsem did not discuss it, navigation is in a state of flux in both Tizen and AGL at present. Tizen IVI dropped the navigation application Navit from its builds in 2013. The word around the project is that either Navit or some other free-software routing application will return in due course; the Google Maps tool may not last due it its reliance on a single, proprietary data provider.

Also still to come in Modello is a port from Tizen IVI's older web runtime to the newer Crosswalk, support for localization, and integration with the Wayland-based Layer Manager. A new release is expected within the week.

[Matt Jones at ALS]

Van Cutsem also noted that the Modello project would be working to add support for "twenty plus" new applications written by AGL. Jaguar Land Rover's Matt Jones provided a preview of that application collection in part of his ALS keynote talk. The new additions being developed include Modello-compatible versions of older software, such as the SmartDeviceLink mobile device tethering and screen-sharing tool. But they also include several entirely new applications, such as fingerprint recognition and voiceprint recognition utilities, a weather application, and a news carousel.

Jones pointed out that Jaguar Land Rover was interested in funding open-source projects like these AGL reference applications, and told anyone interested in contributing to get in touch. The company has found working with open-source developers to be in its best interests, he said. The average time from concept to deployment in a car is 39 months, but the average software startup only has a lifespan of 18 months. So pairing with startups is not a strategic option.

In contrast, he said, for every dollar that the company puts into Tizen and open source, it estimates that it generates at least 20. He now hopes that the company can start working on more interesting new applications, such as the biometric systems mentioned above. "I hope we're done with implementing Bluetooth profiles and FM radio, and can start doing the unique stuff."

At last: the talking car

Intel's Jaska Uimonen provided a look at one of those possible new developments in his presentation on Winthorpe, an open-source framework for adding speech support to Tizen IVI applications. Winthorpe supports both speech recognition for input and speech synthesis for output, and it provides both as a system-wide service.

This design is distinct from most of the other speech recognition systems on the market, he said. The others tend to either be a standalone, "assistant" application like Apple's Siri, or else each individual app (search, navigation, etc.) is its own "silo"—linked internally to a third-party provider's speech recognition module.

The assistant model can be linked to other apps (such as voice dialing and web searching), but adding new features to these apps requires making changes to the assistant. The close linking approach may also mean multiple apps have speech support, but it has serious drawbacks: the apps are not aware of each other, so they cannot cooperate, and their fate depends entirely on the continued support of the third-party speech engine supplier. In addition, he said, most of the popular speech recognition services (including Google's and Apple's) rely on an active network connection to a remote cloud service.

[Jaska Uimonen at ALS]

Winthorpe attempts to improve on these shortcomings. It provides a platform-level API service with multiple back-ends, so that speech-enabling an application is a one-time process—you do not need to rewrite your code to start using a different speech engine. The API also lets applications stay simple, offloading the speech processing to the service.

The process of speech-enabling an application is straightforward, he said. The program registers itself with the Winthorpe process and declares a set of commands that it wants to listen for. Winthorpe listens for speech input, then notifies its registered client if it recognizes a command—delivering the notification event and, if requested, passing the speech input buffer to the application.

For deciding which registered application gets "voice focus" for a recognized command when there are multiple options, Winthorpe delegates the decision to Tizen IVI's Murphy policy manager—though how Murphy makes that decision is up to the system implementor. Winthorpe is context-aware, he said. When the user makes or answers a phone call, all audio is sent to the phone application and speech recognition is switched off.

The Winthorpe architecture is modular; there can be multiple speech-recognition plugins installed, and there are plugins for disambiguation and for speech synthesis. Currently the plugins include only one open-source recognition engine, Carnegie Mellon University's Pocketsphinx. There are two open-source speech synthesis plugins, one based on Emacspeak and one based on Festival. The Winthorpe team has written demo extensions for media players and for simple web searching.

In addition to registering for callbacks to specific commands, applications can also make use of some special Winthorpe tokens, Uimonen said. One is the wildcard operator * for free-form input. An application can use it to have Winthorpe send the raw audio input rather than having Winthorpe process it as speech. This might be useful for recording notes or calls. Another is "dictionary switch" command, which tells Winthorpe to match speech input against a special dictionary rather than the general-purpose one. This can dramatically improve recognition quality, he said. For instance, if one knows that the speech input will be numeric, switching to a "digits" dictionary will reduce the error rate.

Speech output is considerably simpler than speech recognition, Uimonen said. Winthorpe supports selecting from among multiple installed "voices," multiple languages, and includes commands to adjust the voice's rate and pitch.

One of the weaknesses of the system is how few open-source speech projects there are, he said. Pocketsphinx is currently the only open-source recognition engine because there are few others available, although he said the project is working with the Julius engine that is designed for Japanese. Between the two synthesis engines, Festival is noticeably weaker than Emacspeak. He added that most existing IVI systems use a proprietary speech recognition back-end.

Future work for the project includes Julius integration, improvements to the Murphy integration, the ability to reconfigure the speech-decoding pipeline on the fly, and tools to better pronounce unrecognized words.

Together, the AGL and Tizen IVI projects appear to be making progress on multiple fronts. While some of the work (such as Winthorpe) is of interest primarily to developers, the details of the project indicate that the team is trying to improve on the status quo available in other IVI systems. And other new pieces, such as Modello, indicate that polished, end-user code is within reach for the first time, which is good news for those who are interested in seeing an open-source IVI platform reach the market.

[The author would like to thank The Linux Foundation for travel assistance to attend ALS 2014.]

Comments (2 posted)

Page editor: Nathan Willis

Security

Evaluating the LZO integer-overflow bug

By Nathan Willis
July 9, 2014

In June, a security researcher disclosed an integer-overflow bug in the Lempel–Ziv–Oberhumer (LZO) compression algorithm—a bug that has persisted in the wild for roughly two decades, and is reproduced in multiple LZO implementations as well as in the related LZ4 algorithm. LZ4's author then accused the researcher of irresponsible disclosure and of over-hyping the issue for the sake of publicity. The two have subsequently argued back and forth about the proper assessment of the bug's severity, but wherever history eventually comes down on that particular question, the case holds lessons on a number of fronts for software developers.

Don A. Bailey published a June 26 blog post explaining the bug, which he had discovered during a code audit. In essence, Markus Oberhumer's original LZO code included a simple integer overflow in the code block that handles uncompressed "Literal Runs" in a compressed LZO file. The overflowed variable is later used as a size parameter, which an attacker can use to overflow a pointer and potentially gain access to a protected area of system memory. Importantly, as Bailey sees it, the original LZO reference code has essentially been copied verbatim into a wide variety of later LZO implementations, including OpenVPN, MPlayer2, Libav, FFmpeg, Btrfs, squashfs, Android systems, and the Linux kernel. Furthermore, the LZ4 algorithm developed by Yann Collet also reuses Oberhumer's reference code (including the bug), and LZ4 is also used in a variety of places, including the ZFS filesystem.

Bailey's original post not only described the bug in detail, but it went on to offer an assessment of the severity of the bug for real-world attacks. The Libav and FFmpeg versions of LZO are susceptible to remote code execution, he said, as are LZ4 implementations (although in the LZ4 case, such exploits are only practical on 32-bit architectures). On the other hand, denial-of-service attacks—while arguably less serious—are plausible on all LZO and LZ4 implementations. All of the possible attacks rely on specially crafted data payloads.

Collet fired back with a blog post of his own, the same day, calling Bailey's post "totally irresponsible" and an attempt "to create a flashy headline" by claiming that the bug is far more serious that it actually is. In reality, Collet said, the conditions required to exploit the bug in LZ4 are so peculiar that there is "no real-world risk," and that none of the known LZ4 implementations can be targeted. On June 28, Collet retracted some of his criticisms of Bailey's disclosure methodology but continued to argue that no known program met the conditions required to exploit the bug.

What followed was an at times heated back-and-forth between the two, both about the severity of the bug and about how it was disclosed. Collet noted that the underlying issue had been reported by someone else more than a year earlier and was deemed low-priority, mostly because the hypothetical attack would require that LZ4 be called with blocks of extremely large size (8MB or larger) and because a 64-bit system would require an implausibly large amount of memory to overflow the buffer. Bailey contended that there are plenty of 32-bit systems in the wild today (including most ARM devices) and that disinterest in future-proofing 64-bit implementations was short-sighted.

On July 1, Bailey posted a follow-up showing that LZ4 could be exploited with 2.47 MB of data. Collet again accused Bailey of irresponsible disclosure for publishing this follow-up without notifying the LZ4 project privately ahead of time.

As a practical matter, an update for LZ4 that fixes all of the issues cited by Bailey is already available, r119. The LZO reference implementation has also been updated with a fix, in version 2.08. Fixes have also been published for the affected downstream projects.

Assessing the real-world severity of the bug is, to a large degree, a matter on which reasonable people may never fully agree. It certainly requires separating the issue from Collet and Bailey's argument over disclosure practices—an argument that is not technical in nature. Bailey has written two posts that describe real-world attacks against LZ4 in the wild; one hinges on the fact that, while "real" users might never use LZ4 with exceptionally large block sizes, higher-level libraries often pass data down to algorithms like LZ4 without doing sanity checks. The second shows Bailey exploiting Firefox 30.0's video-playback code.

Lost in all of the debate about how plausible an attack is against LZ4, though, is a separate point raised in Bailey's original blog post. Oberhumer's reference code for LZO is the original source of the integer overflow, and because that reference code is believed to be highly optimized for decompression speed (which is, after all, one of LZO's key selling points), many developers copied it—flaws and all—into their own projects. Algorithms, Bailey said, become treated like "blessed" code, with other developers assuming their correctness and not giving them the same level of scrutiny that they might to other third-party work.

The potential harm of a long-standing bug or even a back-door in reference code is therefore magnified. Where the subject matter is regarded as highly specialized, things become even trickier. One can see echoes of this concern in the recent "too few independent implementations" issue that was cited as an objection to including Daniel J. Bernstein's Curve25519 cryptographic function in the W3C WebCrypto API. The odds may not be particularly high that Bernstein's code contains an exploitable bug, but the fact that so many developers implicitly trust its correctness is a cause for caution.

And cryptography is far from the only subject matter where widespread code reuse is commonplace. It is frequently found where low-level and highly-optimized functions are required. For example, virtually all—if not literally all—free-software raw photo software is built on top of Dave Coffin's dcraw decoder, which is released as ANSI C code typically copied into downstream projects.

However difficult it may be to craft a real-world exploit for LZO or LZ4, a key lesson is that the bug was replicated to a variety of downstream projects in part because the original reference code was not subjected to sufficient scrutiny sooner. A code audit did eventually uncover the flaw, but had that audit taken place years earlier, there would likely be far less outcry over the issue today.

Comments (8 posted)

Brief items

Security quotes of the week

Imagine getting a call from your doctor if you let your gym membership lapse, make a habit of buying candy bars at the checkout counter, or begin shopping at plus-size clothing stores. For patients of Carolinas HealthCare System, which operates the largest group of medical centers in North and South Carolina, such a day could be sooner than they think. Carolinas HealthCare, which runs more than 900 care centers, including hospitals, nursing homes, doctors’ offices, and surgical centers, has begun plugging consumer data on 2 million people into algorithms designed to identify high-risk patients so that doctors can intervene before they get sick. The company purchases the data from brokers who cull public records, store loyalty program transactions, and credit card purchases.
Shannon Pettypiece and Jordan Robertson in Bloomberg Businessweek

You may say that Ozymandias is dead – or rather fictional but, even in the fiction, dead – so couldn't apply to have his virtual trunkless legs buried in the unsearchable sand (I will retain control of this metaphor). The internet can still be accurate about the deceased, you might think. I don't. They're the very people you can say anything about, true or false, because they cannot be libelled. Only the living have legal recourse to ensure accuracy, but why would anyone bother to get things corrected if they can effectively just delete anything written about them that they're not keen on?

People's right to suppress unpleasant lies which are publicly told is being extended to unpleasant truths – until they die when it's suddenly open season on slander. The internet will become constructed entirely of two different sorts of untruth: contemporaneous unalloyed praise and posthumous defamatory hearsay.

David Mitchell in The Guardian

Comments (13 posted)

Schneier: NSA Targets Privacy Conscious for Surveillance

Bruce Schneier has a good summary of recently reported information about the US National Security Agency (NSA) targeting of users searching for or reading information about Tor and The Amnesic Incognito Live System (Tails), which certainly could include readers of this site. "Jake Appelbaum et. al, are reporting on XKEYSCORE selection rules that target users -- and people who just visit the websites of -- Tor, Tails, and other sites. This isn't just metadata; this is "full take" content that's stored forever. [...] It's hard to tell how extensive this is. It's possible that anyone who clicked on this link -- with the embedded torproject.org URL above -- is currently being monitored by the NSA. It's possible that this only will happen to people who receive the link in e-mail, which will mean every Crypto-Gram subscriber in a couple of weeks. And I don't know what else the NSA harvests about people who it selects in this manner. Whatever the case, this is very disturbing." Also see reports in Linux Journal (which was specifically noted in the XKeyscore rules) and Boing Boing.

Comments (11 posted)

OpenSSL speeds up development to avoid being “slow-moving and insular” (Ars Technica)

Ars Technica reports on the OpenSSL project's new roadmap that describes a number of problems with the project and its code along with plans to address them. "The project has numerous problems, the roadmap says. These include a backlog of bug reports, incomplete and incorrect documentation, code complexity that causes maintenance problems, inconsistent coding style, a lack of code review, and having no clear release plan, platform strategy, or security strategy. The plan is to fix all these problems. For example, bug reports should receive 'an initial response within four working days.' That goal can be met now, the roadmap says, but others will take longer. Defining a clear coding standard for the project is expected to take about three months. 'Review[ing] and revis[ing] the public API with a view to reducing complexity' will take about a year."

Comments (48 posted)

The CHERI capability model: Revisiting RISC in an age of risk (Light Blue Touchpaper)

Over at the Light Blue Touchpaper blog, there is a summary of a paper [PDF] presented in late June at the 2014 International Symposium on Computer Architecture about Capability Hardware Enhanced RISC Instructions (CHERI). "CHERI is an instruction-set extension, prototyped via an FPGA-based soft processor core named BERI, that integrates a capability-system model with a conventional memory-management unit (MMU)-based pipeline. Unlike conventional OS-facing MMU-based protection, the CHERI protection and security models are aimed at compilers and applications. CHERI provides efficient, robust, compiler-driven, hardware-supported, and fine-grained memory protection and software compartmentalisation (sandboxing) within, rather than between, addresses spaces. We run a version of FreeBSD that has been adapted to support the hardware capability model (CheriBSD) compiled with a CHERI-aware Clang/LLVM that supports C pointer integrity, bounds checking, and capability-based protection and delegation. CheriBSD also supports a higher-level hardware-software security model permitting sandboxing of application components within an address space based on capabilities and a Call/Return mechanism supporting mutual distrust."

Comments (31 posted)

Garrett: Self-signing custom Android ROMs

Matthew Garrett explains how to get an Android device to refuse to boot an operating system that has not been signed by the device's owner. "It's annoying and involves a bunch of manual processes and you'll need to re-sign every update yourself. But it is possible to configure Nexus devices in such a way that you retain the same level of security you had when you were using the Google keys without losing the freedom to run whatever you want."

Comments (2 posted)

New vulnerabilities

apt-cacher-ng: cross-site scripting

Package(s):apt-cacher-ng CVE #(s):CVE-2014-4510
Created:July 4, 2014 Updated:July 9, 2014
Description: From the Red Hat bugzilla entry:

As noted in this report to oss-security, a flaw exists in the apt-cacher-ng server, and an inside attacker (on the LAN with knowledge of the server's address), could trick a user into visiting, or redirect them to, a manipulated URL that would cause the cross-site scripting attack.

Alerts:
Fedora FEDORA-2014-7751 apt-cacher-ng 2014-07-04

Comments (none posted)

cacti: cross-site scripting

Package(s):cacti CVE #(s):CVE-2014-4002
Created:July 8, 2014 Updated:July 9, 2014
Description: From the CVE entry:

Multiple cross-site scripting (XSS) vulnerabilities in Cacti 0.8.8b allow remote attackers to inject arbitrary web script or HTML via the (1) drp_action parameter to cdef.php, (2) data_input.php, (3) data_queries.php, (4) data_sources.php, (5) data_templates.php, (6) graph_templates.php, (7) graphs.php, (8) host.php, or (9) host_templates.php or the (10) graph_template_input_id or (11) graph_template_id parameter to graph_templates_inputs.php.

Alerts:
Gentoo 201509-03 cacti 2015-09-24
openSUSE openSUSE-SU-2015:0479-1 cacti 2015-03-11
Mageia MGASA-2014-0302 cacti 2014-07-26
Fedora FEDORA-2014-7836 cacti 2014-07-08
Fedora FEDORA-2014-7849 cacti 2014-07-08

Comments (none posted)

cumin: two vulnerabilities

Package(s):cumin CVE #(s):CVE-2012-2682 CVE-2014-0174
Created:July 9, 2014 Updated:July 9, 2014
Description: From the Red Hat advisory:

It was found that if Cumin were asked to display a link name containing non-ASCII characters, the request would terminate with an error. If data containing non-ASCII characters were added to the database (such as via Cumin or Wallaby), requests to load said data would terminate and the requested page would not be displayed until an administrator cleans the database. (CVE-2012-2682)

It was found that Cumin did not set the HttpOnly flag on session cookies. This could allow a malicious script to access the session cookie. (CVE-2014-0174)

Alerts:
Red Hat RHSA-2014:0859-01 cumin 2014-07-09
Red Hat RHSA-2014:0858-01 cumin 2014-07-09

Comments (none posted)

dbus: two denial of service flaws

Package(s):dbus CVE #(s):CVE-2014-3532 CVE-2014-3533
Created:July 3, 2014 Updated:December 22, 2014
Description: From the Debian advisory:

CVE-2014-3532: Alban Crequy at Collabora Ltd. discovered a bug in dbus-daemon's support for file descriptor passing. A malicious process could force system services or user applications to be disconnected from the D-Bus system by sending them a message containing a file descriptor, leading to a denial of service.

CVE-2014-3533: Alban Crequy at Collabora Ltd. and Alejandro Martinez Suarez discovered that a malicious process could force services to be disconnected from the D-Bus system by causing dbus-daemon to attempt to forward invalid file descriptors to a victim process, leading to a denial of service.

Alerts:
Mandriva MDVSA-2015:176 dbus 2015-03-30
Fedora FEDORA-2014-17595 mingw-dbus 2015-01-02
Fedora FEDORA-2014-17570 mingw-dbus 2015-01-02
Fedora FEDORA-2014-16227 dbus 2014-12-19
Gentoo 201412-12 dbus 2014-12-13
openSUSE openSUSE-SU-2014:1239-1 dbus-1 2014-09-28
openSUSE openSUSE-SU-2014:1228-1 dbus-1 2014-09-28
Mandriva MDVSA-2014:148 dbus 2014-07-31
Mageia MGASA-2014-0294 dbus 2014-07-26
openSUSE openSUSE-SU-2014:0921-1 dbus-1 2014-07-21
openSUSE openSUSE-SU-2014:0926-1 dbus-1 2014-07-21
Ubuntu USN-2275-1 dbus 2014-07-08
Fedora FEDORA-2014-8059 dbus 2014-07-08
Debian DSA-2971-1 dbus 2014-07-02

Comments (none posted)

ffmpeg: multiple vulnerabilities

Package(s):ffmpeg CVE #(s):CVE-2014-2097 CVE-2014-2098 CVE-2014-2099 CVE-2014-2263 CVE-2014-4610
Created:July 7, 2014 Updated:July 9, 2014
Description: From the Mageia advisory:

The tak_decode_frame function in libavcodec/takdec.c in FFmpeg before 2.0.4 does not properly validate a certain bits-per-sample value, which allows remote attackers to cause a denial of service (out-of-bounds array access) or possibly have unspecified other impact via crafted TAK (aka Tom's lossless Audio Kompressor) data (CVE-2014-2097).

libavcodec/wmalosslessdec.c in FFmpeg before 2.0.4 uses an incorrect data-structure size for certain coefficients, which allows remote attackers to cause a denial of service (memory corruption) or possibly have unspecified other impact via crafted WMA data (CVE-2014-2098).

The msrle_decode_frame function in libavcodec/msrle.c in FFmpeg before 2.0.4 does not properly calculate line sizes, which allows remote attackers to cause a denial of service (out-of-bounds array access) or possibly have unspecified other impact via crafted Microsoft RLE video data (CVE-2014-2099).

The mpegts_write_pmt function in the MPEG2 transport stream (aka DVB) muxer (libavformat/mpegtsenc.c) in FFmpeg before 2.0.4 allows remote attackers to have unspecified impact and vectors, which trigger an out-of-bounds write (CVE-2014-2263).

An integer overflow in LZO decompression in FFmpeg before 2.0.5 allows remote attackers to have an unspecified impact by embedding compressed data in a video file (CVE-2014-4610).

Alerts:
Mandriva MDVSA-2015:173 ffmpeg 2015-03-30
Mandriva MDVSA-2014:129 ffmpeg 2014-07-09
Mageia MGASA-2014-0281 ffmpeg 2014-07-04
Mageia MGASA-2014-0280 ffmpeg 2014-07-04
Gentoo 201603-06 ffmpeg 2016-03-12

Comments (none posted)

file: denial of service

Package(s):file CVE #(s):CVE-2014-3538
Created:July 7, 2014 Updated:September 11, 2014
Description: From the CVE entry:

file before 5.19 does not properly restrict the amount of data read during a regex search, which allows remote attackers to cause a denial of service (CPU consumption) via a crafted file that triggers backtracking during processing of an awk rule. NOTE: this vulnerability exists because of an incomplete fix for CVE-2013-7345.

Alerts:
Oracle ELSA-2015-2155 file 2015-11-23
Red Hat RHSA-2015:2155-07 file 2015-11-19
Oracle ELSA-2015-1135 php 2015-06-23
Mandriva MDVSA-2015:080 php 2015-03-28
Red Hat RHSA-2014:1766-01 php55-php 2014-10-30
Red Hat RHSA-2014:1765-01 php54-php 2014-10-30
Oracle ELSA-2014-1327 php 2014-09-30
CentOS CESA-2014:1327 php 2014-09-30
Red Hat RHSA-2014:1327-01 php 2014-09-30
Debian DSA-3021-2 file 2014-09-10
Debian DSA-3021-1 file 2014-09-09
Slackware SSA:2014-247-01 php 2014-09-04
Mandriva MDVSA-2014:172 php 2014-09-03
Debian DSA-3008-2 php5 2014-08-21
Mageia MGASA-2014-0324 php 2014-08-08
Mandriva MDVSA-2014:149 php 2014-08-06
Mageia MGASA-2014-0307 file 2014-08-05
Mandriva MDVSA-2014:146 file 2014-07-31
Ubuntu USN-2278-1 file 2014-07-15
Fedora FEDORA-2014-7992 file 2014-07-05
Scientific Linux SLSA-2015:2155-7 file 2015-12-21
Red Hat RHSA-2016:0760-01 file 2016-05-10
Oracle ELSA-2016-0760 file 2016-05-13
Scientific Linux SLSA-2016:0760-1 file 2016-06-08

Comments (none posted)

kernel: privilege escalation

Package(s):kernel CVE #(s):CVE-2014-4699
Created:July 7, 2014 Updated:August 11, 2014
Description: From the Debian advisory:

Andy Lutomirski discovered that the ptrace syscall was not verifying the RIP register to be valid in the ptrace API on x86_64 processors. An unprivileged user could use this flaw to crash the kernel (resulting in denial of service) or for privilege escalation.

Alerts:
Oracle ELSA-2015-0290 kernel 2015-03-12
openSUSE openSUSE-SU-2014:1246-1 kernel 2014-09-28
SUSE SUSE-SU-2014:1138-1 kernel 2014-09-16
Oracle ELSA-2014-1167 kernel 2014-09-09
Oracle ELSA-2014-1392 kernel 2014-10-21
openSUSE openSUSE-SU-2014:0985-1 kernel 2014-08-11
openSUSE openSUSE-SU-2014:0957-1 kernel 2014-08-01
Oracle ELSA-2014-0981 kernel 2014-07-29
Mandriva MDVSA-2014:155 kernel 2014-08-07
Red Hat RHSA-2014:0949-01 kernel 2014-07-28
Oracle ELSA-2014-3049 kernel 2014-07-24
CentOS CESA-2014:0923 kernel 2014-07-25
CentOS CESA-2014:0924 kernel 2014-07-25
Scientific Linux SLSA-2014:0924-1 kernel 2014-07-24
Oracle ELSA-2014-0923 kernel 2014-07-23
Oracle ELSA-2014-0924 kernel 2014-07-23
Red Hat RHSA-2014:0923-01 kernel 2014-07-23
Red Hat RHSA-2014:0924-01 kernel 2014-07-23
Red Hat RHSA-2014:0925-01 kernel 2014-07-23
Red Hat RHSA-2014:0913-01 kernel-rt 2014-07-22
SUSE SUSE-SU-2014:0908-1 Linux kernel 2014-07-17
SUSE SUSE-SU-2014:0909-1 Linux kernel 2014-07-17
SUSE SUSE-SU-2014:0910-1 Linux kernel 2014-07-17
SUSE SUSE-SU-2014:0911-1 Linux kernel 2014-07-17
SUSE SUSE-SU-2014:0912-1 Linux kernel 2014-07-17
Ubuntu USN-2272-1 linux-lts-trusty 2014-07-05
Ubuntu USN-2271-1 linux-lts-saucy 2014-07-05
Ubuntu USN-2270-1 linux-lts-raring 2014-07-05
Ubuntu USN-2269-1 linux-lts-quantal 2014-07-05
Ubuntu USN-2274-1 kernel 2014-07-05
Ubuntu USN-2268-1 kernel 2014-07-05
Ubuntu USN-2266-1 kernel 2014-07-05
Ubuntu USN-2267-1 EC2 kernel 2014-07-05
Debian DSA-2972-1 kernel 2014-07-06

Comments (none posted)

lzo: denial of service/possible code execution

Package(s):lzo CVE #(s):CVE-2014-4607
Created:July 3, 2014 Updated:January 2, 2017
Description: From the Red Hat bugzilla entry:

An integer overflow may occur when processing any variant of a "literal run" in the lzo1x_decompress_safe function. Each of these three locations is subject to an integer overflow when processing zero bytes. This exposes the code that copies literals to memory corruption. It should be noted that if the target is 64bit liblzo2, the overflow is still possible, but impractical. An overflow would require so much input data that an attack would be infeasible even in modern computers.

Alerts:
openSUSE openSUSE-SU-2015:0932-1 LibVNCServer 2015-05-24
Mandriva MDVSA-2015:163 grub2 2015-03-29
Gentoo 201503-13 busybox 2015-03-29
Mandriva MDVSA-2015:146 libvncserver 2015-03-29
Mandriva MDVSA-2015:150 liblzo 2015-03-29
Fedora FEDORA-2015-1007 dump 2015-02-25
Fedora FEDORA-2015-1023 dump 2015-02-25
Fedora FEDORA-2014-16452 grub2 2014-12-17
Fedora FEDORA-2014-16403 grub2 2014-12-12
Fedora FEDORA-2014-16378 grub2 2014-12-12
Fedora FEDORA-2014-10366 icecream 2014-11-19
Fedora FEDORA-2014-10468 icecream 2014-11-19
Mageia MGASA-2014-0432 kde4 2014-10-29
Mandriva MDVSA-2014:181 dump 2014-09-24
Mageia MGASA-2014-0378 dump 2014-09-15
Mandriva MDVSA-2014:173 busybox 2014-09-03
Mandriva MDVSA-2014:168 libvncserver 2014-09-02
Mageia MGASA-2014-0362 distcc 2014-09-01
Mageia MGASA-2014-0363 blender 2014-09-01
Fedora FEDORA-2014-9632 distcc 2014-08-30
Fedora FEDORA-2014-9591 distcc 2014-08-30
Mageia MGASA-2014-0361 x11vnc 2014-08-28
Mageia MGASA-2014-0356 libvncserver 2014-08-27
Mageia MGASA-2014-0360 kdenetwork4 2014-08-27
Mageia MGASA-2014-0359 italc 2014-08-27
Mageia MGASA-2014-0357 icecream 2014-08-27
Mageia MGASA-2014-0355 harbour 2014-08-27
Mageia MGASA-2014-0358 grub2 2014-08-27
Mageia MGASA-2014-0352 mednafen 2014-08-25
Mageia MGASA-2014-0351 busybox 2014-08-25
Fedora FEDORA-2014-9151 krfb 2014-08-16
Fedora FEDORA-2014-9183 krfb 2014-08-16
Fedora FEDORA-2014-7939 lzo 2014-10-12
Debian DSA-2995-1 lzo2 2014-08-03
SUSE SUSE-SU-2014:0955-1 lzo 2014-07-31
Ubuntu USN-2300-1 lzo2 2014-07-24
Oracle ELSA-2014-0861 lzo 2014-07-23
openSUSE openSUSE-SU-2014:0922-1 lzo 2014-07-21
SUSE SUSE-SU-2014:0904-1 lzo 2014-07-16
Scientific Linux SLSA-2014:0861-2 lzo 2014-07-09
Mandriva MDVSA-2014:134 liblzo 2014-07-10
CentOS CESA-2014:0861 lzo 2014-07-09
Red Hat RHSA-2014:0861-01 lzo 2014-07-09
Mageia MGASA-2014-0290 liblzo 2014-07-09
Oracle ELSA-2014-0861 lzo 2014-07-09
CentOS CESA-2014:0861 lzo 2014-07-09
Fedora FEDORA-2014-7926 lzo 2014-07-03
Gentoo 201701-14 lzo 2017-01-02

Comments (none posted)

mediawiki: prevent external resources in SVG files

Package(s):mediawiki CVE #(s):
Created:July 7, 2014 Updated:July 11, 2014
Description: From the MediaWiki announcement:

Prevent external resources in SVG files.

Alerts:
Fedora FEDORA-2014-7779 mediawiki 2014-07-05
Fedora FEDORA-2014-7805 mediawiki 2014-07-05

Comments (1 posted)

openstack-ceilometer: information leak

Package(s):openstack-ceilometer CVE #(s):CVE-2014-4615
Created:July 8, 2014 Updated:August 13, 2014
Description: From the Fedora advisory:

Fix tokens leaking to message queue

Alerts:
Ubuntu USN-2311-2 ceilometer 2014-08-21
Ubuntu USN-2321-1 neutron 2014-08-21
Red Hat RHSA-2014:1050-01 openstack-ceilometer 2014-08-13
Ubuntu USN-2311-1 python-pycadf 2014-08-11
Fedora FEDORA-2014-7780 python-pycadf 2014-07-08
Fedora FEDORA-2014-7799 openstack-ceilometer 2014-07-08

Comments (none posted)

owncloud: undisclosed vulnerability

Package(s):owncloud CVE #(s):
Created:July 9, 2014 Updated:July 30, 2014
Description: From the owncloud changelog:

Release "6.0.4" Fixed a security issue (Will be disclosed two weeks after this release)

Alerts:
Mandriva MDVSA-2014:140 owncloud 2014-07-29
Mageia MGASA-2014-0301 owncloud 2014-07-26
Fedora FEDORA-2014-7964 owncloud 2014-07-09

Comments (none posted)

php: information disclosure

Package(s):php5 CVE #(s):CVE-2014-4721
Created:July 9, 2014 Updated:July 31, 2014
Description: From the CVE entry:

The phpinfo implementation in ext/standard/info.c in PHP before 5.4.30 and 5.5.x before 5.5.14 does not ensure use of the string data type for the PHP_AUTH_PW, PHP_AUTH_TYPE, PHP_AUTH_USER, and PHP_SELF variables, which might allow context-dependent attackers to obtain sensitive information from process memory by using the integer data type with crafted values, related to a "type confusion" vulnerability, as demonstrated by reading a private SSL key in an Apache HTTP Server web-hosting environment with mod_ssl and a PHP 5.3.x mod_php.

Alerts:
Mandriva MDVSA-2015:080 php 2015-03-28
Red Hat RHSA-2014:1766-01 php55-php 2014-10-30
Red Hat RHSA-2014:1765-01 php54-php 2014-10-30
openSUSE openSUSE-SU-2014:1236-1 php5 2014-09-28
Scientific Linux SLSA-2014:1012-1 php53 and php 2014-08-06
CentOS CESA-2014:1013 php 2014-08-06
openSUSE openSUSE-SU-2014:0945-1 php5 2014-07-30
CentOS CESA-2014:1012 php53 2014-08-06
Oracle ELSA-2014-1013 php 2014-08-06
Oracle ELSA-2014-1012 php53 2014-08-06
Oracle ELSA-2014-1012 php53 2014-08-06
CentOS CESA-2014:1012 php53 2014-08-06
Red Hat RHSA-2014:1012-01 php53 2014-08-06
Ubuntu USN-2276-1 php5 2014-07-09
Mandriva MDVSA-2014:130 php 2014-07-09
Mageia MGASA-2014-0284 php 2014-07-09
Mageia MGASA-2014-0283 php 2014-07-09
Debian DSA-2974-1 php5 2014-07-08
Red Hat RHSA-2014:1013-01 php 2014-08-06
SUSE SUSE-SU-2016:1638-1 php53 2016-06-21

Comments (none posted)

phpmyadmin: cross-site scripting

Package(s):phpmyadmin CVE #(s):CVE-2014-4348
Created:July 9, 2014 Updated:July 30, 2014
Description: From the CVE entry:

Multiple cross-site scripting (XSS) vulnerabilities in phpMyAdmin 4.2.x before 4.2.4 allow remote authenticated users to inject arbitrary web script or HTML via a crafted (1) database name or (2) table name that is improperly handled after presence in (a) the favorite list or (b) recent tables.

Alerts:
Fedora FEDORA-2014-8577 phpMyAdmin 2014-07-30
Fedora FEDORA-2014-8581 phpMyAdmin 2014-07-30
Mandriva MDVSA-2014:126 phpmyadmin 2014-07-08

Comments (none posted)

python: script execution

Package(s):python CVE #(s):CVE-2014-4650
Created:July 9, 2014 Updated:November 24, 2014
Description: From the Mageia advisory:

The CGIHTTPServer Python module does not properly handle URL-encoded path separators in URLs. This may enable attackers to disclose a CGI script's source code or execute arbitrary scripts in the server's document root.

Alerts:
Oracle ELSA-2015-2101 python 2015-11-23
Red Hat RHSA-2015:2101-01 python 2015-11-19
Scientific Linux SLSA-2015:1330-1 python 2015-08-03
Red Hat RHSA-2015:1330-01 python 2015-07-22
Ubuntu USN-2653-1 python2.7, python3.2, python3.4 2015-06-25
Red Hat RHSA-2015:1064-01 python27 2015-06-04
Mandriva MDVSA-2015:076 python3 2015-03-27
Mandriva MDVSA-2015:075 python 2015-03-27
Fedora FEDORA-2014-16393 python3 2014-12-12
Fedora FEDORA-2014-14266 python 2014-11-22
Fedora FEDORA-2014-14257 python3 2014-11-13
Fedora FEDORA-2014-14245 python3 2014-11-09
Fedora FEDORA-2014-14227 python 2014-11-09
openSUSE openSUSE-SU-2014:1734-1 python 2014-12-31
openSUSE openSUSE-SU-2014:1070-1 python3 2014-08-28
openSUSE openSUSE-SU-2014:1042-1 python3 2014-08-20
openSUSE openSUSE-SU-2014:1041-1 python 2014-08-20
openSUSE openSUSE-SU-2014:1046-1 python 2014-08-20
Mageia MGASA-2014-0285 python 2014-07-09
Scientific Linux SLSA-2015:2101-1 python 2015-12-21

Comments (none posted)

python-django-evolution: incompatible versions

Package(s):python-django-evolution CVE #(s):
Created:July 9, 2014 Updated:July 9, 2014
Description: From the Red Hat bugzilla:

Review Board 1.7.x is only compatible with django_evolution 0.6.x, but I accidentally pushed django_evolution 0.7.1 to stable in Fedora 19 and 20.

Alerts:
Fedora FEDORA-2014-7333 ReviewBoard 2014-07-09
Fedora FEDORA-2014-7348 ReviewBoard 2014-07-09
Fedora FEDORA-2014-7333 python-django-evolution 2014-07-09
Fedora FEDORA-2014-7348 python-django-evolution 2014-07-09

Comments (none posted)

vlc: code execution

Package(s):vlc CVE #(s):CVE-2013-1868 CVE-2013-1954 CVE-2013-4388
Created:July 8, 2014 Updated:July 28, 2014
Description: From the CVE entries:

Multiple buffer overflows in VideoLAN VLC media player 2.0.4 and earlier allow remote attackers to cause a denial of service (crash) and execute arbitrary code via vectors related to the (1) freetype renderer and (2) HTML subtitle parser. (CVE-2013-1868)

The ASF Demuxer (modules/demux/asf/asf.c) in VideoLAN VLC media player 2.0.5 and earlier allows remote attackers to cause a denial of service (crash) and possibly execute arbitrary code via a crafted ASF movie that triggers an out-of-bounds read. (CVE-2013-1954)

Buffer overflow in the mp4a packetizer (modules/packetizer/mpeg4audio.c) in VideoLAN VLC Media Player before 2.0.8 allows remote attackers to cause a denial of service (crash) and possibly execute arbitrary code via unspecified vectors. (CVE-2013-4388)

Alerts:
Gentoo 201411-01 vlc 2014-11-05
Mageia MGASA-2014-0296 live555, vlc, mplayer 2014-07-26
Debian DSA-2973-1 vlc 2014-07-07

Comments (none posted)

xen: information leak

Package(s):xen CVE #(s):CVE-2014-4021
Created:July 4, 2014 Updated:July 9, 2014
Description: From the Red Hat bugzilla entry:

While memory pages recovered from dying guests are being cleaned to avoid leaking sensitive information to other guests, memory pages that were in use by the hypervisor and are eligible to be allocated to guests weren't being properly cleaned. Such exposure of information would happen through memory pages freshly allocated to or by the guest. A malicious guest might be able to read data relating to other guests or the hypervisor itself.

Alerts:
openSUSE openSUSE-SU-2014:1281-1 xen 2014-10-09
openSUSE openSUSE-SU-2014:1279-1 xen 2014-10-09
Debian DSA-3006-1 xen 2014-08-18
Oracle ELSA-2014-0926 kernel 2014-07-25
Oracle ELSA-2014-0926 kernel 2014-07-25
CentOS CESA-2014:0926 kernel 2014-07-25
Scientific Linux SLSA-2014:0926-1 kernel 2014-07-24
Red Hat RHSA-2014:0926-01 kernel 2014-07-23
Gentoo 201407-03 xen 2014-07-16
Fedora FEDORA-2014-7734 xen 2014-07-04
Fedora FEDORA-2014-7722 xen 2014-07-04

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The current development kernel is 3.16-rc4, which was released on July 6.

Stable kernel status: On July 3, Greg Kroah-Hartman announced that 3.14 would be the next "longterm stable" kernel; he will be maintaining it until August 2016 or thereabouts.

The 3.15.4, 3.14.11, 3.10.47, and 3.4.97 stable kernels were released on July 6, followed by 3.15.5, 3.14.12, 3.10.48, and 3.4.98 on July 9.

Comments (none posted)

Quotes of the week

IOW, what would an end-user's bug report look like?

It's important to think this way because a year from now some person we've never heard of may be looking at a user's bug report and wondering whether backporting this patch will fix it. Amongst other reasons.

Andrew Morton

Hey, I figure that if you weren't desperately in need of entertainment, you would not have asked me to hack a perl script!
Paul McKenney

"Magic barrier sprinkles" is a bad path to start down, IMHO.
Rusty Russell

We do not do defensive programming, we try to do logical things, and only logical things.
Eric Dumazet (Thanks to Dan Carpenter.)

Comments (none posted)

The future of realtime Linux in doubt

In a message about the release of the 3.14.10-rt7 realtime Linux kernel, Thomas Gleixner reiterated that the funding problems that have plagued realtime Linux (which he raised, again, at last year's Real Time Linux Workshop) have only gotten worse. Efforts were made to find funding for the project, but "nothing has materialized". Assuming that doesn't change, Gleixner plans to cut back on development and on plans to get the code upstream. "After my last talk about the state of preempt-RT at LinuxCon Japan, Linus told me: 'That was far more depressing than I feared'. The mainline kernel has seen a lot of benefit from the preempt-RT efforts in the past 10 years and there is a lot more stuff which needs to be done upstream in order to get preempt-RT fully integrated, which certainly would improve the general state of the Linux kernel again."

Comments (103 posted)

Kernel development news

Anatomy of a system call, part 1

July 9, 2014

This article was contributed by David Drysdale

System calls are the primary mechanism by which user-space programs interact with the Linux kernel. Given their importance, it's not surprising to discover that the kernel includes a wide variety of mechanisms to ensure that system calls can be implemented generically across architectures, and can be made available to user space in an efficient and consistent way.

I've been working on getting FreeBSD's Capsicum security framework onto Linux and, as this involves the addition of several new system calls (including the slightly unusual execveat() system call), I found myself investigating the details of their implementation. As a result, this is the first of a pair of articles that explore the details of the kernel's implementation of system calls (or syscalls). In this article we'll focus on the mainstream case: the mechanics of a normal syscall (read()), together with the machinery that allows x86_64 user programs to invoke it. The second article will move off the mainstream case to cover more unusual syscalls, and other syscall invocation mechanisms.

System calls differ from regular function calls because the code being called is in the kernel. Special instructions are needed to make the processor perform a transition to ring 0 (privileged mode). In addition, the kernel code being invoked is identified by a syscall number, rather than by a function address.

Defining a syscall with SYSCALL_DEFINEn()

The read() system call provides a good initial example to explore the kernel's syscall machinery. It's implemented in fs/read_write.c, as a short function that passes most of the work to vfs_read(). From an invocation standpoint the most interesting aspect of this code is way the function is defined using the SYSCALL_DEFINE3() macro. Indeed, from the code, it's not even immediately clear what the function is called.

    SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
    {
    	struct fd f = fdget_pos(fd);
    	ssize_t ret = -EBADF;
    	/* ... */

These SYSCALL_DEFINEn() macros are the standard way for kernel code to define a system call, where the n suffix indicates the argument count. The definition of these macros (in include/linux/syscalls.h) gives two distinct outputs for each system call.

    SYSCALL_METADATA(_read, 3, unsigned int, fd, char __user *, buf, size_t, count)
    __SYSCALL_DEFINEx(3, _read, unsigned int, fd, char __user *, buf, size_t, count)
    {
    	struct fd f = fdget_pos(fd);
    	ssize_t ret = -EBADF;
    	/* ... */

The first of these, SYSCALL_METADATA(), builds a collection of metadata about the system call for tracing purposes. It's only expanded when CONFIG_FTRACE_SYSCALLS is defined for the kernel build, and its expansion gives boilerplate definitions of data that describes the syscall and its parameters. (A separate page describes these definitions in more detail.)

The __SYSCALL_DEFINEx() part is more interesting, as it holds the system call implementation. Once the various layers of macros and GCC type extensions are expanded, the resulting code includes some interesting features:

    asmlinkage long sys_read(unsigned int fd, char __user * buf, size_t count)
    	__attribute__((alias(__stringify(SyS_read))));

    static inline long SYSC_read(unsigned int fd, char __user * buf, size_t count);
    asmlinkage long SyS_read(long int fd, long int buf, long int count);

    asmlinkage long SyS_read(long int fd, long int buf, long int count)
    {
    	long ret = SYSC_read((unsigned int) fd, (char __user *) buf, (size_t) count);
    	asmlinkage_protect(3, ret, fd, buf, count);
    	return ret;
    }

    static inline long SYSC_read(unsigned int fd, char __user * buf, size_t count)
    {
    	struct fd f = fdget_pos(fd);
    	ssize_t ret = -EBADF;
    	/* ... */

First, we notice that the system call implementation actually has the name SYSC_read(), but is static and so is inaccessible outside this module. Instead, a wrapper function, called SyS_read() and aliased as sys_read(), is visible externally. Looking closely at those aliases, we notice a difference in their parameter types — sys_read() expects the explicitly declared types (e.g. char __user * for the second argument), whereas SyS_read() just expects a bunch of (long) integers. Digging into the history of this, it turns out that the long version ensures that 32-bit values are correctly sign-extended for some 64-bit kernel platforms, preventing a historical vulnerability.

The last things we notice with the SyS_read() wrapper are the asmlinkage directive and asmlinkage_protect() call. The Kernel Newbies FAQ helpfully explains that asmlinkage means the function should expect its arguments on the stack rather than in registers, and the generic definition of asmlinkage_protect() explains that it's used to prevent the compiler from assuming that it can safely reuse those areas of the stack.

To accompany the definition of sys_read() (the variant with accurate types), there's also a declaration in include/linux/syscalls.h, and this allows other kernel code to call into the system call implementation directly (which happens in half a dozen places). Calling system calls directly from elsewhere in the kernel is generally discouraged and is not often seen.

Syscall table entries

Hunting for callers of sys_read() also points the way toward how user space reaches this function. For "generic" architectures that don't provide an override of their own, the include/uapi/asm-generic/unistd.h file includes an entry referencing sys_read:

    #define __NR_read 63
    __SYSCALL(__NR_read, sys_read)

This defines the generic syscall number __NR_read (63) for read(), and uses the __SYSCALL() macro to associate that number with sys_read(), in an architecture-specific way. For example, arm64 uses the asm-generic/unistd.h header file to fill out a table that maps syscall numbers to implementation function pointers.

However, we're going to concentrate on the x86_64 architecture, which does not use this generic table. Instead, x86_64 defines its own mappings in arch/x86/syscalls/syscall_64.tbl, which has an entry for sys_read():

    0	common	read			sys_read

This indicates that read() on x86_64 has syscall number 0 (not 63), and has a common implementation for both of the ABIs for x86_64, namely sys_read(). (The different ABIs will be discussed in the second part of this series.) The syscalltbl.sh script generates arch/x86/include/generated/asm/syscalls_64.h from the syscall_64.tbl table, specifically generating an invocation of the __SYSCALL_COMMON() macro for sys_read(). This header file is used, in turn, to populate the syscall table, sys_call_table, which is the key data structure that maps syscall numbers to sys_name() functions.

x86_64 syscall invocation

Now we will look at how user-space programs invoke the system call. This is inherently architecture-specific, so for the rest of this article we'll concentrate on the x86_64 architecture (other x86 architectures will be examined in the second article of the series). The invocation process also involves a few steps, so a clickable diagram, seen at left, may help with the navigation.

[System call diagram]

In the previous section, we discovered a table of system call function pointers; the table for x86_64 looks something like the following (using a GCC extension for array initialization that ensures any missing entries point to sys_ni_syscall()):

    asmlinkage const sys_call_ptr_t sys_call_table[__NR_syscall_max+1] = {
    	[0 ... __NR_syscall_max] = &sys_ni_syscall,
    	[0] = sys_read,
    	[1] = sys_write,
    	/*... */
    };

For 64-bit code, this table is accessed from arch/x86/kernel/entry_64.S, from the system_call assembly entry point; it uses the RAX register to pick the relevant entry in the array and then calls it. Earlier in the function, the SAVE_ARGS macro pushes various registers onto the stack, to match the asmlinkage directive we saw earlier.

Moving outwards, the system_call entry point is itself referenced in syscall_init(), a function that is called early in the kernel's startup sequence:

    void syscall_init(void)
    {
    	/*
    	 * LSTAR and STAR live in a bit strange symbiosis.
    	 * They both write to the same internal register. STAR allows to
    	 * set CS/DS but only a 32bit target. LSTAR sets the 64bit rip.
    	 */
    	wrmsrl(MSR_STAR,  ((u64)__USER32_CS)<<48  | ((u64)__KERNEL_CS)<<32);
    	wrmsrl(MSR_LSTAR, system_call);
    	wrmsrl(MSR_CSTAR, ignore_sysret);
    	/* ... */

The wrmsrl instruction writes a value to a model-specific register; in this case, the address of the general system_call syscall handling function is written to register MSR_LSTAR (0xc0000082), which is the x86_64 model-specific register for handling the SYSCALL instruction.

And this gives us all we need to join the dots from user space to the kernel code. The standard ABI for how x86_64 user programs invoke a system call is to put the system call number (0 for read) into the RAX register, and the other parameters into specific registers (RDI, RSI, RDX for the first 3 parameters), then issue the SYSCALL instruction. This instruction causes the processor to transition to ring 0 and invoke the code referenced by the MSR_LSTAR model-specific register — namely system_call. The system_call code pushes the registers onto the kernel stack, and calls the function pointer at entry RAX in the sys_call_table table — namely sys_read(), which is a thin, asmlinkage wrapper for the real implementation in SYSC_read().

Now that we've seen the standard implementation of system calls on the most common platform, we're in a better position to understand what's going on with other architectures, and with less-common cases. That will be the subject of the second article in the series.

Comments (14 posted)

Control groups, part 2: On the different sorts of hierarchies

July 9, 2014

This article was contributed by Neil Brown


Control groups

Hierarchies are everywhere. Whether this is a deep property of the universe or simply the result of the human thought process, we see hierarchies wherever we look, from the URL bar that your browser displays (or maybe doesn't) to the pecking order in the farm yard. There is a fun fact that if you click on the first link in the main text of a Wikipedia article, and then repeat that on each following article, you eventually get to Philosophy, though this is apparently only true 94.52% of the time. Nonetheless it suggests that all knowledge can be arranged hierarchically underneath the general heading of "Philosophy".

Control groups (cgroups) allow processes to be grouped hierarchically and the specific details of this hierarchy are one area where cgroups have both undergone change and received criticism. In our ongoing effort to understand cgroups enough to enjoy the debates that regularly spring up, it is essential to have an appreciation of the different ways a hierarchy can be used, so we can have some background against which to measure the hierarchy in cgroups.

I find that an example from my past raises some relevant issues that we can then see play out in some more generally familiar filesystem hierarchies and that we can be prepared to look for in cgroup hierarchies.

Hierarchies in computer account privileges

In a previous role as a system administrator for a modest-sized computing department at a major Australian university, we had a need for a scheme to impose various access controls on, and provide resource allocations to, a wide variety of users: students, both undergraduate and post-graduate, and staff, both academic and professional. Already it is clear that a hierarchy is presenting itself, with room for further subdivisions between research and course-work students, and between technical and clerical professional staff.

Largely orthogonal to this hierarchy were divisions of the school into research groups and support groups (I worked in the Computing Support Group) together with a multitude of courses that were delivered, each loosely associated with a particular program (Computer Engineering, Software Engineering, etc.) at a particular year level. Within each of the different divisions and courses there could be staff in different roles as well as students. Some privileges best aligned with the role performed by the owner of the account, so staff received a higher printing allowance than students. Others aligned with the affiliation of the account owner — a particular printer might be reserved for employees in the School Office who had physical access and used it for printing confidential transcripts. Similarly, students in some particular course had a genuine need for a much higher budget of color printouts.

To manage all of this we ended up with two separate hierarchies that were named "Reason" (which included the various roles, since they were the reason a computer account was given) and "Organisation" (identifying that part of the school in which the role was active). From these two we formed a cross product such that for each role and for each affiliation there was, at least potentially, a group of user accounts. Each account could exist in several of these groups, as both staff and students could be involved in multiple courses, and some senior students might be tutors for junior courses. Various privileges and resources could be allocated to individual roles and affiliations or intersections thereof, and they would be inherited by any account in the combined hierarchy.

Manageable complexity

Having a pair of interconnected hierarchies was certainly more complex than the single hierarchy that I was hoping for, but it had one particular advantage: it worked. It was an arrangement that proved to be very flexible and we never had any trouble deciding where to attach any particular computer account. The complexity was a small price to play for the utility.

Further, the price was really quite small. While creating the cross product of two hierarchies by hand would have been error prone, we didn't have to do that. A fairly straightforward tool managed all the complexity behind the scenes, creating and linking all the intermediate tree nodes as required. While working with the tree, whether assigning permissions or resources or attaching people to various roles or affiliations, we rarely needed to think about the internal details and never risked getting them wrong.

This exercise left me with a deep suspicion of simple hierarchies. They are often tempting, but just as often they are an over-simplification. So the first lesson from this tale is that a little complexity can be well worth the cost, particularly if it is well-chosen and can be supported by simple tools.

Two types of hierarchy

The second lesson from this exercise is that the two hierarchies weren't just different in detail; they had quite different characters.

The "Reason" hierarchy is what might be called a "classification" hierarchy. Every individual had their own unique role but it is useful to group similar roles into classes and related classes into super-classes. A widely known hierarchy that has this same property is the Linnaean taxonomy of Biological classification, which is a hierarchy of life forms with seven main ranks of Kingdom, Phylum, Class, Order, Family, Genus, and Species.

With this sort of hierarchy all the members belong in the leaves. In the biological example, all life forms are members of some species. We may not know (or be able to agree) which species a particular individual belongs to, but to suggest that some individual is a member of some Family, but not of any Genus or Species doesn't make sense. It would be at best an interim step leading to a final classification.

The "Organisation" hierarchy has quite a different character. The different research groups did not really represent a classification of research interests, but were a way to organize people into conveniently sized groups to distribute management. Certainly the groups aligned with people's interests where possible, but it was not unheard of for someone to be assigned to a group not because they naturally belonged, but because it was most convenient. To some extent the grouping exists for some separate purpose and members are placed in groups to meet that purpose. This contrasts with a "classification" where each "class" exists only to contain its members.

An organizational hierarchy has another important property: it is perfectly valid for internal nodes to contain individuals. The Head of School was the head of the whole school, and so belonged at the top of the hierarchy. Similarly, a program director could reasonably be associated with the program as a whole without being specifically associated with each of the courses in the program. In many organizations, the leader or head of each group is a member of the group one step up in the organizational hierarchy, which affirms this pattern.

These two different types of hierarchy are quite common and often get mingled together. Two places that we can find them that will be familiar to many readers are the "sysfs" filesystem in Linux, and the source code tree for the Linux kernel.

Devices in /sys

The "sysfs" filesystem (normally mounted at /sys) is certainly a hierarchy — as that is how filesystems work. While sysfs currently contains a range of different objects including modules, firmware information, and filesystem details, it was originally created for devices and it is only the devices that will concern us here.

There are, in fact, three separate hierarchical arrangements of devices that all fit inside sysfs, suggesting that each device should have three parents. As devices are represented as directories, this is clearly not possible, since Unix directories may have only one parent. This conundrum is resolved thorough the use of symbolic links (or "symlinks") with implicit, rather than explicit, links to parents. We will start exploring with the hierarchies that are held together with symlinks.

The hierarchy rooted at /sys/dev could be referred to as the "legacy hierarchy". From the early days of Unix there have been two sorts of devices: block devices and character devices. These are represented by the various device-special-files that can normally be found in /dev. Each such file identifies as either a block device or a character device and has a major device number indicating the general class of device (e.g. serial port, parallel port, disk or tape drive) and a minor number that indicates which particular device of that class is the target.

This three-level hierarchy is exactly what we find under /sys/dev, though a colon is used, rather than a slash, to separate the last two levels. So /sys/dev/block/8:0 (block device with major number 8 and minor number 0) is a symbolic link to a directory representing the device also known as "sda". If we start in that directory and want to find the path from /sys/dev, we can find the last two components ("8:0") by reading the "dev" file. Determining that it is a block device is less straightforward, though the presence of a "bdi" (block device info) directory is a strong hint.

This hierarchy is particularly useful if all you have is the name of a device file in /dev, or an open file descriptor on such a device. The stat() or fstat() system calls will report the device type and the major and minor numbers, and these can trivially be converted to a path name in /sys/dev, which can lead to other useful information about the device.

The second symlink-based hierarchy is probably the most generally useful. It is rooted at /sys/class and /sys/bus, suggesting that there really should be another level in there to hold both of these. There are plans to combine both of these into a new /sys/subsystem tree, though as those plans are at least seven years old, I'm not holding my breath. One valuable aspect of these plans that is already in place is that each device directory has a subsystem symlink that points back to either the class or bus tree, so you can easily find the parent of any device within this hierarchy.

The /sys/class hierarchy is quite simple, containing a number of device classes each of which contains a number of specific devices with links to the real device directory. As such, it is conceptually quite similar to the legacy hierarchy, just with names instead of numbers. The /sys/bus hierarchy is similar, though the devices are collected into a separate devices subdirectory allowing each bus directory to also contain drivers and other details.

The third hierarchy for organizing devices is a true directory-based hierarchy that doesn't depend on symlinks. It is found in /sys/devices and has a structure that, in all honesty, is rather hard to describe.

The overriding theme to the organization is that it follows the physical connectedness of devices, so if a hard drive is accessed via a USB port with the USB controller attached to a PCI bus, then the path though the hierarchy to that hard drive will first find the PCI bus, and then the USB port. After the hard drive will be the "block" device that provides access to the data on the drive, and then possibly subordinate devices for partitions.

This is an arrangement that seems like a good idea until you realize that some devices get control signals from one place (or maybe two if there is a separate reset line) and power supply from another place, so a simple hierarchy cannot really describe all the interconnectedness. This is an issue that was widely discussed in preparation for this year's Kernel Summit.

When examining these hierarchies from the perspective of "classification" versus "organization", some fairly clear patterns emerge. The /sys/dev hierarchy is a simple classification hierarchy, though possibly overly simple as many devices (e.g. network interfaces) don't appear there. The /sys/class part of the subsystem hierarchy is similarly a simple classification, though it is more complete.

The /sys/bus part of the subsystem hierarchy is also a simple two-level classification, though the presence of extra information for each bus type, such as the drivers directory, confuse this a little. Devices in the class hierarchy are classified by what functionality they provide (net, sound, watchdog, etc). Devices in the bus hierarchy are classified by how they are accessed and represent different addressable units rather than different functional units. The extra entries in the /sys/bus subtree allow some control over what functionality (represented by a driver and realized as a class device) is requested of each addressable unit.

With this understood, it is hierarchically a simple two-level classification.

The /sys/devices hierarchy is indisputably an organizational hierarchy. It contains all the class devices and all the bus devices in a rough analog of the physical organization of devices. When there is no physical device, or it is not currently represented on any sort of bus, devices are organized into /sys/devices/virtual.

Here again we see that both a classification hierarchy and an organization hierarchy for the same objects can be quite useful, each in its own way. There can be some complexity to working with both, but if you follow the rules, it isn't too bad.

The Linux kernel source tree

For a significantly different perspective on hierarchies, we can look at the Linux kernel source code tree, though many evolving source code trees could provide similar examples. This hierarchy is more about organization than classification, though, as with the research groups discussed earlier, there is generally an attempt to keep related things together when convenient.

There are two aspects of the hierarchy that are worth highlighting, as they illustrate choices that must be made — consciously or unconsciously.

At the top level, there are directories for various major subsystems, such as fs for filesystems (and also file servers like nfsd), mm for memory management, sound, block, crypto, etc. These all seem like reasonable classifications. And then there is kernel. Given that all of Linux is an operating system kernel, maybe this bit is the kernel of the kernel?

In reality, it is various distinct bits and pieces that don't really belong to any particular subsystem, or they are subsystems that are small enough to only need one or two files. In some cases, like the time and sched directories, they are subsystems which were once small enough to belong in kernel and have grown large enough to need their own directory, but not bold enough to escape from the kernel umbrella.

The fs subtree has a similar set of files. Most of fs is the different filesystems and there are a few support modules that get their own subdirectory, such as exportfs, which helps various file servers, and dlm, which supports locking for cluster filesystems. However, in fs is also an ad hoc collection of C files providing services to filesystems, or implementing the higher-level system call interfaces. These are exactly like the code that appears in kernel (and possibly lib) at the top level. However, in fs there is no subdirectory for miscellaneous things, it all just stays in the top level of fs.

There is not necessarily a right answer as to whether everything should be classified into its own leaf directory (following the kernel model), or whether it is acceptable to store source code in internal directories (as is done in fs). However, it is a choice that must be made, and is certainly something to hold an opinion on when debating hierarchies in cgroups.

The kernel source tree also contains a different sort of classification: scripts live in the scripts directory, firmware lives in the firmware directory, and header files live in the include directory — except when they don't. There has been a tendency in recent years to move some header files out of the include directory tree and closer to the C source code files that they are related to. To make this more concrete, let's consider the example of the NFS and the ext3 filesystems.

Each of these filesystems consist of some C language files, some C header files, and assorted other files. The question is: should the header files for NFS live with the header files for ext3 (header files together), or should the header files for NFS live with the C language files for NFS (NFS files together)? To put this another way, do we need to use the hierarchy to classify the header files as different from the other files, or are the different names sufficient?

There was a time when most, if not all, header files were in the include tree. Today, it is very common to find include files mixed with the C files. For ext3, a big change happened in Linux 3.4, when all four header files were moved from include/linux/ into a single file with the rest of the ext3 code: fs/ext3/ext3.h.

The point here is that classification is quite possible without using a hierarchy. Sometimes hierarchical classification is perfect for the task. Sometimes it is just a cumbersome inconvenience. Being willing to use hierarchy when, but only when, it is needed, makes a lot of sense.

Hierarchies for processes

Understanding cgroups, which is the real goal of this series of articles, will require some understanding of how to manage groups of processes and what role hierarchy can play in that management. None of the above is specifically about processes, but it does raise some useful questions or issues that we can consider when we start looking at the details of cgroups:

  • Does the simplicity of a single hierarchy outweigh the expressiveness of multiple hierarchies, whether they are separate (as in sysfs) or interconnected (as in the account management example)?

  • Is the overriding goal to classify processes, or simply to organize them? Or are both needs relevant, and, if so, how can we combine them?

  • Could we allow non-hierarchical mechanisms, such as symbolic links or file name suffixes, to provide some elements of classification or organization?

  • Does it ever make sense for processes to be attached to internal nodes in the hierarchy, or should they be forced into leaves, even if that leaf is simply a miscellaneous leaf.

In the hierarchy of process groups we looked at last time, we saw a single simple hierarchy that classified processes, first by login session, and then by job group. All processes that were in the hierarchy at all were in the leaves, but many processes, typically system daemons that never opened a tty at all, were completely absent from the hierarchy.

To begin to find answers to these questions in a more modern setting, we need to understand what cgroups actually does with processes and what the groups are used for. In the next article we will start answering that question by taking a close look at some of the cgroups "subsystems", which include resource controllers and various other operations that need to treat a set of processes as group.

Comments (1 posted)

Filesystem notification, part 1: An overview of dnotify and inotify

July 9, 2014

This article was contributed by Michael Kerrisk.


Filesystem notification

Filesystem notification APIs provide a mechanism by which applications can be informed when events happen within a filesystem—for example, when a file is opened, modified, deleted, or renamed. Over time, Linux has acquired three different filesystem notification APIs, and it is instructive to look at them to understand what the differences between the APIs are. It's also worthwhile to consider what lessons have been learned during the design of the APIs—and what lessons remain to be learned.

This article is thus the first in a series that looks at the Linux filesystem notification APIs: dnotify, inotify, and fanotify. To begin with, we briefly describe the original API, dnotify, and look at its limitations. We'll then look at the inotify API, and consider the ways in which it improves on dnotify. In a subsequent article, we'll take a look at the fanotify API.

Filesystem notification use cases

In order to compare filesystem notification APIs, it's useful to consider some of the use cases for those APIs. Some of the common use cases are the following:

  • Caching a model of filesystem objects: The application wants to maintain an internal representation that accurately reflects the current set of objects in a filesystem, or some subtree of that filesystem. An example of such an application is a file manager, which presents the user with a graphical representation of the objects in a filesystem.
  • Logging filesystem activity: The application wants to record all of the events (or some subset of event types) that occur for the monitored filesystem objects.
  • Gatekeeping filesystem operations: The application wants to intervene when a filesystem event occurs. The classic example of such an application is an antivirus system: when another program tries to (for example) execute a file, the antivirus system first checks the contents of the file for malware, and then either allows the execution to proceed if the file contents are benign, or prevents execution if a virus is detected.

In the beginning: dnotify

Without a kernel-supported filesystem notification API, an application must resort to techniques such as polling the state of directories and files using repeated invocations of system calls such as stat() and the readdir() library function. Such polling is, of course, slow and inefficient. Furthermore, this approach allows only a limited range of events to be detected, for example, creation of a file, deletion of a file, and changes of file metadata such as permissions and file size. By contrast, operations such as file renames are difficult to identify.

Those problems led to the creation of the first in-kernel implementation of a filesystem notification API, dnotify, which was implemented by Stephen Rothwell (these days, the maintainer of the linux-next tree) and which first appeared in Linux 2.4.0 (in 2001).

Because it was the first attempt at implementing a filesystem notification API, done at a time when the problem was less well understood and when some of the pitfalls of API design were less easily recognized, the dnotify API has a number of peculiarities. To begin with, the interface is multiplexed on the existing fcntl() system call. (By contrast, the later inotify and fanotify APIs were each implemented using new system calls.) To enable monitoring, one makes a call of the form:

    fcntl(fd, F_NOTIFY, mask);

Here, fd is a file descriptor that specifies a directory to be monitored, and this brings us to the second oddity of the API: dnotify can be used to monitor only whole directories; monitoring individual files is not possible. The mask specifies the set of events to be monitored in the directory. These include events for file access, modification, creation, deletion, and attribute changes (e.g., permission and ownership changes) that are fully listed in the fcntl(2) man page.

A further dnotify oddity is its method of notification. When an event occurs, the monitoring application is sent a signal (SIGIO by default, but this can be changed). The signal on its own does not identify which directory had the event, but if we use sigaction() to establish the handler using the SA_SIGINFO flag, then the handler receives a siginfo_t argument whose si_fd field contains the file descriptor associated with the directory. At that point, the application then needs to rescan the directory to determine which file has changed. (In typical usage, the application would maintain a data structure that caches a mapping of file descriptors to directory names, so that it can map si_fd back to a directory name.)

A simple example of the use of dnotify can be found here.

Problems with dnotify

As is probably clear, the dnotify API is cumbersome, and has a number of limitations. As already noted, we can monitor only entire directories, not individual files. Furthermore, dnotify provides notification for a rather modest range of events. Most notably, by comparison to inotify, dnotify can't tell us when a file was opened or closed. However, there are also some other serious limitations of the API.

The use of signals as a notification method causes a number of difficulties. The first of these is that signals are delivered asynchronously: catching signals with a handler can be racy and error-prone. One way around that particular difficulty is to instead accept signals synchronously using sigwaitinfo(). The use of SIGIO as the default notification signal is also undesirable, because it is one of the traditional signals that does not queue. This means that if events are generated more quickly than the application can process the signals, then some notifications will be lost. (This difficulty can be circumvented by changing the notification signal to one of the so-called realtime signals, which can be queued.)

Signals are also problematic because they convey little information: at most, we get a signal number (it is possible to arrange for different directories to notify using different signals) and a file descriptor number. We get no information about which particular file in a directory triggered an event, or indeed what kind of event occurred. (One can play tricks such as opening multiple file descriptors for the same directory, each of which notifies a different set of events, but this adds complexity to the application.) One further reason that using signals as a notification method can be a problem is that an application that uses dnotify might also make use of a library that employs signals: the use of a particular signal by dnotify in the main program may conflict with the library's use of the same signal (or vice versa).

A final significant limitation of the dnotify API is the need to open a file descriptor for each directory that is monitored. This is problematic for two reasons. First, an application that monitors a large number of directories may quickly run out of file descriptors. However, a more serious problem is that holding file descriptors open on a filesystem prevents that filesystem from being unmounted.

Notwithstanding these API problems, dnotify did provide an efficiency improvement over simply polling a filesystem, and dnotify came to be employed in some widely used tools such as the Beagle desktop search tool. However, it soon became clear that a better API would make life easier for user-space applications.

Enter inotify

The inotify API was developed by John McCutchan with support from Robert Love. First released in Linux 2.6.13 (in 2005), inotify aimed to address all of the obvious problems with dnotify.

The API employs three dedicated system calls—inotify_init(), inotify_add_watch(), and inotify_rm_watch()—and makes use of the traditional read() system call as well.

[Inotify diagram]

inotify_init() creates an inotify instance—a kernel data structure that records which filesystem objects should be monitored and maintains a list of events that have been generated for those objects. The call returns a file descriptor that is employed by the rest of the API to refer to this inotify instance. The diagram at right summarizes the operation of an inotify instance.

inotify_add_watch() allows us to modify the set of filesystem objects monitored by an inotify instance. We can add new objects (files and directories) to the monitoring list, specifying which events are to be notified, and change the set of events that are notified for an object that is already in the monitoring list. Unsurprisingly, inotify_rm_watch() is the converse of inotify_add_watch(): it removes an object from the monitoring list.

The three arguments to inotify_add_watch() are an inotify file descriptor, a filesystem pathname, and a bit mask:

    int inotify_add_watch(int fd, const char *pathname, uint32_t mask);

The mask argument specifies the set of events to be notified for the filesystem object referred to by pathname and can include some additional bits that affect the behavior of the call. As an example, the following code allows us to monitor file creation and deletion events inside the directory mydir, as well as monitor for deletion of the directory itself:

    int fd, wd;

    fd = inotify_init();

    wd = inotify_add_watch(fd, "mydir",
                           IN_CREATE | IN_DELETE | IN_DELETE_SELF);

A full list of the bits that can be included in the mask argument is given in the inotify(7) man page. The set of events notified by inotify is a superset of that provided by dnotify. Most notably, inotify provides notifications when filesystem objects are opened and closed, and provides much more information for file rename events, as we outline below.

The return value of inotify_add_watch() is a "watch descriptor", which is an integer value that uniquely identifies the specified filesystem object within the inotify monitoring list. An inotify_add_watch() call that specifies a filesystem object that is already being monitored (possibly via a different pathname) will return the same watch descriptor number as was returned by the inotify_add_watch() that first added the object to the monitoring list.

When events occur for objects in the monitoring list, they can be read from the inotify file descriptor using read(). (The inotify file descriptor can also be monitored for readability using select(), poll(), and epoll().) Each read() returns one or more structures of the following form to describe an event:

    struct inotify_event {
        int      wd;      /* Watch descriptor */
        uint32_t mask;    /* Bit mask describing event */
        uint32_t cookie;  /* Unique cookie associating related events */
        uint32_t len;     /* Size of name field */
        char     name[];  /* Optional null-terminated name */
    };

The wd field is a watch descriptor that was previously returned by inotify_add_watch(). By maintaining a data structure that maps watch descriptors to pathnames, the application can determine the filesystem object for which this event occurred. mask is a bit mask that describes the event that occurred. In most cases, this field will include one of the bits specified in the mask specified when the watch was established. For example, given the inotify_add_watch() call that we showed earlier, if the directory mydir was deleted, read() would return an event whose mask field has the IN_DELETE_SELF bit set. (By contrast, dnotify does not generate an event when a monitored directory is deleted.)

In addition to the various events for which an application may request notification, there are certain events for which inotify always generates automatic notifications. The most notable of these is IN_IGNORED, which is generated whenever inotify ceases to monitor an object. This can occur, for example, because the object was deleted or the filesystem on which it resides was unmounted. The IN_IGNORED event can be used by the application to adjust its internal model of what is currently being monitored. (Again, dnotify has no analog of this event.)

The name field is used (only) when an event occurs for a file inside a monitored directory: it contains the null-terminated name of the file that triggered this event. The len field indicates the total size of the name field, which may be terminated by multiple null bytes in order to pad out the inotify_event structure to a size that allows successive structures in the read buffer to be aligned at architecture-appropriate byte boundaries (typically, multiples of 16 bytes).

The cookie field exists to help applications interpret rename events. When a file is renamed inside (or between) monitored directories, two events are generated: an IN_MOVED_FROM event for the directory from which the file is moved, and an IN_MOVED_TO event for the directory to which the file is moved. The first event contains the old name of the file, and the second event contains the new name. Both events have the same unique cookie value, allowing the application to connect the two events, and thus work out the old and new name of the file (a task that is rather difficult with dnotify). We'll say rather more about rename events in the next article in this series.

Inotify does not provide recursive monitoring. In other words, if we are monitoring the directory mydir, then we will receive notifications for that directory as well as all of its immediate descendants, including subdirectories. However, we will not receive notifications for events inside the subdirectories. But, with some effort, it is possible to perform recursive monitoring by creating watches for each of the subdirectories in a directory tree. To assist with this task, when a subdirectory is created inside a monitored directory (or indeed, when any event is generated for a subdirectory), inotify generates an event that has the IN_ISDIR bit set. This provides the application with the opportunity to add watches for new subdirectories.

Example program

The code below demonstrates the basic steps in using the inotify API. The program first creates an inotify instance and adds watches for all possible events for each of the pathnames specified in its command line. It then sits in a loop reading events from the inotify file descriptor and displaying information from those events (using our displayInotifyEvent(), shown in the full version of the code here).

    int
    main(int argc, char *argv[])
    {
        struct inotify_event *event
        ...

        inotifyFd = inotify_init();         /* Create inotify instance */

        for (j = 1; j < argc; j++) {
            wd = inotify_add_watch(inotifyFd, argv[j], IN_ALL_EVENTS);

            printf("Watching %s using wd %d\n", argv[j], wd);
        }

        for (;;) {                          /* Read events forever */
            numRead = read(inotifyFd, buf, BUF_LEN);
            ...

            /* Process all of the events in buffer returned by read() */

            for (p = buf; p < buf + numRead; ) {
                event = (struct inotify_event *) p;
                displayInotifyEvent(event);

                p += sizeof(struct inotify_event) + event->len;
            }
        }
    }

Suppose that we use this program to monitor two subdirectories, xxx and yyy:

    $ ./inotify_demo xxx yyy
    Watching xxx using wd 1
    Watching yyy using wd 2

If we now execute the following command:

    $ mv xxx/aaa yyy/bbb

we see the following output from our program:

    Read 64 bytes from inotify fd
        wd = 1; cookie =140040; mask = IN_MOVED_FROM
            name = aaa
        wd = 2; cookie =140040; mask = IN_MOVED_TO
            name = bbb

The mv command generated an IN_MOVED_FROM event for the xxx directory (watch descriptor 1) and an IN_MOVED_TO event for the yyy directory (watch descriptor 2). The two events contained, respectively, the old and new name of the file. The events also had the same cookie value, thus allowing an application to connect them.

How inotify improves on dnotify

Inotify improves on dnotify in a number of respects. Among the more notable improvements are the following:

  • Both directories and individual files can be monitored.
  • Instead of signals, applications are notified of filesystem events by reading structured data from a file descriptor created using the API. This approach allows an application to deal with notifications synchronously, and also allows for richer information to be provided with notifications.
  • Inotify does not require an application to open file descriptors for each monitored object. Instead, it uses an API-specific handle (the watch descriptor). This avoids the problems of file-descriptor exhaustion and open file descriptors preventing filesystems from being unmounted.
  • Inotify provides more information when notifying events. First, it can be used to detect a wider range of events. Second, when the subject of an event is a file inside a monitored directory, inotify provides the name of that file as part of the event notification.
  • Inotify provides richer information in its notification of rename events, allowing an application to easily determine the old and new name of the renamed object.
  • IN_IGNORED events make it (relatively) easy for an inotify application to maintain an internal model of the currently monitored set of filesystem objects.

Concluding remarks

We've briefly seen how inotify improves on dnotify. In the next article in this series, we look in more detail at inotify, considering how it can be used in a robust application that monitors a filesystem tree. This will allow us to see the full capabilities of inotify, while at the same time discovering some of its limitations.

Comments (26 posted)

Patches and updates

Kernel trees

Architecture-specific

Core kernel code

Development tools

Device drivers

Device driver infrastructure

Documentation

Filesystems and block I/O

Memory management

Security-related

Miscellaneous

Page editor: Jake Edge

Distributions

Debian and the PHP license

By Jake Edge
July 9, 2014

Unclear or idiosyncratic licenses on projects can often be problematic for distributions. In particular, Debian seems to struggle with more of these license issues than most other distributions, largely because of the project's notorious attention to that kind of detail. Even so, it is a bit surprising to see the distribution wrestling with the PHP license. One might have guessed that any problems with it would have been worked out long ago, but a problem with that license, as it applies to PHP extensions, reared its head (again) at the end of June.

The problem has been present for years. The PHP License, version 3.01—the most recent as of this writing—contains statements about the software it covers that are specific to distributing PHP itself. According to Ondřej Surý, any package that uses the license but does not come from the "PHP Group" does not have a valid license:

I did have a quite long and extensive chat with FTP Masters and our conclusion was that PHP License (any version) is suitable only for software that comes directly from "PHP Group", that basically means only PHP (src:php5) itself.

In fact, the Debian FTP masters, who serve as the gatekeepers on what packages are allowed into the distribution, specifically mention PHP in a Reject FAQ that lists reasons the team may reject packages. For PHP extensions, it says:

You have a PHP add-on package (any php script/"app"/thing, not PHP itself) and it's licensed only under the standard PHP license. That license, up to the 3.x which is actually out, is not really usable for anything else than PHP itself. I've mailed our -legal list about that and got only one response, which basically supported my view on this. Basically this license talks only about PHP, the PHP Group, and includes Zend Engine, so its not applicable to anything else.

Given that the mail referenced is from 2005, this is clearly a longstanding problem, though little seems to have been done about it over the years. PHP has updated its license and removed some of the problematic wording (the "Zend Engine" wording in particular), but there is still a belief that PHP extensions shouldn't be using the PHP license. There are a number of possible solutions to that problem, which Surý outlined. Debian could get the extension upstreams to relicense under the BSD or MIT licenses (for example), show that the software does actually come from the PHP Group, or remove the affected packages from Debian entirely. He also updated a pile of bugs that were filed against various PHP add-on modules.

It's a complicated question and, unsurprisingly, there are multiple interpretations of the license. That is unfortunate, but it is something that only the PHP Group can address—something it seems unwilling to do. There are some who think that anything distributed from PEAR (PHP Extension and Application Repository) that uses the PHP license (version 3.01 or greater) should be considered to have a reasonable license, while others would add code that comes from PECL (PHP Extension Community Library) to that list as well.

But the use of the PHP license is pervasive throughout the PHP ecosystem, well beyond just PEAR and PECL. For example, Mike Gabriel wondered what the problem was for the LGPL-covered Smarty 3 template engine. As Surý pointed out, though, Smarty 3 also uses four separate PHP files that are under the PHP license.

Surý's email subject said that the extensions covered by the PHP license were "not distributable", but others took exception to that claim. The license text says that the software is being distributed by the PHP Group, which is clearly not the case when Debian (or anyone else) distributes. Other, similar language essentially requires the distributor to lie, as Steve Langasek said:

There is nothing in these licenses that makes the software undistributable; it just requires the distributor to attach *false statements* to it as part of distribution.

I have no objection to the ftp team's decision to treat this as an automatic reject on this basis - I don't think a license that requires us to make false statements is suitable for main - but it's wrong to claim that these works are undistributable.

But Marco d'Itri thought that none of that mattered. PHP support for certain packages is critical:

Reality check #1: it is quite obvious that even if anybody else accepts this interpretation then nobody cares.
Reality check #2: Debian would not be viable without major packages like PHP support for imagemagick or memcached, if we do we may as well remove the the whole language.

Matthias Urlichs piled on to the "reality check" theme. He agreed that the problem is one that no other distribution cares about and noted that Debian has had these extensions in its repositories for years. Furthermore:

Thus, reality check #3: This license contains some strange terms that make it look like it doesn't really apply to the software it's distributed with, but QUITE OBVIOUSLY the author of the software in question thought otherwise, and there is no actual legal problem (nobody else is complaining about the license, much less threatening to revoke permissions, much less suing somebody).

Thus, while we're in a reasonably good position to convince Upstream to fix that problem, filing RC bugs and thus making PHP [unusable] in Debian is certainly going to be regarded as typical Debian principles-above-all overkill but unlikely to be helpful to anybody.

Later in the thread, Urlichs summarized the situation. It is clear, he said, that PHP doesn't care about the misuse of its license and the misusers don't understand that they are making a mistake. Any efforts by Debian to change that just makes the extension authors "consider us quite strange for even mentioning" a license change. He outlined three options: removing the modules ("I'd be for this in a heartbeat if it would make people switch to a saner programming language, but that's wishful thinking"), getting all of the upstreams to change their licenses ("Fat chance"), or biting the bullet and just living with the status quo.

That last option seems to be winning the day (or else everyone ran out of steam to keep arguing). As Russ Allbery put it:

I don't see this as a matter of principle unless the principle is "we refuse to deal with even major software packages that do dumb and self-contradictory things with licenses but without any intent to actually restrict the freedom of the software covered by them." And I don't actually agree with that principle. For stuff not already in Debian, sure, let's stick to a simple policy because we can usually get people to change upstream and make the world a better place, and we don't lose much if we fail. But that doesn't really apply to PHP.

For his part, Surý plans to start closing bugs for those packages that are distributed from PEAR and PECL, which covers most of the affected packages.

While PHP is able to have an unclear license that gets wrongly applied to its extensions (at least in Debian's view), it can only do so because of its popularity—lesser packages may find it much harder to find their way into distributions with oddly constructed licenses. It is important that projects choose their licenses carefully, which is something that many of these extension developers seem to have skipped. It is possible that Debian is being overly critical of the terms, but anyone reading that license may find it to be rather informal and it certainly makes life difficult for distributors. Perhaps that's what the PHP project wants, but one gets the sense that what most project members really want is just to ignore licensing issues altogether.

Comments (3 posted)

Brief items

Distribution quote of the week

These days Gentoo is sort of a “background” distro that has been around for ages, has loads of users but new people don’t get excited about anymore, kind of like Debian.
-- Patrick McLean

Comments (3 posted)

Release for CentOS-7

The CentOS project has released CentOS 7.0-1406. This release is the first to be built with sources hosted at git.centos.org. All source rpms are signed with the same key used to sign their binary counterparts. This release also introduces the new numbering scheme. "The 0 component maps to the upstream release, whose code this release is built from. The 1406 component indicates the monthstamp of the code included in the release ( in this case, June 2014 ). By using a monthstamp we are able to respin and reissue updated media for things like container and cloud images, that are regularly refreshed, while still retaining a connection to the base distro version." The release notes also mention that this is the first release to have a supported upgrade path, from CentOS 6.5 to CentOS 7. (Thanks to Scott Dowdle)

Full Story (comments: none)

Distribution News

Debian GNU/Linux

Updating the list of Debian Trusted Organizations

Lucas Nussbaum presents an updated list of Debian Trusted Organizations.
Historically, SPI was the sole organization authorized to hold assets for the Debian Project. Over the years, a number of other organizations started to hold assets on behalf of Debian, but we did not enforce the process defined in our constitution to officially maintain a list of Trusted Organizations.

I would like to use this opportunity to stress the importance of the work of such supporting organizations for Debian, and for the Free Software community general. The legal and financial framework they provide is a crucial contribution to ensure healthy and functional projects.

Full Story (comments: none)

Debian RT News - New member, freeze reminder and last Squeeze release

The Debian release team welcomes new members, talks about the Jessie release schedule and the upcoming final point release for Squeeze, and more. "For users who wish to stay with Squeeze a bit longer, we recommend that you use and support the Squeeze LTS project. Please keep in mind that Squeeze LTS is only provided for a limited set of architectures (i386 and amd64), and that you need to update your sources.list to use Squeeze LTS."

Full Story (comments: none)

Newsletters and articles of interest

Distribution newsletters

Comments (none posted)

Schaller: Wayland in Fedora update

Christian Schaller has posted an update on Fedora's transition to the Wayland display manager. "So the summary is that while we expect to have a version of Wayland in Fedora Workstation 21 that will be able to run a fully functional desktop, there are some missing pieces we now know that will not make it. Which means that since we want to ship at least one Fedora release with a feature complete Wayland as an option before making it default, that means that Fedora Workstation 23 is the earliest Wayland can be the default."

Comments (31 posted)

Tails above the Rest (Linux Journal)

Linux Journal shows how to use Tails in three articles. Part 1, the installation. Part 2, using Tails. Part 3, advanced features in Tails. From part 3: "As you might imagine, a security- and anonymity-focused distribution like Tails provides a number of encryption tools. These include more general-purpose tools like GNOME disk manager, which you can use to format new encrypted volumes and the ability to mount encrypted volumes that show up in the Places menu at the top of the desktop. In addition to general-purpose tools, Tails also includes an OpenPGP applet that sits in the notification area (that area of the panel at the top right-hand section of the desktop along with the clock, sound and network applets)."

Comments (none posted)

Page editor: Rebecca Sobol

Development

The future of Ardour

July 9, 2014

This article was contributed by Adam Saunders

Ardour is likely the most compelling open-source digital audio workstation (DAW) for music professionals. But a recent blog post by Ardour's lead developer, Paul Davis, revealed that he will likely need to shift his focus due to a lack of financial support for the project:

I really don't like writing articles about Ardour and money. I like to think that successful and worthy projects will magically fund themselves, and obviously, I like to think that Ardour is successful and worthy. This is wrong thinking, however.

Given the support from users and companies for the project, that news comes as a bit of a shock. Users have stated that Ardour can hold its own against Pro Tools, a proprietary DAW used by professionals throughout the music industry. It has also been used in the Mixbus DAW product from Harrison Audio Consoles, sales of which provides some income for the project.

Davis is not completely stopping development work on Ardour; he has "the option of working for a digital audio company that is developing new projects based on Ardour." However, this would focus development on the needs of that particular business over Ardour's end users:

If I do this, I will still be working on Ardour's codebase, but my focus will cease being what I perceive the needs and desires of Ardour users to be, and will be dominated by what another company thinks I should be doing. I don't particularly want to go down this route, but given the current "curve" of the income trend, it appears to me that I will probably have to.

Davis concluded the post with uncertainty about the prospects of the community picking up the burden, and an insistence that his message is not a request for funding. He also emphasized in a comment on the post that he is not abdicating his role as lead developer. However, the message is clear: Ardour's lead developer will likely shift gears on that work in the near future.

What happened? How did a project that received both commercial attention and high praise from users not find the means to fund one developer full-time? To help figure that out, we will need to take a look Ardour's history.

Davis had started the project in 2000, working full-time on Ardour for several years, buoyed by a windfall from his work for Amazon.com in its early days. But the issue of financial sustainability would rear its head before Ardour's initial "stable" 0.99 release in September 2005 (developers made a decision to skip the 1.0 release and make major changes for a 2.0 release). In a post to the development mailing list in May 2005, Davis explained that he had only earned $6,000 over the past five years to support his work. As a result, for a time, he would have to take an unrelated development job that would consume most of his time during the week.

In 2007, the project started a subscription-based funding model. Web site visitors could only download binaries with a paid subscription, which currently costs $1, $4, or $10 per month, with a $50/month option for institutions. Full source code remains available for free download. However, with the project only targeting Linux and Mac OS X users, this seems not to have led to a sustainable model, as popular Linux distributions packaged Ardour for their users. For example, in an October 2007 LWN review of Ardour 2, Forrest Cook reported that the multimedia-focused Ubuntu Studio came with Ardour out-of-the-box.

The community remained concerned about sustainable funding, with the topic dominating discussion on the development mailing list in January 2009. Patrick Shirkey of Boost Hardware then suggested a number of possible funding sources, ranging from seeking grants, to a music CD featuring artists who use Ardour, to celebrity endorsements.

While Davis's income from Ardour had improved by June 2009, an interview with Linux Journal around that time revealed that he was still very concerned about his personal financial situation. Positive attention toward Ardour, coupled with concern for its self-sustainability, remained during the years to come. In a 2010 episode of Jono Bacon and Stuart Langridge's "Shot of Jaq" podcast [.ogg], they noted that software projects relying on funding directly from end users to finance long-term development have struggled in comparison to the proven model of corporations financing free software projects while selling related services (e.g. Red Hat and SUSE). There have also been marketing issues with the subscription model, with one potential user complaining in 2011 about having to pay for binaries for this particular open-source project when other open-source software is available free in both source and executable forms.

In 2012, we finally saw an Ardour-based release for Windows users: a proprietary, closed-source product named Harrison Mixbus. Windows users arriving at the Ardour download page were directed to the Harrison Consoles web site to purchase the product: it currently costs $149 outright (discounted from $219), or $49 plus $9/month for a subscription.

There was also discussion in 2012 about charging $10-20 for Ardour on the Ubuntu Software Center, which did not lead to doing so; discussion was short-lived and the idea did not seem to be taken seriously by the community. Ardour's 3.0 release came in 2013, with many new features, such as complete support for MIDI.

It appears that Ardour's subscription model has one major technical flaw, which may have cost the project some money. Some formerly subscribing Ardour users, who were concerned about the future of the project,commented on Davis' recent post noting that they had various technical difficulties with the subscription mechanism. Davis replied that PayPal, which is used for subscriptions, does not provide a programmatic interface to view canceled subscriptions, leaving him to manually update the subscription database by downloading a CSV file from PayPal. Addressing that issue may pull in some extra cash, but likely not enough to cause Davis to reconsider his decision to take the new job.

Perhaps an opportunity was missed with the project's refusal to consider "average" Windows users (i.e. those who couldn't or wouldn't pay the high cost for Harrison Mixbus) as a potential userbase worth targeting. In 2006, Davis noted, albeit without providing examples, that several open source projects offering Windows ports have not had the capacity to manage the increased demands:

Many other *nix open source projects have been overwhelmed when they have ported to windows - a huge, sudden influx of users with zero background in software development, and no infrastructure to offer them support. we don't want to end up in that situation.

In 2009, Davis again argued that the social costs of supporting a large, non-technical Windows userbase who are not as "willing to support software developers who provide useful tools" outweigh any benefits to the project. But with the massive install base of Windows, the standard subscription option might have captured quite a few Windows users. It's plausible that, with a marketing push, Ardour could have found enough Windows users to subscribe at, say, $10/month. That might have brought in enough to enable Davis to stay on full-time. Or it may not have; that influx might have been more than eaten up by the additional support burden, which is what concerned Davis.

In 2011, Davis noted that the market for DAWs is small, and that even the mainstream proprietary DAWs have been financially struggling:

The audience for paid-for audio software [...] is at least an order of magnitude smaller than the one for paid-for games. the strategies that work to create a revenue flow for games and game engines are totally different [...] from the ones that work for pro-audio software. if that wasn't true, then companies like Steinberg would not have nearly gone bankrupt a few years ago, and Avid would be making huge sums of money from sales of ProTools. But Steinberg did nearly go bankrupt (they are now owned by Yamaha) and Avid does not make much money from the sales of ProTools.

It is sad to see such a well-respected open-source project — one that is key to attracting certain users to Linux (i.e. professional and hobbyist musicians) — be unable to support even one full-time developer. The history of Ardour serves, in part, as a warning to potential developers for open-source software niche projects; relying on end users for funding paid work may not be a sustainable model.

Comments (12 posted)

Brief items

Quotes of the week

I understand that we don’t want to self-host. IT has enough to do. I also understand that it may be that no-one is offering to host an open source solution that meets our feature requirements. And the “Mozilla using proprietary software or web services” ship hasn’t just sailed, it’s made it to New York and is half way back and holding an evening cocktail party on the poop deck.
Gervase Markham

Perhaps Yorba should have done a kickstarter for Geary (their mail client) *and* potato salad.
Garrett LeSage, reflecting on the current state of crowdfunding culture.

Comments (none posted)

First release of KDE Frameworks 5

The KDE Community has announced the release of KDE Frameworks 5.0. "Frameworks 5 is the next generation of KDE libraries, modularized and optimized for easy integration in Qt applications. The Frameworks offer a wide variety of commonly needed functionality in mature, peer reviewed and well tested libraries with friendly licensing terms. There are over 50 different Frameworks as part of this release providing solutions including hardware integration, file format support, additional widgets, plotting functions, spell checking and more. Many of the Frameworks are cross platform and have minimal or no extra dependencies making them easy to build and add to any Qt application."

Comments (19 posted)

Unifont 7.0.03 Released

Version 7.0.03 of the GNU Unifont font has been released. This update includes a glyph for every printable code point in version 7.0 of the Unicode Basic Multilingual Plane (BMP), and also supports the ConScript Unicode Registry (CSUR).

Full Story (comments: none)

GNU Source Release Collection 2014.07.06 available

Version 2014.07.06 of the GNU Source Release Collection (GSRC) is now available. The release is a snapshot of released GNU software projects, including 61 updates since the last release and four new projects added since the last release. Note, though, that GSRC does not include every project under the GNU umbrella; users should check the coverage statistics to see which packages are provided.

Full Story (comments: none)

Juju 1.20 available

Version 1.20 of the Juju service-orchestration system has been released. Among other changes, this update adds a "high availability" mode, adds support for multiple network interfaces that use the same MAC address, and improves support for LXC containers. It also adds a version field to the server-side API, so users in the future may see new features available only in versioned API requests.

Comments (none posted)

Newsletters and articles

Development newsletters from the past week

Comments (none posted)

Python Foundation uncoils as membership opens up (Opensource.com)

Opensource.com has an interview with Nick Coghlan, who is a newly elected Python Software Foundation (PSF) board member. In the interview, Coghlan discusses the new open membership model for the PSF, what makes Python special, how the huge investment in OpenStack is having an impact on CPython core development, and a look at the future for both Python and the PSF. "For me, the most fascinating thing about Python is the sheer breadth of the domains it competes in. In the projects I worked on at Boeing, Python became our "go to" glue language for getting different parts of a complex system to play nicely together, as well for writing simulation tools for testing environments. Linux distributions tend to use it in a similar fashion. In the scientific space it goes head to head with the likes of MATLAB for numeric computing, and R for statistical analysis. It was the original implementation language for YouTube, and the language of choice for OpenStack components, yet still simple enough to be chosen as the preferred programming language for the Raspberry Pi and One Laptop Per Child educational programs. With the likes of Maya and Blender using it as their embedded scripting engine, animation studios love it because animators can learn to handle tasks that previously had to be handled by the studios' development teams. That diversity of use cases can make things fraught at times, especially in core development where the competing interests can often collide, but it's also a tremendous strength."

Comments (none posted)

Interview: Damian Conway (Linux Voice)

Linux Voice magazine has an interview with Damian Conway, one of the chief architects of Perl 6. In it, he talks about Perl 6 a bit (of course), but also about Perl, in general, as well as about teaching and learning programming. "Anyone who believes you can teach programming in an hour has no idea about what programming is. I think that I finally thought that I was a confident programmer maybe about four or five years ago, so after about a quarter of a century of coding. I felt that I was an ordinary good programmer by that stage. I don’t think you can even teach HTML in an hour, to be brutally honest."

Comments (32 posted)

Gräßlin: Next Generation Klipper

On his blog, Martin Gräßlin examines Klipper, the KDE clipboard manager, with an eye toward how it should work for Plasma 5.1. "A clipboard history is of course an important part of a desktop shell and thus should be a first class citizen. The user interface needs to be integrate and this means the interface needs to be provided by a Plasmoid which needs to be added to the notification area. The interface would still show a list and this is best done by providing the data in the form of a QAbstractItemModel. As there should only be one clipboard history manager, but at the same time perhaps several user interfaces for it (e.g. one panel per screen) the QAbstractItemModel holding the data needs to be provided by a DataEngine. So overall we need to separate the user interface (Plasmoid) from the data storage (DataEngine) and turn the existing Klipper in just being the data storage."

Comments (52 posted)

Page editor: Nathan Willis

Announcements

Brief items

Andrew Tanenbaum retires

Professor Andrew Tanenbaum, creator of MINIX, is retiring after 43 years at the Vrije Universiteit in the Netherlands. He will give a final lecture at the VU on October 23, which will be followed by a reception. (Thanks to Michael Kerrisk.)

Comments (16 posted)

Articles of interest

FSFE Newsletter – July 2014

The July edition of the Free Software Foundation Europe newsletter covers Privacy cafés, Email self-defense goes multilingual, What to use instead of WhatsApp and Threema?, and more.

Full Story (comments: none)

FSFE: EC distorts market by refusing to break free from lock-in

The Free Software Foundation Europe expresses its disappointment with the European Commission for sticking with proprietary software. "The Commission recently admitted publicly for the first time that it is in "effective captivity" to Microsoft. But documents obtained by FSFE show that the Commission has made no serious effort to find solutions based on Open Standards. In consequence, a large part of Europe's IT industry is essentially locked out of doing business with the Commission."

Full Story (comments: none)

An open-minded Internet safety curriculum (Opensource.com)

Part of the curriculum for high school students in the US is a class on internet safety. This article on Opensource.com looks at what is taught and what else should be covered in these classes. "Of course, we must work to help kids understand that the technology world can be a complicated and unsafe place. Digital reputation, Internet security, and online self-defense are critical skills for every citizen. However, in a rush to reduce the discussion to popular topics such as cyberbullying, online predators, and chat rooms, many schools have missed larger and more salient issues. Net Neutrality, Snowden's NSA revelations, social data mining, vendor lock-in and control: these fundamental ideas, principles, and values will ultimately shape and direct our students' technology future, and our society."

Comments (none posted)

Calls for Presentations

Seattle GNU/Linux Conference

The Seattle GNU/Linux Conference (SeaGL) will take place October 24-25, 2014 in Seattle, Washington. The call for participation deadline is July 27.

Full Story (comments: none)

CFP Deadlines: July 10, 2014 to September 8, 2014

The following listing of CFP deadlines is taken from the LWN.net CFP Calendar.

DeadlineEvent Dates EventLocation
July 11 October 13
October 15
CloudOpen Europe Düsseldorf, Germany
July 11 October 13
October 15
Embedded Linux Conference Europe Düsseldorf, Germany
July 11 October 13
October 15
LinuxCon Europe Düsseldorf, Germany
July 11 October 15
October 17
Linux Plumbers Conference Düsseldorf, Germany
July 14 August 15
August 17
GNU Hackers' Meeting 2014 Munich, Germany
July 15 October 24
October 25
Firebird Conference 2014 Prague, Czech Republic
July 20 January 12
January 16
linux.conf.au 2015 Auckland, New Zealand
July 21 October 21
October 24
PostgreSQL Conference Europe 2014 Madrid, Spain
July 24 October 6
October 8
Qt Developer Days 2014 Europe Berlin, Germany
July 24 October 24
October 26
Ohio LinuxFest 2014 Columbus, Ohio, USA
July 25 September 22
September 23
Lustre Administrators and Developers workshop Reims, France
July 27 October 14
October 16
KVM Forum 2014 Düsseldorf, Germany
July 27 October 24
October 25
Seattle GNU/Linux Conference Seattle, WA, USA
July 30 October 16
October 17
GStreamer Conference Düsseldorf, Germany
July 31 October 23
October 24
Free Software and Open Source Symposium Toronto, Canada
August 1 August 4 CentOS Dojo Cologne, Germany Cologne, Germany
August 15 September 25
September 26
Kernel Recipes Paris, France
August 15 August 25 CentOS Dojo Paris, France Paris, France
August 15 November 3
November 5
Qt Developer Days 2014 NA San Francisco, CA, USA
August 15 October 20
October 21
Tizen Developer Summit Shanghai Shanghai, China
August 18 October 18
October 19
openSUSE.Asia Summit Beijing, China
August 22 October 3
October 5
PyTexas 2014 College Station, TX, USA
August 31 October 13 Tracing Summit 2014 Düsseldorf, Germany
August 31 October 25
October 26
T-DOSE 2014 Eindhoven, Netherlands
September 1 October 2
October 3
PyCon ZA 2014 Johannesburg, South Africa
September 1 October 28
October 29
2014 LLVM Developers' Meeting San Jose, CA, USA
September 2 November 11 Korea Linux Forum Seoul, South Korea
September 7 October 4
October 5
Iberian minority language groups reach out to open source Santiago de Compostela, Spain

If the CFP deadline for your event does not appear here, please tell us about it.

Upcoming Events

Events: July 10, 2014 to September 8, 2014

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
July 5
July 11
Libre Software Meeting Montpellier, France
July 6
July 12
SciPy 2014 Austin, Texas, USA
July 14
July 16
2014 Ottawa Linux Symposium Ottawa, Canada
July 18
July 20
GNU Tools Cauldron 2014 Cambridge, England, UK
July 19
July 20
Conference for Open Source Coders, Users and Promoters Taipei, Taiwan
July 20
July 24
OSCON 2014 Portland, OR, USA
July 21
July 27
EuroPython 2014 Berlin, Germany
July 26
August 1
Gnome Users and Developers Annual Conference Strasbourg, France
August 1
August 3
PyCon Australia Brisbane, Australia
August 4 CentOS Dojo Cologne, Germany Cologne, Germany
August 6
August 9
Flock Prague, Czech Republic
August 9 Fosscon 2014 Philadelphia, PA, USA
August 15
August 17
GNU Hackers' Meeting 2014 Munich, Germany
August 18
August 19
Xen Developer Summit North America Chicago, IL, USA
August 18 7th Workshop on Cyber Security Experimentation and Test San Diego, CA, USA
August 18
August 19
Linux Security Summit 2014 Chicago, IL, USA
August 18
August 20
Linux Kernel Summit Chicago, IL, USA
August 19 2014 USENIX Summit on Hot Topics in Security San Diego, CA, USA
August 20
August 22
USENIX Security '14 San Diego, CA, USA
August 20
August 22
LinuxCon North America Chicago, IL, USA
August 20
August 22
CloudOpen North America Chicago, IL, USA
August 22
August 23
BarcampGR Grand Rapids, MI, USA
August 23
August 31
Debian Conference 2014 Portland, OR, USA
August 23
August 24
Free and Open Source Software Conference St. Augustin (near Bonn), Germany
August 25 CentOS Dojo Paris, France Paris, France
August 26
August 31
ownCloud Contributor Conference and Hackathon Berlin, Germany
August 27
August 30
LaKademy 2014 São Paulo, Brazil
September 2
September 5
LibreOffice Conference Bern, Switzerland
September 5 The OCaml Users and Developers Workshop Gothenburg, Sweden
September 5
September 7
BalCCon 2k14 Novi Sad, Serbia
September 6
September 12
Akademy 2014 Brno, Czech Republic
September 7
September 12
CppCon Bellevue, WA, USA

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol


Copyright © 2014, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds