LWN.net Weekly Edition for April 24, 2014 [LWN.net]

The next generation of Python programmers

By Jake Edge
April 23, 2014

PyCon 2014

In a keynote on day two of PyCon 2014 (April 12), Jessica McKellar made an impassioned plea for the Python community to focus on the "next generation" of Python programmers. She outlined the programming-education problem that exists in US high schools (and likely elsewhere in the world as well), but she also highlighted some concrete steps the community could take to help fix it. She expressed hope that she could report progress at the next PyCon, which will also be held in Montréal, Canada, next year.

Statistics

McKellar used the US state of Tennessee (known for "country music and Jack Daniels") as an example. That is where she went to high school and where her family lives. There are roughly 285,000 high school students in Tennessee, but only 251 of them took the advanced placement (AP) computer science (CS) exam. That is 0.09%. AP CS is the most common computer science class that high school students in the US can take.

She showed a slide with an Hour of Code promo that had various "important, famous, and rich people", including US President Barack Obama, who are enthusiastic about students learning to code, she said. They believe that all students should at least have the opportunity to learn how to program in school.

But the reality is quite a bit different. In terms of AP participation by students, CS is quite low. She put up a graph of AP test takers by subject for 2013, showing CS with roughly 30,000 takers nationwide, slightly ahead of 2D studio art and well below subjects like history or literature, which have 400,000+ participants each.

The problem, typically, is the availability of classes. McKellar's sister goes to high school in Nashville, Tennessee. She likes electronics and would be interested in taking a class in computer programming. The best that she is offered at her high school is a class on learning to use Microsoft Word, however. That's not just a problem for Tennessee, either, as there are roughly 25,000 high schools in the US and only around 2,300 of them teach AP CS.

She put up more statistics, including the pass rate of the AP CS exam, which was 66% in Tennessee. But of the 25 African-American students who took it, the pass rate was 32%. She showed maps of the US with states highlighted where no African-Americans took the test (11 states), no Hispanics (7), and no girls (3). One of the latter was Mississippi, where the lack of female test takers may be somewhat self-reinforcing; any girl in that state may well get the message that AP CS is not something that girls do. In addition, any girl who considers taking it will have to worry about her results being scrutinized on a national level: "I better pass or I will be a stat people talk about".

AP class participation by gender was next up. There are more AP classes where girls outnumber boys, but for math and science, that balance switches. CS has the worst gender imbalance of any AP class.

The Python community cares about this because it spends not just time and money, but "blood, sweat, and actual tears" trying to improve this imbalance, which starts in high school and even earlier. McKellar said she understands "how empowering programming is" and she wants students to have the opportunity to "engage in computational thinking skills". She wants the politicians that are making decisions about how the country is run to have that experience as well. Fixing this problem is important to the long-term success of the Python language as well.

What can be done

It was a "depressing introduction", she said, with lots of statistics that make it look like an "intractable problem". But there is some solace that can be taken from some of those statistics. While AP CS is the most gender-skewed AP exam, 29% of the test takers in Tennessee were girls. That is largely due to one teacher, Jill Pala in Chattanooga, who taught 30 of the 71 girls who took the exam. McKellar asked: If one teacher can make that big of a difference, what can a 200,000-member community do?

To answer that question, she asked three CS education specialists. If one is "armed with a Python community", what can be done to help educate the next generation of (Python) programmers? She got responses in four main areas: policy, student engagement, supporting teachers, and curriculum development. She said that she would be presenting the full fire hose of ideas in the hopes that everyone in the audience would find at least one that resonated.

Policy

To start with, in 35 states computer science is only an elective that doesn't give any math or science credit to a student who takes it. If a class doesn't "count for anything", students don't want to take it, schools don't want to offer it, and teachers don't want to teach it. One thing that audience members could do is to "pick up the phone" and ask legislators and school boards to change that.

There is also a lack of per-state data on who makes the decisions about what high school graduation requirements are. There is also a lack of a single place to go for per-state credential requirements for a teacher to be certified in CS. This is a problem for policy makers because they have no way to judge their own state policies by comparing them with their neighbors. It is something that could be fixed "in a weekend" with some Python, web scraping, and version control, she said.

Another problem is that AP CS is still taught in Java, which is a bad language to teach in. That is not her "hating on Java", but it is something that teachers say. If you think about what goes into writing "hello world" for Java, it doesn't allow deferring certain concepts (e.g. classes, object-oriented programming), which makes it difficult to understand "if you've never written a for loop before". Java is also getting "long in the tooth" as the AP CS language. Pascal was the that language for fifteen years, followed by C++ for six years, and now Java for the last eleven years.

People have been gathering information about what languages colleges use, what languages college teachers wish they were using, and what language they think they will be using ten years from now. Some of that information is compiled into Reid's List (described in this article [PDF], scroll down a ways), which shows that Python's popularity with college CS programs is clearly on the rise. But Reid's List has not been updated since 2012, as it is done manually, partly via phone calls, she said.

The 2012 list also shows Java with a clear lock on first place for the first programming language taught (Java 197, C++ 82, Python 43, ...). But, McKellar believes that Python's numbers have "skyrocketed" since then. She would like people to engage the College Board (which sets the AP standards) to switch the AP CS exam from Java to Python. The College Board bases its decision on what language teachers want to teach in, so the rise of Python, could be instrumental in helping it to make that decision—especially if that rise has continued since 2012. AP CS courses in Python would make for a "more fun computing experience", she said, and by engaging with the College Board, that change could happen in four to six years.

Student engagement

Students don't really know what CS is or what it is about, so they don't have much interest in taking it. But there are lots of existing organizations that teach kids, but don't know programming: summer camps, Boy Scouts, Girl Scouts, etc. This is an opportunity for the Python community to work with these groups to add a programming component to their existing activities.

There are also after-school programs that lack programming teachers. The idea is to take advantage of existing programs for engaging students to help those students learn a bit about computer science. That may make them more likely to participate in AP CS when they get to high school.

Supporting teachers

CS teachers are typically all alone, as there is generally only one per high school. That means they don't have anyone to talk to or to bounce ideas off of. But the Python community is huge, McKellar said, so it makes sense to bring those high school teachers into it.

Pythonistas could offer to answer lesson plan questions. Or offer to be a teaching assistant. They could also volunteer to visit the class to answer questions about programming. Inviting the teacher to a local user group meeting might be another way to bring them into the community.

Curriculum development

There is a new AP CS class called "CS Principles" that is being developed and tested right now. It covers a broader range of topics that will appeal to more students, so it is a "really exciting way to increase engagement", she said. So far, though, there is no day-to-day curriculum for the course in any language. That is a huge opportunity for Python.

If the best way to teach the CS Principles class was with a curriculum using Python, everyone would use it, she said. Teachers have a limited amount of time, so if there is an off-the-shelf curriculum that does a good job teaching the subject, most will just use it. The types of lessons that are laid out for the class look interesting (e.g. cryptography, data as art, sorting) and just require some eyes and hands to turn them into something Python-oriented that can be used in the classroom. Something like that could be used internationally, too, as there aren't many curricula available for teaching programming to high school students.

Deploying Python for high schools can be a challenge, however. She talked with one student who noted that even the teachers could not install software in the computer lab—and the USB ports had been filled with glue for good measure. That means that everything must be done in the browser. Runestone Interactive has turned her favorite book for learning Python, Think Python, into an interactive web-based textbook. The code is available on GitHub.

Perhaps the most famous browser-based Python is Skulpt, which is an implementation of the language in JavaScript (also available on GitHub). There are currently lots of open bugs for things that teachers want Skulpt to be able to do. Fixing those bugs might be something the community could do. Whether we like or not, she said, the future of teaching programming may be in the browser.

Summing up

Since we are starting from such terrible numbers (both raw and percentage-wise), a small effort can make a big difference, McKellar said. The Python Software Foundation (PSF), where McKellar is a director, is ready to help. If you are interested in fixing Skulpt bugs, for example, the PSF will feed you pizza while you do that (in a sprint, say). It will also look to fund grant proposals for any kind of Python-education-related project.

She put forth a challenge: by next year's PyCon, she would like to see every attendee do one thing to further the cause of the next generation of Python programmers. At that point, she said, we can look at the statistics again and see what progress has been made. As she made plain, there is plenty of opportunity out there, it just remains to be seen if the community picks up the ball and runs with it.

Slides and video from McKellar's keynote are available.

Comments (80 posted)

Pickles are for delis

By Jake Edge
April 23, 2014

PyCon 2014

Alex Gaynor likes pickles, but not of the Python variety. He spoke at PyCon 2014 in Montréal, Canada to explain the problems he sees with the Python pickle object serialization mechanism. He demonstrated some of the things that can happen with pickles—long-lived pickles in particular—and pointed out some alternatives.

Pickle introduction

He began by noting that he is a fan of delis, pickles, and, sometimes, software, but that some of those things—software and the Python pickle module—were also among his least favorite things. The idea behind pickle serialization is simple enough: hand the dump() function an object, get back a byte string. That byte string can then be handed to the pickle module's load() function at a later time to recreate the object. Two of the use cases for pickles are to send objects between two Python processes or to store arbitrary Python objects in a database.

The pickle.dumps() (dump to a string) method returns "random nonsense", he said, and demonstrated that with the following code:

    >>> pickle.dumps([1, "a", None])
    "(lp0\nI1\naS'a'\np1\naNa."

By using the pickletools module, which is not well-known, he said, one can peer inside the nonsense:

    >>> pickletools.dis("(lp0\nI1\naS'a'\np1\naNa.")
	0: (    MARK
	1: l        LIST       (MARK at 0)
	2: p    PUT        0
	5: I    INT        1
	8: a    APPEND
	9: S    STRING     'a'
       14: p    PUT        1
       17: a    APPEND
       18: N    NONE
       19: a    APPEND
       20: .    STOP

The pickle format is a simple stack-based language, similar in some ways to the bytecode used by the Python interpreter. The pickle is just a list of instructions to build up the object, followed by a STOP opcode to return what it has built so far.

In principle, dumping data to the pickle format is straightforward: determine the object's type, find the dump function for that type, and call it. Each of the built-in types (like list, int, or string) would have a function that can produce the pickle format for that type.

But, what happens for user-defined objects? Pickle maintains a table of functions for the built-in types, but it can't do that for user-defined classes. It turns out that it uses the __reduce__() member function that returns a function and arguments used to recreate the object. That function and its arguments are put into the pickle, so that the function can be called (with those arguments) at unpickling time. Using the built-in object() type, he showed how that information is stored in the pickle (the output was edited by Gaynor for brevity):

    >>> pickletools.dis(pickle.dumps(object()))
	0: c    GLOBAL     'copy_reg _reconstructor'
       29: c        GLOBAL     '__builtin__ object'
       55: N        NONE
       56: t        TUPLE
       60: R    REDUCE
       64: .    STOP

The _reconstructor() method from the copy_reg module is used to reconstruct its argument, which is the object type from the __builtin__ module. Similarly, for a user-defined class (again, output has been simplified):

    >>> class X(object):
    ...  def __init__(self):
    ...   self.my_cool_attr = 3
    ...
    >>> x = X()
    >>> pickletools.dis(pickle.dumps(x))
	0: c    GLOBAL     'copy_reg _reconstructor'
       29: c        GLOBAL     '__main__ X'
       44: c        GLOBAL     '__builtin__ object'
       67: N        NONE
       68: t        TUPLE
       72: R    REDUCE
       77: d        DICT
       81: S    STRING     'my_cool_attr'
      100: I    INT        3
      103: s    SETITEM
      104: b    BUILD
      105: .    STOP

The pickle refers to the class X, by name, as well as the my_cool_attr attribute. By default, Python pickles all of the entries in x.__dict__, which stores the attributes of the object.

A class can define its own unique pickling behavior by defining the __reduce__() method. If it contains something that cannot be pickled (a file object, for example), some kind of custom pickling solution must be used. __reduce__() needs to return a function and arguments to be called at unpickling time, for example:

    >>> class FunkyPickle(object):
    ...  def __reduce__(self):
    ...   return (str, ('abc',),)
    ...
    >>> pickle.loads(pickle.dumps(FunkyPickle()))
    'abc'

Unpickling is "shockingly simple", Gaynor said. If we look at the first example again (i.e. [1, 'a', None]), the commands in the pickle are pretty straightforward (ignoring some of the extraneous bits). LIST creates an empty list on the stack, INT 1 puts the integer 1 on the stack, and APPEND appends it to the list. The string 'a' and None are handled similarly.

Pickle woes

But, as we've seen, pickles can cause calls to any function available to the program (built-ins, imported modules, or those present in the code). Using that, a crafted pickle can cause all kinds of problems—from information disclosure to a complete compromise of the user account that is unpickling the crafted data. It is not a purely theoretical problem, either, as several applications or frameworks have been compromised because they unpickled user-supplied data. "You cannot safely unpickle data that you do not trust", he said, pointing to a blog post that shows how to exploit unpickling.

But, if the data is trusted, perhaps because we are storing and retrieving it from our own database, are there other problems with pickle? He put up a quote from the E programming language web site (scroll down a ways) that pointed to the problem:

Do you, Programmer, take this Object to be part of the persistent state of your application, to have and to hold, through maintenance and iterations, for past and future versions, as long as the application shall live?

- Erm, can I get back to you on that?

He then related a story that happened "many, many maintenance iterations ago, in a code base far, far away". Someone put a pickle into a database, then no one touched it for eighteen months or so. He needed to migrate the table to a different format in order to optimize the storage of some of the other fields. About halfway through the migration of this 1.6 million row table, he got an obscure exception: "module has no attribute".

As he mentioned earlier, pickle stores the name of the pickled class. What if that class no longer exists in the application? In that case, Python throws an AttributeError exception, because the "'module' object has no attribute 'X'" (where X is the name of the class). In Gaynor's case, he was able to go back into the Git repository, find the class in question, and add it back into the code.

A similar problem can occur if the name of an attribute in the class should change. The name is "hardcoded" in the pickle itself. In both of his examples, someone was doing maintenance on the code, made some seemingly innocuous changes, but didn't realize that there was still a reference to the old names in a stored pickle somewhere. In Gaynor's mind, this is the worst problem with pickles.

Alternatives

But if pickling is not a good way to serialize Python objects, what is the alternative? He said that he advocated writing your own dump() functions for objects that need them. He demonstrated a class that had a single size attribute, along with a JSON representation that was returned from dump():

    def dump(self):
        return json.dumps({
            "version" : 1,
            "size": self.size
        })

The version field is the key to making it all work as maintenance proceeds. If, at some point, size is changed to width and height, the dump() function can be changed to emit "version" : 2. One can then create a load() function that deals with both versions. It can derive the new width and height attributes from size (perhaps using sqrt() if size was the area of a square table as in his example).

Writing your own dump() and load() functions is more testable, simpler, and more auditable, Gaynor said. It can be tested more easily because the serialization doesn't take place inside an opaque framework. The simplicity also comes from the fact that the code is completely under your control; pickle gives you all the tools needed to handle these maintenance problems (using __reduce__() and a few other special methods), but it takes a lot more code to do so. Custom serialization is more auditable because one must write dump() and load() for each class that will be dumped, rather than pickle's approach which simply serializes everything, recursively. If some attribute got pickled improperly, it won't be known until the pickle.load() operation fails somewhere down the road.

His example used JSON, but there are other mechanisms that could be used. JSON has an advantage that it is readable and supported by essentially every language out there. If speed is an issue, though, MessagePack is probably worth a look. It is a binary format and supports lots of languages, though perhaps somewhat fewer than JSON.

He concluded his talk by saying that "pickle is unsafe at any speed" due to the security issues, but, more importantly, the maintenance issues. Pickles are still great at delis, however.

An audience member wondered about using pickles for sessions, which is common in the Python web-framework world. Gaynor acknowledged pickle's attraction, saying that being able to toss any object into the session and get it back later is convenient, but it is also the biggest source of maintenance problems in his experience. The cookies that are used as session keys (or signed cookies that actually contain the pickle data) can pop up at any time, often many months later, after the code has changed. He recommends either only putting simple types that JSON can handle directly into sessions or creating custom dump() and load() functions for things JSON can't handle.

There are ways to make pickle handle code updates cleanly, but they require that lots of code be written to do so. Pickle is "unsafe by default" and it makes more sense to write your own code rather than to try to make pickle safe, he said. One thing that JSON does not handle, but pickle does, is cyclic references. Gaynor believes that YAML does handle cyclic references, though he cautioned that the safe_load() function in the most popular Python implementation must be used rather than the basic load() function (though he didn't elaborate). Cyclic references are one area that makes pickle look nice, he said.

One of the biggest lessons he has learned when looking at serialization is that there is no single serialization mechanism that is good for all use cases. Pickle may be a reasonable choice for multi-processing programs where processes are sending pickles to each other. In that case, the programmer controls both ends of the conversation and classes are not usually going to disappear during the lifetime of those processes. But the clear sense was that, even in that case, Gaynor would look at a solution other than pickle.

The video of the talk is at pyvideo.org (along with many other PyCon videos). The slides are available at Speaker Deck.

Comments (6 posted)

Transmageddon media transcoder reaches 1.0

By Nathan Willis
April 23, 2014

Version 1.0 of the Transmageddon media transcoder was released in early April, complete with several new features in addition to the implicit stability cred that accompanies a 1.0 release. The tool itself looks unassuming: it occupies a small GTK+ window with little on display beyond a few pull-down menu selectors and a "Transcode" button, but it fills an important gap in the free software landscape. Support for compressed media formats can often turn into a minefield, with patent and licensing problems on one side and lack of free-software support on the other. Tools like Transmageddon that convert one format to another are in some sense a necessary evil as long as there are multiple audio/video formats to choose from, but a good free software tool can minimize the pain, making format conversion easy to tackle without constructing the lengthy and arcane command sequences demanded by command-line encoders.

The new release was made April 1, and was followed up by a 1.1 release on April 10 that fixed a pair of important bugs. Source bundles are available for download on the project's web site, as are RPM packages. Transmageddon requires GTK+3, GLib 2, Python 3, and GStreamer 1.0. The formats that the application can handle depend on which GStreamer plugins are installed; for the fullest range of options, all of the plugin collections are recommended, including the Bad and Ugly sets.

Christian Schaller started Transmageddon as a kind of personal experiment to familiarize himself with Python and GStreamer development back in 2009. Since then, the application has grown a bit in scope, but mostly in accordance with new features supported by GStreamer. The original intent, for instance, was to support simple output profiles that targeted portable devices, so that a user could transcode a video file for playback on the Nokia 810 tablet or Apple iPod. As most people who have experimented with transcoding videos have discovered, though, there are still quite a few options to juggle for any one target device, including separate options for the video and audio codecs chosen and what to do with multiple audio tracks. In addition to that, of course, the landscape of codecs and container formats has continued to evolve over the intervening years, as it will certainly continue to do.

Transmageddon's main competition in the easy-to-use media transcoder space is arguably HandBrake, which also offers a user-friendly GUI and is available under the GPL. But HandBrake has its own oddities that inhibit its support from Linux distributions, starting with the fact that it ships linked against GPL-incompatible components like the versions of FAAC that include proprietary code.

In practice, the other main transcoding approach in use is directly calling the command-line transcoding features exposed by FFmpeg, Mencoder, and other media-playback libraries. The difficulty, of course, is that ease of use is severely lacking with this approach. One often needs a different incantation for almost every input and output option, and the command-line libraries have a notorious reputation for esoteric syntax that is difficult to memorize—a reputation that, frankly, is well-earned. Command-line usage by end users is not the main goal of any of these library projects.

Transmageddon, on the other hand, works—and it generally works without too many confusing errors that are hard to debug. The user selects the input file from a GTK+ file-chooser dialog, Transmageddon automatically detects the makeup and format of the audio and video data within, and presents drop-down lists with the output options supported by the available GStreamer plugins.

The 1.0 release introduced three major new features: support for multiple audio streams, support for directly ripping DVDs, and support for defining custom transcoding presets. The multiple audio stream support means that Transmageddon has the ability to transcode each audio stream independently—including omitting some streams, and including transcoding each stream into a different codec. DVDs commonly include multiple audio tracks these days, and for some (such as audio commentary tracks) it might make sense to use a speech-specific codec like Speex, rather than a general-purpose codec that must handle the blend of speech, music, and sound effects found in the primary audio mix.

DVD-ripping, of course, is a common use case that Transmageddon had simply never supported before. The legality of ripping DVDs—even DVDs that you own—is a matter where the entertainment industry vehemently clashes with the software industry (and many consumers), so manually installing a few optional libraries like libdvdcss2 is required. Some Linux distributions make this step easier than others, but for a readable DVD, Transmageddon accurately detects the disc's media files, and transcoding them to a more portable format is not any more difficult than transcoding other content.

Custom presets for Transmageddon can be created by making an XML configuration file that follows the application's profile syntax. The syntax is straightforward, with <video> and <audio> elements encapsulating basic information like the codec to be used, frame rate, and number of audio channels. The only genuinely complex part of defining a custom profile is likely to be knowing and understanding the options for the chosen codecs; for help with that task, the Transmageddon site recommends getting familiar with GStreamer's gst-inspect-1.0 utility.

Two other low-level features also made it into the 1.0 release: support for Google's VP9 video codec and the ability to set a "language" tag for the input file (which is then preserved in the metadata of the output file). VP9 is now supported by Firefox, Chrome/Chromium, FFmpeg, and a few other applications, although it might not yet be called widespread. The language-tagging feature is just the beginning of better language support: DVDs typically tag their audio tracks with a language (which is metadata that Transmageddon preserves in the output file); in the future, Schaller said, he hopes to support multiple language tags on all input files, to simplify finding an audio track by language during playback.

Given its ease of use, Transmageddon is likely to attract quite a few users. Almost everyone runs into the need to convert a media file into another format at one time or another; some people do such transcoding on a regular basis, particularly for their portable devices. But Transmageddon is also worth exploring because it is built on GStreamer, which in recent years has grown into a library in widespread use by projects far beyond the GNOME desktop environment. GStreamer's plugin architecture makes it possible to build and use a multimedia system without any problematic codecs—and to build it without the hassle of recompiling from source. Distributors and system builders can address codec patent licenses and other such issues as they see fit, often on a per-plugin basis.

Transmageddon thus meets an important need with a simple and straightforward interface. At the same time, though, its existence also demonstrates that despite the popularity of GStreamer, FFmpeg, MPlayer, and similar big-name multimedia projects, transcoding media files remains an end-user's problem. Life for the average user would be much simpler if media applications like MythTV, XBMC, gPodder, and the various "music store" plugins of audio players automatically handled transcoding duties. But they don't. Partly that is because the never-ending codec wars make it unduly hard for a free-software application to automatically read and convert proprietary file formats, of course, but the upshot is that users have to juggle the codecs, containers, and transcoding themselves.

Comments (2 posted)

Decentralized storage with Camlistore

By Nathan Willis
April 23, 2014

Reducing reliance on proprietary web services has been a major target of free-software developers for years now. But it has taken on increased importance in the wake of Edward Snowden's disclosures about service providers cooperating with government mass-surveillance programs—not to mention the vulnerability that many providers have to surveillance techniques whether they cooperate or not. While some projects (such as Mailpile, ownCloud, or Diaspora) set out to create a full-blown service that users can be in complete control of, others, such as the Tahoe Least-Authority Filesystem, focus on more general functionality like decentralized data storage. Camlistore is a relative newcomer to the space; like Tahoe-LAFS it implements a storage system, but its creators are particularly interested in its use as a storage layer for blogs, content-management systems (CMSes), filesharing, and other web services.

Camlistore is a content-addressable storage (CAS) system with an emphasis on decentralized data storage. Specifically, the rationale for the project notes that it should be usable on a variety of storage back-ends, including Amazon's S3, local disk, Google Drive, or even mobile devices, with full replication of content between different locations.

Content addressability means that objects can be stored without assigning them explicit file names or placing them in a directory hierarchy. Instead, the "identity" of each object is a hash or digest calculated over the content of the object itself; subsequent references to the object are made by looking up the object's digest—where it is stored is irrelevant. As the rationale document notes, this property is a perfect fit for a good many objects used in web services today: photos, blog comments, bookmarks, "likes," and so on. These objects are increasing created in large numbers, and rarely does a file name or storage location come into play. Rather, they are accessed through a search interface or a visual browsing feature.

The Camlistore project produces both an implementation of such a decentralized storage system and a schema for representing various types of content. The schema would primarily be of interest to those wishing to use Camlistore as a storage layer for other applications.

The project's most recent release is version 0.7, from February 27. The storage server (with several available back-ends) is included in the release, as are a web-based interface, a Filesystem in Userspace (FUSE) module for accessing Camlistore as a filesystem, several tools for interoperating with existing web services, and mobile clients for Android and iOS.

The architecture of a Camlistore repository includes storage nodes (referred to by the charming name "blob servers") and indexing/search nodes, which index uploaded items by their digests and provide a basic search interface. The various front-end applications (including the mobile and web interfaces) handle both connecting to a blob server for object upload and retrieval and connecting to a search server for finding objects.

There can be several blob servers that fully synchronize with one another by automatically mirroring all data; the existing implementations can use hard disk storage or any of several online storage services. At the blob-server level, the only items that are tracked are blobs: immutable byte sequences that are uploaded to the service. Each blob is indexed by its digest (also called a blobref); Camlistore supports SHA1, MD5, and SHA256 as digest functions. Blobs themselves are encrypted (currently with AES-128, although other ciphers may be added in the future).

Semantically speaking, a blob does not contain any metadata—it is just a bunch of bytes. Metadata is attached to a blob by associating the blob with a data type from the schema, then cryptographically signing the result. Subsequently, an application can alter the attributes of a blob by creating a new signed schema blob (called a "claim"). For any blob, then, all of the claims on it are saved in the data store and can be replayed or backed up at will. That way, stored objects are mutable, but the changes to them are non-destructive. The current state of an object is the application of all of the claims associated with a blob, applied in the order of their timestamps.

This storage architecture allows for, potentially, a wide variety of front-end clients. Index servers already exist that use SQLite, LevelDB, MySQL, PostgreSQL, MongoDB, and Google App Engine's data store to manage the indexed blobs. Since an index server is logically separate from the blob servers that it indexes, it is possible to run an index on a portable device that sports little built-in storage, and still be able to transparently access all of the content maintained in the remote storage locations. In addition, Camlistore has the concept of a "graph sync," in which only a subset of the total blob storage is synchronized to a particular device. While full synchronization is useful to preserve the data in the event that a web service like Amazon S3 unexpectedly becomes unreachable, there are certainly many scenarios when it makes sense to keep only some of the data on hand.

As far as using the blob storage is concerned, at present Camlistore only implements two models: the basic storage/search/retrieval approach one would use to manage the entire collection, and directly sharing a particular item with another user. By default, each Camlistore server is private to a single user; users can share an object by generating a signed assertion that another user is permitted to access the object. This signed assertion is just one more type of claim for the underlying blob in the database. Several user-authentication options are supported, but for now the recipient of the share needs to have an account on the originating Camlistore system.

It may be a while before Camlistore is capable of serving as a storage layer for a blog, photo-hosting site, or other web service, but when it is ready, it will bring some interesting security properties with it. As mentioned, all claims on items in the database are signed—using GPG keys. That not only allows for verification of important operations (like altering the metadata of a blob), but it means it would be possible to perform identity checks for common operations like leaving comments. Camlistore does have some significant competition from other decentralized storage projects, Tahoe-LAFS included, but it will be an interesting project to watch.

Comments (none posted)

Security quotes of the week

This time I set the country code correctly, rebooted and now I can actually watch Monkey Dust again. Hurrah! But, at the same time, concerning. This software has been written without any concern for security, and it listens on the network by default. If it took me this little time to find two entirely independent ways to run arbitrary code on the device, it doesn't seem like a stretch to believe that there are probably other vulnerabilities that can be exploited with less need for physical access.

The depressing part of this is that there's no reason to believe that Panasonic are especially bad here - especially since a large number of vendors are shipping much the same Mediatek code, and so probably have similar (if not identical) issues. The future is made up of network-connected appliances that are using your electricity to mine somebody else's Dogecoin. Our nightmarish dystopia may be stranger than expected.

— Matthew Garrett pokes at his Panasonic BDT-230

I returned home Monday night and wanted nothing more than to take a shower, but my bathroom was flooded with water from a broken water heater from an apartment above. I had nothing better to do while waiting for maintenance than poke around with malloc.conf. If it hadn’t been for that, I probably never would have bothered. So there you have it, a broken water heater is the true cause of the libressl fork.

— Ted Unangst (Thanks to Cesar Eduardo Barros.)

Heartbleed is getting its fifteen minutes of fame, but what may matter most is that so much of what is being deployed now is in the embedded systems space — network-capable microcontrollers inside everything that has a power cord or a fuel tank. No one watches these and they are treated as if immortal. They have no remote management capability. There is not even a guarantee that their maker knows with precision what went into any one of them after the model year is over. The option suggested by the honeymoon effect is thus impossible, so the longer lived the devices really are, the surer it will be that they will be hijacked within their lifetime. Their manufacturers may die before they do, a kind of unwanted legacy much akin to space junk and Superfund sites.

— Dan Geer

Comments (none posted)

OpenSSL code beyond repair, claims creator of “LibreSSL” fork (Ars Technica)

Ars Technica takes a look at the LibreSSL fork of OpenSSL created by the OpenBSD project. "The decision to fork OpenSSL is bound to be controversial given that OpenSSL powers hundreds of thousands of Web servers. When asked why he wanted to start over instead of helping to make OpenSSL better, de Raadt said the existing code is too much of a mess. "Our group removed half of the OpenSSL source tree in a week. It was discarded leftovers," de Raadt told Ars in an e-mail. "The Open Source model depends [on] people being able to read the code. It depends on clarity. That is not a clear code base, because their community does not appear to care about clarity. Obviously, when such cruft builds up, there is a cultural gap. I did not make this decision... in our larger development group, it made itself.""

Comments (77 posted)

cacti: multiple vulnerabilities

Package(s):

cacti

CVE #(s):

CVE-2014-2708 CVE-2014-2709 CVE-2014-2326 CVE-2014-2328 CVE-2014-2327

Created:

April 17, 2014

Updated:

June 30, 2014

Description:

From the Red Hat bugzilla entries [1, 2]:

CVE-2014-2708 is for the SQL injection issues in graph_xport.php.

CVE-2014-2709 is for the shell escaping issues in lib/rrd.php

A posting to bugtraq from Deutsche Telekom noted multiple flaws in Cacti 0.8.7g:

CVE-2014-2326: stored XSS "The Cacti application is susceptible to stored XSS attacks. This is mainly the result of improper output encoding."

CVE-2014-2327: missing CSRF token "The Cacti application does not implement any CSRF tokens. More about CSRF attacks, risks and mitigations see https://www.owasp.org/index.php/Cross-Site_Request_Forgery_(CSRF). This attack has a vast impact on the security of the Cacti application, as multiple configuration parameters can be changed using a CSRF attack. One very critical attack vector is the modification of several binary files in the Cacti configuration, which may then be executed on the server. This results in full compromise of the Cacti host by just clicking a web link. A proof of concept exploit has been developed, which allows this attack, resulting in full (system level) access of the Cacti system. Further attack scenarios include the modification of the Cacti configuration and adding arbitrary (admin) users to the application."

CVE-2014-2328: use of exec-like function calls without safety checks allow arbitrary command execution "Cacti makes use of exec-like method PHP function calls, which execute command shell code without any safety checks in place. In combination with a CSRF weakness this can be triggered without the knowledge of the Cacti user. Also, for more elaborate attacks, this can be combined with a XSS attack. Such an attack will result in full system (Cacti host) access without any interaction or knowledge of the Cacti admin."

Alerts:

Gentoo	201509-03	cacti	2015-09-24
openSUSE	openSUSE-SU-2015:0479-1	cacti	2015-03-11
Mageia	MGASA-2014-0302	cacti	2014-07-26
Debian	DSA-2970-1	cacti	2014-06-29
openSUSE	openSUSE-SU-2014:0600-1	cacti	2014-05-02
Fedora	FEDORA-2014-4928	cacti	2014-04-17
Fedora	FEDORA-2014-4892	cacti	2014-04-17

Comments (none posted)

java: three unspecified vulnerabilities

Package(s):

java-1.7.0-oracle

CVE #(s):

CVE-2014-0432 CVE-2014-0448 CVE-2014-2422

Created:

April 17, 2014

Updated:

May 14, 2014

Description:

Yet again more unspecified Java vulnerabilities.

Alerts:

Gentoo	201502-12	oracle-jre-bin	2015-02-15
SUSE	SUSE-SU-2014:0733-2	IBM Java 7	2014-06-02
SUSE	SUSE-SU-2014:0733-1	IBM Java 7	2014-05-30
Red Hat	RHSA-2014:0486-01	java-1.7.0-ibm	2014-05-13
Red Hat	RHSA-2014:0412-01	java-1.7.0-oracle	2014-04-17
Red Hat	RHSA-2014:0413-02	java-1.7.0-oracle	2014-04-17

Comments (none posted)

java: multiple unspecified vulnerabilities

Package(s):

java-1.6.0-sun

CVE #(s):

CVE-2014-0449 CVE-2014-2401 CVE-2014-2409 CVE-2014-2420 CVE-2014-2428

Created:

April 17, 2014

Updated:

June 3, 2014

Description:

More in a long series of unspecified Java vulnerabilities.

Alerts:

Gentoo	201502-12	oracle-jre-bin	2015-02-15
SUSE	SUSE-SU-2014:0733-2	IBM Java 7	2014-06-02
SUSE	SUSE-SU-2014:0728-3	IBM Java 6	2014-06-03
SUSE	SUSE-SU-2014:0733-1	IBM Java 7	2014-05-30
SUSE	SUSE-SU-2014:0728-2	IBM Java 6	2014-05-30
SUSE	SUSE-SU-2014:0728-1	IBM Java 6	2014-05-29
Red Hat	RHSA-2014:0508-01	java-1.6.0-ibm	2014-05-15
Red Hat	RHSA-2014:0509-01	java-1.5.0-ibm	2014-05-15
Red Hat	RHSA-2014:0486-01	java-1.7.0-ibm	2014-05-13
Red Hat	RHSA-2014:0412-01	java-1.7.0-oracle	2014-04-17
Red Hat	RHSA-2014:0413-02	java-1.7.0-oracle	2014-04-17
Red Hat	RHSA-2014:0414-01	java-1.6.0-sun	2014-04-17

Comments (none posted)

kernel: privilege escalation

Package(s):

kernel

CVE #(s):

CVE-2014-2851

Created:

April 18, 2014

Updated:

May 6, 2014

Description:

From the CVE entry:

Integer overflow in the ping_init_sock function in net/ipv4/ping.c in the Linux kernel through 3.14.1 allows local users to cause a denial of service (use-after-free and system crash) or possibly gain privileges via a crafted application that leverages an improperly managed reference counter.

Alerts:

Oracle	ELSA-2015-0290	kernel	2015-03-12
Oracle	ELSA-2014-1392	kernel	2014-10-21
openSUSE	openSUSE-SU-2014:1246-1	kernel	2014-09-28
Red Hat	RHSA-2014:1101-01	kernel	2014-08-27
CentOS	CESA-2014:0981	kernel	2014-07-31
Scientific Linux	SLSA-2014:0981-1	kernel	2014-07-29
Oracle	ELSA-2014-0981	kernel	2014-07-29
Red Hat	RHSA-2014:0981-01	kernel	2014-07-29
Oracle	ELSA-2014-0786	kernel	2014-07-23
SUSE	SUSE-SU-2014:0908-1	Linux kernel	2014-07-17
SUSE	SUSE-SU-2014:0909-1	Linux kernel	2014-07-17
SUSE	SUSE-SU-2014:0910-1	Linux kernel	2014-07-17
SUSE	SUSE-SU-2014:0911-1	Linux kernel	2014-07-17
SUSE	SUSE-SU-2014:0912-1	Linux kernel	2014-07-17
openSUSE	openSUSE-SU-2014:0856-1	kernel	2014-07-01
Ubuntu	USN-2260-1	linux-lts-trusty	2014-06-27
openSUSE	openSUSE-SU-2014:0840-1	kernel	2014-06-25
Red Hat	RHSA-2014:0786-01	kernel	2014-06-24
Red Hat	RHSA-2014:0557-01	kernel-rt	2014-05-27
Ubuntu	USN-2227-1	linux-ti-omap4	2014-05-27
Ubuntu	USN-2225-1	linux-lts-saucy	2014-05-27
Ubuntu	USN-2224-1	linux-lts-raring	2014-05-27
Ubuntu	USN-2223-1	linux-lts-quantal	2014-05-27
Ubuntu	USN-2228-1	kernel	2014-05-27
Ubuntu	USN-2226-1	kernel	2014-05-27
Ubuntu	USN-2221-1	kernel	2014-05-26
Mageia	MGASA-2014-0238	kernel-vserver	2014-05-24
Mageia	MGASA-2014-0234	kernel-tmb	2014-05-23
Mageia	MGASA-2014-0236	kernel-tmb	2014-05-24
Mageia	MGASA-2014-0237	kernel-rt	2014-05-24
Mageia	MGASA-2014-0235	kernel-linus	2014-05-24
Mageia	MGASA-2014-0229	kernel-vserver	2014-05-19
Mageia	MGASA-2014-0228	kernel	2014-05-19
Debian	DSA-2926-1	kernel	2014-05-12
Mageia	MGASA-2014-0208	kernel-rt	2014-05-08
Mageia	MGASA-2014-0206	kernel	2014-05-08
Fedora	FEDORA-2014-5609	kernel	2014-05-06
Fedora	FEDORA-2014-5235	kernel	2014-04-18
Oracle	ELSA-2014-3018	kernel	2014-04-17
Oracle	ELSA-2014-3019	kernel	2014-04-17
Oracle	ELSA-2014-3019	kernel	2014-04-17
Mandriva	MDVSA-2014:124	kernel	2014-06-13

Comments (none posted)

kernel: denial of service

Package(s):

kernel

CVE #(s):

CVE-2014-0155

Created:

April 21, 2014

Updated:

May 6, 2014

Description:

From the CVE entry

The ioapic_deliver function in virt/kvm/ioapic.c in the Linux kernel through 3.14.1 does not properly validate the kvm_irq_delivery_to_apic return value, which allows guest OS users to cause a denial of service (host OS crash) via a crafted entry in the redirection table of an I/O APIC. NOTE: the affected code was moved to the ioapic_service function before the vulnerability was announced.

Alerts:

Ubuntu	USN-2336-1	linux-lts-trusty	2014-09-02
Ubuntu	USN-2337-1	kernel	2014-09-02
SUSE	SUSE-SU-2014:0908-1	Linux kernel	2014-07-17
SUSE	SUSE-SU-2014:0909-1	Linux kernel	2014-07-17
SUSE	SUSE-SU-2014:0910-1	Linux kernel	2014-07-17
SUSE	SUSE-SU-2014:0911-1	Linux kernel	2014-07-17
SUSE	SUSE-SU-2014:0912-1	Linux kernel	2014-07-17
Ubuntu	USN-2239-1	linux-lts-saucy	2014-06-05
Ubuntu	USN-2241-1	kernel	2014-06-05
Mageia	MGASA-2014-0238	kernel-vserver	2014-05-24
Mageia	MGASA-2014-0234	kernel-tmb	2014-05-23
Mageia	MGASA-2014-0236	kernel-tmb	2014-05-24
Mageia	MGASA-2014-0237	kernel-rt	2014-05-24
Mageia	MGASA-2014-0235	kernel-linus	2014-05-24
Mageia	MGASA-2014-0229	kernel-vserver	2014-05-19
Mageia	MGASA-2014-0227	kernel-rt	2014-05-19
Mageia	MGASA-2014-0226	kernel-linus	2014-05-19
Mageia	MGASA-2014-0228	kernel	2014-05-19
Mageia	MGASA-2014-0225	kernel	2014-05-18
Fedora	FEDORA-2014-5609	kernel	2014-05-06
Fedora	FEDORA-2014-5235	kernel	2014-04-18
CentOS	CESA-2014:X009	kernel	2014-06-16

Comments (none posted)

mysql: multiple unspecified vulnerabilities

Package(s):

mysql-5.5

CVE #(s):

CVE-2014-0384 CVE-2014-2419 CVE-2014-2430 CVE-2014-2431 CVE-2014-2432 CVE-2014-2436 CVE-2014-2438 CVE-2014-2440

Created:

April 23, 2014

Updated:

July 24, 2014

Description:

From the CVE entries:

Unspecified vulnerability in the MySQL Server component in Oracle MySQL 5.5.35 and earlier and 5.6.15 and earlier allows remote authenticated users to affect availability via vectors related to XML. (CVE-2014-0384)

Unspecified vulnerability in Oracle MySQL Server 5.5.35 and earlier and 5.6.15 and earlier allows remote authenticated users to affect availability via unknown vectors related to Partition. (CVE-2014-2419)

Unspecified vulnerability in Oracle MySQL Server 5.5.36 and earlier and 5.6.16 and earlier allows remote authenticated users to affect availability via unknown vectors related to Performance Schema. (CVE-2014-2430)

Unspecified vulnerability in Oracle MySQL Server 5.5.36 and earlier and 5.6.16 and earlier allows remote attackers to affect availability via unknown vectors related to Options. (CVE-2014-2431)

Unspecified vulnerability Oracle the MySQL Server component 5.5.35 and earlier and 5.6.15 and earlier allows remote authenticated users to affect availability via unknown vectors related to Federated. (CVE-2014-2432)

Unspecified vulnerability in Oracle MySQL Server 5.5.36 and earlier and 5.6.16 and earlier allows remote authenticated users to affect confidentiality, integrity, and availability via vectors related to RBR. (CVE-2014-2436)

Unspecified vulnerability in Oracle MySQL Server 5.5.35 and earlier and 5.6.15 and earlier allows remote authenticated users to affect availability via unknown vectors related to Replication. (CVE-2014-2438)

Unspecified vulnerability in the MySQL Client component in Oracle MySQL 5.5.36 and earlier and 5.6.16 and earlier allows remote attackers to affect confidentiality, integrity, and availability via unknown vectors. (CVE-2014-2440)

Alerts:

Mandriva	MDVSA-2015:091	mariadb	2015-03-28
Oracle	ELSA-2014-1861	mariadb	2014-11-17
Gentoo	201409-04	mysql	2014-09-04
Oracle	ELSA-2014-0702	mariadb	2014-07-23
Red Hat	RHSA-2014:0702-01	mariadb	2014-06-10
SUSE	SUSE-SU-2014:0769-1	MySQL	2014-06-07
Slackware	SSA:2014-152-01	mariadb	2014-06-01
Mageia	MGASA-2014-0239	mariadb	2014-05-24
Scientific Linux	SLSA-2014:0536-1	mysql55-mysql	2014-05-23
Red Hat	RHSA-2014:0537-01	mysql55-mysql	2014-05-22
Oracle	ELSA-2014-0536	mysql55-mysql	2014-05-22
CentOS	CESA-2014:0537	mysql55-mysql	2014-05-22
CentOS	CESA-2014:0536	mysql55-mysql	2014-05-22
Red Hat	RHSA-2014:0536-01	mysql55-mysql	2014-05-22
CentOS	CESA-2014:0522	mariadb55-mariadb	2014-05-21
Red Hat	RHSA-2014:0522-01	mariadb55-mariadb	2014-05-20
Mandriva	MDVSA-2014:102	mariadb	2014-05-16
Fedora	FEDORA-2014-6120	mariadb-galera	2014-05-16
Debian	DSA-2919-1	mysql-5.5	2014-05-03
Fedora	FEDORA-2014-5409	mariadb	2014-04-29
Fedora	FEDORA-2014-5393	mariadb	2014-04-29
Fedora	FEDORA-2014-5396	community-mysql	2014-04-29
Fedora	FEDORA-2014-5369	community-mysql	2014-04-29
Ubuntu	USN-2170-1	mysql-5.5	2014-04-23

Comments (none posted)

openshift-origin-broker: authentication bypass

Package(s):

openshift-origin-broker

CVE #(s):

CVE-2014-0188

Created:

April 23, 2014

Updated:

April 23, 2014

Description:

From the Red Hat advisory:

A flaw was found in the way openshift-origin-broker handled authentication requests via the remote user authentication plug-in. A remote attacker able to submit a request to openshift-origin-broker could set the X-Remote-User header, and send the request to a passthrough trigger, resulting in a bypass of the authentication checks to gain access to any OpenShift user account on the system.

Alerts:

Red Hat	RHSA-2014:0423-01	openshift-origin-broker	2014-04-23
Red Hat	RHSA-2014:0422-01	openshift-origin-broker	2014-04-23

Comments (none posted)

openssl: denial of service

Package(s):

openssl

CVE #(s):

CVE-2010-5298

Created:

April 18, 2014

Updated:

July 24, 2014

Description:

From the Debian advisory:

A read buffer can be freed even when it still contains data that is used later on, leading to a use-after-free. Given a race condition in a multi-threaded application it may permit an attacker to inject data from one connection into another or cause denial of service.

Alerts:

SUSE	SUSE-SU-2015:0743-1	mariadb	2015-04-21
Mandriva	MDVSA-2015:062	openssl	2015-03-27
Fedora	FEDORA-2014-17576	mingw-openssl	2015-01-02
Fedora	FEDORA-2014-17587	mingw-openssl	2015-01-02
Oracle	ELSA-2014-1652	openssl	2014-10-16
Gentoo	201407-05	openssl	2014-07-28
Oracle	ELSA-2014-0679	openssl	2014-07-23
Red Hat	RHSA-2014:0679-01	openssl	2014-06-10
Slackware	SSA:2014-156-03	openssl	2014-06-05
Scientific Linux	SLSA-2014:0625-1	openssl	2014-06-05
Oracle	ELSA-2014-0625	openssl	2014-06-05
Fedora	FEDORA-2014-7102	openssl	2014-06-05
Fedora	FEDORA-2014-7101	openssl	2014-06-05
CentOS	CESA-2014:0625	openssl	2014-06-05
Red Hat	RHSA-2014:0625-01	openssl	2014-06-05
Mandriva	MDVSA-2014:090	openssl	2014-05-16
Ubuntu	USN-2192-1	openssl	2014-05-05
openSUSE	openSUSE-SU-2014:0592-1	OpenSSL	2014-05-02
Mageia	MGASA-2014-0187	openssl	2014-04-23
Debian	DSA-2908-1	openssl	2014-04-17

Comments (none posted)

otrs: cross-site scripting

Package(s):

otrs

CVE #(s):

CVE-2014-2553 CVE-2014-2554

Created:

April 22, 2014

Updated:

June 10, 2014

Description:

From the SUSE bug report:

Cross-site scripting (XSS) vulnerability in Open Ticket Request System (OTRS) 3.1.x before 3.1.21, 3.2.x before 3.2.16, and 3.3.x before 3.3.6 allows remote authenticated users to inject arbitrary web script or HTML via vectors related to dynamic fields.

Alerts:

Mageia	MGASA-2014-0194	otrs	2014-04-24
openSUSE	openSUSE-SU-2014:0561-1	otrs	2014-04-22
Mandriva	MDVSA-2014:111	otrs	2014-06-10

Comments (none posted)

python-django: multiple vulnerabilities

Package(s):

python-django

CVE #(s):

CVE-2014-0472 CVE-2014-0473 CVE-2014-0474

Created:

April 22, 2014

Updated:

May 5, 2014

Description:

From the Ubuntu advisory:

Benjamin Bach discovered that Django incorrectly handled dotted Python paths when using the reverse() function. An attacker could use this issue to cause Django to import arbitrary modules from the Python path, resulting in possible code execution. (CVE-2014-0472)

Paul McMillan discovered that Django incorrectly cached certain pages that contained CSRF cookies. An attacker could possibly use this flaw to obtain a valid cookie and perform attacks which bypass the CSRF restrictions. (CVE-2014-0473)

Michael Koziarski discovered that Django did not always perform explicit conversion of certain fields when using a MySQL database. An attacker could possibly use this issue to obtain unexpected results. (CVE-2014-0474)

Alerts:

openSUSE	openSUSE-SU-2014:1132-1	python-django	2014-09-16
Gentoo	201406-26	django	2014-06-26
Mandriva	MDVSA-2014:113	python-django	2014-06-10
Mandriva	MDVSA-2014:112	python-django	2014-06-10
Debian	DSA-2934-1	python-django	2014-05-19
Fedora	FEDORA-2014-5562	python-django	2014-05-02
Fedora	FEDORA-2014-5486	python-django15	2014-05-01
Fedora	FEDORA-2014-5475	python-django14	2014-05-01
Fedora	FEDORA-2014-5503	python-django	2014-05-01
Red Hat	RHSA-2014:0457-01	Django	2014-04-30
Red Hat	RHSA-2014:0456-01	Django	2014-04-30
Mageia	MGASA-2014-0196	python-django	2014-04-28
Ubuntu	USN-2169-2	python-django	2014-04-23
Ubuntu	USN-2169-1	python-django	2014-04-22

Comments (none posted)

python-django-horizon: cross-site scripting

Package(s):

python-django-horizon

CVE #(s):

CVE-2014-0157

Created:

April 23, 2014

Updated:

May 30, 2014

Description:

From the CVE entry:

Cross-site scripting (XSS) vulnerability in the Horizon Orchestration dashboard in OpenStack Dashboard (aka Horizon) 2013.2 before 2013.2.4 and icehouse before icehouse-rc2 allows remote attackers to inject arbitrary web script or HTML via the description field of a Heat template.

Alerts:

openSUSE	openSUSE-SU-2015:0078-1	openstack-dashboard	2015-01-19
Red Hat	RHSA-2014:0581-01	python-django-horizon	2014-05-29
Ubuntu	USN-2206-1	horizon	2014-05-06
Fedora	FEDORA-2014-5002	python-django-horizon	2014-04-23

Comments (none posted)

qemu: code execution

Package(s):

qemu

CVE #(s):

CVE-2014-0150

Created:

April 18, 2014

Updated:

December 12, 2014

Description:

From the Debian advisory:

Michael S. Tsirkin of Red Hat discovered a buffer overflow flaw in the way qemu processed MAC addresses table update requests from the guest. A privileged guest user could use this flaw to corrupt qemu process memory on the host, which could potentially result in arbitrary code execution on the host with the privileges of the qemu process.

Alerts:

Mandriva	MDVSA-2015:061	qemu	2015-03-13
Fedora	FEDORA-2014-15951	xen	2014-12-12
Fedora	FEDORA-2014-15503	xen	2014-12-01
Fedora	FEDORA-2014-15521	xen	2014-12-01
Mandriva	MDVSA-2014:220	qemu	2014-11-21
Mageia	MGASA-2014-0426	qemu	2014-10-28
Gentoo	201408-17	qemu	2014-08-30
Fedora	FEDORA-2014-5825	qemu	2014-05-01
Ubuntu	USN-2182-1	qemu, qemu-kvm	2014-04-28
Red Hat	RHSA-2014:0434-01	qemu-kvm-rhev	2014-04-24
Red Hat	RHSA-2014:0435-01	qemu-kvm-rhev	2014-04-24
Scientific Linux	SLSA-2014:0420-1	qemu-kvm	2014-04-22
Oracle	ELSA-2014-0420	qemu-kvm	2014-04-22
CentOS	CESA-2014:0420	qemu-kvm	2014-04-22
Red Hat	RHSA-2014:0420-01	qemu-kvm	2014-04-22
Debian	DSA-2910-1	qemu-kvm	2014-04-18
Debian	DSA-2909-1	qemu	2014-04-18

Comments (none posted)

qemu-kvm: multiple vulnerabilities

Package(s):

qemu-kvm

CVE #(s):

CVE-2014-0142 CVE-2014-0143 CVE-2014-0144 CVE-2014-0145 CVE-2014-0146 CVE-2014-0147 CVE-2014-0148

Created:

April 23, 2014

Updated:

April 23, 2014

Description:

From the Red Hat advisory:

Multiple integer overflow, input validation, logic error, and buffer overflow flaws were discovered in various QEMU block drivers. An attacker able to modify a disk image file loaded by a guest could use these flaws to crash the guest, or corrupt QEMU process memory on the host, potentially resulting in arbitrary code execution on the host with the privileges of the QEMU process. (CVE-2014-0143, CVE-2014-0144, CVE-2014-0145, CVE-2014-0147)

A divide-by-zero flaw was found in the seek_to_sector() function of the parallels block driver in QEMU. An attacker able to modify a disk image file loaded by a guest could use this flaw to crash the guest. (CVE-2014-0142)

A NULL pointer dereference flaw was found in the QCOW2 block driver in QEMU. An attacker able to modify a disk image file loaded by a guest could use this flaw to crash the guest. (CVE-2014-0146)

It was found that the block driver for Hyper-V VHDX images did not correctly calculate BAT (Block Allocation Table) entries due to a missing bounds check. An attacker able to modify a disk image file loaded by a guest could use this flaw to crash the guest. (CVE-2014-0148)

Alerts:

Mandriva	MDVSA-2015:061	qemu	2015-03-13
Mandriva	MDVSA-2014:220	qemu	2014-11-21
Mageia	MGASA-2014-0426	qemu	2014-10-28
Debian	DSA-3044-1	qemu-kvm	2014-10-04
Debian	DSA-3045-1	qemu	2014-10-04
Ubuntu	USN-2342-1	qemu, qemu-kvm	2014-09-08
Gentoo	201408-17	qemu	2014-08-30
SUSE	SUSE-SU-2014:0623-1	kvm	2014-05-08
Fedora	FEDORA-2014-5825	qemu	2014-05-01
Red Hat	RHSA-2014:0434-01	qemu-kvm-rhev	2014-04-24
Red Hat	RHSA-2014:0435-01	qemu-kvm-rhev	2014-04-24
Scientific Linux	SLSA-2014:0420-1	qemu-kvm	2014-04-22
Oracle	ELSA-2014-0420	qemu-kvm	2014-04-22
CentOS	CESA-2014:0420	qemu-kvm	2014-04-22
Red Hat	RHSA-2014:0420-01	qemu-kvm	2014-04-22

Comments (none posted)

rsync: denial of service

Package(s):

rsync

CVE #(s):

CVE-2014-2855

Created:

April 18, 2014

Updated:

March 29, 2015

Description:

From the Mageia advisory:

Ryan Finnie discovered that rsync 3.1.0 contains a denial of service issue when attempting to authenticate using a nonexistent username. A remote attacker could use this flaw to cause a denial of service via CPU consumption.

Alerts:

Mandriva	MDVSA-2015:131	rsync	2015-03-29
Mageia	MGASA-2015-0065	rsync	2015-02-15
openSUSE	openSUSE-SU-2014:0595-1	Rsync	2014-05-02
Ubuntu	USN-2171-1	rsync	2014-04-23
Fedora	FEDORA-2014-5315	rsync	2014-04-20
Mageia	MGASA-2014-0179	rsync	2014-04-17

Comments (none posted)

Kernel release status

The current development kernel is 3.15-rc2, released on April 20. Linus said: "And on the seventh day the rc release rose again, in accordance with the Scriptures laid down at the kernel summit of the year 2004."

Stable updates: 3.13.11 came out on April 22. Greg has said that he will stop maintaining 3.13 at this point, but the Ubuntu kernel team has taken over support through April 2016.

Comments (none posted)

Quotes of the week

Welcome to the most unnecessarily complicated netcat album release format yet.

In this repository, you will be able to compile your own kernel module, create a /dev/netcat device and pipe its output into an audio player.

ogg123 - < /dev/netcat

This repository contains the album's track data in source files, that (for complexity's sake) came from .ogg files that were encoded from .wav files that were created from .mp3 files that were encoded from the mastered .wav files which were generated from ProTools final mix .wav files that were created from 24-track analog tape.

— Brandon Lucia, Andrew Olmstead, and David Balatero release a new album

I suppose the first thing to do is to get the warning in there and see if we can get an understanding of how much code is likely to be affected by the change. Add "please email Kees" to the printk ;) I did that once, many years ago. I got a lot of email. Didn't do it again.

— Andrew Morton

Comments (1 posted)

Ktap or BPF?

By Jonathan Corbet
April 23, 2014

While the kernel's built-in tracing mechanisms have advanced considerably over the last few years, the kernel still lacks a DTrace-style dynamic tracing facility. In the last year we have seen the posting of two different approaches toward scriptable dynamic tracing: ktap and BPF tracing filters. Both work by embedding a virtual machine in the kernel to execute scripts, but the similarity ends there. Putting one virtual machine into the kernel for tracing is a hard sell; adding two of them is not really seen as an option by anybody involved. So, at some point, a decision will have to be made. A recent discussion on that topic gives some hints about the direction that decision could go.

The trigger for the discussion was the posting of a new version of the ktap patch set after a period of silence. While quite a bit of work has been done on ktap, little was done to address the concerns that kept ktap out of the 3.13 kernel. Ingo Molnar, who blocked the merging of ktap the last time around, was not pleased that progress had not been made on that front.

Virtual machines

There appear to be two specific points of argument that come up when the merits of ktap and BPF tracing filters are discussed. The first of those is, naturally, the question of introducing another virtual machine into the kernel. On this point, the discussion has shifted a bit, though, for a simple reason: while ktap needs its own virtual machine, the BPF engine is already in the mainline kernel, and it has been getting better.

BPF originally stood for "Berkeley packet filter"; it was used as a way to tell the kernel how to narrow down a stream of packets from a network interface when tools like tcpdump are in use. Over time, though, BPF has been used in other contexts, such as filtering access to system calls as part of the seccomp mechanism and in a number of packet classification subsystems. Alexei Starovoitov's tracing filters patch set simply allows this virtual engine to be used to select and process system events as well.

In 2011, BPF gained a just-in-time compiler that sped it up considerably. The 3.15 kernel takes this work further; it will feature a radically reworked (by Alexei) BPF engine that expands its functionality considerably. The new BPF offers the same virtual instruction set to user space, but those instructions are translated within the kernel into a format that is closer to what the hardware provides. The new format offers a number of advantages over the old, including ten registers instead of two, 64-bit registers, more efficient jump instructions, and a mechanism to allow kernel functions to be called from BPF programs. Needless to say, the additional capabilities have further reinforced BPF's position as the virtual machine of choice for an in-kernel dynamic tracing facility.

Thus, if ktap is to be accepted into the kernel, it almost certainly needs to be retargeted to the BPF virtual machine. Ktap author Jovi Zhangwei has expressed a willingness to consider making such a change, but he sees a number of shortcomings in BPF that would need to be resolved first. BPF as it currently exists does not support features needed by ktap, including access to global variables, timer-limited looping (or loops in general, since BPF disallows them by design), and more. Jovi also repeatedly complained about the BPF tracing filter design, which is oriented around attaching scripts to specific tracepoints; Jovi wants a more flexible mechanism that would allow attaching a single script to a range of tracepoints.

That last functionality should not be too hard to add. Most of the rest of Jovi's requests could probably be worked into BPF as well, especially if Jovi were to help to do the work. Alexei seems to be amenable to evolving BPF in ways that would enable it to better support ktap. The communication between the two developers appears to be difficult, though, with frequent misunderstandings being seen. At one point, Jovi concluded that Alexei was not interested in making the necessary changes to BPF; he responded by saying:

Anyway, I think there will don't have any necessary to upstream ktap any more, I still enjoy the simplicity and flexibility given by ktap, and hope there will have a kernel built-in alternative solution in future.

In truth, the situation need not be so grim, but there may be a need for an outside developer to come in and actually do the work to integrate ktap and BPF to show that it is possible. Thus far, volunteers to do this work have not made themselves known. And, in any case, there is another issue.

Scripting languages

Ktap is built on the Lua language, which offers a number of features (associative arrays, for example) that can be useful in dynamic tracing settings. Ingo, along with a few others, would rather see a language that looks more like C:

I'd suggest using C syntax instead initially, because that's what the kernel is using. The overwhelming majority of people probing the kernel are programmers, so there's no point in inventing new syntax, we should reuse existing syntax!

The BPF tracing filters patch uses a restricted version of the C language; Alexei has also provided backends for both GCC and LLVM to translate that language into something the BPF virtual machine can run. So, once again, the BPF approach appears to have a bit of an advantage here at the moment.

Unsurprisingly, Jovi feels differently about this issue; he sees the ktap language as being far simpler to work with. To support this claim, he provided this code from a BPF tracing filter example:

    void dropmon(struct bpf_context *ctx) {
        void *loc;
        uint64_t *drop_cnt;

        loc = (void *)ctx->arg2;

        drop_cnt = bpf_table_lookup(ctx, 0, &loc);
        if (drop_cnt) {
            __sync_fetch_and_add(drop_cnt, 1);
        } else {
            uint64_t init = 0;
            bpf_table_update(ctx, 0, &loc, &init);
        }
    }

This filter, he says, can be expressed this way in ktap:

    var s ={}

    trace skb:kfree_skb {
        s[arg2] += 1
    }

Alexei concedes that ktap has a far less verbose source language, though he has reservations about the conciseness of the underlying bytecode. In any case, though, he (along with others) has suggested that, once there is agreement on which virtual machine is to be used, there could be any number of scripting languages supported in user space.

And that is roughly where the discussion wound down. There is a lot of interesting functionality to be found in ktap, but, the way things stand currently, it may well be that this code gets passed over in favor of an offering from a developer who is more willing to do what is needed to get the code upstream. That said, this discussion is far from resolved, and Jovi is not the only developer who is working on ktap. With the application of a bit of energy, it may yet be possible to get ktap's higher-level functionality into a condition where it could someday be merged.

Comments (15 posted)

Changing the default shared memory limits

By Nathan Willis
April 23, 2014

The Linux kernel's System V shared-memory limit has, by default, been fixed at the same value since its inception. Although users can increase this limit, as the amount of memory expected by modern applications has risen over the years, the question has become whether or not it makes sense to simply increase the default setting—including the option of removing the limit altogether. But, as is often the case, it turns out that there are users who have come to expect the shared-memory limit to behave in a particular way, so altering it would produce unintended consequences. Thus, even though no one seems happy with the default setting as it is, how exactly to fix it is not simple.

System V–style shared memory (SHM) is commonly used as an interprocess communication resource; a set of cooperating processes (such as database sessions) can share a segment of memory up to the maximum size allowed by the operating system. That limit that can be expressed in terms of bytes per shared segment (SHMMAX), and in terms of the total number of pages used for all SHM segments (SHMALL). On Linux, the default value of SHMMAX has always been set at 32MB, and the default value of SHMALL is defined as:

    #define SHMALL (SHMMAX/getpagesize()*(SHMMNI/16))

where SHMMNI is the maximum number of SHM segments—4096 by default—which in turn gives a default SHMALL of 2097152 pages. Though they have well-known defaults, both SHMMAX and SHMALL can be adjusted with sysctl. There is also a related parameter setting the minimum size of a shared segment (SHMMIN); unlike the other resource limits, it is set to to one byte and cannot be changed.

While most users seem to agree that one byte is a reasonable minimum segment size, the same cannot be said for SHMMAX; 32MB does not go far for today's resource-hungry processes. In fact, it has been routine procedure for several years for users to increase SHMMAX on production systems, and it is standard practice to recommend increasing the limit for most of the popular applications that make use of SHM.

Naturally, many in the community have speculated that it is high time to bump the limit up to some more suitable value, and on March 31, Davidlohr Bueso posted a patch that increased SHMMAX to 128MB. Bueso admitted that the size of the increase was an essentially arbitrary choice (a four-fold bump), but noted in the ensuing discussion that, in practice, users will probably prefer to make their own choice for SHMMAX as a percentage of the total system RAM; bumping up the default merely offers a more sensible starting point for contemporary hardware.

But Andrew Morton argued that increasing the size of the default parameter did not address the underlying issue—that users were frequently hitting what was, fundamentally, an artificial limit with no real reason behind it:

Look. The 32M thing is causing problems. Arbitrarily increasing the arbitrary 32M to an arbitrary 128M won't fix anything - we still have the problem. Think bigger, please: how can we make this problem go away for ever?

One way to make the problem go away forever would be to eliminate SHMMAX entirely, but as was pointed out in the discussion, administrators probably do want to be able to set some limit to ensure that no user creates a SHM segment that eats up all of the system memory. Motohiro Kosaki suggested setting the default to zero, to stand for "unlimited." Bueso then adopted that approach for the second version of his patch. Since SHMMIN is hardcoded to one, the reasoning goes, SHMMAX cannot ever be misinterpreted by users as a valid value—either it is the default ("unlimited"), or it is the result of an overflow.

The updated patch also set the default value of SHMALL to zero—again representing "unlimited". But removing the limit on the total amount of SHM in this manner revealed a second wrinkle: as Manfred Spraul pointed out, setting SHMALL to zero is currently a move that system administrators (quite reasonably) use to disable SHM allocations entirely; the patch has the unwanted effect of completely reversing the outcome of that move—enabling unlimited SHM allocation.

Spraul subsequently wrote his own alternative patch set that attempts to avoid this issue by instead setting the defaults for SHMMAX and SHMALL to ULONG_MAX, which amounts to setting them to infinity. This solution is not without its risks, either; in particular there are known cases where an application simply tries to increment the value SHMMAX rather than setting it, which causes an overflow. The result would be that applications would encounter the wrong value for SHMMAX—most likely a value far smaller than they need, causing their SHM allocation attempts to fail.

Nevertheless, Bueso concurred that avoiding the reversal of behavior for manually setting SHMALL to zero was a good thing, and signed off on Spraul's approach. The latest version of Spraul's patch set attempts to avoid the overflow issue by using ULONG_MAX - 1L<<24 instead, but he admits that ultimately there is nothing preventing users from causing overflows when left to their own devices.

One final concern stemming from this change is that if a system implements no upper limits on SHM allocation, it will be possible for users to consume all of the available memory as SHM segments. If such a greedy allocation happens, however, the out-of-memory (OOM) killer will not be able to free that memory. The solution is for administrators to either enable the shm_rmid_forced option (which forces each SHM segment to be created with the IPC_RMID flag—guaranteeing that it is associated with at least one process, which in turn ensures that when the OOM killer kills the guilty process, the SHM segment vanishes with it) or to manually set SHM limits.

Since the desire to avoid manually configuring SHM limits was the original goal of the effort, it might seem as if the effort has come full circle. But, for the vast majority of users, removing the ancient defaults is a welcome improvement. Rogue users attempting to allocate all of the memory in a shared segment are at best an anomaly (and certainly something that administrators should stay on the lookout for), whereas the old default 32MB SHM size has long been problematic for a wide variety of users in need of shared memory.

Comments (15 posted)

Loopback NFS: theory and practice

April 23, 2014

This article was contributed by Neil Brown

The Linux NFS developers have long known that mounting an NFS filesystem onto the same host that is exporting it (sometimes referred to as a loopback or localhost NFS mount) can lead to deadlocks. Beyond one patch posted over ten years ago, little effort has been put into resolving the situation as no convincing use case was ever presented. Testing of the NFS implementation can certainly benefit from loopback mounts; this use probably triggered the mentioned patch. With that fix in place, the remaining deadlocks do take some effort to trigger, so the advice to testers was essentially "be careful and you should be safe".

For any other use case, it would seem that using a "bind" mount would provide a result that is indistinguishable from a loopback NFS mount. In short: if it hurts when you use a loopback NFS mount, then simply don't do that. However, a convincing use case recently came to light which motivated more thought on the issue. It led this author on an educational tour of the interaction between filesystems and memory management, and produced a recently posted patch set (replacing an earlier attempt) which removes most, and hopefully all, such deadlocks.

A simple cluster filesystem

That use case involves using NFS as the filesystem in a high-availability cluster where all hosts have shared access to the storage. For all nodes in the cluster to be able to access the storage equally, you need some sort of cluster filesystem like OCFS2, Ceph, or GlusterFS. If the cluster doesn't need particularly high levels of throughput and if the system administrator prefers to stick with known technology, NFS can provide a simple and tempting alternative.

To use NFS as a cluster filesystem, you mount the storage on an arbitrary node using a local filesystem (ext4, XFS, Btrfs, etc), export that filesystem using NFS, then mount the NFS filesystem on all other nodes. The node exporting the filesystem can make it appear in the local namespace in the desired location using bind mounts and no loopback NFS is needed — at least initially.

As this is a high-availability cluster, it must be able to survive the failure of any node, and particularly the failure of the node running the NFS server. When this happens, the cluster-management software can mount the filesystem somewhere else. The new owner of the filesystem can export it via NFS and take over the IP address of the failed host; all nodes will smoothly be able to access the shared storage again. All nodes, that is, except the node which has taken over as the NFS server.

The new NFS-serving node will still have the shared filesystem mounted via NFS and will now be accessing it as a loopback NFS mount. As such, it will be susceptible to all the deadlocks that have led us to recommend against loopback NFS mounts in the past. In this case, it is not possible to "simply use bind mounts" as the filesystem is already mounted, applications are already using it and have files open, etc. Unmounting that filesystem would require stopping those applications — an action which is clearly contrary to the high-availability goal.

This scenario is clearly useful, and clearly doesn't work. So what was previously a wishlist item, and quite far from the top of the list at that, has now become a bug that needs fixing.

Theory meets practice

The deadlocks that this scenario trigger generally involve a sequence of events like: (1) the NFS server tries to allocate memory, (2) the memory allocator then tries to free memory by writing some pages out to the filesystem via the NFS client, and (3) the NFS client waits for the NFS server to make some progress. My assumption had been that this deadlock was inevitable because the same memory manager was trying to serve two separate but competing users: the NFS client and the NFS server.

A possible fix might be to run the NFS server inside a virtual machine, and to give this VM a fixed and locked allocation of memory so there would not be any competition. This would work, but it is hardly the simple solution that our administrator was after and would likely present challenges in sizing the VM for optimal performance.

This is where I might have left things had not a colleague, Ulrich Schairer, presented me with a system which was deadlocking exactly as described and said, effectively, "It's broken, please fix". I reasoned that analyzing the deadlock would at least allow me to find a precise answer as to why it cannot work. I now know it led to more than that. After a sequence of patches and re-tests it turned out that there were two classes of problem, each of which differed in important ways from the problem which was addressed 10 years ago. Trying to understand these problems led to an exploration of the nature and history of the various mechanisms already present in Linux to avoid memory-allocation deadlocks as reported on last week.

With that context, it might seem that some manipulation of the __GFP_FS and/or PF_FSTRANS flags should allow the deadlock to be resolved. If we think of nfsd as simply being the lower levels of the NFS filesystem, then the deadlock involves a lower layer of a filesystem allocating memory and thus triggering writeout to that same filesystem. This is exactly the deadlock that __GFP_FS was designed to prevent, and, in fact, setting PF_FSTRANS in all nfsd threads did fix the deadlock that was the easiest to hit.

Further investigation revealed, as it often does, that reality is sometimes more complex than theory might suggest. Using the __GFP_FS infrastructure, either directly or through PF_FSTRANS, turns out to be neither sufficient, nor desirable, as a solution to the problems with loopback NFS mounts. The remainder of this article explores why it is not sufficient and next week we will conclude with an explanation of why it isn't even desirable.

A pivotal patch

Central to understanding both sides of this problem is a change that happened in Linux 3.2. This change was authored by my colleague Mel Gorman who fortunately sits just on the other side of the Internet from me and has greatly helped my understanding of some of these issues (and provided valuable review of early versions of this article). This patch series changed the interplay between memory reclaim and filesystem writeout in a way that, while not actually invalidating __GFP_FS, changed its importance.

Prior to 3.2, one of the several strategies that memory reclaim would follow was to initiate writeout of any dirty filesystem pages that it found. Writing a dirty page's contents to persistent storage is an obvious requirement before the page itself can be freed, so it would seem to make sense to do it while looking for pages to free. Unfortunately, it had some serious negative side effects.

One side effect was the amount of kernel stack space that could be used. The writepage() function in some filesystems can be quite complex and, as a result, can quite reasonably use a lot of stack space. If a memory allocation request was made in some unrelated code that also used a lot of stack space, then the fact that memory allocation led directly to memory reclaim and, from there, to filesystem writeout, meant that two heavy users of stack space could be joined together, significantly increasing the total amount of stack space that could be required. In some cases, the amount of space needed could exceed the size of the kernel stack.

Another side effect is that pages could be written out in an unpredictable order. Filesystems tend to be optimized to write pages out in the order they appear in the file, first page first. This allows space on the storage device to be allocated optimally and allows multiple pages to be easily grouped into fewer, larger writes. When multiple processes are each trying to reclaim memory, and each is writing out any dirty pages it finds, the result is somewhat less orderly than we might like.

Hence the change in Linux 3.2 removed writeout from direct reclaim, leaving it to be done by kswapd or the various filesystem writeback threads. In such a complex system as Linux memory management, a little change like that should be expected to have significant follow-on effects, and the patch mentioned above was really just the first of a short series which made the main change and then made some adjustments to restore balance. The particular adjustment which interests us was to add a small delay during reclaim.

Waiting for writeout

The writeout code that was removed would normally avoid submitting a write if doing so might block. This can happen if the block I/O request queue is full and the submission needs to wait for a free slot; it can be avoided by checking if the backing device is "congested". However, if the process that is allocating memory is in the middle of writing to a file on a particular device, and the memory reclaim code finds a dirty page that can be written to that same device, then it skips the congestion test and, thus, it may well block. This has the effect of slowing down any process writing to a device to match the speed of the device itself and is important for keeping balance in memory management.

With the change so that direct reclaim would no longer write out dirty file pages, this delay no longer happened (though the backing_device_info field of the task structure which enabled the delay is still present with no useful purpose). In its place, we get an explicit small delay if all the dirty pages looked at are waiting for a congested backing device. This delay causes problems for loopback NFS mounts. In contrast to the implicit delay present before Linux 3.2, this delay is not avoided by clearing __GFP_FS. This is why using __GFP_FS or PF_FSTRANS is not sufficient.

Understanding this problem requires an understanding of the "backing device" object, an abstraction within Linux that holds some important information about the storage device underlying a filesystem. This information includes the recommended read-ahead size and the request queue length — and also whether the device is congested or not. For local filesystems struct backing_dev_info maps directly to the underlying block device (though, for Btrfs, which can have multiple block devices, there are extra challenges). For NFS, the queue in this structure is a list of requests to be sent to the NFS server rather than to a physical device. When this queue reaches a predefined size, the backing device for the NFS filesystem will be designated as "congested".

If the backing device for a loopback-mounted NFS filesystem ever gets congested while memory is tight, we have a problem. As nfsd tries to allocate pages to execute write requests, it will periodically enter reclaim and, as the NFS backing device is congested, it will be forced to sleep for 100ms. This delay will slow NFS throughput down to several kilobytes per second and so will ensure that the NFS backing device remains congested. This does not actually result in a deadlock as forward progress is achieved, but it is a livelock resulting in severely reduced throughput, which is nearly as bad.

This situation is very specific to our NFS scenario, as the problem is caused by a backing device writing into the page cache. It is not really a general filesystem recursion issue, so it is not the same sort of problem that might be addressed with suitable use of __GFP_FS.

Learning from history

This issue is, however, similar to the problem from ten years ago that was fixed by the patch mentioned in the introduction. In that case, the problem was that a process which was dirtying pages would be slowed down until a matching number of dirty pages had been written out. When this happened, nfsd could end up being blocked until nfsd had written out some pages, thus producing a deadlock. In our present case, the delay happens when reclaiming memory rather than when dirtying memory, and the delay has an upper limit of 100ms, but otherwise it is a similar problem.

The solution back then was to add a per-process flag called PF_LESS_THROTTLE, which was set only for nfsd threads. This flag increased the threshold at which the process would be slowed down (or "throttled") and so broke the deadlock. There are two important ideas to be seen in that patch: use a per-process flag, and do not remove the throttling completely but relax it just enough to avoid the deadlock. If nfsd were not throttled at all when dirtying pages, that would just cause other problems.

With our 100ms delay, it is easy to add a test for the same per-process flag, but the sense in which the delay should only be partially removed is somewhat less obvious.

The problem occurs when nfsd is writing to a local filesystem, but the NFS queue is congested. nfsd should probably still be throttled when that local filesystem is congested, but not when the NFS queue is congested. If other queues are congested, it probably doesn't matter very much whether nfsd is throttled or not, though there is probably a small preference in favor of throttling.

As the backing_dev_info field of the task structure was (fortuitously) not removed when direct-reclaim writeback was removed in 3.2, we can easily use PF_LESS_THROTTLE to avoid the delay in cases where current->backing_dev_info (i.e. the backing device that nfsd is writing to) is not congested. This may not be completely ideal, but it is simple and meets the key requirements, so should be safe ... providing it doesn't upset other users of PF_LESS_THROTTLE.

Though PF_LESS_THROTTLE has only ever been used in nfsd, there have been various patches proposed between 2005 and 2013 adding the flag to the writeback process used by the loop block device, which makes a regular file behave like a block device. This process is in exactly the same situation as nfsd: it implements a backing device by writing into the page cache. As such, it can be expected to face exactly the same problems as described above and would equally benefit from having PF_LESS_THROTTLE set and having that flag bypass the 100ms delay. It is probably only a matter of time before some patch to add PF_LESS_THROTTLE to loop devices will be accepted.

There are three other places where direct reclaim can be throttled. The first is the function throttle_direct_reclaim(), which was added in 3.6 as part of swap-over-NFS support. This throttling is explicitly disabled for any kernel threads (i.e. processes with no user-space component). As both nfsd and the loop device thread are kernel threads, this function cannot affect users of PF_LESS_THROTTLE and so need not concern us.

The other two are in shrink_inactive_list() (the same function which holds the primary source of our present pain). The first of these repeatedly calls congestion_wait() until there aren't too many processes reclaiming memory at the same time, as this can upset some heuristics. This has previously led to a deadlock that was fixed by avoiding the delay whenever __GFP_FS or __GFP_IO was cleared. Further discussion of this will be left to next time when we examine the use of __GFP_FS more closely.

The last delay is near the end of shrink_active_list(); it adds an extra delay (via congestion_wait() again) when it appears that the flusher threads are struggling to make progress. While a livelock triggered by this delay has not been seen in testing, it is conceivable that the flusher thread could block when the NFS queue is congested; that could lead to nfsd suffering this delay as well and so keeping the queue congested. Avoiding this delay in the same conditions as the other delay seems advisable.

One down, one to go

With the livelocks under control, not only for loopback NFS mounts but potentially for the loop block device as well, we only need to deal with one remaining deadlock. As we found with this first problem, the actual change required will be rather small. The effort to understand and justify that change, which will be explored next week, will be somewhat more substantial.

Comments (2 posted)

Linus Torvalds Linux 3.15-rc2 ?

Greg KH Linux 3.13.11 ?

Luis Henriques Linux 3.11.10.8 ?

Jungseok Lee Support 4 levels of translation tables for ARM64 ?

Jianguo Wu ARM: mm: support big-endian page tables ?

Chanwoo Choi Support new Exynos3250 SoC based on Cortex-A7 dual core ?

Haojian Zhuang enable Hisilicon HiP04 SoC ?

Florian Fainelli ARM: BCM63XX: add support for BCM63138 SoC ?

Ley Foon Tan nios2 Linux kernel port ?

Masami Hiramatsu [PATCH -tip v9 00/26] kprobes: introduce NOKPROBE_SYMBOL, bugfixes and scalbility efforts ?

Waiman Long qspinlock: a 4-byte queue spinlock with PV support ?

Steven Rostedt rwsem: The return of multi-reader PI rwsems ?

chrubis@suse.cz [LTP] [ANNOUNCE] The Linux Test Project has been released for APRIL 2014 ?

Iyappan Subramanian net: Add APM X-Gene SoC Ethernet driver support ?

Maxime Ripard Add support for the Allwinner A31 DMA Controller ?

Jamie Lentin hid: Add custom driver for Lenovo ThinkPad Compact Bluetooth Keyboard ?

Luis R. Rodriguez backports-3.14-1 released ?

Luis R. Rodriguez backports-3.15-rc1 is released ?

Jean-Jacques Hiblot DRM driver for the ATMEL High end LCD controller ?

Gregory CLEMENT USB3 support for Armada 38x ?

Tim Kryger Add Broadcom Kona PWM Support ?

Florian Fainelli net: Broadcom SYSTEMPORT driver ?

Myron Stowe [PATCH v2 0/5] x86/PCI: Add AMD hostbridge support for newer AMD systems ?

Andy Gross Introduce drivers/soc and add QCOM GSBI driver ?

Alexandre Courbot drm/nouveau: support for GK20A, cont'd ?

Xiubo Li clocksource: Add Freescale FlexTimer Module (FTM) timer support ?

Boris BREZILLON regmap: support regmap over SMBus ?

Johannes Thumshirn tty: serial: Add driver for MEN's 16z135 High Speed UART. ?

Laurent Pinchart ADV7611 support ?

srinivas.kandagatla@linaro.org Add Qualcomm SD Card Controller support. ?

Michal Malý input: Introduce ff-memless-next as an improved replacement for ff-memless ?

Alexey Charkov net: via-rhine: add support for on-chip Rhine controllers ?

Antoine Ténart ARM: berlin: add AHCI support ?

Santosh Shilimkar net: Add Keystone NetCP ethernet driver support ?

Tomasz Figa Generic Device Tree based power domain look-up ?

Michael Kerrisk (man-pages) man-pages-3.65 is released ?

Eric W. Biederman Detaching mounts on unlink for 3.15 ?

NeilBrown Support loop-back NFS mounts - take 2 ?

Dan Streetman mm: zpool: add common api for zswap to use zbud/zsmalloc ?

Kees Cook seccomp: add PR_SECCOMP_EXT and SECCOMP_EXT_ACT_TSYNC ?

Richard Guy Briggs audit: implement multicast socket for journald ?

Dmitry Kasatkin in-kernel IMA/EVM initialization ?

Eric W. Biederman Preventing abuse when passing file descriptors ?

Haiyang Zhang hyperv: Add support for virtual Receive Side Scaling (vRSS) ?

Pablo Neira Ayuso libnftnl 1.0.1 release ?

Fedora's firewall furor

By Jonathan Corbet
April 23, 2014

For many Fedora users, one of the first steps taken after installing a new machine is to turn off SELinux — though SELinux has indeed become less obstructive over time. In a similar vein, many users end up turning off the network firewall that Fedora sets up by default. The firewall has a discouraging tendency to get in the way as soon as one moves beyond simple web-browsing and email applications and, while it is possible to tweak the firewall to make a specific application work, few users have the knowledge or the patience to do that. So they just make the whole thing go away. That's why it has been proposed to turn off the firewall by default for the Fedora 21 Workstation product. While the proposal has been rejected, for now anyway, the reasons behind it remain.

The change proposal reads this way:

The current level of integration into the desktop and applications does not justify enabling the firewalld service by default. Additionally, the set of zones that we currently expose is excessive and not user-friendly. Therefore, we will disable the firewall service while we are working on a more user-friendly way to deal with network-related privacy issues.

As one might imagine, this proposal led to a lengthy discussion once it hit the Fedora development list. To those who oppose the proposal, it is seen as an admission of defeat: firewalling is too hard, so Fedora's developers are simply giving up and, in the process, making the system less secure and repeating mistakes made by other operating systems in the past. Others, though, question the level of real-world security provided by the firewall, especially when many users just end up turning it off anyway. But, behind all the noise, there actually appears to be a sort of consensus on how things should actually work; it's just that nobody really knows how to get to that point.

Nobody, obviously, wants a Fedora system to be insecure by default (unless perhaps they work for the NSA). But there is a desire for the applications installed by the user to simply work without the need for firewall tweaking. Beyond that, the set of services that should work should depend on the network environment that the system finds itself in. When the system is connected to a trusted network (at home, say), more services should be reachable than when it is connected to a dodgy airport wireless network. Fedora's firewalld tries to make the right thing happen in all cases, but the problem turns out to be hard.

Once the firewall is in place, any network service that is not explicitly allowed will not function properly. That is why the first attempt to set up something as simple as an SSH server on a Fedora system is usually doomed to failure. There are a couple of mechanisms that could address this problem, but they have issues of their own. The first possible solution is to provide an API to allow applications to request the opening of a hole in the firewall for a specific port. Firewalld already supports this API via D-Bus, but it is hard to see this API as a full solution to the problem for a couple of reasons:

Firewalld is a Fedora-specific project, so its API, too, is naturally a Fedora-only affair. As a result, application developers are reluctant to include firewalld support in their code; it's an additional maintenance burden for a relatively small group of additional users.
Users may not want an application to be able to silently perforate the firewall in all settings, especially if they are worried about malicious software running on their own systems.

A potential solution to the second problem (and, perhaps the first) is to put a mechanism in place to notice when an application is trying to listen on a firewalled port and ask the user if the port should be opened. The problem with this approach was nicely summarized by Daniel Walsh, who said: "Nothing worse than asking users security-related questions about opening firewall ports. Users will just answer yes, whether or not they are being hacked." Even if they take the time to answer carefully, users will tend to get annoyed by security questions in short order. As a general rule, the "ask the user" approach tends not to work out well.

An alternative is to try to do the right thing depending on the network the system is connected to. On a trusted network, the firewall could allow almost all connections and services will just work. When connecting to a coffee-shop network, instead, a much tighter firewall would be in place. As it happens, firewalld was designed to facilitate this type of policy as well; it allows the placement of both networks and applications into "zones." When the system is on a network assigned to a specific zone, only applications which have been enabled for that zone will be able to open reachable ports to the world.

The current setup does not lack for zones; indeed, there are nine of them with names that vary from "trusted" to "external," "dmz," or "drop." As Matthias Clasen pointed out, this is far too many zones for most users to know what to do with, and there is no real information about what the differences between them are. Configuration is via a set of XML files; NetworkManager can put networks into zones if one digs far enough into the dialogs, but there is little help for users wanting to know what a specific zone means or how it can be changed.

There seems to be a rough consensus that, if firewalld had a more usable zones system, it could be left enabled by default. The move to disable the firewall is a clear statement that, in some minds at least, firewalld cannot be fixed in the Fedora 21 time frame. There is, however, one approach that might work: reducing the number of zones considerably. In fact, in a related discussion last February, Christian Schaller suggested that all the system needs by default is two zones: trusted and untrusted. When NetworkManager connects to a new network, it can ask the user whether that network is trusted or not and set the firewall accordingly.

This idea seemed to gain some favor in both discussions, but it is not clear that somebody will get around to actually making it work in the near term. That may need to change in the near future, though. On April 23, the Fedora Engineering Steering Committee discussed this proposal and, with a five-to-two vote, rejected it. So the Fedora 21 workstation product will probably have a firewall by default, but how that firewall will work still needs to be figured out.

Comments (56 posted)

Distribution quotes of the week

I have found that if you are going to bother.. do it because it is making something better for you, for something you care about. That is stuff you can control and not items left to the fact that people choose to use what everyone else uses or by the fact its name sounds exotic or they like Orange over Blue.

-- Stephen J Smoogen

Equating disagreement with antipathy is more detrimental than vitriolic disagreement. We need the 'Friends' foundation to remind us that even in the hottest of flamewars, everyone has good intentions. Sometimes strong language is just a device for making a point. Even the wildest of idiom isn't inherently intended to convey personal disrespect. We need a reminder, especially with contentious issues, not to ignore valid points because they were delivered poorly and not to overvalue perspectives that were shared more politely.

-- Pete Travis

Comments (none posted)

Debian 6.0 to get long-term support

The Debian project has announced that the security support period for the 6.0 ("squeeze") release has been extended by nearly two years; it now runs out in February 2016. At the end, squeeze will have received a full five years of security support. "squeeze-lts is only going to support i386 and amd64. If you're running a different architecture you need to upgrade to Debian 7 (wheezy). Also there are going to be a few packages which will not be supported in squeeze-lts (e.g. a few web-based applications which cannot be supported for five years). There will be a tool to detect such unsupported packages."

Full Story (comments: none)

Ubuntu 14.04 LTS (Trusty Tahr) released

Ubuntu has announced the release of its latest long-term support distribution: Ubuntu 14.04 LTS (aka "Trusty Tahr"). The release notes have all the details. It comes in a multitude of configurations, for desktops, servers, the cloud, phones, and tablets; also in many flavors: Kubuntu, Edubuntu, Xubuntu, Lubuntu, Ubuntu GNOME, Ubuntu Kylin, and Ubuntu Studio. "Ubuntu 14.04 LTS is the first long-term support release with support for the new "arm64" architecture for 64-bit ARM systems, as well as the "ppc64el" architecture for little-endian 64-bit POWER systems. This release also includes several subtle but welcome improvements to Unity, AppArmor, and a host of other great software."

Full Story (comments: 13)

Distribution newsletters

DistroWatch Weekly, Issue 555 (April 21)
Five Things in Fedora This Week (April 22)
Ubuntu Weekly Newsletter, Issue 364 (April 20)

Comments (none posted)

Shuttleworth: U talking to me?

Ubuntu's Trusty Tahr has been released and that means it's time for a new development branch. Mark Shuttleworth has announced the name of the next Ubuntu release. "So bring your upstanding best to the table – or the forum – or the mailing list – and let’s make something amazing. Something unified and upright, something about which we can be universally proud. And since we’re getting that once-every-two-years chance to make fresh starts and dream unconstrained dreams about what the future should look like, we may as well go all out and give it a dreamlike name. Let’s get going on the utopic unicorn."

Comments (22 posted)

Emmabuntüs: A philanthropist’s GNU/Linux (muktware)

Muktware takes a quick look at Emmabuntüs. "Emmabuntüs is a desktop GNU/Linux distribution which originated in France with a humanitarian mission. It was designed with 4 primary objectives – refurbishing of computers given to humanitarian organizations like the Emmaüs communities, promoting GNU/Linux among beginners, extending the life of older equipments and reducing waste by over-consumption of raw materials."

Comments (none posted)

Testing your full software stack with cwrap

April 23, 2014

This article was contributed by Andreas Schneider and Jakub Hrozek

Testing network applications correctly is hard. The biggest challenge is often to set up the environment to test a client/server application. One option is to set up several virtual machines or containers and run a full client/server interaction between them. But building this environment might not always be possible; for example some build systems typically have no network at all and run as a non-privileged user. Also for newcomers, who want to contribute code to your project, it is often a difficult and time-consuming task to set up that kind of development environment.

Reading and running the test cases is normally a good entry point toward understanding a project, because you learn how it is set up and how you need to use the API to achieve your goal. For these reasons, it would be preferable if there was a way to run the tests locally using a non-root user, while still being able to run in an environment as close to real world as possible. Avoiding the testing of code that requires elevated privileges or networking is usually not an option, because many projects have a test-driven development model. This means to submit new code or to fix issues, a test case is required so regressions are avoided.

The cwrap project

The cwrap project aims to help client/server software development teams that are trying to gain full functional test coverage to complete that task. It makes it possible to run several instances of the full software stack on the same machine and perform local functional testing of complex network configurations. Daemons often require privilege separation and local user and group accounts, separate from the hosting system. The cwrap project does not require virtualization or root credentials and can be used on different operating systems.

It is basically like The Matrix, where reality is simulated and everything is a lie.

cwrap is a new project, but the ideas and the origin of the project are from the Samba codebase. cwrap presents the internals of one of the most advanced FOSS testing systems that has helped Samba developers for many years to test their protocol implementations. Samba is complex, it provides several server components that need to interact with each other. It provides a client executable, a client library, and a testing suite called smbtorture. These need to be run against different server setups to test the correctness of the protocols and server components.

In trying to test your server, you may run into some problems. Your server might need to open privileged ports, which requires superuser access. If you need to run several instances of daemons for different tasks, then the setup becomes more complex. An example would be that you want to test a SSH client with Kerberos. So you need a KDC (key distribution center) and an SSH server. If you provide login or authentication functionality, user and group accounts on the system are required. This means each machine you run the tests on needs to have the same users. To be able to switch to a user after authentication, you have to be root in the first place. All these things make testing harder and the setup more complex.

What you actually want is to be able to run all required components on a single machine: the one a developer is working on. All tests should work as a normal non-privileged user. So what you really want is to just run make test and wait till all tests are finished.

The cwrap project enables you to set up such an environment easily by providing three libraries you can preload to any binary.

What is preloading?

Preloading is a feature of the dynamic linker (ld). It is a available on most Unix systems and allows loading a user-specified, shared library before all other shared libraries that are linked to an executable.

Library preloading is most commonly used when you need a custom version of a library function to be called. You might want to implement your own malloc(3) and free(3) functions that would perform rudimentary leak checking or memory access control for example, or you might want to extend the I/O calls to dump data when reverse engineering a binary blob. In those cases, the library to be preloaded would implement the functions you want to override. Only functions in dynamically loaded libraries can be overridden. You're not able to override a function the application implements by itself or links statically with. More details can be found in the man page of ld.so.

The wrappers use preloading to supply their own variants of several system or library calls suitable for unit testing of networked software or privilege separation. For example, the socket_wrapper includes its version of most of the standard API calls used to communicate over sockets. Its version routes the communication over local sockets.

The wrappers

cwrap consists of three different wrappers. Each of them implements a set of functions to fulfill a testing task. There is socket_wrapper, nss_wrapper and uid_wrapper.

socket_wrapper

This library redirects all network communication to happen over Unix sockets, emulating both IPv4 and IPv6 sockets and addresses. This allows you to start several daemons of a server component on the same machine without any conflicts. You are also able to simulate binding to privileged ports below port 1024, which normally requires root privileges. If you need to understand the packet flow to see what is happening on the wire, you can also capture the network traffic in pcap format and view it later with tools such as Wireshark.

The idea and the first incarnation of socket_wrapper was written by Jelmer Vernooij in 2005. It made it possible to run the Samba torture suite against smbd in make test. From that point in time, we started to write more and more automated tests. We needed more wrappers as the test setup became increasingly complex. With Samba 4.0 we needed to test the user and group management of an Active Directory server and make it simple for developers to do that. The technology has been in use and tested a while now. But because the code was embedded in the Samba source tree, it wasn't possible to use it outside of the Samba code base. The cwrap project now makes this possible.

There are some features in development, like support for IP_PKTINFO in auxiliary messages of sendmsg() and recvmsg(). We also would like to support for fd-passing with auxiliary messages soon to implement and test some new features for the Samba DCERPC infrastructure.

Lets take a look how socket_wrapper works on a single machine. Here is a demo you can run yourself after you have installed it:

    # Open a console and create a directory for the unix sockets.
    $ mktemp -d
    /tmp/tmp.bQRELqDrhM

    # Then start nc to listen for network traffic using the temporary directory.
    $ LD_PRELOAD=libsocket_wrapper.so \
      SOCKET_WRAPPER_DIR=/tmp/tmp.bQRELqDrhM \
      SOCKET_WRAPPER_DEFAULT_IFACE=10 nc -v -l 127.0.0.10 7

    # nc, listens on 127.0.0.10 because it is specified on the command-line
    # and it corresponds to the SOCKET_WRAPPER_DEFAULT_IFACE value specified

    # Now open another console and start 'nc' as a client to connect to the server:
    $ LD_PRELOAD=libsocket_wrapper.so \
      SOCKET_WRAPPER_DIR=/tmp/tmp.bQRELqDrhM \
      SOCKET_WRAPPER_DEFAULT_IFACE=100 \
      SOCKET_WRAPPER_PCAP_FILE=/tmp/sw.pcap nc -v 127.0.0.10 7

    # (The client will use the address 127.0.0.100 when connecting to the server)
    # Now you can type 'Hello!' which will be sent to the server and should appear
    # in the console output of the server.
    # When you have finished, you can examine the network packet dump with
    # "wireshark /tmp/sw.pcap"

nss_wrapper

There are projects that provide daemons needing to be able to create, modify, and delete Unix users. Others just switch user IDs to interact with the system on behalf of another user (e.g. a user space file server). To be able to test these, you need the privilege to modify the passwd and group files. With nss_wrapper it is possible to define your own passwd and group files which will be used by the software while it is under test.

If you have a client and server under test, they normally use functions to resolve network names to addresses (DNS) or vice versa. The nss_wrapper allows you to create a hosts file to set up name resolution for the addresses you use with socket_wrapper.

The user, group, and hosts functionality are all defined as wrappers around the Name Service Switch (NSS) API. The Name Service Switch is a modular system, used by most Unix systems, that allows you to fetch information from several databases (users, groups, hosts, and more) using loadable modules. The list and order of modules is configured in the file /etc/nsswitch.conf. Usually, the nsswitch.conf file contains the "files" module shipped with glibc that looks up users in /etc/passwd, groups in /etc/group, and hosts in /etc/hosts. But it's also possible to define additional sources of information by configuring third party modules — a good example might be looking up users from LDAP using nss_ldap.

Here is an example of using nss_wrapper to handle users and groups:

    $ echo "bob:x:1000:1000:Bob Gecos:/home/test/bob:/bin/false" > passwd
    $ echo "root:x:65534:65532:Root user:/home/test/root:/bin/false" >> passwd
    $ echo "users:x:1000:" > group
    $ echo "root:x:65532:" >> group
    $ LD_PRELOAD=libnss_wrapper.so NSS_WRAPPER_PASSWD=passwd \
        NSS_WRAPPER_GROUP=group getent passwd bob
    bob:x:1000:1000:Bob Gecos:/home/test/bob:/bin/false

The following shows nss_wrapper faking the host name:

    $ LD_PRELOAD=libnss_wrapper.so NSS_WRAPPER_HOSTNAME=test.example.org hostname
    test.example.org

Here, nss_wrapper simulates host name resolution:

    $ echo "fd00::5357:5faa test.cwrap.org" > hosts
    $ echo "127.0.0.170 test.cwrap.org" >> hosts
    # Now query ahostsv6 which returns only IPv6 addresses and
    # calls getaddrinfo() for each the entry.
    $ LD_PRELOAD="libnss_wrapper.so" NSS_WRAPPER_HOSTS=hosts \
      getent ahostsv6 test.cwrap.org
    fd00::5357:5faa DGRAM  test.cwrap.org
    fd00::5357:5faa STREAM test.cwrap.org

uid_wrapper

Some projects, such as a file server, need privilege separation to be able to switch to the user who owns the files and do file operations on their behalf. uid_wrapper convincingly lies to the application, letting it believe it is operating as root and even switching between UIDs and GIDs as needed. You can start any application making it believe it is running as root. We will demonstrate this later. You should keep in mind that you will not gain more permissions or privileges with uid_wrapper than you currently have; remember it is The Matrix.

Maybe you know that glibc has support for switching the user/group only for the local thread. For example calling setuid(1000) synchronizes all threads to change to the given UID. The setuid(), setguid(), etc. functions send a signal to each thread, telling it that it should change the relevant ID. The signal handler of each thread for the signal then uses syscall() using the corresponding SYS_setXid constant for the local thread. So, under glibc, if you want to change the UID only for the local thread, you have to make the system call directly:

    rc = syscall(SYS_setruid, 1000, 0);

uid_wrapper has support for glibc's special privilege separation with threads. It intercepts calls to syscall() to handle the remapping of UIDs and GIDs. Here is an example of uid_wrapper in action:

    $ LD_PRELOAD=libuid_wrapper.so UID_WRAPPER=1 UID_WRAPPER_ROOT=1 id
    uid=0(root) gid=0(root) groups=100(users),0(root)

How are the wrappers tested?

You may sense a bit of a conflict of interest with wrappers. On one hand, this article stated that unit tests with wrappers strive to simulate the real world environment as closely as possible. On the other hand, the wrappers substitute such fundamental calls as socket() and getpwnam(). It's paramount that the wrappers be extremely well tested so that you, as a user of the wrappers, are confident that any failure in testing implemented using the wrappers is a bug in the program under test and not an unwanted side effect of the wrappers. To this end, the wrappers include a large unit test suite that make sure the wrappers function as intended. At the time of this writing, the code coverage for wrappers is pretty high: nss_wrapper 79%, socket_wrapper 77%, and uid_wrapper 85% code coverage.

As an example of a unit test, the socket_wrapper implements a very simple echo server. The unit tests that exercise the read() or write() calls then connect to the echo server instance that is seemingly running on a privileged port. In fact, the echo server is run using socket_wrapper, so all communication is redirected over a local socket. You can inspect the unit test in the Samba repository. The CMakeLists.txt file also gives a good overview of how the tests are set up.

The wrappers leverage the cmocka unit testing framework that was covered in an earlier LWN article. In short, the cmocka library provides unit test developers with the ability to use mock objects. Moreover, the cmocka library has very low dependency footprint; in fact, it requires only the standard C library.

All the wrapper libraries are built using the cmake build system. In order to provide cwrap developers with an easy-to-use dashboard that displays the results of unit tests, an instance of the cdash dashboard is running and scheduling tests on several operating systems including several Linux distributions, FreeBSD, and OpenIndiana (descended from OpenSolaris). Currently the i686 and x86_64 architectures are tested. The dashboard is a one-stop view that lets you see if any of the unit tests has trouble or if compiling the wrappers or their unit tests yields any compiler errors or warnings.

Final thoughts

Regular LWN readers may have read about namespaces in Linux. These provide similar functionality as the lightweight virtualization layer mechanism known as containers. But to set up namespaces, you often will need root privileges. When distributions enable user namespaces, that requirement will go away, but there is another problem: namespaces are not available on BSD or Solaris.

Currently Samba is the only user of the cwrap libraries since cwrap was not available for external consumption until recently. Andreas is currently working on cwrap integration to test libssh against an OpenSSH sshd server. We are also planning to improve the test environment of SSSD, but we didn't have time to work on it yet. At Red Hat, Quality Engineering has started to write tests for nss_ldap using nss_wrapper, but they are not upstream yet. If you plan to use cwrap, join us on the #cwrap IRC channel on Freenode.

Comments (1 posted)

Quotes of the week

That is, of course, the irony of the patent system. Without patent protection, a competitor can simply replicate an invention and undercut the inventor's price -- which necessarily includes all the time and expense of research and development -- so the incentive to experiment and create is severely inhibited. But if innovators such as Glenn Curtiss cannot build on the progress of others without paying exorbitantly for the privilege, the incentive to continue to experiment and create is similarly inhibited.

— Joe Nocera (hat tip to Michael Kerrisk)

Currently it seems that the RAM requirements for Linux desktop use are about 3 years behind the RAM requirements for Windows. This is convenient as a PC is fully depreciated according to the tax office after 3 years. This makes it easy to get 3 year old PCs cheaply (or sometimes for free as rubbish) which work really well for Linux. But it would be nice if we could be 4 or 5 years behind Windows in terms of hardware requirements to reduce the hardware requirements for Linux users even further.

— Russell Coker

Comments (9 posted)

QEMU 2.0.0 released

The QEMU team has announced the release of version 2.0.0 of the QEMU "open source machine emulator and virtualizer". New features in the release include support for KVM on AArch64 (64-bit ARM) systems, support for all 64-bit ARMV8 instructions (other than the optional CRC and crypto extensions), support for the Allwinner A10-based cubieboard, CPU hotplug for Q35 x86 systems, better Windows guest performance when doing many floating-point or SIMD operations, live snapshot merging, new management interfaces for CPU and virtio-rng hotplug, direct access to NFSv3 shares using libnfs, and lots more. Detailed information about all of the changes can be found in the changelog.

Comments (3 posted)

ISC releases BIND 10 1.2, renames it, and turns it over to community

Internet Systems Consortium, the non-profit behind the BIND DNS server, has released version 1.2 of BIND 10, which is the last release it will make of the "applications framework for Internet infrastructure, such as DNS". That completes ISC's development effort on BIND 10, so it has renamed the project to Bundy and turned it over to the community for updates and maintenance. "'BIND 10 is an excellent software system,' said Scott Mann, ISC's Vice President of Engineering, 'and a huge step forward in open-source infrastructure software. Unfortunately, we do not have the resources to continue development on both projects, and BIND 9 is much more widely used.' 'The BIND 10 software is open-source,' Scott added, 'so we are making it available for anyone who wants to continue its development. The source will be available from GitHub under the name Bundy, to mitigate the confusion between it and ISC's BIND 9 (a completely separate system). The name 'BIND' is associated with ISC; we have changed its name as a reminder that ISC is no longer involved with the project.'"

Full Story (comments: 28)

GCC 4.9.0 released

Version 4.9.0 of the GNU Compiler Collection is out. "GCC 4.9.0 is a major release containing substantial new functionality not available in GCC 4.8.x or previous GCC releases." The list of new features is indeed long; see the 4.9.0 release page for lots more information.

Full Story (comments: 11)

Linux Test Project released for April 2014

The stable test suite from the Linux Test Project has been updated for April 2014. Notable changes include 20 new syscall test cases, fixes for out-of-tree building and cross-compilation, and the rewrite of several scripts to run in shells other than bash.

Full Story (comments: none)

Development newsletters from the past week

What's cooking in git.git (April 17)
What's cooking in git.git (April 18)
What's cooking in git.git (April 22)
LLVM Weekly (April 21)
OCaml Weekly News (April 22)
OpenStack Community Weekly Newsletter (April 18)
Perl Weekly (April 21)
PostgreSQL Weekly News (April 20)
Python Weekly (April 17)
Ruby Weekly (April 17)
Tor Weekly News (April 23)

Comments (none posted)

Ars Technica: Tor network’s ranks of relay servers cut because of Heartbleed bug

Ars Technica reports on the impact that the "Heartbleed" bug in OpenSSL has had for the Tor anonymizing network. "The Tor Project team has been moving to provide patches for all of the components, and most of the core network was quickly secured. However, a significant percentage of the relay servers, many of which serve countries with heavy Internet censorship, have remained unpatched. These systems are operated by volunteers and may run unattended."

Comments (6 posted)

Faure: Freedesktop Summit 2014 Report

David Faure has a report on the Freedesktop Summit, which was held recently in Nuremberg. "The meeting also produced an agreement on the future of startup notification in the Wayland world. A protocol based on broadcast of D-Bus signals will be used instead of the current approach with X client messages. This approach is expected to integrate nicely with future frameworks for sandboxed applications. Improvements were also made to the protocol to allow for tab-based applications that make dynamic choices about creating a new tab or a new window depending on the workspace in which a document was opened."

[Editor's note: apologies to Ryan Lortie who wrote this article.]

Comments (3 posted)

The Apache Software Foundation Announces 100 Million Downloads of Apache OpenOffice

The Apache Software Foundation has announced that Apache OpenOffice has been downloaded 100 million times. "Official downloads at openoffice.org are hosted by SourceForge, where users can also find repositories for more than 750 extensions and over 2,800 templates for OpenOffice."

Full Story (comments: 2)

Plant Breeders Release First 'Open Source Seeds' (NPR)

NPR has a look at the cross-pollination of open source software and agriculture, resulting in the release of the first "Open Source Seeds". The new Open Source Seed Initiative was formed to put seeds, and, more importantly, their genetic material, into a protected commons, so they will be available in perpetuity. "At an event on the campus of the University of Wisconsin, Madison, backers of the new Open Source Seed Initiative will pass out 29 new varieties of 14 different crops, including carrots, kale, broccoli and quinoa. Anyone receiving the seeds must pledge not to restrict their use by means of patents, licenses or any other kind of intellectual property. In fact, any future plant that's derived from these open source seeds also has to remain freely available as well." (Thanks to Rich Brown.)

Comments (none posted)

LAC14 interview series

Gabriel Nordeborn has started a series of interviews with people involved with the Linux Audio Conference, which will be held May 1-4 in Karlsruhe, Germany. As of this writing interviews with Miller Puckette, Robin Gareus, and Albert Graef are available.

Comments (none posted)

Raspberry Pi, 2nd Edition--New from Pragmatic Bookshelf

Pragmatic Bookshelf has released "Raspberry Pi, 2nd Edition" by Maik Schmidt.

Full Story (comments: none)

CFP Deadlines: April 24, 2014 to June 23, 2014

The following listing of CFP deadlines is taken from the LWN.net CFP Calendar.

Deadline	Event Dates	Event	Location
April 24	October 6 October 8	Operating Systems Design and Implementation	Broomfield, CO, USA
April 25	August 1 August 3	PyCon Australia	Brisbane, Australia
April 25	August 18	7th Workshop on Cyber Security Experimentation and Test	San Diego, CA, USA
May 1	July 14 July 16	2014 Ottawa Linux Symposium	Ottawa, Canada
May 1	May 12 May 16	Wireless Battle Mesh v7	Leipzig, Germany
May 2	August 20 August 22	LinuxCon North America	Chicago, IL, USA
May 2	August 20 August 22	CloudOpen North America	Chicago, IL, USA
May 3	May 17	Debian/Ubuntu Community Conference - Italia	Cesena, Italy
May 4	July 26 August 1	Gnome Users and Developers Annual Conference	Strasbourg, France
May 9	June 10 June 11	Distro Recipes 2014 - canceled	Paris, France
May 12	July 19 July 20	Conference for Open Source Coders, Users and Promoters	Taipei, Taiwan
May 18	September 6 September 12	Akademy 2014	Brno, Czech Republic
May 19	September 5	The OCaml Users and Developers Workshop	Gothenburg, Sweden
May 23	August 23 August 24	Free and Open Source Software Conference	St. Augustin (near Bonn), Germany
May 30	September 17 September 19	PostgresOpen 2014	Chicago, IL, USA
June 6	September 22 September 23	Open Source Backup Conference	Köln, Germany
June 6	June 10 June 12	Ubuntu Online Summit 06-2014	online, online
June 20	August 18 August 19	Linux Security Summit 2014	Chicago, IL, USA

If the CFP deadline for your event does not appear here, please tell us about it.

Events: April 24, 2014 to June 23, 2014

The following event listing is taken from the LWN.net Calendar.

Date(s)	Event	Location
April 25 April 28	openSUSE Conference 2014	Dubrovnik, Croatia
April 26 April 27	LinuxFest Northwest 2014	Bellingham, WA, USA
April 29 May 1	Embedded Linux Conference	San Jose, CA, USA
April 29 May 1	Android Builders Summit	San Jose, CA, USA
May 1 May 4	Linux Audio Conference 2014	Karlsruhe, Germany
May 2 May 3	LOPSA-EAST 2014	New Brunswick, NJ, USA
May 8 May 10	LinuxTag	Berlin, Germany
May 12 May 16	Wireless Battle Mesh v7	Leipzig, Germany
May 12 May 16	OpenStack Summit	Atlanta, GA, USA
May 13 May 16	Samba eXPerience	Göttingen, Germany
May 15 May 16	ScilabTEC 2014	Paris, France
May 17	Debian/Ubuntu Community Conference - Italia	Cesena, Italy
May 20 May 24	PGCon 2014	Ottawa, Canada
May 20 May 21	PyCon Sweden	Stockholm, Sweden
May 20 May 22	LinuxCon Japan	Tokyo, Japan
May 21 May 22	Solid 2014	San Francisco, CA, USA
May 23 May 25	FUDCon APAC 2014	Beijing, China
May 23 May 25	PyCon Italia	Florence, Italy
May 24	MojoConf 2014	Oslo, Norway
May 24 May 25	GNOME.Asia Summit	Beijing, China
May 30	SREcon14	Santa Clara, CA, USA
June 2 June 3	PyCon Russia 2014	Ekaterinburg, Russia
June 2 June 4	Tizen Developer Conference 2014	San Francisco, CA, USA
June 9 June 10	Erlang User Conference 2014	Stockholm, Sweden
June 9 June 10	DockerCon	San Francisco, CA, USA
June 10 June 12	Ubuntu Online Summit 06-2014	online, online
June 10 June 11	Distro Recipes 2014 - canceled	Paris, France
June 13 June 14	Texas Linux Fest 2014	Austin, TX, USA
June 13 June 15	State of the Map EU 2014	Karlsruhe, Germany
June 13 June 15	DjangoVillage	Orvieto, Italy
June 17 June 20	2014 USENIX Federated Conferences Week	Philadelphia, PA, USA
June 19 June 20	USENIX Annual Technical Conference	Philadelphia, PA, USA
June 20 June 22	SouthEast LinuxFest	Charlotte, NC, USA
June 21 June 28	YAPC North America	Orlando, FL, USA
June 21 June 22	AdaCamp Portland	Portland, OR, USA

If your event does not appear here, please tell us about it.

LWN.net Weekly Edition for April 24, 2014

Statistics

What can be done

Policy

Student engagement

Supporting teachers

Curriculum development

Summing up

Pickle introduction

Pickle woes

Alternatives

Security

Brief items

New vulnerabilities

cacti: multiple vulnerabilities

java: three unspecified vulnerabilities

java: multiple unspecified vulnerabilities

kernel: privilege escalation

kernel: denial of service

mysql: multiple unspecified vulnerabilities

openshift-origin-broker: authentication bypass

openssl: denial of service

otrs: cross-site scripting

python-django: multiple vulnerabilities

python-django-horizon: cross-site scripting

qemu: code execution

qemu-kvm: multiple vulnerabilities

rsync: denial of service

Kernel development

Brief items

Kernel development news

Virtual machines

Scripting languages

A simple cluster filesystem

Theory meets practice

A pivotal patch

Waiting for writeout

Learning from history

One down, one to go

Patches and updates

Kernel trees

Architecture-specific

Core kernel code

Development tools

Device drivers

Documentation

Filesystems and block I/O

Memory management

Security-related

Virtualization and containers

Miscellaneous

Distributions

Brief items

Newsletters and articles of interest

Development

The cwrap project

What is preloading?

The wrappers

socket_wrapper

nss_wrapper

uid_wrapper

How are the wrappers tested?

Final thoughts

Brief items

Newsletters and articles

Announcements

Brief items

Articles of interest

New Books

Calls for Presentations

CFP Deadlines: April 24, 2014 to June 23, 2014

Upcoming Events

Events: April 24, 2014 to June 23, 2014