LWN.net Logo

LWN.net Weekly Edition for June 13, 2013

A report from pgCon 2013

June 10, 2013

This article was contributed by Josh Berkus

This year's pgCon, which concluded May 25th, included an unusually high number of changes to the PostgreSQL community, codebase, and development. Contributors introduced multiple new major projects which will substantially change how people use PostgreSQL, including parallel query, a new binary document store type, and pluggable storage. In addition, Tom Lane switched jobs, four new committers were selected, pgCon had the highest attendance ever at 256 registrations, and held its first unconference after the regular conference. Overall, it was a mind-bending and exhausting week.

pgCon is a PostgreSQL developer and advanced user conference held in Ottawa, Canada every year in May. It usually brings together most of the committers and major contributors to the project in order to share ideas, present projects and new features, and coordinate schedules and code changes. The main conference days are preceded with various summits, including the PostgreSQL Clustering Summit, the Infrastructure Team meeting, and the Developer Meeting. The latter consists of a closed summit of 18 to 25 top code contributors to PostgreSQL who coordinate feature development and plans for the next version (PostgreSQL 9.4).

Parallel query

The longest and most interesting discussion at the developer meeting was about adding parallel query capabilities to PostgreSQL. Currently, the database is restricted to using one process and one core to execute each individual query, and cannot make use of additional cores to speed up CPU-bound tasks. While it can execute dozens of separate queries simultaneously, the lack of individual-query parallelism is still very limiting for users of analytics applications, for which PostgreSQL is already popular.

Bruce Momjian announced on behalf of EnterpriseDB that its engineering team would be focusing on parallelism for the next version of PostgreSQL. Noah Misch will be leading this project. The project plans to have index building parallelized for 9.4, but most of its work will be creating a general framework for parallelism. According to Momjian, there are three things you need for any parallel operation in the database:

  • An efficient way of passing data to the parallel backends, probably using a shared memory facility.
  • A method for starting and stopping worker processes.
  • The ability for the worker processes to share the reference data and state information of parent process.

The EnterpriseDB team had explored using threads for worker processes, but these were not seen as a productive approach, primarily because the PostgreSQL development team is used to working with processes and the backend code is structured around them. While the cost of starting up processes is high compared to threads, the additional locking required for threading looked to be just as expensive in performance terms. Momjian put it this way:

With threads, everything is shared by default and you have to take specific steps not to share. With processes, everything is unshared by default, and you have to specifically share things. The process model and explicit sharing is a shorter path from where we are currently.

The PostgreSQL developers plan to build a general framework for parallelism and then work on parallelizing one specific database task at a time. The first parallel feature is expected to be building indexes in parallel using parallel in-memory sort. This is an important feature for users because building indexes is often slower than populating the the underlying table, and it is often CPU-bound. It's also seen as a good first task because index builds run for minutes rather than milliseconds, so optimizing worker startup costs can be postponed until later development.

PostgreSQL and non-volatile memory

Major contributor KaiGai Kohei of NEC brought up the recent emergence of non-volatile memory (NVRAM), or persistent memory, devices and discussed ideas on how to take advantage of them for PostgreSQL. Intel engineer Matthew Wilcox further reinforced the message that NVRAM was coming in his lightning talk on the first day of pgCon. NVRAM persists its contents after a system power cycle, but is addressed like main memory and is around 50% as fast.

Initially, Kohei is interested in using NVRAM for the PostgreSQL Write Ahead Log (WAL), an append-only set of files that is used to guarantee transactional integrity and crash safety. This will work with the small sizes and limited write cycles of the early NVRAM devices. For servers with NVRAM, WAL writes would go to a memory region on the device allocated using mmap(). In later generations of NVRAM, developers can look at using it for the main database files.

There are many unknowns about this technology, such as what method can be employed to guarantee absolute write ordering. Developers speculated about whether transactional memory could somehow be employed for this. Right now, the PostgreSQL community is waiting to get its collective hands on an NVRAM device for testing and development.

Disqus keynote

Mike Clarke, Operations Lead of Disqus, delivered the keynote for pgCon this year. Disqus is the leading comment-hosting platform, which is used extensively by blogs and news sites all over the internet. Its technology platform includes Python, Django, RabbitMQ, Cassandra and PostgreSQL.

Most of its over three terabytes of data, including comments, threads, forums, and user profiles, is stored in PostgreSQL. This adds up to over 50,000 writes per second to the database, and millions of new posts and threads a day.

Clarke extolled the virtues of SSD storage. Disqus uses a 6-node master-slave replication database cluster running on fast machines with RAIDed SSD storage. Only SSDs have allowed it to continue scaling to its current size. Prior to moving to SSD-based storage, Disqus was at 100% IO utilization and had continual problems with long IO wait times. Now utilization is down and wait times are around one millisecond.

He also complained about some of the pain points of scaling out PostgreSQL. Disqus uses Slony-I, PostgreSQL's older replication system, which the company has customized to its workload, and feels that it can't afford to upgrade. For that reason, Clarke is eagerly awaiting the new logical replication system expected with PostgreSQL 9.4 next year. He also was unhappy about the lack of standard design patterns for PostgreSQL proxying and failover; everyone seems to build their stack differently. On the other hand, he praised extensions as the best feature of PostgreSQL, since they allow building applications inside the database.

Clarke ended with a request for some additional PostgreSQL features. He wants tools to enable sharded multiserver databases to be built inside PostgreSQL more easily, such as by improving PL/Proxy, the distributed table interface extension for PostgreSQL introduced by Skype. He'd also like to see a query progress indicator, something that was later presented at pgCon by Jan Urbański.

HStore and the future of JSON

During the regular talks, Oleg Bartunov and Teodor Sigaev introduced the prototype of the next version of their "hstore" extension for PostgreSQL. hstore allows storing a simple key-value store, a "hash" or "dictionary", in a PostgreSQL field, and to allow indexing the keys. Today, many users of PostgreSQL and JSON use it to store "flattened" JSON objects so that they can be indexed on all keys. The presentation introduced a new version of hstore which can nest, as well as storing arrays, so that it will be a closer match for fully structured JSON, as well as for complex multi-level hashes and dictionaries in Perl and Python.

This prototype new hstore also supports indexing which enables very fast lookup of keys, values, and even document fragments many levels deep. In their tests, they used the Del.icio.us dataset, which includes 1.2 million bookmark documents, and were able to search out all values matching a complex nesting expression in 0.1 seconds, or all instances of a common key in 0.5 seconds. The indexes are also reasonably sized, at around 75% of the size of the data to which they are attached. Earlier attempts to index tree-structured text data in PostgreSQL and other databases have resulted in indexes which are significantly larger than the base table. Individual hstore fields can be up to 512MB in size.

While many attendees were excited and impressed by the prototype, some were unhappy. Several contributors were upset that the new type wasn't JSON. They argued that the PostgreSQL project didn't need a non-standard type and interface, when what developers want is a binary, indexed JSON type. After extensive discussion, Bartunov and Sigaev agreed to work on JSON either instead of, or in addition to, a new hstore for the next version.

Hopefully, this means that users can expect a JSON type for version 9.4 that supports arbitrary nested key lookup and complex search expressions. This would make PostgreSQL more suitable for for applications which currently use a JSON document database, such as MongoDB or CouchDB. With the addition of compatibility projects like Mongres, users might even be able to run such applications largely unaltered.

Pluggable storage

The final day of pgCon this year was the conference's first-ever "unconference day". An unconference is a meeting in which the attendees select sessions and compose the schedule at the event. Unconferences tend to be more discussion-oriented than regular conferences, and center more around recent events and ideas. Around 75 of the pgCon attendees stayed for the unconference.

One of the biggest topics discussed at the unconference was idea of making PostgreSQL's storage "pluggable". Historically, companies have wanted to tailor the database for particular workloads by adding support for column store tables, clustered storage, graphs, streaming data, or other special-purpose data structures. These changes have created incompatible forks of PostgreSQL, such as Greenplum or Vertica, cutting off any development collaboration with those vendors. Other companies, such as Huawei and Salesforce, who are newly involved in PostgreSQL, would like to be able to change the storage model without forking the code.

The PostgreSQL contributors discussed methods of accomplishing this. First, they discussed the possibility of using the Foreign Data Wrapper (FDW) facility to attach new storage types. Foreign Data Wrappers allow users to attach external data, such as other databases, through a table interface. After some discussion, this was seen as unsuitable in most cases, since users want to actually manage tables, including creation, backup, and replication, through PostgreSQL, not just "have a window" into them. They also want to support creating indexes on different storage types.

If FDW won't work, the developers will need to create a new set of hooks and an API for "storage managers". This was actually supported by early versions of POSTGRES at the University of California, which had prototypes of both an in-memory and a write-once media (WORM) storage manager. However, that code has atrophied and doesn't support most current PostgreSQL features.

For any potential storage, the storage manager would need to support several conventions of PostgreSQL, including:

  • having tuples (rows) which are structured like PostgreSQL's tuples, including metadata
  • being transactional
  • providing a method for resolving data visibility
  • providing a physical row identifier for index building

The PostgreSQL system catalogs would stay on the current conventional native storage, regardless of what new types of storage managers were added.

If implemented, this would be a major change to the database system. It would become possible to use PostgreSQL as a query engine, transaction manager, and interface for very different types of databases, both proprietary and open source. It might even become possible for MySQL "storage engine" vendors, such as Infobright and Tokutek, to port their products. Peter van Hardenberg of Heroku suggested it might also make it possible to run PostgreSQL on top of HDFS.

Committer changes

The most frequently quoted news from pgCon this year was news that Tom Lane, lead committer on PostgreSQL, was changing employers from Red Hat to Salesforce. While announced in a rather low-key way through the Developer Meeting notes and Lane's show badge, this was big enough news that Wired picked it up. Lane had worked at Red Hat for 11 years, having joined to support Red Hat Database, its distribution of PostgreSQL. While Red Hat Database was eventually canceled, Lane stayed on at Red Hat, which was very supportive of his contributions to the project.

Lane's move is more significant in what it says about Salesforce's commitment to PostgreSQL than any real change in his expected activities as a committer. Until now, most commentators have suggested that Salesforce's mentions of PostgreSQL were merely posturing, but hiring Lane suggests that it plans to follow through on migrating away from Oracle Database. Six other Salesforce staff also attended pgCon. Its exact plans were not shared with the community, although it's reasonable to hypothesize from development discussions at the conference that Salesforce plans to contribute substantially to the open-source project, and that pluggable storage is a development target.

Lane memorialized his change of employment by putting his nine-year-old Red Hat laptop bag into the charity auction at the end of pgCon. It sold for $170.

The PostgreSQL Core Team, a six-member steering committee for the project, announced the selection of four new committers to PostgreSQL: Jeff Davis of Aster Data, author of the range types feature in version 9.2; Fujii Masao of NTT Data, main author of the synchronous replication feature; Stephen Frost of Resonate, author of several security features; and Noah Misch of EnterpriseDB, author of numerous SQL improvements. This brings the number of committers on PostgreSQL to twenty.

More PostgreSQL

Of course, there were many other interesting presentations and talks at pgCon. Keith Paskett ran a tutorial on optimizing and using PostgreSQL on ZFS atop OmniOS (an OpenSolaris fork), while other users talked about using PostgreSQL on ZFS for Linux. Jeff Davis presented strategies to use PostgreSQL's new anti-disk-corruption features. Josh McDermott ran another Schemaverse tournament, as a qualifier for the upcoming Defcon tournament. Robert Haas showed the most common failures of the PostgreSQL query planner, and sparked discussion about how to fix them.

On the second full conference day, Japanese community members presented the newly-formed PostgreSQL Enterprise Consortium of Japan, a group of 39 Japanese companies aiming to promote and improve PostgreSQL. This group is currently working on clustered PostgreSQL, benchmarking, and migration tools to migrate from other database systems. And just for fun, Álvaro Hernández Tortosa demonstrated creating one billion tables in a single PostgreSQL database.

Overall, it was the most exciting pgCon I've attended, and shows the many new directions in which PostgreSQL development is going. Anyone there got the impression that the project would be completely reinventing the database within a few years. If you work with PostgreSQL, or are interested in contributing to it, you should consider attending next year.

[ Josh Berkus is a member of the PostgreSQL Core Team. ]

Comments (6 posted)

OSS meetups, OLPC, and OpenRelief

By Jake Edge
June 12, 2013
LinuxCon Japan 2013

As always, there were more sessions at the recently completed triumvirate of Linux Foundation conferences in Tokyo than can be written up. In fact, also as usual, there were more sessions available than people to cover them. The Automotive Linux Summit Spring, LinuxCon Japan, and CloudOpen Japan covered a lot of ground in five days. Here are reports from three presentations at LinuxCon.

OSS meetups in Japan

[Hiro Yoshioka]

Hiro Yoshioka spoke about the types of open source gatherings that go on in Japan. He is the technical managing officer for Rakuten, which is a large internet services company in Japan. Before that, he was the CTO of Miracle Linux from 2000 to 2008. The goal of his talk was to encourage other Japanese people in the audience to start up their own "meetups" and other types of technical meetings and seminars, but the message was applicable anywhere. Organizing these meetings is quite rewarding, and lots of fun, but it does take some time to do, he said.

Yoshioka used the "kernel code reading party" that he started in Yokohama in April 1999 as a case study. He wondered if he would be able to read the kernel source code, so he gathered up some members of the Yokahama Linux Users Group to create an informal technical seminar to do so. The name of the meeting has stuck, but the group no longer reads kernel source. Instead, they have presentations on kernel topics, often followed by a "pizza and beer party".

There are numerous advantages to being the organizer of such a meeting, he said. You get to choose the date, time, and location for the event, as well as choosing the speakers. When he wants to learn about something in the kernel, he asks someone who knows about it to speak. Presenters also gain from the experience because they get to share their ideas in a relaxed setting. In addition, they can talk about an "immature idea" and get "great feedback" from those attending. Attendees, of course, get to hear "rich technical information".

Being the organizer has some downsides, mostly in the amount of time it takes. The organizer will "need to do everything", Yoshioka said, but sometimes the community will help out. In order to make the meetings "sustainable", the value needs to exceed the cost. So either increasing the value or decreasing the cost are ways to help make the meetings continue. Finding great speakers is the key to making the value of the meetings higher, while finding inexpensive meeting places is a good way to bring down costs.

How to find the time to organize meetings like those he mentioned was one question from the audience. It is a difficult question, Yoshioka said, but as with many things it comes down to your priorities. Another audience member noted that convincing your employer that the meeting will be useful in your job may allow you to spend some of your work time on it. "Make it part of your job".

Another example that Yoshioka gave was the Rakuten Technology Conference, which has been held yearly since 2007. It is a free meeting with content provided by volunteers. In the past, it has had keynotes from Ruby creator Matz and Dave Thomas of The Pragmatic Programmer. Proposals for talks are currently under discussion for this year's event, which will be held on October 26 near Shinagawa station in Tokyo. Unlike many other technical meetings in Japan, the conference is all in English, he said.

The language barrier was of interest to several non-Japanese audience members. Most of the meetings like Yoshioka described are, unsurprisingly, in Japanese, but for non-speakers there are a few possibilities. The Tokyo hackerspace has half of its meetings in English, he said, and the Tokyo Linux Users Group has a web page and mailing list in English. In addition, Yoshioka has an English-language blog with occasional posts covering the kernel code reading party meetings and other, similar meetings.

One laptop per child

[Madeline Kroah-Hartman]

A Kroah-Hartman different from the usual suspect spoke in the student track. In a presentation that followed her father's, Madeline Kroah-Hartman looked at the One Laptop Per Child (OLPC) project, its history, and some of its plans for the future. She has been using the OLPC for a number of years, back to the original XO version, and she brought along the newest model, XO-4 Touch, to show.

The project began in 2005 with the goal of creating a laptop for children that could be sold for $100. It missed that goal with the initial XO, but did ship 2.5 million of the units, including 83,000 as part of the "Give 1 Get 1" program that started in 2007. The idea was to have a low-powered laptop that would last the whole school day, which the XO is capable of, partly because it "sleeps between keystrokes" while leaving the display on, she said.

Recharging the laptops has been something of a challenge for the project, particularly in developing countries where electricity may not be available. Various methods have been tried, from a hand crank to a "yo-yo charger" that was never distributed. Using the yo-yo got painful after ten minutes, she said, but it took one and a half hours to fully charge the device. Solar-powered charging is now the norm.

OLPCs were distributed in various countries, including 300,000 to Uruguay (where every child in the country got one) and 4,500 to women's schools in Afghanistan, as well as to Nicaragua, Rwanda, and others. In Madagascar, the youngest students were teaching the older ones how to use the laptops, while in India the attendance rate neared 100% for schools that had OLPCs, she said.

OLPCs generally run the Sugar environment on top of Fedora. It is a "weird" interface that sometimes doesn't work, she said, but it is designed for small children. That means it has lots of pictures as part of the interface to reduce clutter and make it more intuitive for that audience. There are lots of applications that come with the OLPC, including the Etoys authoring environment, a Python programming environment, the Scratch 2D animation tool, a physics simulation program, a local copy of Wikipedia in the native language, a word processor, and more. The Linux command line is also available in a terminal application, though children may not actually use it in practice, she said.

The first model was designed so that you could "throw it at a wall" and it wouldn't break, she said. Various other versions were created over the years, including the X0-1.5, a dual-touchscreen XO-2 that was never released, and the XO-4 Touch. The latter will be shipping later this year. There is also the Android-based XO tablet that will be selling at Walmart for $100 starting in June. It is "very different" than the Sugar-based XOs, Kroah-Hartman said, but will come pre-loaded with education and mathematical apps.

There are lots of ways to participate in the project, she said, many of which are listed on the Participate wiki page. She noted that only 30% of the XO software is translated to Japanese, so that might be one place for attendees to start.

OpenRelief

[OpenRelief plane]

In an update to last year's presentation, Shane Coughlan talked about the progress (and setbacks) for the OpenRelief project. That project had its genesis at the 2011 LinuxCon Japan—held shortly after the earthquake, tsunami, and nuclear accident that hit Japan—as part of a developer panel discussion about what could be done to create open source technical measures to help out disaster relief efforts. That discussion led to the creation of the OpenRelief project, which seeks to build a robotic airplane (aka drone) to help relief workers "see through the fog" to get the right aid to the right place at the right time.

The test airframe he displayed at last year's event had some durability flaws: "airframes suck", he said. In particular, the airframe would regularly break in ways that would be difficult to fix in the field. Endurance is one of the key features required for a disaster relief aircraft, and the project had difficulty finding one that would both be durable and fit into its low price point ($1000 for a fully equipped plane, which left $100-200 for the airframe).

In testing the original plane, though, OpenRelief found that the navigation and flight software/hardware side was largely a solved problem, through projects like ArduPilot and CanberraUAV. Andrew Tridgell (i.e. "Tridge" of Samba and other projects) is part of the CanberraUAV team, which won the 2012 Outback Rescue Challenge; "they completely rock", Coughlan said. The "unmanned aerial vehicle" (UAV) that was used by CanberraUAV was "a bit big and expensive" for the needs of OpenRelief, but because it didn't have to focus on the flight software side of things, the project could turn to other parts of the problem.

One of those was the airframe, but that problem may now be solved. The project was approached by an "aviation specialist" who had created a regular airframe as part of a project to build a vertical takeoff and landing (VTOL) drone to be sold to the military. It is a simple design with rails to attach the wings and wheels as well as to hang payloads (e.g. cameras, radiation detectors, ...). There are dual servos for the control surfaces which provides redundancy. It is about the same size as the previous airframe, but can go 40km using an electric engine rather than 20km as the older version did. It can also carry 9kg of payload vs. the 0.5kg available previously. With an optional gasoline-powered engine, the range will increase to 200-300km.

OpenRelief released the design files for this new airframe on the day of Coughlan's talk. It is something that "anyone can build", he said. Test flights are coming soon, but he feels confident that the airframe piece, at least, is now under control. There is still plenty of work to do in integrating all of the different components into a working system, including adding some "mission control" software that can interface with existing disaster relief systems.

Coughlan also briefly mentioned another project he has been working on, called Data Twist. The OpenStreetMap (OSM) project is popular in Japan—where Coughlan lives—because the "maps are great", but the data in those maps isn't always easy to get at. Data Twist is a Ruby program that processes the OSM XML data to extract information to build geo-directories.

A geo-directory might contain "all of the convenience stores in China"—there were 43,000 of them as of the time of his talk—for example. Data Twist uses the categories tagged in the OSM data and can extract the locations into a Wordpress Geo Mashup blog post, which will place the locations on maps in the posts.

Data Twist is, as yet, just an experiment in making open data (like OSM data) more useful in other contexts. It might someday be used as part of OpenRelief, but there are other applications too. The idea was to show someone who didn't care about open source or disaster relief efforts some of the benefits of open data. It is in the early stages of development and he encourages others to take a look.

Wrap-up

All three conferences were held at the Hotel Chinzanso Tokyo and its associated conference (and wedding) center. It was a little off the beaten track—if that phrase can ever be applied to a city like Tokyo—in the Mejiro section of the city. But the enormous garden (complete with fireflies at night) was beautiful; it tended to isolate the conferences from the usual Tokyo "hustle and bustle". As always, the events were well-run and featured a wide array of interesting content.

[I would like to thank the Linux Foundation for travel assistance to Tokyo for LinuxCon Japan.]

Comments (none posted)

RSS feed reading in Firefox

By Nathan Willis
June 12, 2013

Google Reader, arguably the most widely-used feed aggregation tool, is being unceremoniously dumped and shut down at the end of June. As such, those who spend a significant chunk of their time consuming RSS or Atom content have been searching for a suitable replacement. There are a variety of options available, from third-party commercial services to self-hosted web apps to desktop applications. Trade-offs are involved simply in choosing which application type to adopt; for example, a web service provides access from anywhere, but it also relies on the availability of a remote server (whether someone else administrates it or not). But there is at least one other option worth exploring: browser extensions.

As Luis Villa pointed out in April, browsers do at best a mediocre job of making feed content discoverable, and they do nothing to support feed reading directly. But there are related features in Firefox, such as "live bookmarks," which blur together the notion of news feeds and periodically polling a page for changes. Several Firefox add-ons attempt to build a decent feed-reading interface where none currently exists—not all of them exploit the live bookmark functionality for this, although many do. Since recent Firefox releases are capable of synchronizing both bookmarks and add-ons, it lets the user access the same experience across multiple desktop and laptop machines (although an extension-based feed reader does not offer universal availability, as a purely web-based solution does).

Bookmarks, subscriptions; potato, potahto

The most lightweight option available for recent Firefox builds is probably MicroRSS, which offers nothing more than a list of subscribed feeds down the left hand margin, and text of the entries from the selected feed on the right. For some users that may be plenty, of course, but as a practical replacement for Google Reader it falls short, since there is no way to import an existing list of subscriptions (typically as an Outline Processor Markup Language (OPML) file). It also does not count unread items, much less offer searching, sorting, starring, or other news-management features. On the plus side, it is actively maintained, but the license is not specified.

Feed Sidebar is another lightweight option. It essentially just displays the existing "live bookmarks" content in a persistent Firefox sidebar. This mechanism requires the user to subscribe to feeds as live bookmarks, but it has the benefit of being relatively simple. The top half of the sidebar displays the list of subscriptions, with each subscription as a separate folder; its individual items are listed as entries within the folder, which the user must click to open in the browser. Notably, the Firefox "sidebar" is browser chrome and not a page element, which makes the feed sidebar visible in every tab, as opposed to confining the feed reading experience to a single spot. Feed Sidebar is licensed under GPLv2, which is a tad atypical for Firefox extensions, where the Mozilla Public License (MPL) dominates.

When it comes to the simple implementations, it is also worth mentioning that Thunderbird can subscribe to RSS and Atom feeds natively. This functionality is akin to the email client's support for NNTP news; like newsgroups, a subscription is presented to the user much as a POP or IMAP folder is. Feeds with new content appear in the sidebar, new messages are listed in the top pane, and clicking on any them opens the content in the bottom message pane. Subscribing to news feeds does require setting up a separate "Blogs and News Feeds" account in Thunderbird, though, and users can only read one feed at a time—one cannot aggregate multiple feeds into a folder, for example.

Moving up a bit on the functionality ladder, Sage is an MPL-1.1-licensed extension that stores your subscribed feeds in a (user-selectable) bookmark folder. For reading, it provides a Firefox sidebar with two panes; the upper one presents a list of the subscriptions, and the lower one presents a list of available articles from the selected subscription. The main browser tab shows a summary of each entry in the selected feed, although opening any entry opens up the corresponding page on the original site, rather than rendering it inside the feed-reader UI. As rendering the original page in the browser might suggest, Sage does not store any content locally, so it does not offer search functionality.

The project is actively developed on GitHub, although it is also worth noting that one of the project's means of fundraising is to insert "affiliate" links into feed content that points toward certain online merchants.

The high end

Digest is a fork of a no-longer-developed extension called Brief. It attempts to provide a more full-featured feed-reading experience than some of the other readers; it keeps a count of unread items for each feed, allows the user to "star" individual items or mark them as unread, and quite a few other features one would expect to find in a web service like Google Reader.

[Digest]

As is the case with several other extensions, Digest stores feed subscriptions in a (selectable) bookmarks folder. However, it also downloads entries locally—allowing the user to choose how long old downloads are preserved (thankfully), which enables it to offer content search. It also renders its entire interface within the browser tab in HTML, unlike some of the competition. Digest is licensed as MPL 2.0, and is actively under development by its new maintainer at GitHub. It can import (and export) OPML subscription files.

Like Digest, Newsfox replicates much of the Google Reader experience inside Firefox. The look is a bit different, since Newsfox incorporates a three-pane interface akin to an email client. This UI is implemented in browser chrome, but unlike the earlier live-bookmark–based options, it still manages to reside entirely within one tab. That said, Newsfox expects to find subscriptions in the default Live Bookmarks folder, and there does not appear to be a way to persuade it to look elsewhere. Perhaps more frustrating, it either does not understand subfolders within the Live Bookmarks folder, or it chooses to ignore them, so dozens or hundreds of feeds are presented to the user in a single, scrolling list.

[Newsfox]

On the plus side, Newsfox offers multi-tier sorting; one can tell it to first sort feeds alphabetically (increasing or decreasing), then sort by date (again, increasing or decreasing), and so on, up to four levels deep. It can also encrypt the locally-download feed content, which might appeal to laptop users, and is an option none of the other extensions seems to feature. Downloaded entries can be searched, which is a plus, and on the whole the interface is fast and responsive, more so than Digest's HTML UI.

The last major option on the full-fledged feed-aggregator front is Bamboo, an MPL-1.1-licensed extension that appears to be intentionally aiming for Google Reader replacement status—right down to the UI, which mimics the dark gray "Google toolbar" currently plastered across all of the search giant's web services. The interface is rendered in HTML, and uses the decidedly Google Reader–like sidebar layout, rendering feed content within the right-hand pane. Bamboo supports all of the basic features common to the high-end aggregators already discussed: OPML import/export, folders, search, sorting, marking items as read/unread, and locally storing feed content. It also adds more, such as the ability to star "favorite" items, the ability to save items for offline reading, a toggle-able headline-or-full-item display setting, and a built-in ad blocker.

[Bamboo]

Interestingly enough, despite its comparatively rich feature set, Bamboo uses a bookmark folder to keep track of feed subscriptions, but it does not allow the user to select the folder where subscriptions are saved. Instead, like Newsfox, it only examines the default Live Bookmarks folder.

And the rest

If one goes searching for "RSS" on the Firefox Add-ons site, there are plenty more options that turn up, many of which reflect entirely different approaches to feed aggregation. For example, SRR offers a "ticker"-style scroll of headlines from subscribed feeds, which is useful for a handful of feeds at best. Dozens or hundreds, however, will overpower even the toughest attention span. Or there is Newssitter, which provides a "bookshelf"-style interface that seems visually designed for reading primarily on a mobile device. That may meet the needs of many news junkies, of course, but it bears little resemblance to the Google Reader experience; getting a quick overview of dozens of feeds is not possible, for example.

Selecting a Google Reader replacement is not a simple task; everyone uses the service in slightly different ways, and all of the options offer different (and overlapping) subsets of the original product's feature set.

The bare-bones feed reading extensions all have big limitations that probably make them less useful as a drop-in replacement; for instance they may not check for new content in the background, and they certainly do not provide much search functionality. For a user with a lot of subscriptions, supplementary features like searching and saving items can take the application from mediocre to essential. After all, it is frequently hard to backtrack to a barely-remembered news story weeks or months after reading the original feed.

To that end, the more fleshed-out Google Reader alternatives offer a much more useful experience in the long run. Only time will tell how solid they are over the long haul, of course—it is not beyond reason to think that some of them will start to slow down or wobble with months of saved content to manage. On the other hand, none of them can offer one key feature of Google Reader: the months' (or in many cases years') worth of already read news items. Most individual feeds do not publish their site's entire history, but Google Reader could search years' worth of already read material. That is just one of the things people lose when a web service shuts down.

Based on my early experiments, Bamboo offers the most features, while Newsfox is faster, but Digest is more flexible. It is tempting to fall back on that familiar old saying: you pays your nickel and you takes your chances (though sans nickel in free software circles). But because all three options can follow and display the same set of feeds, it may be worth installing more than one and giving them a simultaneous test drive for a week or so. At the very least, Firefox can synchronize the bookmarks and add-ons, providing you with some way to get at your subscriptions when away from home—at least if there is a Firefox installation nearby.

Comments (18 posted)

Page editor: Nathan Willis

Security

Tizen content scanning and app obfuscation

By Nathan Willis
June 12, 2013
Tizen Dev Con 2013

At the 2013 Tizen Developer Conference in San Francisco, there was a range of different security talks examining difference facets of hardening the mobile platform. Last week, we examined the Smack framework that implements access control for system resources. There were also sessions that explored the problem of protecting the device at higher levels of the system stack. Sreenu Pilluta spoke about guarding against malware delivered via the Internet, and Roger Wang offered an unusual proposal for obfuscating JavaScript applications themselves: by compiling them.

Content, secure

Pilluta is an engineer at anti-virus software vendor McAfee. As he explained, Tizen device vendors are expected to manage their own "app stores" through which users install safe applications, but that leaves a lot of avenues for malicious content unblocked. Email, web pages, and media delivery services can all download content from untrusted sources that might contain a dangerous payload. Pilluta described Tizen's Content Security Framework (CSF), a mechanism designed to let device vendors add pluggable virus- and malware-scanning software to their Tizen-based products.

[Pilluta at Tizen Dev Con]

The CSF itself provides a set of APIs that other components can use to scan two distinct classes of content: downloaded data objects and remote URLs. The actual engines that perform the scanning are plugins to CSF, and are expected to be added to Tizen by device vendors. Security engines come in two varieties: Scan Engines (for data) and Site Engines (for URLs). Scan Engines inspect content and are designed to retrieve malware-matching patterns from the vendor, as is typical of PC virus-scanning programs today. Site Engines use a reputation system, in which the vendor categorizes URLs and creates block list policies by category (e.g., gambling, pornography, spyware, etc.).

Applications dictate when the scanning is performed, Pilluta said, which is intentionally a decision left up to the vendor. Some might choose to scan a page before loading it at all, while others might load the page but scan it before executing any JavaScript. It is also up to the application what to do when infected content is found; the rationale being that the application can provide a more context-aware response to the user, and do so within the expected bounds of the user interface, rather than popping up an imposing and unfamiliar warning notification from a component the user was unaware even existed.

The CSF scanning APIs are high-level and event-driven, which Pilluta said allowed applications to call them cooperatively. For example, an email client could call the Site Engine to scan an URL inside of an email message and the Scan Engine on a file attachment. Similarly, the email client could call the Site Engine on a URL clicked upon to be opened in the browser. Old-fashioned scanning methods that use "deep hooks" into the filesystem would make this sort of cooperation difficult, he said.

The APIs are also designed to provide flexibility to application authors. For example, the Site Engine API is not tied to the Tizen Web Runtime or even to the system's HTTP stack. Thus, an application that uses its own built-in HTTP proxy can still take advantage of the CSF to scan URLs without re-implementing the scanner.

Ultimately, CSF is a framework that device makers will take advantage of, each in its own way. Presumably commercial vendors will offer virus scanning engines to interested OEMs, but consumers will likely not see any of them until a Tizen product hits the market. The flexible framework also seems designed to support HTTP-driven services like downloadable media and game content, which are frequently the most-cited examples of why companies want to see Tizen in devices like smart TVs and car dashboards.

CSF is an open source contribution to the Tizen platform, although one would reasonably expect McAfee to also develop scanning engines to offer to device vendors and mobile providers. As the CSF begins to take shape in products coming to market, it will be interesting to see if there are also any open source scanning engines, either in the Tizen reference code or produced by third parties. One would hope so, since malware detection is a concern for everyone, not just commercial device makers.

JavaScript app protection

In contrast to Pilluta's talk, Wang was not presenting a component of the Tizen architecture; rather, he was showing the progress he has made on a personal effort that he hopes will appeal to independent application developers. The issue he tackled was protecting JavaScript applications against reverse-engineering. While that is not an issue for developers of open source apps, building tools to simplify the process on an open platform like Tizen could have implications further down the road. Wang is a developer for Intel working on the Tizen platform, although this particular project is a personal side-effort.

[Wang at Tizen Dev Con]

In the past, he said, JavaScript was primarily used for incidental page features and other such low-value scripts, but today JavaScript applications implement major functionality, and HTML5-driven platforms like Tizen should offer developers a way to protect their code against theft and reverse-engineering. There are a number of techniques already in use that side-step the issue, such as separating the core functionality out into a server-side component, or building the business model around the value of user-contributed data. But these approaches do not work for "pure" client-side JavaScript apps.

Most app developers rely on an obfuscation system to "minify" JavaScript that they want to obscure from prying eyes. Obfuscation removes easily-understood function and variable names, and changes the formatting to make the code delivered difficult to understand. The most popular obfuscator, he noted, was Yahoo's YUI Compressor (which has other beneficial features like removing dead code), followed by the Google Closure Compiler, and UglifyJS. But obfuscators still produce JavaScript which is delivered to the client browser or web runtime and can ultimately be reverse-engineered.

The other major approach found in practice today is encryption, in which the app is downloaded by the device and placed in encrypted storage by the installer. Typically either the initial download is conducted over a secure channel (e.g., HTTPS) or the download is done in the clear and the installation program encrypts the app when it is installed. Both have weaknesses, Wang said. Someone can dump the HTTP connection if it is unencrypted and intercept the app, but a skilled attacker could also run a man-in-the-middle attack against HTTPS. Ultimately, he concluded, there is always dumping from memory, so encryption is an approach that will always get broken one way or another.

Although there are a few esoteric approaches out there—such as writing one's app in another language and then compiling it to JavaScript (a practice Wang said was out-of-scope for the talk since he was addressing the concerns of JavaScript coders), most people simply "lawyer up" and apply licensing terms that forbid examining the app. That may not work in every jurisdiction, he said, and even when it does, it is expensive.

Wang's experiment takes a different approach entirely: compiling the JavaScript app to machine code, just like one does with a native app. The technique works by exploiting the difference between a platform's web runtime (which does not allow the user to inspect or save HTML content) and the web browser. A developer can work in JavaScript, then deploy the app as a binary. The platform would have to support this approach, both in the installer and in the web runtime, however, and developers would need to rebuild their apps for each HTML5 platform.

Wang has implemented the technique as an experimental feature of node-webkit, his app runtime derived from Chromium and Node.js. It compiles a JavaScript app using the V8 JavaScript engine's "snapshot" feature. Snapshots dump the engine's heap, and thus contain all of the created objects and Just-In-Time (JIT) compiled functions. In Chromium, snapshots are used to cache contexts for performance reasons; the node-webkit compiler simply saves them. The resulting binaries can then be executed by WebKit's JSC utility.

There are, naturally, limitations. V8 snapshots are created very early in the execution process, so some DOM objects (such as window) have not yet been created when the snapshot is taken. On the wiki entry for the feature, Wang suggests a few ways to work around this issue. The other limitation, however, is that the snapshot tool will throw an error if the JavaScript app is too large. Wang suggests splitting the app up if this limitation poses a practical problem. Another limitation is that the resulting binary also runs significantly slower than JavaScript executed in the runtime.

He has been exploring other techniques for extending the idea, such as using the Crankshaft optimizer. Crankshaft is an alternative to the JavaScript compiler currently used in V8. At the moment, using Crankshaft's compiler can generate code that runs faster, Wang said, but it takes significantly longer to compile, and it requires "training Crankshaft on your code."

Wang has defined an additional field for the package.json file that defines Tizen HTML5 applications; "snapshot" : "snapshot.bin" can be used to point to compiled JavaScript apps and test them with node-webkit. He is still in the process of working out the API required to connect JSC to the Tizen web runtime, however. The feature is not currently slated to become part of the official Tizen platform.

Obfuscating JavaScript by any means is a controversial subject. To many in the free software world, it is seen as a technique to prevent users from studying and modifying the software on their systems. Bradley Kuhn of the Software Freedom Conservancy lambasted it at SCALE 2013, for instance. Then again, obfuscation is not required to make a JavaScript app non-free; as the Free Software Foundation notes, licensing can do that alone. Still, it is likely that compiling JavaScript apps to machine code offers a tantalizing measure of protection to quite a few proprietary software vendors, beyond the attack-resistance of traditional obfuscation techniques.

Many users, of course, are purely pragmatic about mobile apps: they use what is available, free software or otherwise. But as the FSF points out, unobfuscated JavaScript, while it may be non-free, can still be read and modified. Perhaps the longer-term concern about obfuscation or compiling to machine code is that a device vendor could automate the technique on its mobile app store. But automated or manual, the prospect of building JavaScript compilation into Tizen did appear to ruffle several feathers at Tizen Dev Con; audience members asked about the project during the Q&A sections of several later talks. Nevertheless, for the foreseeable future, Wang's effort remains a side project of an experimental nature.

[The author wishes to thank the Linux Foundation for travel assistance to Tizen Dev Con.]

Comments (3 posted)

Brief items

PRISM quotes of the week

2. If I add a phone to my account, will those calls also be monitored?

Once again, the answer is good news. If you want to add a child or any other family member to your Verizon account, their phone calls—whom they called, when, and the duration of the call—will all be monitored by the United States government, at no additional cost.

— "US President Barack Obama" in a FAQ for Verizon customers

Knowing how the government spies on us is important. Not only because so much of it is illegal -- or, to be as charitable as possible, based on novel interpretations of the law -- but because we have a right to know. Democracy requires an informed citizenry in order to function properly, and transparency and accountability are essential parts of that. That means knowing what our government is doing to us, in our name. That means knowing that the government is operating within the constraints of the law. Otherwise, we're living in a police state.

We need whistle-blowers.

Bruce Schneier

Only one explanation seems logical. The government is afraid of us -- you and me. They're terrified (no pun intended) that if we even knew the most approximate ranges of how many requests they're making, we would suspect significant abuse of their investigatory powers.

In the absence of even this basic information, conspiracy theories have flourished, which incorrectly assume that the level of data being demanded from Web services is utterly unfettered and even higher than reality -- and the government's intransigence has diverted people's anger inappropriately to those Web services. A tidy state of affairs for the spooks and their political protectors.

Lauren Weinstein

Even assuming the U.S. government never abuses this data -- and there is no reason to assume that! -- why isn't the burgeoning trove more dangerous to keep than it is to foreswear? Can anyone persuasively argue that it's virtually impossible for a foreign power to ever gain access to it? Can anyone persuasively argue that if they did gain access to years of private phone records, email, private files, and other data on millions of Americans, it wouldn't be hugely damaging?

Think of all the things the ruling class never thought we'd find out about the War on Terrorism that we now know. Why isn't the creation of this data trove just the latest shortsighted action by national security officials who constantly overestimate how much of what they do can be kept secret? Suggested rule of thumb: Don't create a dataset of choice that you can't bear to have breached.

Conor Friedersdorf

Comments (15 posted)

New vulnerabilities

bzr: denial of service

Package(s):bzr CVE #(s):CVE-2013-2099 CVE-2013-2098
Created:June 7, 2013 Updated:September 10, 2013
Description:

From the Red Hat bug report:

A denial of service flaw was found in the way SSL module implementation of Python3, version 3 of the Python programming language (aka Python 3000), performed matching of the certificate's name in the case it contained many '*' wildcard characters. A remote attacker, able to obtain valid certificate with its name containing a lot of '*' wildcard characters could use this flaw to cause denial of service (excessive CPU consumption) by issuing request to validate such a certificate for / to an application using the Python's ssl.match_hostname() functionality.

Alerts:
Fedora FEDORA-2013-9628 2013-06-07
Fedora FEDORA-2013-9620 2013-06-07
Fedora FEDORA-2013-12414 2013-07-15
Fedora FEDORA-2013-12396 2013-07-15
Fedora FEDORA-2013-12421 2013-07-15
Fedora FEDORA-2013-13216 2013-07-26
Fedora FEDORA-2013-13140 2013-07-26
Fedora FEDORA-2013-13213 2013-07-26
Mageia MGASA-2013-0252 2013-08-22
Mandriva MDVSA-2013:229 2013-09-10
Ubuntu USN-1983-1 2013-10-01
Ubuntu USN-1984-1 2013-10-01
Ubuntu USN-1985-1 2013-10-01

Comments (2 posted)

cgit: directory traversal

Package(s):cgit CVE #(s):CVE-2013-2117
Created:June 6, 2013 Updated:July 17, 2013
Description:

From the Red Hat Bugzilla entry:

Today I found a nasty directory traversal:

http://somehost/?url=/somerepo/about/../../../../etc/passwd

[...] Cgit by default is not vulnerable to this, and the vulnerability only exists when a user has configured cgit to use a readme file from a filesystem filepath instead of from the git repo itself. Until a release is made, administrators are urged to disable reading the readme file from a filepath, if currently enabled.

Alerts:
Fedora FEDORA-2013-9522 2013-06-06
Fedora FEDORA-2013-9498 2013-06-06
openSUSE openSUSE-SU-2013:1207-1 2013-07-17

Comments (none posted)

chromium-browser: multiple vulnerabilities

Package(s):chromium-browser CVE #(s):CVE-2013-2855 CVE-2013-2856 CVE-2013-2857 CVE-2013-2858 CVE-2013-2859 CVE-2013-2860 CVE-2013-2861 CVE-2013-2862 CVE-2013-2863 CVE-2013-2865
Created:June 11, 2013 Updated:June 12, 2013
Description: From the Debian advisory:

CVE-2013-2855: The Developer Tools API in Chromium before 27.0.1453.110 allows remote attackers to cause a denial of service (memory corruption) or possibly have unspecified other impact via unknown vectors.

CVE-2013-2856: Use-after-free vulnerability in Chromium before 27.0.1453.110 allows remote attackers to cause a denial of service or possibly have unspecified other impact via vectors related to the handling of input.

CVE-2013-2857: Use-after-free vulnerability in Chromium before 27.0.1453.110 allows remote attackers to cause a denial of service or possibly have unspecified other impact via vectors related to the handling of images.

CVE-2013-2858: Use-after-free vulnerability in the HTML5 Audio implementation in Chromium before 27.0.1453.110 allows remote attackers to cause a denial of service or possibly have unspecified other impact via unknown vectors.

CVE-2013-2859: Chromium before 27.0.1453.110 allows remote attackers to bypass the Same Origin Policy and trigger namespace pollution via unspecified vectors.

CVE-2013-2860: Use-after-free vulnerability in Chromium before 27.0.1453.110 allows remote attackers to cause a denial of service or possibly have unspecified other impact via vectors involving access to a database API by a worker process.

CVE-2013-2861: Use-after-free vulnerability in the SVG implementation in Chromium before 27.0.1453.110 allows remote attackers to cause a denial of service or possibly have unspecified other impact via unknown vectors.

CVE-2013-2862: Skia, as used in Chromium before 27.0.1453.110, does not properly handle GPU acceleration, which allows remote attackers to cause a denial of service (memory corruption) or possibly have unspecified other impact via unknown vectors.

CVE-2013-2863: Chromium before 27.0.1453.110 does not properly handle SSL sockets, which allows remote attackers to execute arbitrary code or cause a denial of service (memory corruption) via unspecified vectors.

CVE-2013-2865: Multiple unspecified vulnerabilities in Chromium before 27.0.1453.110 allow attackers to cause a denial of service or possibly have other impact via unknown vectors.

Alerts:
Debian DSA-2706-1 2013-06-10
Mageia MGASA-2013-0194 2013-07-01
Gentoo 201309-16 2013-09-24

Comments (none posted)

kde: weak passwords generated by PasteMacroExpander

Package(s):kde CVE #(s):CVE-2013-2120
Created:June 12, 2013 Updated:June 17, 2013
Description: From the Red Hat bugzilla:

A security flaw was found in the way PasteMacroExpander of paste applet of kdeplasma-addons, a suite of additional plasmoids for KDE desktop environment, performed password generation / derivation for user provided string. An attacker could use this flaw to obtain plaintext form of such a password (possibly leading to their subsequent ability for unauthorized access to a service / resource, intended to be protected by such a password).

Alerts:
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10130 2013-06-12
Fedora FEDORA-2013-10182 2013-06-16

Comments (6 posted)

kernel: multiple vulnerabilities

Package(s):kernel CVE #(s):CVE-2013-1935 CVE-2013-1943 CVE-2013-2017
Created:June 11, 2013 Updated:June 13, 2013
Description: From the Red Hat advisory:

* A flaw was found in the way KVM (Kernel-based Virtual Machine) initialized a guest's registered pv_eoi (paravirtualized end-of-interrupt) indication flag when entering the guest. An unprivileged guest user could potentially use this flaw to crash the host. (CVE-2013-1935, Important)

* A missing sanity check was found in the kvm_set_memory_region() function in KVM, allowing a user-space process to register memory regions pointing to the kernel address space. A local, unprivileged user could use this flaw to escalate their privileges. (CVE-2013-1943, Important)

* A double free flaw was found in the Linux kernel's Virtual Ethernet Tunnel driver (veth). A remote attacker could possibly use this flaw to crash a target system. (CVE-2013-2017, Moderate)

Red Hat would like to thank IBM for reporting the CVE-2013-1935 issue and Atzm WATANABE of Stratosphere Inc. for reporting the CVE-2013-2017 issue. The CVE-2013-1943 issue was discovered by Michael S. Tsirkin of Red Hat.

Alerts:
Red Hat RHSA-2013:0911-01 2013-06-10
Oracle ELSA-2013-0911 2013-06-11
CentOS CESA-2013:0911 2013-06-12
Oracle ELSA-2013-2534 2013-06-12
Oracle ELSA-2013-2534 2013-06-12
Scientific Linux SL-kern-20130612 2013-06-12
Ubuntu USN-1940-1 2013-09-06
Ubuntu USN-1939-1 2013-09-06

Comments (none posted)

libraw: code execution

Package(s):libraw CVE #(s):CVE-2013-2126
Created:June 7, 2013 Updated:July 31, 2013
Description:

From the Secunia advisory:

Two vulnerabilities have been reported in LibRaw, which can be exploited by malicious people to potentially compromise an application using the library.

1) A double-free error exits when handling damaged full-color within Foveon and sRAW files.

2) An error during exposure correction can be exploited to cause a buffer overflow.

Successful exploitation may allow execution of arbitrary code.

Alerts:
Mageia MGASA-2013-0167 2013-06-06
Fedora FEDORA-2013-9773 2013-06-11
Fedora FEDORA-2013-9798 2013-06-11
Ubuntu USN-1884-1 2013-06-18
Ubuntu USN-1885-1 2013-06-18
openSUSE openSUSE-SU-2013:1083-1 2013-06-26
openSUSE openSUSE-SU-2013:1085-1 2013-06-26
openSUSE openSUSE-SU-2013:1168-1 2013-07-10
Mageia MGASA-2013-0223 2013-07-21
Mageia MGASA-2013-0219 2013-07-21
Fedora FEDORA-2013-13112 2013-07-24
Fedora FEDORA-2013-13038 2013-07-24
Fedora FEDORA-2013-13499 2013-07-30
Mageia MGASA-2013-0269 2013-09-01
Gentoo 201309-09 2013-09-15

Comments (none posted)

mediawiki: insecure file uploading

Package(s):mediawiki CVE #(s):CVE-2013-2114
Created:June 7, 2013 Updated:July 22, 2013
Description:

From the Red Hat bug report:

MediaWiki user Marco discovered that security checks for file uploads were not being run when the file was uploaded in chunks through the API. This option has been available to users who can upload files since MediaWiki 1.19.

Alerts:
Fedora FEDORA-2013-9622 2013-06-07
Fedora FEDORA-2013-9616 2013-06-07
Mageia MGASA-2013-0221 2013-07-21
Mageia MGASA-2013-0226 2013-07-21

Comments (none posted)

mod_security: denial of service

Package(s):mod_security CVE #(s):CVE-2013-2765
Created:June 6, 2013 Updated:July 2, 2013
Description:

From the Red Hat Bugzilla entry:

Fixed Remote Null Pointer DeReference (CVE-2013-2765). When forceRequestBodyVariable action is triggered and a unknown Content-Type is used, mod_security will crash trying to manipulate msr->msc_reqbody_chunks->elts however msr->msc_reqbody_chunks is NULL. (Thanks Younes JAAIDI)

Alerts:
Fedora FEDORA-2013-9518 2013-06-06
Fedora FEDORA-2013-9519 2013-06-06
Mageia MGASA-2013-0179 2013-06-26
Mandriva MDVSA-2013:187 2013-07-02
openSUSE openSUSE-SU-2013:1331-1 2013-08-14
openSUSE openSUSE-SU-2013:1336-1 2013-08-14
openSUSE openSUSE-SU-2013:1342-1 2013-08-14

Comments (none posted)

PackageKit: only allow patches for regular updates

Package(s):PackageKit CVE #(s):CVE-2013-1764
Created:June 10, 2013 Updated:June 12, 2013
Description: From the openSUSE advisory:

The PackageKit zypp backend was fixed to only allow patches to be updated. Otherwise a regular user could install new packages or even downgrade older packages to ones with security problems.

Alerts:
openSUSE openSUSE-SU-2013:0889-1 2013-06-10

Comments (none posted)

php: code execution

Package(s):php CVE #(s):CVE-2013-2110
Created:June 11, 2013 Updated:June 24, 2013
Description: From the Slackware advisory:

A heap-based overflow in the quoted_printable_encode() function could be used by a remote attacker to crash PHP or execute code as the 'apache' user.

Alerts:
Slackware SSA:2013-161-01 2013-06-10
Ubuntu USN-1872-1 2013-06-11
Mageia MGASA-2013-0172 2013-06-18
Mageia MGASA-2013-0176 2013-06-19
Fedora FEDORA-2013-10233 2013-06-23

Comments (none posted)

pki-tps: two vulnerabilities

Package(s):pki-tps CVE #(s):CVE-2013-1885 CVE-2013-1886
Created:June 6, 2013 Updated:June 12, 2013
Description:

From the Red Hat bugzilla entries [1, 2]:

CVE-2013-1885: It was reported that Certificate System suffers from XSS flaws in the /tus/ and /tus/tus/ URLs, such as:

GET /tus/tus/%22%2b%61%6c%65%72%74%28%34%38%32%36%37%29%2b%22

or

GET /tus/%22%2b%61%6c%65%72%74%28%36%31%34%35%32%29%2b%22

which will in turn output something like:

<!--
var uriBase = "/tus/"+alert(85384)+";
var userid = "admin";

This was reported against Certificate System 8.1 and may also affect Dogtag 9 and 10.

CVE-2013-1886: It was reported that Certificate System suffers from a format string injection flaw when viewing certificates. This could allow a remote attacker to crash the Certificate System server or, possibly, execute arbitrary code with the privileges of the user [running] the service (typically run as an unprivileged user, such as pkiuser).

Alerts:
Fedora FEDORA-2013-9258 2013-06-06

Comments (none posted)

pymongo: denial of service

Package(s):pymongo CVE #(s):CVE-2013-2132
Created:June 11, 2013 Updated:July 8, 2013
Description: From the Debian advisory:

Jibbers McGee discovered that pymongo, a high-performance schema-free document-oriented data store, is prone to a denial-of-service vulnerability. An attacker can remotely trigger a NULL pointer dereference causing MongoDB to crash.

Alerts:
Debian DSA-2705-1 2013-06-10
openSUSE openSUSE-SU-2013:1064-1 2013-06-21
Ubuntu USN-1897-1 2013-07-03
Mageia MGASA-2013-0201 2013-07-06
Red Hat RHSA-2013:1170-01 2013-08-21

Comments (none posted)

rubygem-passenger: insecure temp files

Package(s):rubygem-passenger CVE #(s):CVE-2013-2119
Created:June 11, 2013 Updated:July 10, 2013
Description: From the Red Hat bugzilla:

Michael Scherer reported that the passenger ruby gem, when used in standalone mode, does not use temporary files in a secure manner. In the lib/phusion_passenger/standalone/main.rb's create_nginx_controller function, passenger creates an nginx configuration file insecurely and starts nginx with that configuration file:

       @temp_dir        = "/tmp/passenger-standalone.#{$$}"
       @config_filename = "#{@temp_dir}/config"
If a local attacker were able to create a temporary directory that passenger uses and supply a custom nginx configuration file they could start an nginx instance with their own configuration file. This could result in a denial of service condition for a legitimate service or, if passenger were executed as root (in order to have nginx listen on port 80, for instance), this could lead to a local root compromise.
Alerts:
Fedora FEDORA-2013-9789 2013-06-11
Fedora FEDORA-2013-9771 2013-06-11
Mageia MGASA-2013-0205 2013-07-09
Red Hat RHSA-2013:1136-01 2013-08-05

Comments (none posted)

samba: multiple vulnerabilities

Package(s):samba CVE #(s):
Created:June 10, 2013 Updated:June 12, 2013
Description: From the openSUSE advisory:

  • - Add support for PFC_FLAG_OBJECT_UUID when parsing packets; (bso#9382).
  • - Fix "guest ok", "force user" and "force group" for guest users; (bso#9746).
  • - Fix 'map untrusted to domain' with NTLMv2;(bso#9817).
  • - Fix crash bug in Winbind; (bso#9854).
  • - Fix panic in nt_printer_publish_ads; (bso#9830).
Alerts:
openSUSE openSUSE-SU-2013:0916-1 2013-06-10

Comments (none posted)

subversion: denial of service

Package(s):subversion CVE #(s):CVE-2013-1968 CVE-2013-2112
Created:June 10, 2013 Updated:June 28, 2013
Description: From the Debian advisory:

CVE-2013-1968: Subversion repositories with the FSFS repository data store format can be corrupted by newline characters in filenames. A remote attacker with a malicious client could use this flaw to disrupt the service for other users using that repository.

CVE-2013-2112: Subversion's svnserve server process may exit when an incoming TCP connection is closed early in the connection process. A remote attacker can cause svnserve to exit and thus deny service to users of the server.

Alerts:
Debian DSA-2703-1 2013-06-09
Mandriva MDVSA-2013:173 2013-06-13
openSUSE openSUSE-SU-2013:1006-1 2013-06-14
Mageia MGASA-2013-0175 2013-06-19
Ubuntu USN-1893-1 2013-06-27
openSUSE openSUSE-SU-2013:1139-1 2013-07-04
Fedora FEDORA-2013-13672 2013-08-15
Gentoo 201309-11 2013-09-23

Comments (none posted)

wireshark: denial of service

Package(s):wireshark CVE #(s):CVE-2013-3561
Created:June 7, 2013 Updated:June 12, 2013
Description:

From the CVE database entry:

Multiple integer overflows in Wireshark 1.8.x before 1.8.7 allow remote attackers to cause a denial of service (loop or application crash) via a malformed packet, related to a crash of the Websocket dissector, an infinite loop in the MySQL dissector, and a large loop in the ETCH dissector.

Alerts:
Mageia MGASA-2013-0168 2013-06-06
openSUSE openSUSE-SU-2013:1084-1 2013-06-26
openSUSE openSUSE-SU-2013:1086-1 2013-06-26
Gentoo GLSA 201308-05:02 2013-08-30

Comments (none posted)

wireshark: multiple vulnerabilities

Package(s):wireshark CVE #(s):CVE-2013-4074 CVE-2013-4081 CVE-2013-4083
Created:June 12, 2013 Updated:September 30, 2013
Description: From the CVE entries:

The dissect_capwap_data function in epan/dissectors/packet-capwap.c in the CAPWAP dissector in Wireshark 1.6.x before 1.6.16 and 1.8.x before 1.8.8 incorrectly uses a -1 data value to represent an error condition, which allows remote attackers to cause a denial of service (application crash) via a crafted packet. (CVE-2013-4074)

The http_payload_subdissector function in epan/dissectors/packet-http.c in the HTTP dissector in Wireshark 1.6.x before 1.6.16 and 1.8.x before 1.8.8 does not properly determine when to use a recursive approach, which allows remote attackers to cause a denial of service (stack consumption) via a crafted packet. (CVE-2013-4081)

The dissect_pft function in epan/dissectors/packet-dcp-etsi.c in the DCP ETSI dissector in Wireshark 1.6.x before 1.6.16, 1.8.x before 1.8.8, and 1.10.0 does not validate a certain fragment length value, which allows remote attackers to cause a denial of service (application crash) via a crafted packet. (CVE-2013-4083)

Alerts:
Mandriva MDVSA-2013:172 2013-06-12
Debian DSA-2709-1 2013-06-17
Mageia MGASA-2013-0180 2013-06-26
Mageia MGASA-2013-0181 2013-06-26
Gentoo 201308-05 2013-08-28
Gentoo GLSA 201308-05:02 2013-08-30
Fedora FEDORA-2013-17661 2013-09-28

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The current development kernel is 3.10-rc5, released on June 8. In the announcement Linus Torvalds made it clear he was not completely pleased with the patches he was getting this late in the cycle: "Guys, guys, guys. I'm going to have to start cursing again unless you stop sending me non-critical stuff. So the next pull request I get that has "cleanups" or just pointless churn, I'm going to call you guys out on, and try to come up with new ways to insult you, your mother, and your deceased pet hamster."

Stable updates: Five stable kernels were released on June 7: 3.9.5, 3.4.48, and 3.0.81 by Greg Kroah-Hartman; from Canonical's extended stable trees Kamal Mostafa released 3.8.13.2 and Luis Henriques released 3.5.7.14.

The 3.9.6, 3.4.49, and 3.0.82 stable kernels are currently under review. They can be expected June 13 or soon after.

Comments (2 posted)

Quotes of the week

If companies are going to go off and invent the square wheel, and that makes *them* suffer the loss of being able to merge back into the mainline kernel, thereby making *their* job of moving forward with their kernel versions much harder, then yes, we *are* happy.
Russell King

Randomly not being able to connect AT ALL to wireless networks is not a valid "rate control" model.
Linus Torvalds

First of all, we really need to stop thinking about choosing [CPU] frequency (at least for x86). that concept basically died for x86 6 years ago.
Arjan van de Ven

Comments (5 posted)

Kernel development news

Skiplists II: API and benchmarks

June 12, 2013

This article was contributed by Andrew Shewmaker

A skiplist is composed of a hierarchy of ordered linked lists, where each higher level contains a sparser subset of the list below it. In part one, I described the basic idea of a skiplist, a little history of various attempts to use it in the Linux kernel, and Chris Mason's new cache-friendly skiplist for index ranges. This article will continue with a description of the current state of Chris's skiplist API and his future plans for it. I'll also discuss the performance of skiplists and rbtrees in a simple RAM test, as well as Chris's more interesting IOMMU comparison.

Skiplist API

A skiplist can be declared and initialized to an empty state with lines like:

    #include <linux/skiplist.h>

    struct sl_list list;
    sl_init_list(&list, GFP_ATOMIC);

Once the list exists, the next step is to populate it with data. As is shown in the data structure diagram, each structure to be placed in the list should embed an sl_slot structure; pointers to this embedded structure are used with the skiplist API.

Insertion into the skiplist requires the programmer to get a "preload token" — skiplist_preload() ensures that the necessary memory is available and disables preemption. With the token in hand, it's possible to actually insert the item, then re-enable preemption. Preloading helps avoid the need for atomic allocations and also to minimize time spent inside a leaf's lock during insertion. The preload function takes a pointer to a skiplist and a "get free page" mask describing the type of allocation to be performed, and it returns an integer token to be used later:

    int skiplist_preload(struct sl_list *list, gfp_t gfp_mask);

Note that preemption is disabled by skiplist_preload() and must not be re-enabled during insertion because the function is holding an RCU read lock and working with per-CPU data structures.

The function that actually adds an item to the list, skiplist_insert(), is called with that list, a slot to be inserted, and a token returned by skiplist_preload():

    int skiplist_insert(struct sl_list *list, struct sl_slot *slot, 
			int preload_token);

Here's an example insertion into a skiplist:

    int preload_token, ret;
    preload_token = skiplist_preload(skiplist, GFP_KERNEL);

    if (preload_token < 0)
    	return preload_token;

    ret = skiplist_insert(skiplist, slot, preload_token);
    preempt_enable();

Deletion only requires one function call, though it is implemented in two phases if a leaf becomes empty. In that case, the leaf is marked "dead," then it is unlinked from the skiplist level by level. In either case, it returns the slot pointer of what it deleted from the list.

    struct sl_slot *skiplist_delete(struct sl_list *list, unsigned long key,
           	       		    unsigned long size);

Adding or removing a key to/from an existing leaf is simple and only requires a lock at the leaf. However, if a leaf is created or destroyed, then more locking is required. Leaves with higher levels require locks to be taken on neighboring nodes all the way down to level zero so that everything can be re-linked without having a neighbor being deleted out from under. The list of affected leaves is kept track of in a temporary sl_node list referred to as a cursor. (Chris is reworking his code to get rid of cursors). The best-case scenario is a modification at level zero where only a couple of locks are required. Both the preallocation and the insertion code are biased in favor of creating a level-zero leaf. Regardless, the locking is only required for a small window of time.

Unlike an rbtree, rebalancing of the skiplist is not required, even when simultaneous insertions and deletions are being performed in different parts of the skiplist.

A specialized insertion function is provided that finds a free index range in the skiplist that is aligned and of a given size. This isn't required by filesystems, but Chris implemented it so that he could directly compare rbtrees to skiplists in the IOMMU code. The IOMMU requires this functionality because each PCIE device's domain requires an aligned range of memory addresses.

Calls to skiplist_insert_hole() take a hint of where a hole might be inserted, and must be retried with a new hint if the return value is -EAGAIN. That error return happens when simultaneous holes are being created and the one you hinted at was good, but was stolen before you could use it. On successful insertion, the slot passed in is updated with the location of the hole.

    int skiplist_insert_hole(struct sl_list *list, unsigned long hint,
    			     unsigned long limit, unsigned long size, unsigned long align,
    			     struct sl_slot *slot, gfp_t gfp_mask);

Tearing down a whole skiplist requires a fair amount of work. First free the structures embedding the slots of each leaf, then use sl_free_leaf(), and finally, zero the pointers in the head of the skiplist. Wrappers around container_of() for obtaining the leaf embedding a node or the structure embedding a slot are provided by sl_entry(ptr) and sl_slot_entry(ptr, type, member), respectively. Comments in the code indicate future plans to add skiplist zeroing helpers, but for now you must roll your own as Chris did for his IOMMU patch.

Here's a generic example of destroying a skiplist:

    struct sl_node *p;
    struct sl_leaf *leaf;
    struct sl_slot *slot;
    struct mystruct *mystruct;

    sl_lock_node(skiplist->head);
    p = skiplist->head->ptrs[0].next;
    while (p) {
	    leaf = sl_entry(p);
	    for (i = 0; i < leaf->nr; i++) {
		    slot = leaf->ptrs[i];
		    mystruct = sl_slot_entry(slot, struct mystruct, slot);
		    free_mystruct_mem(mystruct);
	    }
	    p = leaf->node.ptrs[0].next;
	    sl_free_leaf(leaf);
    }

    memset(skiplist->head->ptrs, 0, sl_node_size(SKIP_MAXLEVEL));
    sl_unlock_node(skiplist->head);

Chris considered including slot iterators equivalent to rb_next() and rb_prev(), but decided against it because of the overhead involved in validating a slot with each call. Instead, skiplist_next() and skiplist_prev() are leaf iterators that allow a caller to more efficiently operate on slots in bulk. Chris hasn't posted the updated API yet, but it seems likely that the iterators will resemble the existing sl_next_leaf() and friends.

Calls to sl_first_leaf() and sl_last_leaf() return pointers to the first and last entries of the skiplist. The sl_next_leaf() call is a little different in that you must provide it with an sl_node (embedded in your current leaf), and since each node potentially has many next entries, you must also provide the level l you want to traverse.

    struct sl_leaf *sl_first_leaf(struct sl_list *list);
    struct sl_leaf *sl_last_leaf(struct sl_list *list);
    struct sl_leaf *sl_next_leaf(struct sl_list *list, struct sl_node *p, int l);

Since this skiplist implementation focuses on index ranges (or extents) defined by key and size parameters, it can provide search functions. This is in contrast to rbtrees—they are more diverse, so users must roll their own search functions. Each of the skiplist search functions needs to be passed a pointer to the skiplist, the key you are looking for, and the slot size (the number of extents in a leaf). If successful, they return a pointer to the slot matching the key.

    struct sl_slot *skiplist_lookup(struct sl_list *list, unsigned long key,
				    unsigned long size);
    struct sl_slot *skiplist_lookup_rcu(struct sl_list *list, unsigned long key,
    				        unsigned long size);

The first, skiplist_lookup(), is appropriate for when a skiplist is experiencing high read/write contention. It handles all the locking for you. It protects the skiplist with read-copy-update (RCU) while it finds the correct leaf and then it protects the leaf with a spinlock during a binary search to find the slot. If no slot corresponds to the key, then a NULL pointer is returned.

If skiplist contention is low or you need more control, then use the second variant. Before calling skiplist_lookup_rcu(), you must call rcu_read_lock() and you must take care of details such as reference counting yourself. The search for the leaf uses the same helper function as skiplist_lookup(), but the leaf spinlock is not held. Instead, it depends on the skiplist's RCU read lock being held to also protect the slots in a leaf while it performs a sequential search. This search is sequential because Chris does not do the copy part of RCU. He does order the operations of insertion/deletion to try to make the sequential search safe, and that should usually work. However, it might not return the slot of interest, so it is the responsibility of the caller to verify the key of the returned slot, and then call skiplist_lookup_rcu() again if it the returned slot's key doesn't match the key being searched for.

Chris elaborated on his future plans for the API in a private email:

In terms of changes coming to the patches, the biggest will be in the insert code. Right now skiplist_insert does the search, cursor maintenance, and the insert, but that won't work for XFS because they need more control over the EEXIST condition.

It'll get split out into search and insert steps the caller can control, and you'll be able to call insert with just a locked leaf from any level...

The searching API will also improve, returning both the leaf and the slot. This allows skiplist versions of rb_next() and rb_prev().

The skiplist code also indicates that there is work to be done to make lockdep understand Chris's skiplist locking. It needs to be taught that holding multiple locks on the same level of a skiplist is allowed as long as they are taken left to right.

Testing

In addition to the IOMMU comparison between rbtrees and skiplists that Chris posted numbers for, his patch also includes a simple RAM-only comparison in the form of a kernel module called skiplist_test. I tested 100,000 items for 100,000 rounds with multiple numbers of threads.

This table shows the results:

ADT Threads Fill
Time
(ms)
Check
Time
(ms)
Delete
Time
(ms)
Avg. Thread
Time
(s)
rbtree 1 37 9 12 0.752
skiplist-rcu 1 18 15 23 2.615
rbtree 2 36 8 12 2.766
skiplist-rcu 2 19 19 27 2.713
rbtree 4 36 11 10 6.660
skiplist-rcu 4 23 24 21 3.161
These results show skiplists beating rbtrees in fill time, but losing on check and delete times. The skiplist average thread time is only slightly better with two threads, and beats rbtree soundly with four threads (they take half the time). However, rbtree wins the single threaded case, which surprises Chris because it doesn't match what he sees in user-space testing. He told me, "Most of the difference is the cost of calling spin_lock (even single threaded)."

The more interesting numbers are from Chris's IOMMU comparison. Even though he is mostly interested in using skiplists for Btrfs extents, he chose to use the IOMMU because it is easier to isolate the performance of the two data structures, which makes it both easier for non-Btrfs people to understand and more meaningful to them. He also says, "... with the IOMMU, it is trivial to consume 100% system time on the rbtree lock." The rbtree lock is, in effect, a global lock held once at the start and once at the end of an IO.

Chris kept the basic structure of the IOMMU code so that he could compare skiplists to rbtrees. He was not trying to design a better IOMMU that looked for free ranges of addresses differently or fix the IOMMU contention, though he told me he would work with David Woodhouse on a proper solution that tracks free extents later this year.

His benchmarks were run on a single socket server with two SSD cards. He used a few FIO jobs doing relatively large (20MB) asynchronous/direct IOs with 16 concurrent threads and 10 pending IOs each (160 total). Here are his results for streaming and random writes:

Streaming writes
IOMMU off  2,575MB/s
skiplist   1,715MB/s
rbtree     1,659MB/s

Not a huge improvement, but the CPU time was lower.

[...]

16 threads, iodepth 10, 20MB random writes
IOMMU off  2,548MB/s
skiplist   1,649MB/s
rbtree        33MB/s

The existing rbtree-based IOMMU slows streaming writes down 64.4% of the maximum, and the skiplist's throughput is slightly better at 66.6% while using less CPU time. Evidently the skiplist's advantages in concurrency and in maintaining a balanced overall structure only give it a modest advantage in the streaming write case. However, random writes cause rbtree performance to only achieve 1.3% of the maximum throughput. In this case, a skiplist fares much better, dropping only to 64.7% of the maximum because different threads can hold locks simultaneously while in different parts of the skiplist and it doesn't need to go through a costly rebalancing operation like the rbtree.

16 threads, iodepth 10, 20MB random reads
IOMMU off  2,861MB/s (mostly idle)
skiplist   2,484MB/s (100% system time)
rbtree        99MB/s (100% system time)

... lowering the thread count did improve the rbtree performance, but the
best I could do was around 300MB/s ...

Reads are easier than writes, and we could expect streaming read results to all be close and relatively uninteresting. Certainly both the rbtree and skiplist do better at random reads than random writes. In fact, the skiplist achieves higher throughput for random reads than it does for streaming writes although it has to work hard to do so. And in case anyone thought the thread count was particularly unfair for rbtree in these tests, Chris points out that the best he got for random IOs with rbtree was around 300MB/s. That's still only 10% of the maximum throughput. Furthermore, Chris noted that all of the CPU time spent in the skiplist was in skiplist_insert_hole(), which isn't optimized.

In a recent discussion on the Linux filesystems mailing list, Mathieu Desnoyers proposed another data structure that he is calling RCU Judy arrays. They can't be compared with skiplists just yet since the Judy arrays are only implemented in user space so far, but the competition between the two ideas should improve them both.

Even though there are plenty of opportunities for refinement, this is a promising start for a cache-friendly skiplist for the Linux kernel. It should provide better performance for any subsystem that has high levels of contention between concurrent accesses of its rbtrees: various filesystem indexes, virtual memory areas (VMAs), the high-resolution timer code, etc. CPU schedulers will probably not see any benefit from skiplists because only one thread is making the scheduling decision, but perhaps multiqueue schedulers for the network or block layer might in the case where they have one queue per NUMA node.

Comments (3 posted)

Plans for hot adding and removing memory

By Jake Edge
June 12, 2013
LinuxCon Japan 2013

At LinuxCon Japan, Yasuaki Ishimatsu of Fujitsu talked about the status of memory hotplug, with a focus on what still needs to be done to fully support both hot adding and hot removing memory. If a memory device is broken in a laptop or desktop, you can just replace that memory, but for servers, especially ones that need to stay running, it is more difficult. In addition, having a way to add and remove memory would allow for dynamic reconfiguration on systems where the hardware has been partitioned into two or more virtual machines.

The focus of the memory hotplug work is for both scenarios: broken memory hardware and dynamic reconfiguration. Memory hotplug will be supported in KVM, Ishimatsu said. It is currently supported by several operating systems, but Linux does not completely support it yet. Fixing that is the focus this work.

There are two phases to memory hotplug: physically adding or removing memory (hot add or hot remove) and logically changing the amount of memory available to the system (onlining or offlining memory). Both phases have to be completed before Linux can use any new memory, and taking the memory offline (so that Linux is no longer using it) is required before it can be removed.

The memory management subsystem manages physical memory by using two structures, he said. The page tables hold a direct mapping for virtual to physical addresses. The virtual memory map manages page structures. In order to offline memory, any data needs to be moved out of the memory and those data structures need to be updated. Likewise, when adding memory, new page table and virtual memory map entries must be added.

Pages are managed in zones and, when using the sparse memory model that is needed for memory hotplug systems, zones are broken up into sections that are 128M in size. Sections can be switched from online to offline and vice versa using the /sys/devices/system/memory/memoryX/state file. By echoing offline or online into that file, the pages in that section have their state changed to unusable or usable respectively.

In the 3.2 kernel, hot adding memory and onlining it were fully supported. Offlining memory was supported with limitations, and hot removing it was not supported at all. Work started in July 2012 to remove the offline limitations and to add support for hot remove, Ishimatsu said.

The work for hot remove has been merged for the 3.9 kernel. It will invalidate page table and virtual memory map entries that correspond to the memory being removed. But, since the memory must be taken offline before it is removed, the limitations on memory offline still make it impossible to remove arbitrary memory hardware from the system.

When memory that is to be offlined has data in it, that data is migrated to other memory in the system. But the only pages that are migratable this way are the page cache and anonymous pages, which are known as "movable" pages. If the memory contains non-movable memory, which Ishimatsu called "kernel memory", the section cannot be offlined.

There are two ways to handle that problem that are being considered. The first is to support moving kernel memory when offlining pages that contain it. The advantages to that are that all memory can be offlined and there is no additional performance impact for NUMA systems since there are no restrictions on the types of allocations that can be made. On the downside, though, the kernel physical to virtual address relationship will need to change completely. The other alternative is to make all of a node's memory movable, which would reuse the existing movable memory feature, but means that only page cache and anonymous pages can be stored there, which will impact the performance of that NUMA node.

Ishimatsu said that he prefers the first solution personally, but, as a first step they are implementing the second: creating a node that consists only of movable memory. Linux has the idea of a movable zone (i.e. ZONE_MOVABLE), but zones of that type are not created automatically. If a node consists only of movable memory, all of it can be migrated elsewhere so that the node can be taken offline.

A new boot option, movablecore=acpi, is under development that will use the memory affinity structure in the ACPI static resource affinity table (SRAT) to choose which nodes will be constructed of movable memory. The existing use for movablecore allows setting aside a certain amount of memory that will be movable in the system, but it spreads it evenly across all of the nodes rather than concentrating it only on the nodes of interest. The "hotpluggable" bit for a node in the SRAT will be used to choose the target nodes in the new mode.

Using the online_movable flag to the sysfs memory state file (rather than just online) allows an administrator to tell the system to make that memory movable. Without that, the onlined memory is treated as ZONE_NORMAL, so it may contain kernel memory and thus not be able to be offlined. The online_movable feature was merged for 3.8. That reduces the limitations on taking memory offline, but there is still work to do.

Beyond adding the movablecore=acpi boot option (and possibly a vm.hotadd_memory_treat_as_movable sysctl), there are some other plans. Finding a way to put the page tables and virtual memory map into the hot-added memory is something Ishimatsu would like to see, because it would help performance on that node, but would not allow that memory to be offlined unless those data structures can be moved. He is thinking about solutions for that. Migrating vmalloc() data to other nodes when offlining a node is another feature under consideration.

Eventually, being able to migrate any kernel memory out of a node is something he would like to see, but solutions to that problem are still somewhat elusive. He encouraged those in attendance to participate in the discussions and to help find solutions for these problems.

[I would like to thank the Linux Foundation for travel assistance to Tokyo for LinuxCon Japan.]

Comments (7 posted)

Outreach program for women—kernel edition

By Jake Edge
June 12, 2013

While three kernel internships for women were originally announced in late April, the size of the program has more than doubled since then. Seven internships have been established for kernel work through the Outreach Program for Women (OPW); each comes with a $5000 stipend and a $500 travel grant. The program officially kicks off on June 17, but the application process already brought in several hundred patch submissions from eighteen applicants, 137 of which were accepted into the staging and Xen trees—all in thirteen days.

The program was initiated by the Linux Foundation, which found sponsors for the first three slots, but Intel's Open Source Technology Center added three more while the OPW itself came up with funding for another. The OPW has expanded well beyond its GNOME project roots, with eighteen different organizations (e.g. Debian, KDE, Mozilla, Perl, Twisted, and many more) participating in this round.

The program pairs the interns with a mentor from a participating project to assist the intern with whatever planned work she has taken on for the three months of the each program round. OPW is patterned after the Google Summer of Code project, but is not only for students and programmers as other kinds of projects (and applicants) are explicitly allowed. As the name would imply, it also restricts applicants to those who self-identify as a woman.

The kernel effort has been guided by Sarah Sharp, who is a USB 3.0 kernel hacker for Intel. She is also one of the mentors for this round. In late May, she put together a blog post that described the application process and the patches it brought in. Sharp filled us in on the chosen interns. In addition, most of the patches accepted can be seen in her cherry-picked kernel git tree.

The interns

Sharp will be mentoring Ksenia (Xenia) Ragiadakou who will be working on the USB 3.0 host driver. Ragiadakou is currently studying for her bachelor's degree in computer science at the University of Crete in Greece. In addition to her cleanup patches for the rtl8192u wireless staging driver, Ragiadakou has already found a bug in Sharp's host controller driver.

Two of the interns will be working on the Xen subsystem of the kernel with mentors Konrad Wilk of Oracle and Stefano Stabellini of Citrix. They are Lisa T. Nguyen, who received a bachelor's degree in computer science from the University of Washington in 2007, and Elena Ufimtseva, who got a master's degree in computer science from St. Petersburg University of Information Technologies in 2006. Nguyen did several cleanup patches for Xen (along with various other cleanups) as part of the application process, while Ufimtseva focused on cleanups in the ced1401 (Cambridge Electronics 1401 USB device) driver in staging.

Lidza Louina will be working with Greg Kroah-Hartman as a mentor on further cleanups in staging drivers. She was working on a bachelor's degree in computer science at the University of Massachusetts but had to take time off to work full-time. Her contributions were to the csr wireless driver in the staging tree.

Tülin İzer is working on parallelizing the x86 boot process with mentor PJ Waskiewicz of Intel. She is currently pursuing a bachelor's degree in computer engineering at Galatasary University in Istanbul, Turkey. Her application included fixes for several staging drivers.

Two other Intel-mentored interns are in the mix: Hema Prathaban will be working with Jacob Pan on an Ivy Bridge temperature sensor driver, while Laura Mihaela Vasilescu will be working on Intel Ethernet drivers, mentored by Carolyn Wyborny and Anjali Singhai. Prathaban graduated in 2011 from KLN College of Engineering in India with a bachelor's degree in computer science. She has been a full-time mother for the last year, so the internship provides her a way to get back into the industry. Vasilescu is a master's student at the University of Politehnica of Bucharest, Romania and is also the student president of ROSEdu, an organization for Romanian open source education. Both did a number of patches; Prathaban in the staging tree (including fixing a bug in one driver) and Vasilescu in Intel Ethernet drivers.

Getting started

As with many budding kernel developers, most of the applicants' patches were to various staging drivers. There was a short application window as the kernel portion didn't get announced until a little under two weeks before the deadline. But that didn't seem to slow anything down as there were 41 applicants for the internships, with eighteen submitting patches and eleven having those patches accepted into the mainline.

That level of interest—and success—is partly attributable to a first patch tutorial that she wrote, Sharp said. The tutorial helps anyone get started with kernel development from a fresh Ubuntu 12.04 install. It looks at setting up email, getting a kernel tree, using git, building the kernel, creating a patch, and more. The success was also due to strong applicants and mentors that were "patient and encouraging", she said.

The kernel OPW program was mentioned multiple times at the recently held Linux Foundation conferences in Japan as a helpful step toward making the gender balance of kernel developers better represent the world we live in (as Dirk Hohndel put it). It is also nice to see the geographical diversity of the interns, with Asia, Europe, and North America all represented. Hopefully South America, Africa, and Oceania will appear in follow-on rounds of the program—Antarctica may not make the list for some time to come.

Another round of the OPW, including kernel internships, is planned for January through March 2014 (with application deadlines in December). The program is seeking more interested projects, mentors, and financial backers for the internships. While there are certainly critics of these types of efforts, they have so far proved to be both popular and effective. Other experiments, using different parameters or criteria, are definitely welcome, but reaching out and making an effort to bring more women into the free-software fold is something that will hopefully be with us for some time—until that hoped-for day when it isn't needed at all anymore.

Comments (2 posted)

Patches and updates

Kernel trees

Core kernel code

Development tools

Device drivers

Filesystems and block I/O

Memory management

Networking

Architecture-specific

Security-related

Virtualization and containers

Miscellaneous

Page editor: Jake Edge

Distributions

Tizen compliance

By Nathan Willis
June 12, 2013
Tizen Dev Con 2013

Tizen is intended to serve as a base Linux distribution for consumer electronics products, from phones to automobile dash systems, all built and sold by different manufacturers—yet offering a consistent set of application-level APIs. Consequently, one of the problems the project clearly needs to address is assessing the compliance of products marketed as Tizen offerings. At the 2013 Tizen Developer Conference in San Francisco, several sessions examined the compliance program and the testing process involved.

First, Intel's Bob Spencer addressed the goals and scope of the compliance program. Perhaps the biggest distinction he made was that only hardware products would be subject to any compliance testing; applications will not need to be submitted for tests. That said, applications will need to conform to packaging and security guidelines, but as he put it "acceptance into the Tizen app store means success" on that front. The project is working on a tool to flag build and packaging errors for app developers, but it is not ready for release.

As for hardware, Spencer said, companies developing a Tizen-based product will not need to send in physical devices. Instead, they will need to send the results of the compliance test suite to the Tizen Association, along with a "branding request" for approval of the use of the Tizen trademark.

[Spencer at Tizen Dev Con]

In broad strokes, he said, the compliance model consists of a set of specifications and the test suite that checks a device against them. The specification distinguishes between requirements (i.e., "MUST" statements) and recommendations ("SHOULD" statements). There is a common core specification that applies to all Tizen devices, plus separate "profile" specifications that address particular device categories. Each includes a combination of hardware and system software features. To get the compliance seal of approval, a device must pass both the common specification plus one or more device class profiles.

The "or more" requirement might allow a device to qualify as both a tablet (which falls under the "mobile" profile) and as a "convertible" (which, sadly, is not part of the automotive profile, but rather the "clamshell" profile also used for laptops). Interestingly enough, devices will not be allowed to qualify for Tizen compliance by meeting only the common core specification. At present, the common core specification and the "mobile" profile have been published for the Tizen 2.1 release as a "public draft." Spencer said that the final 2.1 specification is expected by the end of June.

The draft [PDF] does not separate out the common core and mobile profile requirements into distinct sections, which the site says will be done only once there are multiple device profiles published. On the compliance discussion list, Mats Wichmann said that this was due to the need to get a mobile profile specification out the door.

Spencer provided an overview of the specification in his session. He described the hardware requirements as being designed for flexibility, supporting low-end feature phones up to high-end smartphones, tablets from simple e-readers on up, and perhaps even watches (which, incidentally, marked the first mention of Tizen powered watches I have encountered). The list includes 512MB of RAM, 1GB of storage, at least one audio output, some form of Internet connectivity (which SHOULD be wireless), display resolution of at least 480×320, USB 2.0, and touch-screen support (which MUST support single-touch, but SHOULD support multi-touch).

There is considerably more flexibility regarding the vast assortment of sensors and radios found in phones today; the specification indicates that things like GPS, Near-Field Communications (NFC), and accelerometers are all optional, but that if a device provides any of them, it must implement the associated APIs.

At the moment, the draft requires supporting both Tizen's HTML5 APIs and its native APIs; Spencer said there were internal discussions underway as to whether there should be separate "web-only" and "web-plus-native" profile options. In addition to the application APIs, the software side of the specification requires that devices be either ARM or x86 architecture, defines application packaging and management behavior, lists required multimedia codecs, SDK and development tool compliance, and mandates implementation of the "AppControl" application control interface (which defines a set of basic cross-application operations like opening and displaying a file).

The requirements are a bit more stringent in one area: web runtimes. A device must provide both a standard web browser and a web runtime for HTML5 applications. In addition, both must be built from the official Tizen reference implementations (which are based on WebKit), and must not alter the exposed behavior implemented upstream. The browser and web runtime must also report a specific user agent string matching the reference platform and version information.

Testing, testing, testing

Immediately after Spencer finished his overview of the compliance specification, Samsung's Hojun Jaygarl and Intel's Cathy Shen spoke about the Tizen Compliance Test (TCT) used to assess devices. TCT is designed to verify that the version of Tizen running on a product conforms to the specifications, they said, although the project requires that the Tizen reference code will be ported to each device, rather than implemented from scratch. Consequently, the TCT tests are designed to test features that ensure a consistent application development environment and a consistent customer experience, but allow manufacturers to differentiate the user experience (UX).

The TCT battery of tests includes both automated and manual tests, they explained. The manual tests cover those features that require interoperating with other devices, such as pairing with another device over WiFi Direct, or human interaction (such as responding to button presses). The automated tests are unit tests addressing the mandatory hardware and software requirements of the specification, and compliance with any of the optional features the vendor chooses to implement.

[Shen and Jaygarl at Tizen Dev Con]

TCT splits the native API and Web APIs into separate categories (although, again, both are currently required for any device to pass). The native TCT involves a native app called FtApp that executes individual tests on the device in question. The tests themselves are built on the GTest framework developed by Google. Tests are loaded into FtApp from a PC connected to the device via the Smart Development Bridge (SDB) tool in the Tizen SDK. There is also a GUI tool for the host PC to monitor test progress and generate the reports necessary for submission. The "native" tests cover the native application APIs, plus application control, conformance to security privileges, and the hardware features.

The web TCT can use the GUI tool to oversee the process, but there is a command line utility as well. This test suite involves loading an embedded web server onto the device, since it tests the compliance of the device's web runtime with the various Web APIs (including those coming from the W3C and the supplementary APIs defined by Tizen). It also tests the device's web runtime for compliance with package management, security, and privacy requirements, and can run tests on the device's hardware capabilities. These may not be completely automated, for example, involving a human to verify that the screen rotates correctly when the device is turned sideways. Finally, there is a tool called TCT-behavior that tests interactive UI elements; it, too, requires a person to operate the device.

The web TCT currently covers more than 10,000 individual tests, while the native TCT incorporates more than 13,000. Shen and Jaygarl said the automated tests take three to four hours to complete, depending on the device. The manual tests add about one more hour. Reports generated by the test manager are fairly simple; they list the pass/fail result for each test case, the elapsed time, the completion ratio (if applicable), and link to a more detailed log for each case. The test management tool is an Eclipse plugin, designed for use with the Tizen SDK.

During the Q&A at the end of the session, the all-important question of source code availability was raised by the audience. Shen and Jaygarl said that they expected to release the TCT test tools by the end of June. Currently, they are still working on optimizing the manual test cases—although it also probably goes without saying that the TCT can hardly be expected before the final release of the specification it is intended to test.

With more than 23,000 test cases, compliance with Tizen 2.1 will hardly be a rubber-stamp, though requiring vendors to port the reference code ought to take much of the guesswork out of the process. Jaygarl and Shen also commented that developers will be able to write their own test cases in GTest format and run them using the official TCT tools, so when the toolset arrives it may offer something to application developers as well as system vendors.

Compliance with a specification is not necessarily of interest to everyone building an embedded Linux system, nor even to everyone building a system based on Tizen. The program is designed to meet the needs of hardware manufacturers, after all, who already have other regulatory and development tests built into their product cycle. It will be interesting to see how the Tizen Compliance Program evolves to handle the non-mobile-device profiles in the future, but even if that takes a while, it could be amusing to run the tests against the first batches of commercially available Tizen phones, which are reported to arrive this year.

[The author wishes to thank the Linux Foundation for travel assistance to Tizen Dev Con.]

Comments (none posted)

Brief items

Distribution quote of the week

Quite a lot of people have said they find the current layout a bit confusing, but then, we tried two other layouts before this one and people found both of those confusing too. At this point we are running out of possibilities, but perhaps we could label the button 'Unicorn' and have it orbit the screen randomly. That would at least be different.
-- Adam Williamson

Comments (none posted)

FreeBSD 8.4 released

The FreeBSD Release Engineering Team has announced the availability of FreeBSD 8.4. See the detailed release notes for more information.

Comments (none posted)

Newsletters and articles of interest

Distribution newsletters

Comments (none posted)

Linux Mint 15: Solid, But Unsettled (Datamation)

Bruce Byfield reviews Linux Mint 15. "Linux Mint 15 is a solid release, but not an end in itself. Rather, it is part of the ongoing process of refining Cinnamon and Mate while minimizing innovation to keep users comfortable. With this release, the process is starting to meet some of its early promise, but remains ongoing. It is still uncertain whether any distribution with Linux Mint's goal is capable of more than mild innovations. Users, perhaps, might consider this limited scope a good thing—and, after some of the events of the last few years, I can understand this attitude. At this point, many users must be weary of thinking so much about their desktop environments. Such users have settled on Linux Mint precisely because it allows them to forget about their interfaces and concentrate on their work."

Comments (2 posted)

ROSA Desktop Fresh R1 brings Azure and Steam support (The H)

The H takes a brief look at the latest version of ROSA's Desktop Fresh. "The developers say that users of Desktop Fresh R1 can now install Valve's Steam distribution platform on it, giving them access to over a hundred commercial games. The default desktop environment in ROSA Fresh R1 is KDE and the distribution includes version 4.10.3 of the desktop environment. The developers promise that GNOME and LXDE editions of the distribution will follow."

Comments (none posted)

Page editor: Rebecca Sobol

Development

Little things that matter in language design

June 8, 2013

This article was contributed by Neil Brown

The designers of a new programming language are probably most interested in the big features — the things that just couldn't be done with whichever language they are trying to escape from. So they are probably thinking of the type system, the data model, the concurrency support, the approach to polymorphism, or whatever it is that they feel will affect the expressiveness of the language in the way they want.

There is a good chance they will also have a pet peeve about syntax, whether it relates to the exact meaning of the humble semicolon, or some abhorrent feature such as the C conditional expression which (they feel) should never be allowed to see the light of day again. However, designing a language requires more than just addressing the things you care about. It requires making a wide range of decisions concerning various sorts of abstractions, and making sure the choices all fit together into a coherent, and hopefully consistent, whole.

One might hope that, with over half a century of language development behind us, there would be some established norms which can be simply taken as "best practice" without further concern. While this is true to an extent, there appears to be plenty of room for languages to diverge even on apparently simple concepts.

Having begun an exploration of the relatively new languages Rust and Go and, in particular, having two languages to provide illuminating contrasts, it seems apropos to examine some of those language features that we might think should be uncontroversial to see just how uniform they have, or have not, become.

Comments

When first coming to C [PDF] from Pascal, the usage of braces can be a bit of a surprise. While Pascal sees them as one option for enclosing comments, C sees them as a means of grouping statements. This harsh conflict between the languages is bound to cause confusion, or at least a little friction, when moving from one language to the next, but fortunately appears to be a thing of the past.

One last vestige of this sort of confusion can be seen in the configuration files for BIND, the Berkeley Internet Name Daemon. In the BIND configuration files semicolons are used as statement terminators while in the database files they introduce comments.

When not hampered by standards conformance as these database files are, many languages have settled on C-style block comments:

   /* This is a comment */

and C++-style one-line comments:

   // This line has a comment

these having won over from the other Pascal option of:

   (* similar but different block comments *)

and Ada's:

   -- again a similar yet different single line comment.

The other popular alternative is to start comments with a "#" character, which is a style championed by the C-shell and Bourne shell, and consequently used by many scripting languages. Thankfully the idea of starting a comment with "COMMENT" and ending with "TNEMMOC" never really took off and may be entirely apocryphal.

Both Rust and Go have embraced these trends, though not as fully as BIND configuration files and other languages like Crack which allow all three (/* */, //, #). Rust and Go only support the C and C++ styles.

Go doesn't use the "#" character at all, allowing it only inside comments and string constants, so it is available as a comment character for a future revision, or maybe for something else.

Rust has another use for "#" which is slightly reminiscent of its use by the preprocessor in C. The construct:

  #[attribute....]

attaches arbitrary metadata to nearby parts of the program which can enable or disable compiler warnings, guide conditional compilation, specify a license, or any of various other things.

Identifiers

Identifiers are even more standard than comments. Any combination of letters, digits, and the underscore that does not start with a digit is usually acceptable as an identifier providing it hasn't already been claimed as a reserved word (like if or while).

With the increasing awareness of languages and writing systems other than English, UTF-8 is more broadly supported in programming languages these days. This extends the range of characters that can go into an identifier, though different languages extend it differently.

Unicode defines a category for every character, and Go simply extends the definition given above to allow "Unicode letter" (which has 5 sub-categories: uppercase, lowercase, titlecase, modifier, and other) and "Unicode decimal digit" (which is one of 3 sub-categories of "Number", the others being "Number,letter" and "Number,other") to be combined with the underscore. The Go FAQ suggests this definition may be extended depending on how standardization efforts progress.

Rust gives a hint of what these efforts may look like by delegating the task of determining valid identifiers to the Unicode standard. The Unicode Standard Annex #31 defines two character classes, "ID_Start" and "ID_Continue", that can be used to form identifiers in a standard way. The Annex offers these as a resource, rather than imposing them as a standard, and acknowledges that particular use cases may extend them is various ways. It particularly highlights that some languages like to allow identifiers to start with an underscore, which ID_Start does not contain. The particular rule used by Rust is to allow an identifier to start with an ASCII letter, underscore, or any ID_Start, and to be continued with ASCII letters, ASCII digits, underscores, or Unicode ID_Continue characters.

Allowing Unicode can introduce interesting issues if case is significant, as Unicode supports three cases (upper, lower, and title) and also supports characters without case. Most programming languages very sensibly have no understanding of case and treat two characters of different case as different characters, with no attempt to fold case or have a canonical representation. Go however does pay some attention to case.

In Go, identifiers where the first character is an uppercase letter are treated differently in terms of visibility between packages. A name defined in one package is only exported to other packages if it starts with an uppercase letter. This suggests that writing systems without case, such as Chinese, cannot be used to name exported identifiers without some sort of non-Chinese uppercase prefix. The Go FAQ acknowledges this weakness but shows a strong reluctance to give up the significance of case in exports.

Numbers

Numbers don't face any new issues with Unicode though possibly that is just due to continued English parochialism, as Unicode does contain a complete set of Roman numerals as well as those from more current numeral systems. So you might think that numbers would be fairly well standardized by now. To a large extent they are, but there still seems to be wiggle room.

Numbers can be integers or, with a decimal point or exponent suffix (e.g. "1.0e10"), floating point. Integers can be expressed in decimal, octal with a leading "0", or hexadecimal with a leading "0x".

In C99 and D [PDF], floating point numbers can also be hexadecimal. The exponent suffix must then have a "p" rather than "e" and gives a power of two expressed in decimal. This allows precise specification of floating point numbers without any risk of conversion errors. C11 and D also allow a "0b" prefix on integers to indicate a binary representation (e.g. "0b101010") and D allows underscores to be sprinkled though numbers to improve readability, so 1_000_000_000 is clearly the same value as 1e9.

Neither Rust nor Go have included hexadecimal floats. While Rust has included binary integers and the underscore spacing character, Go has left these out.

Another subtlety is that while C, D, Go, and many other languages allow a floating point number to start with a period (e.g. ".314159e1"), Rust does not. All numbers in Rust must start with a digit. There does not appear to be any syntactic ambiguity that would arise if a leading period were permitted, so this is presumably due to personal preference or accident.

In the language Virgil-III this choice is much clearer. Virgil has a fairly rich "tuple" concept [PDF] which provides a useful shorthand for a list of values. Members of a tuple can be accessed with a syntax similar to structure field references, only with a number rather than a name. So in:

    var x:(int, int) = (3, 4);
    var w:int = x.1;

The variable "w" is assigned the value "4" as it is element one of the tuple "x". Supporting this syntax while also allowing ".1" to be a floating point number would require the tokenizer to know when to report two tokens ("dot" and "int") and when it is just one ("float"). While possible, this would be clumsy.

Many fractional numbers (e.g. 0.75) will start with a zero even in languages which allow a leading period (.75). Unlike the case with integers, the leading zero does not mean these number are interpreted in base eight. For 0.75 this is unlikely to cause confusion. For 0777.0 it might. Best practice for programmers would be to avoid the unnecessary digit in these cases and it would be nice if the language required that.

As well as prefixes, many languages allow suffixes on numbers with a couple of different meanings. Those few languages which have "complex" as a built-in type need a syntax for specifying "imaginary" constants. Go, like D, uses an "i" suffix. Python uses "j". Spreadsheets like LibreOffice localc or Microsoft Excel allow either "i" or "j". It is a pity more languages don't take that approach. Rust doesn't support native complex numbers, so it doesn't need to choose.

The other meaning of a suffix is to indicate the "size" of the value - how many bytes are expected to be used to store it. C and D allow u, l, ll, or f for unsigned, long, long long, and float, with a few combinations permitted. Rust allows u, u8, u16, u32, u64, i8, i16, i32, i64, f32, and f64 which cover much the same set of sizes, but are more explicit. Perhaps fortunately, i is not a permitted suffix, so there is room to add imaginary numbers in the future if that turned out to be useful.

Go takes a completely different approach to the sizing of constants. The language specification talks about "untyped" constants though this seems to be some strange usage of the word "untyped" that I wasn't previously aware of. There are in fact "untyped integer" constants, "untyped floating point" constants, and even "untyped boolean" constants, which seem like they are untyped types. A more accurate term might be "unsized constants with unnamed types" though that is a little cumbersome.

These "untyped" constants have two particular properties. They are calculated using high precision with overflow forbidden, and they can be transparently converted to a different type provided that the exact value can be represented in the target type. So "1e15" is an untyped floating point constant which can be used where an int64 is expected, but not where an int32 is expected, as it requires 50 bits to store as an integer.

The specification states that "Constant expressions are always evaluated exactly" however some edge cases are to be expected:

    print((1 + 1/1e130)-1, "\n")
    print(1/1e130, "\n")

results in:

     +9.016581e-131
     +1.000000e-130

so there does seem to be some limit to precision. Maintaining high precision and forbidding overflow means that there really is no need for size suffixes.

Strings

Everyone knows that strings are enclosed in single or double quotes. Or maybe backquotes (`) or triple quotes ('''). And that while they used to contain ASCII characters, UTF-8 is preferred these days. Except when it isn't, and UTF-16 or UTF-32 are needed.

Both Rust and Go, like C and others, use single quotes for characters and double quotes for strings, both with the standard set of escape sequences (though Rust inexplicably excludes \b, \v, \a, and \f). This set includes \uXXXX and \UXXXXXXXX so that all Unicode code-points can be expressed using pure ASCII program text.

Go chooses to refer to character constants as "Runes" and provides the built in type "rune" to store them. In C and related languages "char" is used both for ASCII characters and 8-bit values. It appears that the Go developers wanted a clean break with that and do not provide a char type at all. rune (presumably more aesthetic than wchar) stores (32-bit) Unicode characters while byte or uint8 store 8-bit values.

Rust keeps the name char for 32-bit Unicode characters and introduces u8 for 8-bit values.

The modern trend seems to be to disallow literal newlines inside quoted strings, so that missing quote characters can be quickly detected by the compiler or interpreter. Go follows this trend and, like D, uses the back quote (rather than the Python triple-quote) to surround "raw" strings in which escapes are not recognized and newlines are permitted. Rust bucks the trend by allowing literal newlines in strings and does not provide for uninterpreted strings at all.

Both Rust and Go assume UTF-8. They do not support the prefixes of C (U"this is a string of 32bit characters") or the suffixes of D ("another string of 32bit chars"d), to declare a string to be a multibyte string.

Semicolons and expressions

The phrase "missing semicolon" still brings back memories from first-year computer science and learning Pascal. It was a running joke that whenever the lecturer asked "What does this code fragment do?" someone would call out "missing semicolon", and they were right more often than you would think.

In Pascal, a semicolon separates statements while in C it terminates some statements — if, for, while, switch and compound statements do not require a semicolon. Neither rule is particularly difficult to get used to, but both often require semicolons at the end of lines that can look unnecessary.

Go follows Pascal in that semicolons separate statements — every pair of statements must be separated. A semicolon is not needed before the "}" at the end of a block, though it is permitted there. Go also follows the pattern seen in Python and JavaScript where the semicolon is sometimes assumed at the end of a line (when a newline character is seen). The details of this "sometimes" is quite different between languages.

In Go, the insertion of semicolons happens during "lexical analysis", which is the step of language processing that breaks the stream of characters into a stream of tokens (i.e. a tokenizer). If a newline is detected on a non-empty line and the last token on the line was one of:

  • an identifier,
  • one of the keywords break, continue, fallthrough, or return
  • a numeric, rune, or string literal
  • one of ++, --, ), ], or }

then a semicolon is inserted at the location of the newline.

This imposes some style choices on the programmer such that:

   if some_test
   {
   	some_statement
   }

is not legal (the open brace must go on the same line as the condition), and:

   a = c
     + d
     + e

is not legal — the operation (+) must go at the end of the first line, not the start of the second.

In contrast to this, JavaScript waits until the "parsing" step of language processing when the sequence of tokens is gathered into syntactic units (statements, expressions, etc.) following a context free grammar. JavaScript will insert a semicolon, provided that semicolon would serve to terminate a non-empty statement, if:

  • it finds a newline in a location that the grammar forbids a newline, such as after the word "break" or before the postfix operator "++";
  • it finds a "}" or End-of-file that is not expected by the grammar
  • it finds any token that is not expected, which was separated from the previous token by at least one newline.

This often works well but brings its own share of style choices including the interesting suggestion to sometimes use a semicolon to start a statement.

While both of these approaches are workable, neither really seems ideal. They both force style choices which are rather arbitrary and seem designed to make life easy for the compiler rather than for the programmer.

Rust takes a very different approach to semicolons than Go or JavaScript or many other languages. Rather than making them less important and often unnecessary they are more important and have a significant semantic meaning.

One use involves the attributes mentioned earlier. When followed by a semicolon:

  #[some_attribute];

the attribute applies to the entity (e.g. the function or module) that the attribute appears within. When not followed by a semicolon, the attribute applies to the entity that follows it. A missing semicolon could certainly make a big difference here.

The primary use of semicolons in Rust is much like that in C — they are used to terminate expressions by turning the expressions into statements, discarding any result. The effect is really quite different from C because of a related difference: many things that C considers to be statements, Rust considers to be expressions. A simple example is the if expression.

    a = if b == c  { 4 } else { 5 };

Here the if expression returns either "4" or "5", which is stored in "a".

A block, enclosed in braces ({ }), typically includes a sequence of expressions with semicolons separating them. If the last expression is also followed by a semicolon, then the block-expression as a whole does not have a value — that last semicolon discards the final value. If the last expression is not followed by a semicolon, then the value of the block is the value of the last expression.

If this completely summed up the use of semicolons it would produce some undesirable requirements.

    if condition {
        expression1;
    } else {
        expression2;
    }
    expression3;

This would not be permitted as there is no semicolon to discard the value of the if expression before expression3. Having a semicolon after the last closing brace would be ugly, and that if expression doesn't actually return a value anyway (both internal expressions are terminated with a semicolon) so the language does not require the ugly semicolon and the above is valid Rust code. If the internal expression did return a value, for example if the internal semicolons were missing, then a semicolon would be required before expression3.

Following this line of reasoning leads to an interesting result.

    if condition {
    	function1()
    } else {
    	function2()
    }
    expression3;

Is this code correct or is there a missing semicolon? To know the answer you need to know the types of the functions. If they do not return a value, then the code is correct. If they do, a semicolon is needed, either one at the end of the whole "if" expression, or one after each function call. So in Rust, we need to evaluate the types of expressions before we can be sure of correct semicolon usage in every case.

Now the above is probably just a silly example, and no one would ever write code like that, at least not deliberately. But the rules do seem to add an unnecessary complexity to the language, and the task of programming is complex enough as it is — adding more complexity through subtle language rules is not likely to help.

Possibly a bigger problem is that any tool that wishes to accurately analyze the syntax of a program needs to perform a complete type analysis. It is a known problem that the correct parsing of C code requires you to know which identifiers are typedefs and which are not. Rust isn't quite that bad as missing type information wouldn't lead to an incorrect parse, but at the very least it is a potential source of confusion.

Return

A final example of divergence on the little issues, though perhaps not quite so little as the others, can be found in returning values from functions using a return statement. Both Rust and Go support the traditional return and both allow multiple values to be returned: Go by simply allowing a list of return types, Rust through the "tuple" type which allows easy anonymous structures. Each language has its own variation on this theme.

If we look at the half million return statements in the Linux kernel, nearly 35,000 of them return a variable called "ret", "retval", "retn", or similar, and a further 20,000 return "err", "error", or similar. This totals more than 10% of total usage of return in the kernel. This suggests that there is often a need to declare a variable to hold the intended result of a function, rather than to just return a result as soon as it is known.

Go acknowledges this need by allowing the signature of a function to give names to the return values as well as the parameter values:

    func open(filename string, flags int) (fd int, err int)

Here the (hypothetical) open() function returns two integers named fd (the file descriptor) and err. This provides useful documentation of the meaning of the return values (assuming programmers can be more creative than "retval") and also declares variables with the given names. These can be set whenever convenient in the code of the function and a simple:

    return

with no expressions listed will use the values in those variables. Go requires that this return be present, even if it lists no values and is at the end of the function, which seems a little unnecessary, but isn't too burdensome.

There is evidence [YouTube] that some Go developers are not completely comfortable with this feature, though it isn't clear whether the feature itself is a problem, or rather the interplay with other features of Go.

Rust's variation on this theme we have already glimpsed with the observation that Rust has "expressions" in preference to "statements". The whole body of a function can be viewed as an expression and, provided it doesn't end with a semicolon, the value produced by that expression is the value returned from the function. The word return is not needed at all, though it is available and an explicit return expression within the function body will cause an early return with the given value.

Conclusion

There are many other little details, but this survey provides a good sampling of the many decisions that a language designer needs to make even after they have made the important decisions that shape the general utility of the language. There certainly are standards that are appearing and broadly being adhered to, such as for comments and identifiers, but it is a little disappointing that there is still such variability concerning the available representations of numbers and strings.

The story of semicolons and statement separation is clearly not a story we've heard the end of yet. While it is good to see language designers exploring the options, none of the approaches explored above seem entirely satisfactory. The recognition of a line-break as being distinct from other kinds of white space seems to be a clear recognition that the two dimensional appearance of the code has relevance for parsing it. It is therefore a little surprising that we don't see the line indent playing a bigger role in interpretation of code. The particular rules used by Python may not be to everyone's liking, but the principle of making use of this very obvious aspect of a program seems sound.

We cannot expect ever to converge on a single language that suits every programmer and every task, but the more uniformity we can find on the little details, the easier it will be for programmers to move from language to language and maximize their productivity.

Comments (158 posted)

Brief items

Quotes of the week

Make sure you've got meaningful examples. If you have functions named foo or bar, then that's a warning sign right there.
James Hague, explaining how to write quality functional programming tutorials.

Wow, seems like Stallman was right all along. I might have take back my words on his antics after all.
William McBee, on the recent privacy and government surveillance dust-up in the US.

Comments (none posted)

rsyslog 7.4 released

Version 7.4 of the rsyslog system logger has been released. This is the first version of the new 7.4 stable branch and it joins version 7.2.7 as supported versions of the tool. New headline features include support for the systemd journal (both as input and output) along with log file encryption, signatures, and anonymization.

Comments (18 posted)

Newscoop 4.2 available

Version 4.2 of Newscoop, the open source content management system for news sites is available. This release adds a REST API, Arabic support, and improvements to theming.

Comments (none posted)

GNU Autoconf Archive 2013.06.09 released

A new stable release of the GNU Autoconf Archive is available. The archive includes more than 450 user-contributed macros for GNU Autoconf.

Full Story (comments: none)

GNU Teseq 1.1 released

After five years of development, version 1.1 of GNU Teseq has been released. Teseq is a utility for analyzing files that contain control characters and control sequences. This new release adds support for color output, descriptions and labels of a variety of non-standard controls, and automatic character-set recognition.

Full Story (comments: none)

Newsletters and articles

Development newsletters from the past week

Comments (none posted)

The Wayland Situation: Facts About X vs. Wayland (Phoronix)

Over at Phoronix, Eric Griffith has attempted to set the record straight on X and Wayland, with assistance from X/Wayland developer Daniel Stone. He looks at the failings of X and the corresponding "fixings of Wayland", along with some misconceptions about the two and some generic advantages for Wayland. "'X is Network Transparent.' Wrong. [It's] not. Core X and DRI-1 were network transparent. No one uses either one. Shared-Memory, DRI-2 and DRI-3000 are NOT network transparent, they do NOT work over the network. Modern day X comes down to synchronous, poorly done VNC. If it was poorly done, async, VNC then maybe we could make it work. But [it's] not. Xlib is synchronous (and the movement to XCB is a slow one) which makes networking a NIGHTMARE."

Comments (153 posted)

U-Boot Creator Wolfgang Denk on the Great Achievements of Embedded Linux (Linux.com)

Linux.com interviews Wolfgang Denk, creator of the U-Boot bootloader, about two great things that embedded Linux has achieved: abstracting away hardware differences for application developers and the rapid adoption of the Yocto project. "But the really dramatic changes do not happen in Linux, but in the hardware. If you consider the landslide-like move from Power Architecture to ARM systems in the last two or three years it is highly notable that this happened without disconcertment for both developers and users: thanks to Linux, the low level hardware details are well abstracted away, and on application level it does not really matter at all which exact architecture or SoC you are working with. This is really a great achievement."

Comments (none posted)

Stapelberg: Systemd survey results, part 1

At his blog, Michael Stapelberg examines the first batch of responses to Debian's recent survey about systemd support. This round covers concerns over the number of dependencies, complexity, and feature creep. Stapelberg concludes that "While systemd consumes more resources than sysvinit, it uses them to make more information available about services; its finer-grained service management requires more state-keeping, but in turn offers you more control over your services." Presumably more posts will follow, addressing more of the survey responses.

Comments (none posted)

Page editor: Nathan Willis

Announcements

Brief items

FSFE: German Parliament tells government to strictly limit patents on software

The Free Software Foundation Europe reports that the German Parliament decided upon a joint motion to limit software patents. "The Parliament urges the German Government to take steps to limit the granting of patents on computer programs. Software should exclusively be covered by copyright, and the rights of the copyright holders should not be devalued by third parties' software patents. The only exception where patents should be allowed are computer programs which replace a mechanical or electromagnetic component. In addition the Parliament made clear that governmental actions related to patents must never interfere with the legality of distributing Free Software."

Full Story (comments: 10)

Upcoming Events

Schedule of openSUSE Conference

The schedule is available for the openSUSE Conference (oSC13), which will take place July 18-22 in Thessaloniki, Greece.

Comments (none posted)

Linux Plumbers Conference Deadlines and Updates

Linux Plumbers Conference will take place September 18-20, 2013 in New Orleans, Louisiana. The refereed track paper submission deadline is June 17. Twelve microconferences have been announced, with some slots still open. LPC is expected to sell out, and the conference hotel has other events going on at the same time, so make your travel plans soon.

Full Story (comments: none)

Events: June 13, 2013 to August 12, 2013

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
June 10
June 14
Red Hat Summit 2013 Boston, MA, USA
June 13
June 15
PyCon Singapore 2013 Singapore, Republic of Singapor
June 17
June 18
Droidcon Paris Paris, France
June 18
June 20
Velocity Conference Santa Clara, CA, USA
June 18
June 21
Open Source Bridge: The conference for open source citizens Portland, Oregon, USA
June 20
June 21
7th Conferenza Italiana sul Software Libero Como, Italy
June 22
June 23
RubyConf India Pune, India
June 26
June 28
USENIX Annual Technical Conference San Jose, CA, USA
June 27
June 30
Linux Vacation / Eastern Europe 2013 Grodno, Belarus
June 29
July 3
Workshop on Essential Abstractions in GCC, 2013 Bombay, India
July 1
July 5
Workshop on Dynamic Languages and Applications Montpellier, France
July 1
July 7
EuroPython 2013 Florence, Italy
July 2
July 4
OSSConf 2013 Žilina, Slovakia
July 3
July 6
FISL 14 Porto Alegre, Brazil
July 5
July 7
PyCon Australia 2013 Hobart, Tasmania
July 6
July 11
Libre Software Meeting Brussels, Belgium
July 8
July 12
Linaro Connect Europe 2013 Dublin, Ireland
July 12 PGDay UK 2013 near Milton Keynes, England, UK
July 12
July 14
5th Encuentro Centroamerica de Software Libre San Ignacio, Cayo, Belize
July 12
July 14
GNU Tools Cauldron 2013 Mountain View, CA, USA
July 13
July 19
Akademy 2013 Bilbao, Spain
July 15
July 16
QtCS 2013 Bilbao, Spain
July 18
July 22
openSUSE Conference 2013 Thessaloniki, Greece
July 22
July 26
OSCON 2013 Portland, OR, USA
July 27 OpenShift Origin Community Day Mountain View, CA, USA
July 27
July 28
PyOhio 2013 Columbus, OH, USA
July 31
August 4
OHM2013: Observe Hack Make Geestmerambacht, the Netherlands
August 1
August 8
GUADEC 2013 Brno, Czech Republic
August 3
August 4
COSCUP 2013 Taipei, Taiwan
August 6
August 8
Military Open Source Summit Charleston, SC, USA
August 7
August 11
Wikimania Hong Kong, China
August 9
August 11
XDA:DevCon 2013 Miami, FL, USA
August 9
August 12
Flock - Fedora Contributor Conference Charleston, SC, USA
August 9
August 13
PyCon Canada Toronto, Canada
August 11
August 18
DebConf13 Vaumarcus, Switzerland

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds