User: Password:
|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for June 13, 2013

A report from pgCon 2013

June 10, 2013

This article was contributed by Josh Berkus

This year's pgCon, which concluded May 25th, included an unusually high number of changes to the PostgreSQL community, codebase, and development. Contributors introduced multiple new major projects which will substantially change how people use PostgreSQL, including parallel query, a new binary document store type, and pluggable storage. In addition, Tom Lane switched jobs, four new committers were selected, pgCon had the highest attendance ever at 256 registrations, and held its first unconference after the regular conference. Overall, it was a mind-bending and exhausting week.

pgCon is a PostgreSQL developer and advanced user conference held in Ottawa, Canada every year in May. It usually brings together most of the committers and major contributors to the project in order to share ideas, present projects and new features, and coordinate schedules and code changes. The main conference days are preceded with various summits, including the PostgreSQL Clustering Summit, the Infrastructure Team meeting, and the Developer Meeting. The latter consists of a closed summit of 18 to 25 top code contributors to PostgreSQL who coordinate feature development and plans for the next version (PostgreSQL 9.4).

Parallel query

The longest and most interesting discussion at the developer meeting was about adding parallel query capabilities to PostgreSQL. Currently, the database is restricted to using one process and one core to execute each individual query, and cannot make use of additional cores to speed up CPU-bound tasks. While it can execute dozens of separate queries simultaneously, the lack of individual-query parallelism is still very limiting for users of analytics applications, for which PostgreSQL is already popular.

Bruce Momjian announced on behalf of EnterpriseDB that its engineering team would be focusing on parallelism for the next version of PostgreSQL. Noah Misch will be leading this project. The project plans to have index building parallelized for 9.4, but most of its work will be creating a general framework for parallelism. According to Momjian, there are three things you need for any parallel operation in the database:

  • An efficient way of passing data to the parallel backends, probably using a shared memory facility.
  • A method for starting and stopping worker processes.
  • The ability for the worker processes to share the reference data and state information of parent process.

The EnterpriseDB team had explored using threads for worker processes, but these were not seen as a productive approach, primarily because the PostgreSQL development team is used to working with processes and the backend code is structured around them. While the cost of starting up processes is high compared to threads, the additional locking required for threading looked to be just as expensive in performance terms. Momjian put it this way:

With threads, everything is shared by default and you have to take specific steps not to share. With processes, everything is unshared by default, and you have to specifically share things. The process model and explicit sharing is a shorter path from where we are currently.

The PostgreSQL developers plan to build a general framework for parallelism and then work on parallelizing one specific database task at a time. The first parallel feature is expected to be building indexes in parallel using parallel in-memory sort. This is an important feature for users because building indexes is often slower than populating the the underlying table, and it is often CPU-bound. It's also seen as a good first task because index builds run for minutes rather than milliseconds, so optimizing worker startup costs can be postponed until later development.

PostgreSQL and non-volatile memory

Major contributor KaiGai Kohei of NEC brought up the recent emergence of non-volatile memory (NVRAM), or persistent memory, devices and discussed ideas on how to take advantage of them for PostgreSQL. Intel engineer Matthew Wilcox further reinforced the message that NVRAM was coming in his lightning talk on the first day of pgCon. NVRAM persists its contents after a system power cycle, but is addressed like main memory and is around 50% as fast.

Initially, Kohei is interested in using NVRAM for the PostgreSQL Write Ahead Log (WAL), an append-only set of files that is used to guarantee transactional integrity and crash safety. This will work with the small sizes and limited write cycles of the early NVRAM devices. For servers with NVRAM, WAL writes would go to a memory region on the device allocated using mmap(). In later generations of NVRAM, developers can look at using it for the main database files.

There are many unknowns about this technology, such as what method can be employed to guarantee absolute write ordering. Developers speculated about whether transactional memory could somehow be employed for this. Right now, the PostgreSQL community is waiting to get its collective hands on an NVRAM device for testing and development.

Disqus keynote

Mike Clarke, Operations Lead of Disqus, delivered the keynote for pgCon this year. Disqus is the leading comment-hosting platform, which is used extensively by blogs and news sites all over the internet. Its technology platform includes Python, Django, RabbitMQ, Cassandra and PostgreSQL.

Most of its over three terabytes of data, including comments, threads, forums, and user profiles, is stored in PostgreSQL. This adds up to over 50,000 writes per second to the database, and millions of new posts and threads a day.

Clarke extolled the virtues of SSD storage. Disqus uses a 6-node master-slave replication database cluster running on fast machines with RAIDed SSD storage. Only SSDs have allowed it to continue scaling to its current size. Prior to moving to SSD-based storage, Disqus was at 100% IO utilization and had continual problems with long IO wait times. Now utilization is down and wait times are around one millisecond.

He also complained about some of the pain points of scaling out PostgreSQL. Disqus uses Slony-I, PostgreSQL's older replication system, which the company has customized to its workload, and feels that it can't afford to upgrade. For that reason, Clarke is eagerly awaiting the new logical replication system expected with PostgreSQL 9.4 next year. He also was unhappy about the lack of standard design patterns for PostgreSQL proxying and failover; everyone seems to build their stack differently. On the other hand, he praised extensions as the best feature of PostgreSQL, since they allow building applications inside the database.

Clarke ended with a request for some additional PostgreSQL features. He wants tools to enable sharded multiserver databases to be built inside PostgreSQL more easily, such as by improving PL/Proxy, the distributed table interface extension for PostgreSQL introduced by Skype. He'd also like to see a query progress indicator, something that was later presented at pgCon by Jan Urbański.

HStore and the future of JSON

During the regular talks, Oleg Bartunov and Teodor Sigaev introduced the prototype of the next version of their "hstore" extension for PostgreSQL. hstore allows storing a simple key-value store, a "hash" or "dictionary", in a PostgreSQL field, and to allow indexing the keys. Today, many users of PostgreSQL and JSON use it to store "flattened" JSON objects so that they can be indexed on all keys. The presentation introduced a new version of hstore which can nest, as well as storing arrays, so that it will be a closer match for fully structured JSON, as well as for complex multi-level hashes and dictionaries in Perl and Python.

This prototype new hstore also supports indexing which enables very fast lookup of keys, values, and even document fragments many levels deep. In their tests, they used the Del.icio.us dataset, which includes 1.2 million bookmark documents, and were able to search out all values matching a complex nesting expression in 0.1 seconds, or all instances of a common key in 0.5 seconds. The indexes are also reasonably sized, at around 75% of the size of the data to which they are attached. Earlier attempts to index tree-structured text data in PostgreSQL and other databases have resulted in indexes which are significantly larger than the base table. Individual hstore fields can be up to 512MB in size.

While many attendees were excited and impressed by the prototype, some were unhappy. Several contributors were upset that the new type wasn't JSON. They argued that the PostgreSQL project didn't need a non-standard type and interface, when what developers want is a binary, indexed JSON type. After extensive discussion, Bartunov and Sigaev agreed to work on JSON either instead of, or in addition to, a new hstore for the next version.

Hopefully, this means that users can expect a JSON type for version 9.4 that supports arbitrary nested key lookup and complex search expressions. This would make PostgreSQL more suitable for for applications which currently use a JSON document database, such as MongoDB or CouchDB. With the addition of compatibility projects like Mongres, users might even be able to run such applications largely unaltered.

Pluggable storage

The final day of pgCon this year was the conference's first-ever "unconference day". An unconference is a meeting in which the attendees select sessions and compose the schedule at the event. Unconferences tend to be more discussion-oriented than regular conferences, and center more around recent events and ideas. Around 75 of the pgCon attendees stayed for the unconference.

One of the biggest topics discussed at the unconference was idea of making PostgreSQL's storage "pluggable". Historically, companies have wanted to tailor the database for particular workloads by adding support for column store tables, clustered storage, graphs, streaming data, or other special-purpose data structures. These changes have created incompatible forks of PostgreSQL, such as Greenplum or Vertica, cutting off any development collaboration with those vendors. Other companies, such as Huawei and Salesforce, who are newly involved in PostgreSQL, would like to be able to change the storage model without forking the code.

The PostgreSQL contributors discussed methods of accomplishing this. First, they discussed the possibility of using the Foreign Data Wrapper (FDW) facility to attach new storage types. Foreign Data Wrappers allow users to attach external data, such as other databases, through a table interface. After some discussion, this was seen as unsuitable in most cases, since users want to actually manage tables, including creation, backup, and replication, through PostgreSQL, not just "have a window" into them. They also want to support creating indexes on different storage types.

If FDW won't work, the developers will need to create a new set of hooks and an API for "storage managers". This was actually supported by early versions of POSTGRES at the University of California, which had prototypes of both an in-memory and a write-once media (WORM) storage manager. However, that code has atrophied and doesn't support most current PostgreSQL features.

For any potential storage, the storage manager would need to support several conventions of PostgreSQL, including:

  • having tuples (rows) which are structured like PostgreSQL's tuples, including metadata
  • being transactional
  • providing a method for resolving data visibility
  • providing a physical row identifier for index building

The PostgreSQL system catalogs would stay on the current conventional native storage, regardless of what new types of storage managers were added.

If implemented, this would be a major change to the database system. It would become possible to use PostgreSQL as a query engine, transaction manager, and interface for very different types of databases, both proprietary and open source. It might even become possible for MySQL "storage engine" vendors, such as Infobright and Tokutek, to port their products. Peter van Hardenberg of Heroku suggested it might also make it possible to run PostgreSQL on top of HDFS.

Committer changes

The most frequently quoted news from pgCon this year was news that Tom Lane, lead committer on PostgreSQL, was changing employers from Red Hat to Salesforce. While announced in a rather low-key way through the Developer Meeting notes and Lane's show badge, this was big enough news that Wired picked it up. Lane had worked at Red Hat for 11 years, having joined to support Red Hat Database, its distribution of PostgreSQL. While Red Hat Database was eventually canceled, Lane stayed on at Red Hat, which was very supportive of his contributions to the project.

Lane's move is more significant in what it says about Salesforce's commitment to PostgreSQL than any real change in his expected activities as a committer. Until now, most commentators have suggested that Salesforce's mentions of PostgreSQL were merely posturing, but hiring Lane suggests that it plans to follow through on migrating away from Oracle Database. Six other Salesforce staff also attended pgCon. Its exact plans were not shared with the community, although it's reasonable to hypothesize from development discussions at the conference that Salesforce plans to contribute substantially to the open-source project, and that pluggable storage is a development target.

Lane memorialized his change of employment by putting his nine-year-old Red Hat laptop bag into the charity auction at the end of pgCon. It sold for $170.

The PostgreSQL Core Team, a six-member steering committee for the project, announced the selection of four new committers to PostgreSQL: Jeff Davis of Aster Data, author of the range types feature in version 9.2; Fujii Masao of NTT Data, main author of the synchronous replication feature; Stephen Frost of Resonate, author of several security features; and Noah Misch of EnterpriseDB, author of numerous SQL improvements. This brings the number of committers on PostgreSQL to twenty.

More PostgreSQL

Of course, there were many other interesting presentations and talks at pgCon. Keith Paskett ran a tutorial on optimizing and using PostgreSQL on ZFS atop OmniOS (an OpenSolaris fork), while other users talked about using PostgreSQL on ZFS for Linux. Jeff Davis presented strategies to use PostgreSQL's new anti-disk-corruption features. Josh McDermott ran another Schemaverse tournament, as a qualifier for the upcoming Defcon tournament. Robert Haas showed the most common failures of the PostgreSQL query planner, and sparked discussion about how to fix them.

On the second full conference day, Japanese community members presented the newly-formed PostgreSQL Enterprise Consortium of Japan, a group of 39 Japanese companies aiming to promote and improve PostgreSQL. This group is currently working on clustered PostgreSQL, benchmarking, and migration tools to migrate from other database systems. And just for fun, Álvaro Hernández Tortosa demonstrated creating one billion tables in a single PostgreSQL database.

Overall, it was the most exciting pgCon I've attended, and shows the many new directions in which PostgreSQL development is going. Anyone there got the impression that the project would be completely reinventing the database within a few years. If you work with PostgreSQL, or are interested in contributing to it, you should consider attending next year.

[ Josh Berkus is a member of the PostgreSQL Core Team. ]

Comments (6 posted)

OSS meetups, OLPC, and OpenRelief

By Jake Edge
June 12, 2013
LinuxCon Japan 2013

As always, there were more sessions at the recently completed triumvirate of Linux Foundation conferences in Tokyo than can be written up. In fact, also as usual, there were more sessions available than people to cover them. The Automotive Linux Summit Spring, LinuxCon Japan, and CloudOpen Japan covered a lot of ground in five days. Here are reports from three presentations at LinuxCon.

OSS meetups in Japan

[Hiro Yoshioka]

Hiro Yoshioka spoke about the types of open source gatherings that go on in Japan. He is the technical managing officer for Rakuten, which is a large internet services company in Japan. Before that, he was the CTO of Miracle Linux from 2000 to 2008. The goal of his talk was to encourage other Japanese people in the audience to start up their own "meetups" and other types of technical meetings and seminars, but the message was applicable anywhere. Organizing these meetings is quite rewarding, and lots of fun, but it does take some time to do, he said.

Yoshioka used the "kernel code reading party" that he started in Yokohama in April 1999 as a case study. He wondered if he would be able to read the kernel source code, so he gathered up some members of the Yokahama Linux Users Group to create an informal technical seminar to do so. The name of the meeting has stuck, but the group no longer reads kernel source. Instead, they have presentations on kernel topics, often followed by a "pizza and beer party".

There are numerous advantages to being the organizer of such a meeting, he said. You get to choose the date, time, and location for the event, as well as choosing the speakers. When he wants to learn about something in the kernel, he asks someone who knows about it to speak. Presenters also gain from the experience because they get to share their ideas in a relaxed setting. In addition, they can talk about an "immature idea" and get "great feedback" from those attending. Attendees, of course, get to hear "rich technical information".

Being the organizer has some downsides, mostly in the amount of time it takes. The organizer will "need to do everything", Yoshioka said, but sometimes the community will help out. In order to make the meetings "sustainable", the value needs to exceed the cost. So either increasing the value or decreasing the cost are ways to help make the meetings continue. Finding great speakers is the key to making the value of the meetings higher, while finding inexpensive meeting places is a good way to bring down costs.

How to find the time to organize meetings like those he mentioned was one question from the audience. It is a difficult question, Yoshioka said, but as with many things it comes down to your priorities. Another audience member noted that convincing your employer that the meeting will be useful in your job may allow you to spend some of your work time on it. "Make it part of your job".

Another example that Yoshioka gave was the Rakuten Technology Conference, which has been held yearly since 2007. It is a free meeting with content provided by volunteers. In the past, it has had keynotes from Ruby creator Matz and Dave Thomas of The Pragmatic Programmer. Proposals for talks are currently under discussion for this year's event, which will be held on October 26 near Shinagawa station in Tokyo. Unlike many other technical meetings in Japan, the conference is all in English, he said.

The language barrier was of interest to several non-Japanese audience members. Most of the meetings like Yoshioka described are, unsurprisingly, in Japanese, but for non-speakers there are a few possibilities. The Tokyo hackerspace has half of its meetings in English, he said, and the Tokyo Linux Users Group has a web page and mailing list in English. In addition, Yoshioka has an English-language blog with occasional posts covering the kernel code reading party meetings and other, similar meetings.

One laptop per child

[Madeline Kroah-Hartman]

A Kroah-Hartman different from the usual suspect spoke in the student track. In a presentation that followed her father's, Madeline Kroah-Hartman looked at the One Laptop Per Child (OLPC) project, its history, and some of its plans for the future. She has been using the OLPC for a number of years, back to the original XO version, and she brought along the newest model, XO-4 Touch, to show.

The project began in 2005 with the goal of creating a laptop for children that could be sold for $100. It missed that goal with the initial XO, but did ship 2.5 million of the units, including 83,000 as part of the "Give 1 Get 1" program that started in 2007. The idea was to have a low-powered laptop that would last the whole school day, which the XO is capable of, partly because it "sleeps between keystrokes" while leaving the display on, she said.

Recharging the laptops has been something of a challenge for the project, particularly in developing countries where electricity may not be available. Various methods have been tried, from a hand crank to a "yo-yo charger" that was never distributed. Using the yo-yo got painful after ten minutes, she said, but it took one and a half hours to fully charge the device. Solar-powered charging is now the norm.

OLPCs were distributed in various countries, including 300,000 to Uruguay (where every child in the country got one) and 4,500 to women's schools in Afghanistan, as well as to Nicaragua, Rwanda, and others. In Madagascar, the youngest students were teaching the older ones how to use the laptops, while in India the attendance rate neared 100% for schools that had OLPCs, she said.

OLPCs generally run the Sugar environment on top of Fedora. It is a "weird" interface that sometimes doesn't work, she said, but it is designed for small children. That means it has lots of pictures as part of the interface to reduce clutter and make it more intuitive for that audience. There are lots of applications that come with the OLPC, including the Etoys authoring environment, a Python programming environment, the Scratch 2D animation tool, a physics simulation program, a local copy of Wikipedia in the native language, a word processor, and more. The Linux command line is also available in a terminal application, though children may not actually use it in practice, she said.

The first model was designed so that you could "throw it at a wall" and it wouldn't break, she said. Various other versions were created over the years, including the X0-1.5, a dual-touchscreen XO-2 that was never released, and the XO-4 Touch. The latter will be shipping later this year. There is also the Android-based XO tablet that will be selling at Walmart for $100 starting in June. It is "very different" than the Sugar-based XOs, Kroah-Hartman said, but will come pre-loaded with education and mathematical apps.

There are lots of ways to participate in the project, she said, many of which are listed on the Participate wiki page. She noted that only 30% of the XO software is translated to Japanese, so that might be one place for attendees to start.

OpenRelief

[OpenRelief plane]

In an update to last year's presentation, Shane Coughlan talked about the progress (and setbacks) for the OpenRelief project. That project had its genesis at the 2011 LinuxCon Japan—held shortly after the earthquake, tsunami, and nuclear accident that hit Japan—as part of a developer panel discussion about what could be done to create open source technical measures to help out disaster relief efforts. That discussion led to the creation of the OpenRelief project, which seeks to build a robotic airplane (aka drone) to help relief workers "see through the fog" to get the right aid to the right place at the right time.

The test airframe he displayed at last year's event had some durability flaws: "airframes suck", he said. In particular, the airframe would regularly break in ways that would be difficult to fix in the field. Endurance is one of the key features required for a disaster relief aircraft, and the project had difficulty finding one that would both be durable and fit into its low price point ($1000 for a fully equipped plane, which left $100-200 for the airframe).

In testing the original plane, though, OpenRelief found that the navigation and flight software/hardware side was largely a solved problem, through projects like ArduPilot and CanberraUAV. Andrew Tridgell (i.e. "Tridge" of Samba and other projects) is part of the CanberraUAV team, which won the 2012 Outback Rescue Challenge; "they completely rock", Coughlan said. The "unmanned aerial vehicle" (UAV) that was used by CanberraUAV was "a bit big and expensive" for the needs of OpenRelief, but because it didn't have to focus on the flight software side of things, the project could turn to other parts of the problem.

One of those was the airframe, but that problem may now be solved. The project was approached by an "aviation specialist" who had created a regular airframe as part of a project to build a vertical takeoff and landing (VTOL) drone to be sold to the military. It is a simple design with rails to attach the wings and wheels as well as to hang payloads (e.g. cameras, radiation detectors, ...). There are dual servos for the control surfaces which provides redundancy. It is about the same size as the previous airframe, but can go 40km using an electric engine rather than 20km as the older version did. It can also carry 9kg of payload vs. the 0.5kg available previously. With an optional gasoline-powered engine, the range will increase to 200-300km.

OpenRelief released the design files for this new airframe on the day of Coughlan's talk. It is something that "anyone can build", he said. Test flights are coming soon, but he feels confident that the airframe piece, at least, is now under control. There is still plenty of work to do in integrating all of the different components into a working system, including adding some "mission control" software that can interface with existing disaster relief systems.

Coughlan also briefly mentioned another project he has been working on, called Data Twist. The OpenStreetMap (OSM) project is popular in Japan—where Coughlan lives—because the "maps are great", but the data in those maps isn't always easy to get at. Data Twist is a Ruby program that processes the OSM XML data to extract information to build geo-directories.

A geo-directory might contain "all of the convenience stores in China"—there were 43,000 of them as of the time of his talk—for example. Data Twist uses the categories tagged in the OSM data and can extract the locations into a Wordpress Geo Mashup blog post, which will place the locations on maps in the posts.

Data Twist is, as yet, just an experiment in making open data (like OSM data) more useful in other contexts. It might someday be used as part of OpenRelief, but there are other applications too. The idea was to show someone who didn't care about open source or disaster relief efforts some of the benefits of open data. It is in the early stages of development and he encourages others to take a look.

Wrap-up

All three conferences were held at the Hotel Chinzanso Tokyo and its associated conference (and wedding) center. It was a little off the beaten track—if that phrase can ever be applied to a city like Tokyo—in the Mejiro section of the city. But the enormous garden (complete with fireflies at night) was beautiful; it tended to isolate the conferences from the usual Tokyo "hustle and bustle". As always, the events were well-run and featured a wide array of interesting content.

[I would like to thank the Linux Foundation for travel assistance to Tokyo for LinuxCon Japan.]

Comments (none posted)

RSS feed reading in Firefox

By Nathan Willis
June 12, 2013

Google Reader, arguably the most widely-used feed aggregation tool, is being unceremoniously dumped and shut down at the end of June. As such, those who spend a significant chunk of their time consuming RSS or Atom content have been searching for a suitable replacement. There are a variety of options available, from third-party commercial services to self-hosted web apps to desktop applications. Trade-offs are involved simply in choosing which application type to adopt; for example, a web service provides access from anywhere, but it also relies on the availability of a remote server (whether someone else administrates it or not). But there is at least one other option worth exploring: browser extensions.

As Luis Villa pointed out in April, browsers do at best a mediocre job of making feed content discoverable, and they do nothing to support feed reading directly. But there are related features in Firefox, such as "live bookmarks," which blur together the notion of news feeds and periodically polling a page for changes. Several Firefox add-ons attempt to build a decent feed-reading interface where none currently exists—not all of them exploit the live bookmark functionality for this, although many do. Since recent Firefox releases are capable of synchronizing both bookmarks and add-ons, it lets the user access the same experience across multiple desktop and laptop machines (although an extension-based feed reader does not offer universal availability, as a purely web-based solution does).

Bookmarks, subscriptions; potato, potahto

The most lightweight option available for recent Firefox builds is probably MicroRSS, which offers nothing more than a list of subscribed feeds down the left hand margin, and text of the entries from the selected feed on the right. For some users that may be plenty, of course, but as a practical replacement for Google Reader it falls short, since there is no way to import an existing list of subscriptions (typically as an Outline Processor Markup Language (OPML) file). It also does not count unread items, much less offer searching, sorting, starring, or other news-management features. On the plus side, it is actively maintained, but the license is not specified.

Feed Sidebar is another lightweight option. It essentially just displays the existing "live bookmarks" content in a persistent Firefox sidebar. This mechanism requires the user to subscribe to feeds as live bookmarks, but it has the benefit of being relatively simple. The top half of the sidebar displays the list of subscriptions, with each subscription as a separate folder; its individual items are listed as entries within the folder, which the user must click to open in the browser. Notably, the Firefox "sidebar" is browser chrome and not a page element, which makes the feed sidebar visible in every tab, as opposed to confining the feed reading experience to a single spot. Feed Sidebar is licensed under GPLv2, which is a tad atypical for Firefox extensions, where the Mozilla Public License (MPL) dominates.

When it comes to the simple implementations, it is also worth mentioning that Thunderbird can subscribe to RSS and Atom feeds natively. This functionality is akin to the email client's support for NNTP news; like newsgroups, a subscription is presented to the user much as a POP or IMAP folder is. Feeds with new content appear in the sidebar, new messages are listed in the top pane, and clicking on any them opens the content in the bottom message pane. Subscribing to news feeds does require setting up a separate "Blogs and News Feeds" account in Thunderbird, though, and users can only read one feed at a time—one cannot aggregate multiple feeds into a folder, for example.

Moving up a bit on the functionality ladder, Sage is an MPL-1.1-licensed extension that stores your subscribed feeds in a (user-selectable) bookmark folder. For reading, it provides a Firefox sidebar with two panes; the upper one presents a list of the subscriptions, and the lower one presents a list of available articles from the selected subscription. The main browser tab shows a summary of each entry in the selected feed, although opening any entry opens up the corresponding page on the original site, rather than rendering it inside the feed-reader UI. As rendering the original page in the browser might suggest, Sage does not store any content locally, so it does not offer search functionality.

The project is actively developed on GitHub, although it is also worth noting that one of the project's means of fundraising is to insert "affiliate" links into feed content that points toward certain online merchants.

The high end

Digest is a fork of a no-longer-developed extension called Brief. It attempts to provide a more full-featured feed-reading experience than some of the other readers; it keeps a count of unread items for each feed, allows the user to "star" individual items or mark them as unread, and quite a few other features one would expect to find in a web service like Google Reader.

[Digest]

As is the case with several other extensions, Digest stores feed subscriptions in a (selectable) bookmarks folder. However, it also downloads entries locally—allowing the user to choose how long old downloads are preserved (thankfully), which enables it to offer content search. It also renders its entire interface within the browser tab in HTML, unlike some of the competition. Digest is licensed as MPL 2.0, and is actively under development by its new maintainer at GitHub. It can import (and export) OPML subscription files.

Like Digest, Newsfox replicates much of the Google Reader experience inside Firefox. The look is a bit different, since Newsfox incorporates a three-pane interface akin to an email client. This UI is implemented in browser chrome, but unlike the earlier live-bookmark–based options, it still manages to reside entirely within one tab. That said, Newsfox expects to find subscriptions in the default Live Bookmarks folder, and there does not appear to be a way to persuade it to look elsewhere. Perhaps more frustrating, it either does not understand subfolders within the Live Bookmarks folder, or it chooses to ignore them, so dozens or hundreds of feeds are presented to the user in a single, scrolling list.

[Newsfox]

On the plus side, Newsfox offers multi-tier sorting; one can tell it to first sort feeds alphabetically (increasing or decreasing), then sort by date (again, increasing or decreasing), and so on, up to four levels deep. It can also encrypt the locally-download feed content, which might appeal to laptop users, and is an option none of the other extensions seems to feature. Downloaded entries can be searched, which is a plus, and on the whole the interface is fast and responsive, more so than Digest's HTML UI.

The last major option on the full-fledged feed-aggregator front is Bamboo, an MPL-1.1-licensed extension that appears to be intentionally aiming for Google Reader replacement status—right down to the UI, which mimics the dark gray "Google toolbar" currently plastered across all of the search giant's web services. The interface is rendered in HTML, and uses the decidedly Google Reader–like sidebar layout, rendering feed content within the right-hand pane. Bamboo supports all of the basic features common to the high-end aggregators already discussed: OPML import/export, folders, search, sorting, marking items as read/unread, and locally storing feed content. It also adds more, such as the ability to star "favorite" items, the ability to save items for offline reading, a toggle-able headline-or-full-item display setting, and a built-in ad blocker.

[Bamboo]

Interestingly enough, despite its comparatively rich feature set, Bamboo uses a bookmark folder to keep track of feed subscriptions, but it does not allow the user to select the folder where subscriptions are saved. Instead, like Newsfox, it only examines the default Live Bookmarks folder.

And the rest

If one goes searching for "RSS" on the Firefox Add-ons site, there are plenty more options that turn up, many of which reflect entirely different approaches to feed aggregation. For example, SRR offers a "ticker"-style scroll of headlines from subscribed feeds, which is useful for a handful of feeds at best. Dozens or hundreds, however, will overpower even the toughest attention span. Or there is Newssitter, which provides a "bookshelf"-style interface that seems visually designed for reading primarily on a mobile device. That may meet the needs of many news junkies, of course, but it bears little resemblance to the Google Reader experience; getting a quick overview of dozens of feeds is not possible, for example.

Selecting a Google Reader replacement is not a simple task; everyone uses the service in slightly different ways, and all of the options offer different (and overlapping) subsets of the original product's feature set.

The bare-bones feed reading extensions all have big limitations that probably make them less useful as a drop-in replacement; for instance they may not check for new content in the background, and they certainly do not provide much search functionality. For a user with a lot of subscriptions, supplementary features like searching and saving items can take the application from mediocre to essential. After all, it is frequently hard to backtrack to a barely-remembered news story weeks or months after reading the original feed.

To that end, the more fleshed-out Google Reader alternatives offer a much more useful experience in the long run. Only time will tell how solid they are over the long haul, of course—it is not beyond reason to think that some of them will start to slow down or wobble with months of saved content to manage. On the other hand, none of them can offer one key feature of Google Reader: the months' (or in many cases years') worth of already read news items. Most individual feeds do not publish their site's entire history, but Google Reader could search years' worth of already read material. That is just one of the things people lose when a web service shuts down.

Based on my early experiments, Bamboo offers the most features, while Newsfox is faster, but Digest is more flexible. It is tempting to fall back on that familiar old saying: you pays your nickel and you takes your chances (though sans nickel in free software circles). But because all three options can follow and display the same set of feeds, it may be worth installing more than one and giving them a simultaneous test drive for a week or so. At the very least, Firefox can synchronize the bookmarks and add-ons, providing you with some way to get at your subscriptions when away from home—at least if there is a Firefox installation nearby.

Comments (19 posted)

Page editor: Nathan Willis

Inside this week's LWN.net Weekly Edition

  • Security: Tizen content scanning and app obfuscation; New vulnerabilities in cgit, chromium, kernel, php, ...
  • Kernel: Skiplists API and benchmarks; Hot adding and removing memory; OPW—kernel edition
  • Distributions: Tizen compliance; FreeBSD, ...
  • Development: Little things in language design; Facts about X vs Wayland; The achievements of embedded Linux; Debian's systemd survey; ...
  • Announcements: German Parliament tells government to strictly limit patents on software, events.
Next page: Security>>

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds