June 10, 2013
This article was contributed by Josh Berkus
This year's pgCon, which concluded
May 25th,
included an unusually high number of changes to the PostgreSQL community,
codebase, and development. Contributors introduced multiple new major
projects which will substantially change how people use PostgreSQL,
including parallel query, a new binary document store type, and pluggable
storage. In addition, Tom Lane switched jobs, four new committers were
selected, pgCon
had the highest attendance ever at 256 registrations, and held its first unconference after the regular conference. Overall, it was a mind-bending and exhausting week.
pgCon is a PostgreSQL developer and advanced user conference held in
Ottawa, Canada every year in May. It usually brings together most of the
committers and major contributors to the project in order to share ideas,
present projects and new features, and coordinate schedules and code
changes. The main conference days are preceded with various summits,
including the PostgreSQL Clustering Summit, the Infrastructure Team meeting, and the Developer Meeting. The latter consists of a closed summit of 18 to 25 top code contributors to PostgreSQL who coordinate feature development and plans for the next version (PostgreSQL 9.4).
Parallel query
The longest and most interesting discussion at the developer meeting was about adding parallel query capabilities to PostgreSQL. Currently, the database is restricted to using one process and one core to execute each individual query, and cannot make use of additional cores to speed up CPU-bound tasks. While it can execute dozens of separate queries simultaneously, the lack of individual-query parallelism is still very limiting for users of analytics applications, for which PostgreSQL is already popular.
Bruce Momjian announced on behalf of EnterpriseDB that its engineering
team would be focusing on parallelism for the next version of PostgreSQL.
Noah Misch will be leading this project. The project plans to have index
building parallelized for 9.4, but most of its work will be creating a general framework for parallelism. According to Momjian, there are three things you need for any parallel operation in the database:
- An efficient way of passing data to the parallel backends, probably using a shared memory facility.
- A method for starting and stopping worker processes.
- The ability for the worker processes to share the reference data and state information of parent process.
The EnterpriseDB team had explored using threads for worker processes,
but these were not seen as a productive approach, primarily because the
PostgreSQL development team is used to working with processes and the
backend code is structured around them. While the cost of starting up
processes is high compared to threads, the additional locking required for
threading looked to be just as expensive in performance terms. Momjian put it this way:
With threads, everything is shared by default and you have to take specific steps not to share.
With processes, everything is unshared by default, and you have to specifically share things.
The process model and explicit sharing is a shorter path from where we are currently.
The PostgreSQL developers plan to build a general framework for parallelism and then work on parallelizing one specific database task at a time. The first parallel feature is expected to be building indexes in parallel using parallel in-memory sort. This is an important feature for users because building indexes is often slower than populating the the underlying table, and it is often CPU-bound. It's also seen as a good first task because index builds run for minutes rather than milliseconds, so optimizing worker startup costs can be postponed until later development.
PostgreSQL and non-volatile memory
Major contributor KaiGai Kohei of NEC brought up the recent emergence of non-volatile memory (NVRAM), or persistent memory, devices and discussed ideas on how to take advantage of them for PostgreSQL. Intel engineer Matthew Wilcox further reinforced the message that NVRAM was coming in his lightning talk on the first day of pgCon. NVRAM persists its contents after a system power cycle, but is addressed like main memory and is around 50% as fast.
Initially, Kohei is interested in using NVRAM for the PostgreSQL Write Ahead Log (WAL), an append-only set of files that is used to guarantee transactional integrity and crash safety. This will work with the small sizes and limited write cycles of the early NVRAM devices. For servers with NVRAM, WAL writes would go to a memory region on the device allocated using mmap(). In later generations of NVRAM, developers can look at using it for the main database files.
There are many unknowns about this technology, such as what method can be employed to guarantee absolute write ordering. Developers speculated about whether transactional memory could somehow be employed for this. Right now, the PostgreSQL community is waiting to get its collective hands on an NVRAM device for testing and development.
Disqus keynote
Mike Clarke, Operations Lead of Disqus, delivered the
keynote for pgCon this year. Disqus is the leading comment-hosting
platform, which is used extensively by blogs and news sites all over the
internet. Its technology platform includes Python, Django, RabbitMQ, Cassandra and PostgreSQL.
Most of its over three terabytes of data, including comments, threads, forums, and user profiles, is stored in PostgreSQL. This adds up to over 50,000 writes per second to the database, and millions of new posts and threads a day.
Clarke extolled the virtues of SSD storage. Disqus uses a 6-node
master-slave replication database cluster running on fast machines with
RAIDed SSD storage. Only SSDs have allowed it to continue scaling to its
current size. Prior to moving to SSD-based storage, Disqus was at 100% IO
utilization and had continual problems with long IO wait times. Now
utilization is down and wait times are around one millisecond.
He also complained about some of the pain points of scaling out
PostgreSQL. Disqus uses Slony-I, PostgreSQL's older replication system,
which the company has customized to its workload, and feels that it can't
afford to
upgrade. For that reason, Clarke is eagerly awaiting the new logical
replication system expected with PostgreSQL 9.4 next year. He also was
unhappy about the lack of standard design patterns for PostgreSQL proxying and failover; everyone seems to build their stack differently. On the other hand, he praised extensions as the best feature of PostgreSQL, since they allow building applications inside the database.
Clarke ended with a request for some additional PostgreSQL features. He
wants tools to enable sharded multiserver databases to be built inside
PostgreSQL more easily, such as by improving PL/Proxy, the
distributed table interface extension for PostgreSQL introduced by Skype.
He'd also like to see a query progress indicator, something that was later presented at pgCon by Jan Urbański.
HStore and the future of JSON
During the regular talks, Oleg Bartunov and Teodor Sigaev introduced the
prototype of the next version of their "hstore"
extension for PostgreSQL. hstore allows storing a simple key-value store,
a "hash" or "dictionary", in a PostgreSQL field, and to allow indexing the keys. Today, many users of PostgreSQL and JSON use it to store "flattened" JSON objects so that they can be indexed on all keys. The presentation introduced a new version of hstore which can nest, as well as storing arrays, so that it will be a closer match for fully structured JSON, as well as for complex multi-level hashes and dictionaries in Perl and Python.
This prototype new hstore also supports indexing which enables very fast
lookup of keys, values, and even document fragments many levels deep. In
their tests, they used the Del.icio.us dataset, which includes 1.2 million
bookmark documents, and were able to search out all values matching a
complex nesting expression in 0.1 seconds, or all instances of a common key
in 0.5 seconds. The indexes are also reasonably sized, at around 75% of
the size of the data to which they are attached. Earlier attempts to index tree-structured text data in PostgreSQL and
other databases have resulted in indexes which are significantly larger
than the base table.
Individual hstore fields can be up to 512MB in size.
While many attendees were excited and impressed by the prototype, some were unhappy. Several contributors were upset that the new type wasn't JSON. They argued that the PostgreSQL project didn't need a non-standard type and interface, when what developers want is a binary, indexed JSON type. After extensive discussion, Bartunov and Sigaev agreed to work on JSON either instead of, or in addition to, a new hstore for the next version.
Hopefully, this means that users can expect a JSON type for version 9.4 that supports arbitrary nested key lookup and complex search expressions. This would make PostgreSQL more suitable for for applications which currently use a JSON document database, such as MongoDB or CouchDB. With the addition of compatibility projects like Mongres, users might even be able to run such applications largely unaltered.
Pluggable storage
The final day of pgCon this year was the conference's first-ever "unconference day". An unconference is a meeting in which the attendees select sessions and compose the schedule at the event. Unconferences tend to be more discussion-oriented than regular conferences, and center more around recent events and ideas. Around 75 of the pgCon attendees stayed for the unconference.
One of the biggest topics discussed at the unconference was idea of making PostgreSQL's storage "pluggable". Historically, companies have wanted to tailor the database for particular workloads by adding support for column store tables, clustered storage, graphs, streaming data, or other special-purpose data structures. These changes have created incompatible forks of PostgreSQL, such as Greenplum or Vertica, cutting off any development collaboration with those vendors. Other companies, such as Huawei and Salesforce, who are newly involved in PostgreSQL, would like to be able to change the storage model without forking the code.
The PostgreSQL contributors discussed methods of accomplishing this.
First, they discussed the possibility of using the Foreign Data
Wrapper (FDW) facility to attach new storage types. Foreign Data
Wrappers allow users to attach external data, such as other databases,
through a table interface. After some discussion, this was seen as
unsuitable in most cases, since users want to actually manage tables,
including creation, backup, and replication, through PostgreSQL, not just
"have a window" into them. They also want to support creating indexes on
different storage types.
If FDW won't work, the developers will need to create a new set of hooks and an API for "storage managers". This was actually supported by early versions of POSTGRES at the University of California, which had prototypes of both an in-memory and a write-once media (WORM) storage manager. However, that code has atrophied and doesn't support most current PostgreSQL features.
For any potential storage, the storage manager would need to support several conventions of PostgreSQL, including:
- having tuples (rows) which are structured like PostgreSQL's tuples, including metadata
- being transactional
- providing a method for resolving data visibility
- providing a physical row identifier for index building
The PostgreSQL system catalogs would stay on the current conventional native storage, regardless of what new types of storage managers were added.
If implemented, this would be a major change to the database system. It
would become possible to use PostgreSQL as a query engine, transaction
manager, and interface for very different types of databases, both
proprietary and open source. It might even become possible for MySQL
"storage engine" vendors, such as Infobright and Tokutek, to port their
products. Peter van Hardenberg of Heroku suggested it might also make it
possible to
run PostgreSQL on top of HDFS.
Committer changes
The most frequently quoted news from pgCon this year was news that Tom Lane, lead committer on PostgreSQL, was changing employers from Red Hat to Salesforce. While announced in a rather low-key way through the Developer Meeting notes and Lane's show badge, this was big enough news that Wired picked it up. Lane had worked at Red Hat for 11 years, having joined to support Red Hat Database, its distribution of PostgreSQL. While Red Hat Database was eventually canceled, Lane stayed on at Red Hat, which was very supportive of his contributions to the project.
Lane's move is more significant in what it says about Salesforce's
commitment to PostgreSQL than any real change in his expected activities as
a committer. Until now, most commentators have suggested that Salesforce's
mentions of PostgreSQL were merely posturing, but hiring Lane suggests that
it plans to follow through on migrating away from Oracle Database. Six other Salesforce staff also attended pgCon. Its exact plans were not shared with the community, although it's reasonable to hypothesize from development discussions at the conference that Salesforce plans to contribute substantially to the open-source project, and that pluggable storage is a development target.
Lane memorialized his change of employment by putting his nine-year-old Red Hat laptop bag into the charity auction at the end of pgCon. It sold for $170.
The PostgreSQL Core Team, a six-member steering committee for the project, announced the selection of four new committers to PostgreSQL: Jeff Davis of Aster Data, author of the range types feature in version 9.2; Fujii Masao of NTT Data, main author of the synchronous replication feature; Stephen Frost of Resonate, author of several security features; and Noah Misch of EnterpriseDB, author of numerous SQL improvements. This brings the number of committers on PostgreSQL to twenty.
More PostgreSQL
Of course, there were many other interesting presentations and talks at pgCon. Keith Paskett ran a tutorial on optimizing and using PostgreSQL on ZFS atop OmniOS (an OpenSolaris fork), while other users talked about using PostgreSQL on ZFS for Linux. Jeff Davis presented strategies to use PostgreSQL's new anti-disk-corruption features. Josh McDermott ran another Schemaverse tournament, as a qualifier for the upcoming Defcon tournament. Robert Haas showed the most common failures of the PostgreSQL query planner, and sparked discussion about how to fix them.
On the second full conference day, Japanese community members presented the newly-formed PostgreSQL Enterprise Consortium of Japan, a group of 39 Japanese companies aiming to promote and improve PostgreSQL. This group is currently working on clustered PostgreSQL, benchmarking, and migration tools to migrate from other database systems. And just for fun, Álvaro Hernández Tortosa demonstrated creating one billion tables in a single PostgreSQL database.
Overall, it was the most exciting pgCon I've attended, and shows the many new directions in which PostgreSQL development is going. Anyone there got the impression that the project would be completely reinventing the database within a few years. If you work with PostgreSQL, or are interested in contributing to it, you should consider attending next year.
[ Josh Berkus is a member of the PostgreSQL Core Team. ]
Comments (6 posted)
As always, there were more sessions at the recently completed triumvirate
of Linux Foundation conferences in Tokyo than can be written up. In fact,
also as usual, there were more sessions available than people to cover
them. The Automotive
Linux Summit Spring, LinuxCon
Japan, and CloudOpen
Japan covered a lot of ground in five days. Here are reports from three presentations at
LinuxCon.
OSS meetups in Japan
Hiro Yoshioka spoke about the types of open source gatherings that go on in
Japan. He is the technical managing officer for Rakuten, which is a large
internet services company in Japan. Before that, he was the CTO of Miracle
Linux from 2000 to 2008.
The goal of his talk was to encourage other Japanese people in the audience to start
up their own "meetups" and other types of technical meetings and seminars,
but the
message was
applicable anywhere. Organizing these meetings is quite rewarding, and
lots of fun, but it does take some time to do, he said.
Yoshioka used the "kernel code reading party" that he started in Yokohama in
April 1999 as a case study. He wondered if he would be able to read the kernel
source code, so he gathered up some members of the Yokahama Linux Users
Group to create an informal technical seminar to do so. The name of
the meeting
has stuck, but the group no longer reads kernel source. Instead, they have
presentations on kernel topics, often followed by a "pizza and beer party".
There are numerous advantages to being the organizer of such a meeting, he
said. You get to choose the date, time, and location for the event, as
well as choosing the speakers. When he wants to learn about something in
the kernel, he asks someone who knows about it to speak. Presenters also
gain from the experience because they get to share their ideas in a relaxed
setting. In addition, they can talk about an "immature idea" and get
"great feedback" from those attending. Attendees, of course, get to hear
"rich technical information".
Being the organizer has some downsides, mostly in the amount of time it
takes. The organizer will "need to do everything", Yoshioka said, but
sometimes the community will help out. In order to make the meetings
"sustainable", the value needs to exceed the cost. So either increasing
the value or decreasing the cost are ways to help make the meetings
continue. Finding great speakers is the key to making the value of the
meetings higher, while finding inexpensive meeting places is a good way to
bring down costs.
How to find the time to organize meetings like those he mentioned was one
question from the audience. It is a difficult question, Yoshioka said, but
as with many things it comes down to your priorities. Another audience
member noted that convincing your employer that the meeting will be useful
in your job may allow you to spend some of your work time on it. "Make it
part of your job".
Another example that Yoshioka gave was the Rakuten Technology
Conference, which has been held yearly since 2007. It is a free
meeting with
content provided by volunteers. In the past, it has had keynotes from Ruby
creator Matz and Dave Thomas of The Pragmatic Programmer. Proposals
for talks are currently under discussion for this year's event, which will
be held on October 26 near Shinagawa station in Tokyo. Unlike many other
technical meetings in Japan, the conference is all in English, he said.
The language barrier was of interest to several non-Japanese audience
members. Most of the meetings like Yoshioka described are, unsurprisingly,
in Japanese, but for non-speakers there are a few possibilities. The Tokyo
hackerspace has half of its meetings in English, he said, and the Tokyo
Linux Users Group has a web page and
mailing list in English.
In addition, Yoshioka has an English-language blog with occasional posts covering the
kernel code
reading party meetings and other, similar meetings.
One laptop per child
A Kroah-Hartman different from the usual suspect spoke in the student track.
In a presentation that followed her father's, Madeline Kroah-Hartman looked
at the One Laptop Per Child (OLPC) project, its history, and some of its
plans for the future. She has been using the OLPC for a number of years,
back to the original XO version,
and she brought along the newest model, XO-4 Touch, to show.
The project began in 2005 with the goal of creating a laptop for children
that could be sold for $100. It missed that goal with the initial XO, but did
ship 2.5 million of the units, including 83,000 as part of the "Give 1 Get
1" program that started in 2007. The idea was to have a low-powered laptop
that would last the whole school day, which the XO is capable of, partly
because it "sleeps between keystrokes" while leaving the display on, she said.
Recharging the laptops has been something of a challenge for the project,
particularly in developing countries where electricity may not be
available. Various methods have been tried, from a hand crank to a "yo-yo
charger" that was never distributed. Using the yo-yo got painful after
ten minutes, she said, but it took one and a half hours to fully charge the
device. Solar-powered charging is now the norm.
OLPCs were distributed in various countries, including 300,000 to Uruguay
(where every child in the country got one) and 4,500 to women's schools in
Afghanistan, as well as to Nicaragua, Rwanda, and others. In Madagascar, the youngest
students were teaching the older ones how to use the laptops, while in
India the attendance rate neared 100% for schools that had OLPCs, she said.
OLPCs generally run the Sugar
environment on top of Fedora. It is a "weird" interface that sometimes
doesn't work, she said, but it is designed for small children. That means
it has lots of pictures as part of the interface to reduce clutter and make
it more intuitive for that audience. There are lots of applications
that come with the OLPC, including the Etoys authoring environment, a Python
programming environment, the Scratch 2D animation tool, a physics
simulation program, a local copy of
Wikipedia in the native language, a word processor, and more. The Linux
command line is also available in a terminal application, though children
may not actually use it in practice, she said.
The first model was designed so that you could "throw it at a wall" and it
wouldn't break, she said. Various other versions were created over the
years, including the X0-1.5, a
dual-touchscreen XO-2 that was
never released, and the XO-4 Touch. The latter will be shipping later this
year. There is also the Android-based XO tablet
that will be selling at Walmart for $100 starting in June. It is "very
different" than the Sugar-based XOs, Kroah-Hartman said, but will come
pre-loaded with education and mathematical apps.
There are lots of ways to participate in the project, she said, many of
which are listed on the Participate wiki page.
She noted that only 30% of the XO software is translated to Japanese, so
that might be one place for attendees to start.
OpenRelief
In an update to last year's
presentation, Shane Coughlan talked about the progress (and setbacks)
for the OpenRelief project. That project had its genesis at the 2011
LinuxCon Japan—held shortly after the earthquake, tsunami, and nuclear
accident that hit Japan—as part of a developer panel discussion about
what could be done
to create open source technical measures to help out disaster relief efforts.
That discussion led to the creation
of the OpenRelief project, which seeks
to build a robotic airplane (aka drone) to help relief workers "see through
the fog" to get the right aid to the right place at the right time.
The test airframe he displayed at last year's event had some durability
flaws: "airframes suck", he said. In particular, the airframe would
regularly break in ways that would be difficult to fix in the field.
Endurance is one of the key features required for a disaster relief
aircraft, and the project had difficulty finding one that would both be
durable
and fit into
its low price point ($1000 for a fully equipped plane, which left
$100-200 for the airframe).
In testing the original plane, though, OpenRelief found that the navigation
and flight software/hardware
side was largely a solved problem, through projects like ArduPilot and CanberraUAV. Andrew Tridgell
(i.e. "Tridge" of Samba and other projects) is part of the CanberraUAV
team, which won the 2012 Outback Rescue Challenge;
"they completely rock", Coughlan said. The "unmanned aerial vehicle" (UAV)
that was used by CanberraUAV was "a bit big and expensive" for the
needs of OpenRelief, but because it didn't have to focus on the flight
software side of things, the project could turn to other parts of the problem.
One of those was the airframe, but that problem may now be solved. The
project was approached by an "aviation specialist" who had created a
regular airframe as part of a project to build a vertical takeoff and
landing (VTOL) drone to be sold to the military. It is a simple design
with rails to attach the wings and wheels as well as to hang payloads
(e.g. cameras, radiation detectors, ...). There are dual servos for the
control surfaces which provides redundancy. It is about the same size as
the previous airframe, but can go 40km using an electric engine rather than
20km as the older version did. It can also carry 9kg of payload vs. the
0.5kg available previously. With an optional gasoline-powered engine, the
range will increase to 200-300km.
OpenRelief released
the design files for this new airframe on the day of Coughlan's talk. It
is something that "anyone can build", he said. Test flights are coming
soon, but he feels confident that the airframe piece, at least, is now
under control. There is still plenty of work to do in integrating all of
the different components into a working system, including adding some
"mission control"
software that can interface with existing disaster relief systems.
Coughlan also briefly mentioned another project he has been working on,
called Data Twist. The
OpenStreetMap (OSM) project is
popular in Japan—where Coughlan lives—because the "maps are great", but the
data in those maps isn't always easy to get at. Data Twist is a Ruby
program that processes the OSM XML data to extract information to build
geo-directories.
A geo-directory might contain "all of the convenience stores in
China"—there were 43,000 of them as of the time of his talk—for example.
Data Twist uses the
categories tagged in the OSM data and can extract the locations into a Wordpress
Geo Mashup blog
post, which will place the locations on maps in the posts.
Data Twist is, as yet, just an experiment in making open data (like OSM
data) more useful in other contexts. It might someday be used as part of
OpenRelief, but there are other applications too. The idea was to show
someone who didn't care about open source or disaster relief efforts some
of the benefits of open data. It is in the early stages of development and
he encourages others to take a look.
Wrap-up
All three conferences were held at the Hotel Chinzanso Tokyo and
its associated conference (and wedding) center. It was a little off the
beaten track—if that phrase can ever be applied to a city like Tokyo—in the
Mejiro section of the city. But the enormous garden (complete with
fireflies at night) was beautiful; it tended to isolate the conferences
from the usual Tokyo "hustle and bustle". As always, the events were
well-run and featured a wide array of interesting content.
[I would like to thank the Linux Foundation for travel assistance to Tokyo
for LinuxCon Japan.]
Comments (none posted)
By Nathan Willis
June 12, 2013
Google Reader, arguably the most widely-used feed aggregation tool,
is being unceremoniously dumped and shut down at the end of June. As
such, those who spend a significant chunk of their time consuming RSS
or Atom content have been searching for a suitable replacement. There
are a variety of options available, from third-party commercial
services to self-hosted web apps to desktop applications. Trade-offs
are involved simply in choosing which application type to adopt; for
example, a web service provides access from anywhere, but it also
relies on the availability of a remote server (whether someone else
administrates it or not). But there is at least one other option
worth exploring: browser extensions.
As Luis Villa pointed out in April,
browsers do at best a mediocre job of making feed content
discoverable, and they do nothing to support feed reading directly.
But there are related features in Firefox, such as "live
bookmarks," which blur together the notion of news feeds and
periodically polling a page for changes. Several Firefox add-ons
attempt to build a decent feed-reading interface where none currently
exists—not all of them exploit the live bookmark functionality
for this, although many do. Since recent Firefox releases are capable
of synchronizing both bookmarks and add-ons, it lets the user access
the same experience across multiple desktop and laptop machines (although an
extension-based feed reader does not offer universal availability, as
a purely web-based solution does).
Bookmarks, subscriptions; potato, potahto
The most lightweight option available for recent Firefox builds is
probably MicroRSS,
which offers nothing more than a list of subscribed feeds down the
left hand margin, and text of the entries from the selected feed on
the right. For some users that may be plenty, of course, but as a
practical replacement for Google Reader it falls short, since there is
no way to import an existing list of subscriptions (typically as an Outline Processor Markup
Language (OPML) file). It also does not count unread items, much
less offer searching, sorting, starring, or other news-management
features. On the plus side, it is actively maintained, but the
license is not specified.
Feed
Sidebar is another lightweight option. It essentially just
displays the existing "live bookmarks" content in a persistent Firefox
sidebar. This mechanism requires the user to subscribe to feeds as
live bookmarks, but it has the benefit of being relatively simple.
The top half of the sidebar displays the list of subscriptions, with
each subscription as a separate folder; its individual items are
listed as entries within the folder, which the user must click to open
in the browser. Notably, the Firefox "sidebar" is browser chrome and
not a page element, which makes the feed sidebar visible in every tab,
as opposed to confining the feed reading experience to a single spot.
Feed Sidebar is licensed under GPLv2, which is a tad atypical for
Firefox extensions, where the Mozilla Public License (MPL) dominates.
When it comes to the simple implementations, it is also worth
mentioning that Thunderbird can subscribe to RSS and Atom feeds
natively. This functionality is akin to the email client's support
for NNTP news; like newsgroups, a subscription is presented to the
user much as a POP or IMAP folder is. Feeds with new content appear in
the sidebar, new messages are listed in the top pane, and clicking on
any them opens the content in the bottom message pane. Subscribing to
news feeds does require setting up a separate "Blogs and News Feeds"
account in Thunderbird, though, and users can only read one feed at a
time—one cannot aggregate multiple feeds into a folder, for example.
Moving up a bit on the functionality ladder, Sage is an MPL-1.1-licensed extension
that stores your subscribed feeds in a (user-selectable) bookmark
folder. For reading, it provides a Firefox sidebar with two panes; the upper
one presents a list of the subscriptions, and the lower one presents a
list of available articles from the selected subscription. The main
browser tab shows a summary of each entry in the selected feed,
although opening any entry opens up the corresponding page on the
original site, rather than rendering it inside the feed-reader UI. As
rendering the original page in the browser might suggest, Sage does
not store any content locally, so it does not offer search
functionality.
The project is actively developed on GitHub, although it is
also worth noting that one of the project's means of fundraising is to
insert "affiliate" links into feed content that points toward certain
online merchants.
The high end
Digest
is a fork of a no-longer-developed extension called Brief. It attempts to provide a
more full-featured feed-reading experience than some of the other
readers; it keeps a count of unread items for each feed, allows the
user to "star" individual items or mark them as unread, and quite a
few other features one would expect to find in a web service like
Google Reader.
As is the case with several other extensions, Digest stores feed
subscriptions in a (selectable) bookmarks folder. However, it also
downloads entries locally—allowing the user to choose how long
old downloads are preserved (thankfully), which enables it to offer
content search. It also renders its entire interface within the
browser tab in HTML, unlike some of the competition. Digest is licensed as MPL
2.0, and is actively under development by its new maintainer at GitHub. It can import
(and export) OPML subscription files.
Like Digest, Newsfox
replicates much of the Google Reader experience inside Firefox. The
look is a bit different, since Newsfox incorporates a three-pane
interface akin to an email client. This UI is implemented in browser
chrome, but unlike the earlier live-bookmark–based options, it still
manages to reside entirely within one tab. That said, Newsfox expects
to find subscriptions in the default Live Bookmarks folder, and there
does not appear to be a way to persuade it to look elsewhere. Perhaps
more frustrating, it either does not understand subfolders within the
Live Bookmarks folder, or it chooses to ignore them, so dozens or
hundreds of feeds are presented to the user in a single, scrolling
list.
On the plus side, Newsfox offers multi-tier sorting; one can tell
it to first sort feeds alphabetically (increasing or decreasing), then
sort by date (again, increasing or decreasing), and so on, up to four
levels deep. It can also encrypt the locally-download feed content, which
might appeal to laptop users, and is an option none of the other extensions seems to feature.
Downloaded entries can be searched, which is a plus, and on the whole
the interface is fast and responsive, more so than Digest's HTML UI.
The last major option on the full-fledged feed-aggregator front is
Bamboo,
an MPL-1.1-licensed extension that appears to be intentionally
aiming for Google Reader replacement status—right down to the
UI, which mimics the dark gray "Google toolbar" currently plastered
across all of the search giant's web services. The interface is
rendered in HTML, and uses the decidedly Google Reader–like sidebar
layout, rendering feed content within the right-hand pane. Bamboo supports all of
the basic features common to the high-end aggregators already
discussed: OPML import/export, folders, search, sorting, marking items
as read/unread, and locally storing feed content. It also adds more,
such as the ability to star "favorite" items, the ability to save
items for offline reading, a toggle-able headline-or-full-item display
setting, and a built-in ad blocker.
Interestingly enough, despite its comparatively rich feature set,
Bamboo uses a bookmark folder to keep track of feed subscriptions, but
it does not allow the user to select the folder where subscriptions
are saved. Instead, like Newsfox, it only examines the default Live
Bookmarks folder.
And the rest
If one goes searching for "RSS" on the Firefox Add-ons
site, there are plenty more options that turn up, many of which
reflect entirely different approaches to feed aggregation. For
example, SRR
offers a "ticker"-style scroll of headlines from subscribed feeds, which is useful for a handful of feeds at
best. Dozens or hundreds, however, will overpower even the toughest
attention span. Or there is Newssitter,
which provides a "bookshelf"-style interface that seems visually
designed for reading primarily on a mobile device. That may meet the
needs of many news junkies, of course, but it bears little resemblance
to the Google Reader experience; getting a quick overview of dozens of
feeds is not possible, for example.
Selecting a Google Reader replacement is not a simple task;
everyone uses the service in slightly different ways, and all of the
options offer different (and overlapping) subsets of the original
product's feature set.
The bare-bones feed reading extensions all have big limitations
that probably make them less useful as a drop-in replacement; for
instance they may not check for new content in the background, and
they certainly do not provide much search functionality. For a user
with a lot of subscriptions, supplementary features like searching and
saving items can take the application from mediocre to essential.
After all, it is frequently hard to backtrack to a barely-remembered
news story weeks or months after reading the original feed.
To that end, the more fleshed-out Google Reader alternatives offer
a much more useful experience in the long run. Only time will tell
how solid they are over the long haul, of course—it is not
beyond reason to think that some of them will start to slow down or
wobble with months of saved content to manage. On the other hand,
none of them can offer one key feature of Google Reader: the months'
(or in many cases years') worth of already read news items. Most
individual feeds do not publish their site's entire history, but
Google Reader could search years' worth of already read material.
That is just one of the things people lose when a web service shuts down.
Based on my early experiments, Bamboo offers the most features,
while Newsfox is faster, but Digest is more flexible. It is tempting
to fall back on that familiar old saying: you pays your nickel and you
takes your chances (though sans nickel in free software circles). But
because all three options can follow and display the same set of
feeds, it may be worth installing more than one and giving them a
simultaneous test drive for a week or so. At the very least, Firefox
can synchronize the bookmarks and add-ons, providing you with some way
to get at your subscriptions when away from home—at least if there is a
Firefox installation nearby.
Comments (18 posted)
Page editor: Nathan Willis
Inside this week's LWN.net Weekly Edition
- Security: Tizen content scanning and app obfuscation; New vulnerabilities in cgit, chromium, kernel, php, ...
- Kernel: Skiplists API and benchmarks; Hot adding and removing memory; OPW—kernel edition
- Distributions: Tizen compliance; FreeBSD, ...
- Development: Little things in language design; Facts about X vs Wayland; The achievements of embedded Linux; Debian's systemd survey; ...
- Announcements: German Parliament tells government to strictly limit patents on software, events.
Next page:
Security>>