Leading items
In search of a home for Thunderbird
After nearly a decade of trying, Mozilla is finally making the move of formally spinning off ownership of the Thunderbird email client to a third party. The identity of the new owner is still up for debate; Simon Phipps prepared a report [PDF] analyzing several possible options. But Mozilla does seem intent on divesting itself of the project for real this time. Whoever does take over Thunderbird development, though, will likely face a considerable technical challenge, since much of the application is built on frameworks and components that Mozilla will soon stop developing.
Bird versus fox
To say that Mozilla has had a difficult relationship with Thunderbird would be putting things mildly. The first release was in 2003, with version 1.0 following in late 2004. As soon as 2007, though, Mozilla's Mitchell Baker announced that Mozilla wished to rid itself of Thunderbird and find a new home for the project. Instead, Mozilla ended up separating Thunderbird off into a distinct unit (Mozilla Messaging) under the Mozilla Foundation umbrella. It then reabsorbed that unit in 2011, with Baker noting:
But, in July 2012, Mozilla began pulling paid developers from Thunderbird and left its development primarily in the hands of community volunteers, with a few Mozilla employees performing QA and build duties to support the Extended Support Release (ESR) program. At the time, Baker offered this justification:
By 2014, Mozilla had ramped down its involvement to the point where the Thunderbird team lacked any clear leadership, so the developer community voted to establish a Thunderbird Council made up of volunteers.
Most recently, Baker announced in December 2015 that Thunderbird would be formally separated from Mozilla. Phipps was engaged to research the options that he later published in the aforementioned report. In April 2016, Gervase Markham announced that the search for a new home for the project was underway, with Phipps's recommendations serving as a guide.
Lizard tech
For fans of Thunderbird, the repeated back-and-forth from Mozilla leadership can be a source of frustration on its own, but it probably does not help that Mozilla has started multiple other non-browser projects (such as ChatZilla, Raindrop, Grendel, and Firefox Hello) over the years while insisting that Thunderbird was a distraction from Firefox. Although it might seem like Mozilla management displays an inconsistent attitude toward messaging and other non-web application projects, each call for Mozilla to rid itself of Thunderbird has also highlighted the difficulty of maintaining Thunderbird and Firefox in the same engineering and release infrastructure.
In recent years, due in no small part to pressure coming from the rapid release schedule of Google's Chrome, the Firefox development process has shifted considerably. There are new stable releases made approximately every six weeks, and development builds are provided for the next two releases in separate release channels.
In addition, the Firefox codebase itself is changing. The XUL and XPCOM frameworks are on their way out, to be replaced with components and add-ons written in JavaScript. The Gecko rendering engine is also marked for replacement by Servo, and the entire Firefox architecture may be replaced with the multi-process Electrolysis model.
While these changes are exciting news for Firefox, none of them have made their way into Thunderbird. In April, Mozilla's Mark Surman highlighted the divergence issue in a blog post, noting:
Surman also pointed to a new job listing posted by Mozilla for a contractor who would oversee the transition. The posting describes two key responsibilities: to list all significant technical issues facing Thunderbird (including impact assessments) and to compile an outline of the options available to address those issues to move Thunderbird forward.
Former Mozilla developer Daniel Glazman responded to Surman's post on his own blog, with a more blunt assessment of the technical challenges facing Thunderbird developers. He pointed to the job posting's mention of XUL and XPCOM deprecation and said:
- rewrite the whole UI and the whole JS layer with it
- most probably rewrite the whole SMTP/MIME/POP/IMAP/LDAP/... layer
- most probably have a new Add-on layer or, far worse, no more Add-ons
Glazman concluded that it is too soon to select a new host for the Thunderbird project, given that a decision has yet to be made about how to rewrite the application. Furthermore, he pointed out, Mozilla has not yet begun the transition away from XUL and XPCOM in the Firefox codebase. Only when that process starts, he said, will it be possible to assess the complexity of such a move for Thunderbird.
As far as the build infrastructure goes, Markham sent a proposal to the Thunderbird Council in March suggesting a path forward for separating Thunderbird from the Firefox engineering infrastructure. It did not spawn much discussion, but there did not seem to be any objection either.
Out of the nest
For now, Mozilla seems set on finding a new fiscal and organizational sponsor for Thunderbird, with The Document Foundation and the Software Freedom Conservancy (both highlighted in Phipps's report) currently the leading candidates. But the discussion has only just begun on the technical aspects of maintaining and evolving Thunderbird as a standalone application.
Surman contended that the needs of Firefox and Thunderbird are simply too different today for them to be tied to the same codebase and release process. Essentially, the web changes rapidly, while email changes slowly. It is hard to argue with that assertion (setting aside discussions of how email should change), but Thunderbird fans might contend that Mozilla not contributing developer time to the Thunderbird codebase only exacerbates any inherent difference between the browser and email client.
Whether one thinks Mozilla has not adequately supported Thunderbird over the years or has done its level best, the Thunderbird and Firefox projects today are moving in different directions. Given their shared history, it may seem sad to watch them part ways, but perhaps the Thunderbird community can make the most of the opportunity and drive the application forward where Mozilla could (or would) not.
Caravel data visualization
One aspect of the heavily hyped Internet of Things (IoT) that can easily get overlooked is that each of the Things one hooks up to the Internet invariably spews out a near non-stop stream of data. While commercial IoT users—such as utility companies—generally have a well-established grasp of what data interests them and how to process it, the DIY crowd is better served by flexible tools that make exploring and transforming data easy. Airbnb maintains an open-source Python utility called Caravel that provides such tools. There are many alternatives, of course, but Caravel does a good job at ingesting data and smoothly molding it into nice-looking interactive graphs—with a few exceptions.
My own interest in data-visualization tools stems from IoT projects
(namely home-automation and automotive sensors), but Caravel itself is
in no way limited to such uses. Like most contemporary web-based
service providers, Airbnb
collects a lot of data about its users and their transactions (in this
case, short-term
housing rentals, renters, and property owners). The company also prides itself on
having a slick-looking web interface, and Caravel reflects that: it
sports modern charts and graphs—no crusty old PNGs with jagged
lines generated by Graphviz here; everything is done in JavaScript.
In a nutshell, what Caravel provides is a connection layer supporting a variety of database types, the tools to configure the metrics of interest for any tables one wishes to explore, and an interactive utility for creating data visualizations. Several dozen visualization options are built in, and all of the charts the user creates can be saved and put into convenient "dashboards" for regular usage.
On top of all that, Caravel's interface is web-based and is almost entirely point-and-click. Perhaps the closest parallel would be to a tool like Orange, where the goal is to mask over the complexities of SQL and statistics. Caravel does not quite walk the user through adding new data sources or defining metrics, but it does take care of as many of the repetitive steps as it can.
For example, when you add a database table to your Caravel
work space, there are rows of checkboxes by every field. If you want
to track the minima, maxima, or sums for certain fields, you check
them at load time, and those metrics are automatically available on
the relevant pages of the application from then on. Similar
checkboxes are available for selecting which fields should be used as
categorical groups and which should be available for filtering the
data set.
The first public release of Caravel was in September 2015. The most recent is version 0.8.9, from April 2016. The code is hosted at GitHub and packages are also available on the Python Package Index (PyPI). For the moment, only 2.7 is supported. On Linux, installation also requires the development packages for libssl and libffi. When Caravel is installed, one only needs to initialize the database and create an administrator account to get started.
A Caravel instance is multi-user, and the system supports an array
of permissions and access controls. For testing, though, that is not
necessary. Out of the box, the system provides a local web UI and
comes pre-loaded with a demo data set. SQLite support is built in,
and any other database (local or remote) with SQLAlchemy support can be used
as well. Druid database clusters are also supported, and users
can define a custom schema for any database that requires one. For
those working with large data sets, the good news is that Caravel also
supports a number of open-source caching layers, although none of them
are required. All of these configuration options are presented in the web UI's
"add a database" screen.
The birds-eye view of Caravel usage is that the user adds a new database, then selects and adds each table of interest. From then on, working with Caravel is a matter of using the visualization builder to hone in on a chart or graph that presents some meaningful information. The visualizations include everything from line charts to bubble graphs, box plots to directed graphs, and heatmaps to Sankey diagrams. There are also less scientific options, such as word clouds.
A visualization can be saved as a
"slice," and any number of slices can be collected onto the same page
as a "dashboard." Dashboards are updated regularly as the database is
refreshed, so they can be deployed for internal or public
consumption. Finally, although dashboard graphics are interactive
JavaScript (with additional information shown where the mouse hovers),
all charts and graphs can also be exported as image files.
This set of features is fairly complete, but one might well ask whether the implementation is up to snuff. For the most part, the answer to that question is yes.
Adding new databases and choosing which tables to use borders on trivial, thanks to the well-optimized add-and-edit pages. There are a few caveats, such as the fact that the user cannot simply add all of the tables of interest from a database at once—each table requires a separate round trip through the "add a table" page. And when Caravel does not like something about a table, it is hard to debug.
For example, Caravel includes special treatment for time-series data; the user can mark any field in a table as being of the datetime type and it will be automatically plugged into various time-series charts in the visualization tool. But Caravel could not make sense out of the timestamps in one example data set I downloaded from datahub.io, and there is no easy way to inspect the data directly, nor does there seem to be any way in the UI to transform the timestamps into an acceptable datetime format. Nor even to see what Caravel thinks is wrong with them.
Clearly, this issue falls under a "you must know your data" warning, which is a fair expectation. But the error reporting that Caravel presents yanks the user right out of the UI, displaying a generic, low-level exception warning and a traceback from the Python interpreter.
And this sometimes happens through no fault of the user, like when
the user selects a new graph type from the drop-down menu in the
visualization builder and the
newly-selected graph takes a different number of parameters. By and
large, the visualization tool is quite handy—the point-and-click
settings and controls are not merely a coat of "UI paint" on the top;
they help the user play around with their data sets to find the
visualization settings that work best. Thus, it is more disappointing
when that friendly interactivity breaks down.
There are a couple of troubling technical limitations to mention. First, users must construct any new metrics of interest (other than sample counts, sums, and minima/maxima) by entering raw SQL expressions. Some additional statistical tools would be handy. Perhaps more fundamental is the fact that Caravel cannot join or query multiple tables; all of the visualizations are therefore limited to what information one can extract from a single table.
It might be interesting to pair Caravel with a tool like OpenRefine that specializes in data transformation, but I suspect that for a great many users, what Caravel can do already will serve them well. It handles the database connectivity in the background, putting the emphasis on exploring and manipulating visualizations. The visualizations and dashboards it provides are top-notch by modern standards, but the fact that they are easy for the user to create is Caravel's real advantage.
Page editor: Jonathan Corbet
Next page:
Security>>