Leading items
OpenBSD and the latest OpenSSL bugs
The latest round of OpenSSL bugs was disclosed to the public on June 5, but it is clear that some organizations and distributions had earlier knowledge of the flaws. That is fairly typical for security holes of this sort; distributions get some time to fix the flaws before they are made public (typically simultaneously with the release of the updates). But OpenBSD was not one of the organizations notified in advance; why that is, and whose fault it is, are much in dispute since then.
OpenBSD project leader Theo de Raadt complained about the lack of early notice in a message to the OpenBSD misc and tech mailing lists. OpenBSD has famously forked the OpenSSL code post-Heartbleed into a new library called LibReSSL (or LibreSSL). Both its OpenSSL and LibReSSL packages were affected by the bugs, though, so it is unsurprising that de Raadt is unhappy first hearing about the bugs several days after others had been informed.
According to a timeline published by OpenSSL project member Mark J. Cox, who handled the issues for the project, the distros mailing list was notified of the problem on June 2. That allowed members of that private list—restricted to security representatives from Linux distributions and the BSDs—to request the patches and a copy of the draft advisory. OpenBSD is conspicuously absent from the list of those participating in that list.
As it turns out, de Raadt had been asked if he wanted to join the distros list back in early May. A different OpenSSL problem led Red Hat security response team member Kurt Seifried to CC de Raadt on the report and ask if he or some other OpenBSD member would like to join the list. The distros mailing list is meant to disclose and discuss security problems that affect the entire Unix ecosystem (rather than those that just affect Linux, for which there is a linux-distros mailing list). In characteristic fashion, de Raadt replied:
We don't get paid. And therefore, I don't know where I should find the time to be on another mailing list. It is not like I would have sent a mail to anyone. In general our processes are simply commit & publish. So I'll decline.
Once Cox's timeline made it clear that most other distributions (both Linux
and BSD) had been given an advance heads-up about the issue, de Raadt and other
OpenBSD developers accused OpenSSL of
knowingly keeping the knowledge of the bugs from the project: "Unfortunately I find myself believing reports that the OpenSSL people
intentionally asked others for quarantine, and went out of their way
to ensure this information would not come to OpenBSD and LibreSSL.
"
For his part, Cox states that OpenSSL chose the distros mailing list as
its means of disclosing the bug early to the various affected operating
systems. Because OpenBSD was not on the list, it didn't find out, he said
in a comment on his timeline post. Furthermore, "OpenBSD have
approached us to be notified about future issues and we've asked them to
join the list as they certainly would qualify and would find it beneficial
not just for any future OpenSSL issues.
"
The timeline shows that there were some organizations that received early warning of the bugs, including a few well in advance of the distros posting. Others were notified at around the same time as the posting, but without any details. Whether OpenSSL considered notifying OpenBSD separately from the mailing list is not clear. The project is certainly aware of the LibReSSL effort (and likely unhappy with how its code has been characterized by the OpenBSD crowd), and that it would likely be affected by these problems. But it is entirely possible that notifying OpenBSD just slipped through the cracks as well.
The conversation fairly quickly degenerated. It is clear that de Raadt and others do not see the distros list as the appropriate venue for early disclosure of vulnerabilities. They believe that affected organizations and projects should be contacted individually, it seems. Regardless of whether anyone at OpenBSD gets paid to read security mailing lists, it is undeniable that having a representative on the list would have gotten the project the early disclosure it is looking for, however.
The conversation is also a bit hard to follow since various participants, including Seifried and distros/linux-distros administrator Solar Designer (Alexander Peslyak), sent private mail to de Raadt that he responds to publicly. In addition, de Raadt's emails don't seem to thread correctly for some reason. But he makes it abundantly clear that he is livid about the issue and he lashes out at Peslyak, Seifried, and Cox.
But, ultimately, it is de Raadt's opposition to embargoes (which typically come with early disclosure) that is part of the reason no one from OpenBSD is on the relevant list. Peslyak said that de Raadt had been invited to join the list in 2012, but declined not just for himself but for the entire OpenBSD project. Peslyak, who has been a voice of reason throughout (for example, he has encouraged OpenSSL to contact LibreSSL directly in the future), also said that de Raadt's anti-embargo stance contributed to the current situation:
It is most unfortunate for their users that OpenBSD and LibReSSL did not get the extra few days to fix the problems found in OpenSSL. It is not exactly clear who is most "to blame" for that, but it is clear that things could be done better (by both OpenBSD and OpenSSL) in the future. For some on the OpenBSD/LibReSSL "side", this episode is evidence of why those projects cannot work with OpenSSL. That may be, but the tone and contents of the emails from de Raadt and others may have also made it obvious (again) why it is hard for anyone outside of the OpenBSD clique to work with that project. It is a project that does a lot of good work, but it is not one that is known for getting along with others.
Accessing Wikipedia offline
Every now and then, one finds oneself in a place where the near-ubiquitous Internet connectivity of today is absent, unusably slow, or prohibitively expensive. Some network functionality (like email) may be worth hassle and expense, while others (like streaming media) are not. Somewhere in between, though, lies reference data, which would be nice to cache locally for offline access, if it were technically feasible. To that end, some "open content" projects, such as OpenStreetMap, make configuring offline access relatively painless, but many others do not. For Wikipedia and the related Wikimedia projects (Wiktionary, Wikivoyage, etc.), the combination of an exceptionally large data set, constant editing, and multiple languages makes for a more challenging target—and a niche has developed for offline Wikipedia access software.
Of course, the "correct" solution to providing offline Wikipedia access would arguably be to run a mirror of the real site, which it is certainly possible to do. But, even then, mirrors start with a hefty Wikipedia database dump that requires considerable storage space: around 44GB for the basic text of the English Wikipedia site, without the "talk" or individual user pages. The media content is larger still; around 40TB are currently in Wikimedia's Commons, of which roughly 37TB is still images. Moreover, the database-import method does not allow a mirror to keep up with ongoing edits, although doing so would consume considerable system resources anyway.
On the other hand, in many cases, Wikipedia's usefulness as a general-purpose reference does not depend on having the absolute newest version of each article. Wikimedia makes periodic database dumps, which can suffice for weeks or even months at a time, depending on the subject. It is probably no surprise, then, that the most popular offline-Wikipedia tools focus on turning these periodic database releases into an approximation of the live site. Many also take a number of steps to conserve space—usually by storing a compressed version of the data, but in some cases by also omitting major sections of the content as well. There are two actively developed open-source tools for desktop Linux systems at present: XOWA and Kiwix. Both support storing compressed, searchable archives of multiple Wikimedia sites, although they differ on quite a few of the details.
Kiwix
![[Kiwix library management]](https://static.lwn.net/images/2014/06-kiwix-browse-sm.png)
Kiwix uses the openZIM file format for its content storage. The Wikipedia database dump is converted into static HTML beforehand, then compressed into the ZIM format. The basic ZIM format includes a metadata index that supports searching article titles, but to enable full-text search, the file must be indexed. The Kiwix project offers both indexed and unindexed archives for download; the indexed files are (naturally) larger, and they also come bundled with the Windows build of Kiwix. The ZIM format is designed with this usage in mind; its development is spearheaded by Switzerland's Wikimedia CH.
As far as content availability is concerned, the Kiwix project periodically updates its official ZIM releases for Wikipedia only—albeit in multiple languages (69 at present, not counting image-free variants available for a handful of the larger editions). In addition, volunteers produce ZIM files for other sites, at the moment including Wikivoyage, Wikiquote, Wiktionary, and Project Gutenberg, with TED and other efforts still in the works.
![[Kiwix showing Wikiquote]](https://static.lwn.net/images/2014/06-kiwix-wikiquote-sm.png)
Kiwix itself is a GPLv3-licensed, standalone graphical application that most closely resembles a "help browser" or e-book reader. The content displayed is HTML, of course, but the user interface is limited to the content installed in the local "library." Users can search for new ZIM content from within the application as well as check for updates to the installed files.
Interestingly enough, there are many more ZIM archives listed within Kiwix's available-files browser than there are listed on the project's web site; why any particular offering is listed in the application is not clear, since some of the options appear to be personal vanity-publishing works. Searching and browsing installed archives is simple and fast; type-ahead search suggestions are available and one can bookmark individual pages. There are also built-in tools for checking the integrity of downloaded archives and exporting pages to PDF.
XOWA
In broad strokes, XOWA offers much the same experience as Kiwix: one installs a browser-like standalone application (AGPL-licensed, in this case), for which individual offline-site archives must be manually installed. Like Kiwix, XOWA can download and install content from its own, official archives. But while Kiwix archives contain indexed, pre-generated HTML, XOWA archives include XML from the original database dumps (stored in SQLite files), which is then dynamically rendered into HTML whenever a new page is opened.
In theory, the XML in the Wikipedia database dumps is the original Wiki markup of the articles, so it should be more compact than the equivalent rendered HTML. In practice, though, such a comparison is less simple. The latest Kiwix ZIM file for the English Wikipedia is 42GB with images, 12GB without, whereas the latest XOWA releases are 89.6GB with images and 14.6GB without. But XOWA also makes a point of the fact that in includes not only the basic articles, but also the "Category," "Portal," and "Help" namespaces, as well as multiple sizes of the included images.
![[XOWA library maintenance]](https://static.lwn.net/images/2014/06-xowa-maintenance-sm.png)
When comparing the two approaches, it is also important to note that XOWA is specifically designed for use with Wikimedia database dumps, a choice that has both pros and cons. In the pro column, virtually any compatible database dump can be used with the application; XOWA offers Wikipedia for 30 languages and a much larger selection of the related sites (Wiktionary, Wikivoyage, Wikiquote, Wikisource, Wikibooks, Wikiversity, and Wikinews, which are bundled together for most languages). XOWA's releases also tend to be more up-to-date; at present none is older than a few months, while some of the less-popular Kiwix archives are several years old.
The downsides, though, start with the fact that only Wikimedia-compatible content is supported. Thus, there is no Project Gutenberg archive available, nor could your favorite Linux news site generate a handy offline article archive should it feel compelled to do so. But perhaps more troubling is the fact that XOWA archives do not support full-text searching. Lookup by title is supported, but that may not always be sufficient for research.
![[XOWA showing Wiktionary]](https://static.lwn.net/images/2014/06-xowa-wiktionary-sm.png)
The browsing experience of the XOWA application is similar to Kiwix; both HTML renderers use Mozilla's XULRunner. XOWA also supports bookmarking pages and library maintenance. XOWA gains a point for allowing the user to seamlessly jump between installed wikis; a Wikipedia link to a Wiktionary page works automatically in XOWA, while a Kiwix user must return to the "library" screen and manually open up a second archive in order to change sites.
On the other hand, XOWA does not support printing or PDF export, and there is a noticeable lag between clicking on a link and seeing the page load. The status bar at the bottom of the window is informative enough to indicate that the delay is due to XOWA's JTidy-based parser; it reports the loading of the page content as well as each template and navigation element used. The parser can also still trip up in its XML-to-HTML conversion. If one is concerned about the accuracy of the conversion, of course, Kiwix's pre-generated HTML offers no guarantees either, but at least its results are static and will not crash on an odd bit of Wiki-markup syntax.
The archive wars continue
Ultimately, though, if the question is whether XOWA or Kiwix generates pages more like those one sees in the web browser from the live Wikimedia site, neither standalone application is perfect. But users may chafe at the very need to run a separate application to read Wikipedia to begin with. Fortunately, both projects are also pursuing another option: serving up their content with an embedded web server, which permits users to access the offline archives from any browser they choose.
XOWA's server can be started with:
java -jar /xowa/xowa_linux.jar --app_mode http_server --http_server_port 8080
Kiwix's server (which, like Kiwix, is written in C++) can be started from the command line with:
kiwix-serve --port=8000 wikipedia.zim
or launched from the application's "Tools" menu. A nice touch for those experimenting with both is that Kiwix defaults to TCP port 8000, XOWA to port 8080. The XOWA project also offers a Firefox extension that directs xowa: URIs to the local XOWA web server process.
Moving forward, it will be interesting to watch how both projects are affected by changes to Wikimedia's infrastructure. The XOWA internal documentation notes that Wikipedia is, at some point, planning to implement diff-style database update releases in addition to its full-database dumps. Incremental updates are one of the factors that makes OpenStreetMap so usable in offline mode, and Wikipedia's lack of such updates is what contributes the most pain to Kiwix and XOWA usage: waiting for those multi-gigabyte downloads to finish.
As unsatisfying as it may seem, neither application emerges as the clear winner for someone inspired to head off to a rustic cabin in the mountains and read Wikipedia at length. At its most basic, the trade-off would seem to be Kiwix's support for non-Wikimedia sites and its full-text search versus XOWA's cross-wiki link support and more predictable update process. Either will likely serve the casual user well.
What's in a (CentOS) version number?
The CentOS project has made its reputation by doing one thing very well: repackaging the Red Hat Enterprise Linux (RHEL) distribution into a freely distributable form. For users who are able to do without the support services offered by Red Hat, CentOS has been an invaluable resource. It is perhaps not surprising that CentOS users worry about the future of this distribution; they are getting a lot for free and many of them know that such situations are not always sustainable. For CentOS, keeping its user base depends on maintaining a certain level of trust so that users know it will continue to be available, stable, and free. The discussion around a proposal on version numbers shows just how easy that trust could be to lose.The relationship between CentOS and Red Hat has always been interesting. Red Hat provides the source packages that, after removal of branding elements, are built into the CentOS release. By one measure, CentOS is sustaining freeloaders who want to benefit from Red Hat's work without paying for it. By another, CentOS helps Red Hat by bringing users into its ecosystem; some of those users eventually become paying Red Hat customers. So it is not surprising that users can see the recent acquisition of CentOS by Red Hat in two different lights: it's either an attempt to squash a competing distribution or an effort to sustain that distribution with much-needed support.
Either way, changes were always going to happen after the acquisition. CentOS users will certainly be happy about the first of those changes: support for CentOS developers so they can work on the distribution full time, and support for the infrastructure needed to keep CentOS going. But when CentOS project leader Karanbir Singh proposed a change to the seemingly trivial issue of version numbers, users were quick to express their disapproval.
Traditionally, CentOS releases have used the same version number as the RHEL release they are based on; CentOS 6.5 is a rebuild of the RHEL 6.5 release, for example. The CentOS developers now want to change to a scheme where the major number matches the RHEL major number, but the minor number is generated from the release date. So, if the CentOS version of RHEL 7.0 were to come out in July 2014, it might have a version number like 7.1407. Derivative releases from CentOS special interest groups (SIGs) would have an additional, SIG-specific tag appended to that number.
To the CentOS developers, this change offers a number of advantages. The close tie with RHEL version numbers, it is claimed, can confuse users into believing that a release is supported with security updates when it is not; see this detailed message from Johnny Hughes for an explanation of the reasoning there. Putting the release date into the version number makes the age of a release immediately obvious, presumably inspiring users to upgrade to current releases. This scheme would also make it easier to create releases that are not directly tied to RHEL releases; that is something that the SIGs, in particular, would like to be able to do.
Supporting the SIGs is a big part of the project's plan for the future in general. Karsten Wade described it this way:
So it seems that CentOS wants to follow Red Hat into the cloud. Simply providing a rebuild of RHEL is not as exciting as it once was, so the project wants to expand into other areas where, it is hoped, more users are to be found.
It should be possible to expand in this way as long as the core CentOS distribution remains what it has always been. Unfortunately, some users are worried that things will not be that way. Ljubomir Ljubojevic, the maintainer of the CentOS Facebook page, described his feelings about the change:
A large number of "me too" posts made it clear that Ljubomir is not alone in feeling this way. There is a lot of concern that the project might break the core distribution and that adopting a new version numbering scheme looks like a first step in that direction.
For their part, the CentOS developers have tried to address that concern. Karanbir stated directly that there is no plan to change how the core distribution is managed:
For the most part, the users in the discussion seemed to accept that promise, but that made them no happier about the version numbering change. The date-based numbers, they say, make it harder to know which version of RHEL a CentOS release is based on, and it can make it harder to justify installations (or upgrades) to management. All told, it was hard to find a single supportive voice for this change outside of the CentOS core developers.
Those developers have not said anything about what changes, if any, they might make to their plans in response to the opposition on the list. They are in a bit of a difficult position: they want to make changes aimed at attracting a broader set of users, but those changes appear threatening to their existing users, most of whom are quite happy with the distribution as it is now and are not asking for anything different. If the existing users start to feel that their concerns are not being heard, they may start to look for alternatives. In this case, the powers that be at CentOS may want to make a show of listening to those users and finding a way to resolve their version number concerns that doesn't appear to break the strong connection between RHEL and CentOS releases.
Page editor: Jonathan Corbet
Next page:
Security>>