|
|
Log in / Subscribe / Register

Development

Emacs and changing documentation formats

By Nathan Willis
December 10, 2014

The GNU Emacs project is debating the idea of changing the format in which its official documentation is written and maintained. Proposing the change is Eric S. Raymond, who argues that the Texinfo format currently used is archaic and constitutes a barrier to entry. His proposal has its supporters—including Richard Stallman—but plenty of other project members contend that whatever shortcomings the Emacs documentation may have, replacing Texinfo as Raymond suggests is not the fix.

For those unfamiliar with it, Texinfo is a GNU project with a long history. It is the format behind the command-line info tool (although Emacs uses its own, built-in info page browser) and is the officially blessed manual format for all GNU projects. The syntax itself is akin to many other text-markup formats, but it was designed with structured, cross-referenced user manuals in mind. Thus, it provides some features not found in other markup languages. The most prominent example is its support for document-wide indexes. In addition to their usage in terminal commands like info, Texinfo documents can also be converted to HTML or (through TeX) to print-quality PostScript output. Texinfo is used by a number of non-GNU projects as well; users may have encountered reference to info documents from Linux man pages.

On December 5, Raymond wrote to the Emacs development list to announce his intention to migrate the project's documentation away from GNU's Texinfo as its master format, in order to adopt AsciiDoc instead. Raymond's message came on the heels of a lengthy discussion thread about attracting and retaining new contributors, which is certainly an issue most free-software projects can relate to. In the earlier discussion, several barriers to entry for newcomers were mentioned, such as the fact that Emacs's official how-to-contribute document is a text file installed in a local directory, rather than a web page.

Raymond cited that concern in his message, saying that "Emacs's web resources are weak, scattered, and unfocused", which he said was primarily because "the Emacs development culture is still largely stuck in a pre-Web mindset" that comes across to the public as behind-the-times. And because the Emacs documentation is written in Texinfo, he said, it actually is behind the times. He concluded by saying that a change is required:

The solution must be partly a change in mechanism and partly a change in policy and attitude. The change in technology is the simple part; info and Texinfo must die. They must be replaced with a common format for documentation masters that is Web-friendly, and by Web presentation.

I have discussed this with RMS and, pending my ability to actually write proper translation tools, we have agreed on asciidoc as a new master format.

Raymond volunteered to take on the "tools end" of such a change—starting with writing software to translate from Texinfo to AsciiDoc—but called for someone else to take on the "policy/organization end" of the transition.

Formats and converts

As might be expected for a change as large as the one proposed, reaction to Raymond's email was varied. Several list members expressed support, although many of those who supported the idea of dropping Texinfo favored other documentation formats. Plenty of others were opposed, either on the grounds that Raymond's assessments of suitability of the various documentation formats were off or out of conviction that the costly migration proposed would do little if anything to increase the ranks of Emacs contributors.

Chris Webber was one of the first to register agreement with Raymond's dissatisfaction about the status quo, but he called the choice of AsciiDoc as an alternative "very confusing" and asked why it was it chosen. Raymond replied that AsciiDoc was in use by both the Git project and by Linux kernel developers (although it was later pointed out that only a small portion of the kernel's documentation makes use of AsciiDoc). Moreover, AsciiDoc is "a modern, lightweight markup in general use outside the Emacs project." Among the other markup options, he said, Markdown suffers from a lack of standardization (and he is not convinced the current standardization effort will bear fruit) and is not designed for structured documentation, while reStructuredText and Sphinx have not been adopted by as many high-profile projects as AsciiDoc.

Altogether, AsciiDoc did not seem to elicit much support from list members. In fact, among those who agreed that Texinfo should be replaced, there were more voices in favor of adopting Emacs's Org mode for the markup style of choice than any other option. Rasmus Pank Roulund raised the idea first, noting that Emacs does not have a reliable editing mode for AsciiDoc and, at present, AsciiDoc tools can only generate HTML output. Org mode, in contrast, is well-supported in Emacs and can generate output in a variety of formats.

Raymond, both in his reply to Webber and elsewhere, responded that Org-mode markup would be a poor choice for a documentation format because its use is too limited to the existing Emacs community:

org mode may be functionally capable enough - I don't know yet - but I think it's the wrong kind of positioning; it says "We're Emacs, we're going to stick to our weird ingrown rituals and not-invented-here hostility, go away".

Stallman also weighed in on that suggestion, initially dismissing Org mode because it is a program, not a format, and furthermore it is a program that only runs within Emacs.

Org mode's supporters, like Achim Gratz, disagreed, pointing to several other org-mode compliant programs and the official specification for the syntax. Stallman subsequently conceded the points, concluding:

If we develop software to browse HTML manuals made by Texinfo with all the good features of info, then we can drop use of Info format.

Then there will be no reason to insist on one particular source format for manuals for GNU packages. We could allow any source markup format that can generate the three kinds of output we want:

* Nicely formatted PS or PDF. (Ideally, passed through TeX.)

* HTML used like Info version 2.

* Plain ASCII.

If Org format can do these jobs, it would be one of the acceptable source markup formats.

However, there is far from any consensus as to whether or not Org mode's syntax—or, for that matter, AsciiDoc's—is better than Texinfo's. In fact, it has proven fairly difficult to come to any agreement on what better means in this context. Raymond contended that Texinfo syntax involves duplicating a document's structure in several places, and said he appreciates the fact that AsciiDoc is readable as plain text. "I get to write *foo* instead of <emphasis>foo</emphasis> and you know what? It's better - lower overhead not just in typed characters but in the amount of attention required to read it structurally."

In contrast, David Kastrup said that "When looking at existing Texinfo source, I get a good idea of how to write Texinfo markup of my own. When looking at AsciiDoc, I have no clue since it is not apparent what is formatting, and what is content."

A number of people also pointed out that, contrary to Raymond's initial claim, Texinfo can (and does) generate HTML content from its source files. Rüdiger Sonderfeld pointed out that Texinfo supports images and can produce eye-pleasing output, citing GNU Octave as an example. Kastrup, similarly, pointed to LilyPond as a positive show piece.

On the other hand, the most oft-cited feature of Texinfo is its support for indexes: when writing text, the user can add any term to the manual's index with a markup tag, making the term easy to look up or search for. The format comes with built-in support for a variety of separate indexes, and adding additional indexes is relatively trivial. Critics of the other formats discussed were quick to point out that similar features are not available in the competition, but Eli Zaretskii (Emacs's documentation maintainer), countered that it is not the ability to mark up index entries that is the killer feature—rather, it is the support in the various info browsers for searching the index of a manual.

Change

Ultimately, arguments about markup formats are probably just as doomed as arguments about programming languages: too many of the perceived benefits and drawbacks come down to personal judgment calls on which a large group of people will rarely reach a consensus. Most of the list membership agreed that work could be done to improve the HTML output generated for Emacs's documentation. The notion that a different source format was required to improve the output, however, was a harder sell.

But the proposed change in formats also encountered resistance on another front: Raymond's assertion that switching to AsciiDoc would attract more Emacs contributors. Several people, such as Ted Zlatanov, did acknowledge that they considered Texinfo "a nasty hurdle", even discouraging them from contributing documentation.

Kastrup, though argued that switching formats might attract different people, but was unlikely to attract more. Furthermore, switching formats would likely have the unwanted effect of causing existing documentation writers to lose interest. Zaretskii, for his part, called the AsciiDoc plan "simply insane", given that Texinfo is the only source format in use by the existing documentation authors.

Zaretskii also argued that:

[...] no matter to which source language we switch, it will not magically solve our documentation issues. Contrary to what Eric is saying, 90% of the effort of writing good documentation is not producing markup -- each one is just a couple of keys in Emacs's texinfo-mode. No, the main part of the effort is thinking how to describe a feature in a logical and didactically correct manner, and then expressing that in clear and concise English text. No markup language will ever help us solve these problems, as they are fundamentally human activities based on human creativity.

Few documentation writers would likely disagree with that sentiment. What is harder to assess, however, is the degree to which Texinfo, AsciiDoc, or any other documentation tool plays a role in attracting or repelling a hypothetical new recruit. Zaretskii's position is that "until more people start contributing to the docs, it makes very little sense to me to change the tools", which Raymond called "exactly backwards".

Reconciling those stances would not be easy under any circumstances. It does seem, though, that Raymond may have made his own task substantially more difficult by the manner in which he presented his proposal. He approached the list with a solution already in hand—one that seemingly arose from private conversation with Stallman and not with Zaretskii or other Emacs documentation team members. It can hardly be surprising, then, that his scheme was not received with unbridled enthusiasm by the volunteers on whom he announced it would be enacted.

Raymond, though, is certainly no stranger to controversy nor to the job of pushing a project forward on a difficult change. He recently spearheaded Emacs's move to Git revision control—a process that was difficult and at times prickly to deal with, but one that most people seem to regard as a good idea. He positioned dropping Texinfo in favor of AsciiDoc as a similar endeavor: modernizing the project for long-term benefit. But not all of Raymond's attempts to push the bar forward are particularly successful; some may recall his ill-fated mid-2000 effort to radically rework the kernel configuration system.

Stallman's support can often be sufficient to push through even a major change in Emacs, although at present it is not clear if strong objections from Zaretskii and the other documentation authors is enough to block the plan. It seems a bit unlikely that Raymond's original changeover from Texinfo to AsciiDoc will be implemented as-is (at least, without quite a bit more discussion),—although it is interesting to note that Stallman has exhibited a surprisingly open mindset about the possibility of ditching Texinfo for other formats. But Raymond's proposal has clearly prompted a number of Emacs contributors to take another look at Texinfo's HTML output and the web presence of the project's documentation.

As Filipp Gunbin observed, most people's first instinct when looking for an answer to a programming question is to search online. Historically, Emacs has not optimized its documentation for that approach. Whether the project can improve its documentation to the level expected by adapting its tools to produce slicker-looking, well-indexed HTML is hard to say—but there seems to be a willingness to try.

Comments (109 posted)

Brief items

Quotes of the week

FWIW, I really think that the standalone Info viewer should be kept secret, because it does a disservice to the reputation of the Info format.
Stefan Monnier

Maybe if we rebrand the ELF format as a container implementation people would feel better about static linking.
Kelsey Hightower

Comments (1 posted)

A new set of Docker tools

Docker has announced a new set of container management tools: Machine (for system provisioning), Swarm (native clustering for Dockerized applications), and Compose (assembly of multi-container applications). "Finally, Docker Swarm has a pluggable architecture and ships 'batteries included' with a default scheduler. Stay tuned for the public API in the first half of 2015 which will allow swapping-in a scheduler implemented by an ecosystem partner or even your own custom implementation. Nevertheless, regardless of the underlying scheduler implementation, the interface to the app remains consistent, meaning that the app remains 100% portable."

Comments (7 posted)

Qt 5.4 released

Version 5.4 of the Qt toolkit is now available. It provides better interaction with web-based content, improved graphics, Bluetooth Low Energy support, and a lot more, including a licensing change: "As announced earlier, the open-source version for Qt 5.4 is also made available under the LGPLv3 license. The new licensing option allows us at The Qt Company to introduce more value-add components for the whole Qt ecosystem without making compromises on the business side. It also helps to protect 3rd party developers’ freedom from consumer device lock-down and prevents Tivoization as well as other misuse."

Comments (4 posted)

Ghostscript 9.14.0 available

Version 9.14.0 of GNU Ghostscript has been released. The major feature to note in this version is a license change; Ghostscript is now published under the Affero GPL version 3.

Full Story (comments: none)

sendmail 8.15.1 available

sendmail 8.15.1 has been released. There are new TLS-related features and several bugfixes included, but the major visible change in this version is that sendmail "uses uncompressed IPv6 addresses by default, which is an incompatible change that requires to update IPv6 related configuration data."

Full Story (comments: 7)

Bokeh 0.7 released

Version 0.7 of the Bokeh data-visualization library has been released. Many new tools are included, as as touch-capable UI widgets, an improved high-level charting interface, and a "vastly improved " linked-data table.

Full Story (comments: none)

Python 2.7.9 released

The Python 2.7.9 release is out. The 2.7 series is in deep maintenance mode, but this update still includes a new SSL module (taken from Python 3.4) and validation of SSL certificates by default. This release also adds the ensurepip module, making the "pip" package manager available in all installations.

Comments (none posted)

Newsletters and articles

Development newsletters from the past week

Comments (none posted)

Hutterer: pointer acceleration in libinput - building a DPI database for mice

Peter Hutterer describes a new mechanism aimed at providing consistent acceleration behavior across mice. "For us, useless and unpredictable is bad, especially in the use-case of everyday desktops. To work around that, libinput 0.7 now incorporates the physical resolution into pointer acceleration. And to do that we need a database, which will be provided by udev as of systemd 218 (unreleased at the time of writing). This database incorporates the various devices and their physical resolution, together with their sampling rate. udev sets the resolution as the MOUSE_DPI property that we can read in libinput and use as reference point in the pointer accel code." The developers are looking for help to populate this new database.

Comments (69 posted)

Kocialkowski: A hacker's journey: freeing a phone from the ground up, first part

Paul Kocialkowski shares his experience with porting Replicant to the LG Optimus Black. "Every once in a while, an unexpected combination of circumstances ends up enabling us to do something pretty awesome. This is the story of one of those times. About a year ago, a member of the Replicant community started evaluating a few targets from CyanogenMod and noticed some interesting ones. After some early research, he picked a device: the LG Optimus Black (P970), bought one and started porting Replicant to it. After a few encouraging results, he was left facing issues he couldn't overcome and decided to give up with the port. As the device could still be an interesting target for Replicant, we decided to buy the phone from him so that I could pick up the work where he stalled." (Thanks to Paul Wise)

Comments (3 posted)

Page editor: Nathan Willis
Next page: Announcements>>


Copyright © 2014, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds