By Nathan Willis
December 10, 2014
The GNU Emacs project is debating the idea of changing the format
in which its official documentation is written and maintained.
Proposing the change is Eric S. Raymond, who argues that the Texinfo
format currently used is archaic and constitutes a barrier to entry. His proposal
has its supporters—including Richard Stallman—but plenty of
other project members contend that whatever shortcomings the Emacs
documentation may have, replacing Texinfo as Raymond suggests is not
the fix.
For those unfamiliar with it, Texinfo is a GNU project with a long
history. It is the format behind the command-line info
tool (although Emacs uses its own, built-in info page browser) and is
the officially blessed manual format for all GNU projects.
The syntax itself is akin to many other text-markup formats, but it
was designed with structured, cross-referenced user manuals in mind.
Thus, it provides some features not found in other markup languages.
The most prominent example is its support for document-wide indexes.
In addition to their usage in terminal commands like info, Texinfo
documents can also be converted to HTML or (through TeX) to
print-quality PostScript output. Texinfo is used by a number of
non-GNU projects as well; users may have encountered reference to
info documents
from Linux man pages.
On December 5, Raymond wrote to the
Emacs development list to announce his intention to migrate the
project's documentation away from GNU's Texinfo as its master format, in
order to adopt AsciiDoc instead. Raymond's message came on the heels
of a lengthy discussion thread about attracting and retaining new
contributors, which is certainly an issue most free-software projects
can relate to. In the earlier discussion, several barriers to entry
for newcomers were mentioned, such as the fact that Emacs's official
how-to-contribute document is a text file installed in a local directory, rather than a web page.
Raymond cited that concern in his message, saying that
"Emacs's web resources are weak, scattered, and
unfocused," which he said was primarily because "the Emacs
development culture is still largely stuck in a pre-Web
mindset" that comes across to the public as behind-the-times.
And because the Emacs documentation is written in Texinfo, he said, it
actually is behind the times. He concluded by saying that a
change is required:
The solution must be partly a change in mechanism and partly a change
in policy and attitude. The change in technology is the simple part;
info and Texinfo must die. They must be replaced with a common format
for documentation masters that is Web-friendly, and by Web
presentation.
I have discussed this with RMS and, pending my ability to actually write
proper translation tools, we have agreed on asciidoc as a new master
format.
Raymond volunteered to take on the "tools end" of such
a change—starting with writing software to translate from
Texinfo to AsciiDoc—but called for someone else to take on the
"policy/organization end" of the transition.
Formats and converts
As might be expected for a change as large as the one proposed,
reaction to Raymond's email was varied. Several list members
expressed support, although many of those who supported the idea of
dropping Texinfo favored other documentation formats. Plenty of
others were opposed, either on the grounds that Raymond's assessments
of suitability of the various documentation formats were off or out of
conviction that the costly migration proposed would do little if anything
to increase the ranks of Emacs contributors.
Chris Webber was one of the first to register agreement with Raymond's dissatisfaction
about the status quo, but he called the choice of AsciiDoc as an
alternative "very confusing" and asked why it was it
chosen. Raymond replied that AsciiDoc
was in use by both the Git project and by Linux kernel developers
(although it was later pointed out that only
a small portion of the kernel's documentation makes use of AsciiDoc).
Moreover, AsciiDoc is "a modern, lightweight markup
in general use outside the Emacs project." Among the
other markup options, he said, Markdown suffers from a lack of
standardization (and he is not convinced the current standardization effort will bear fruit) and is not designed for structured documentation,
while reStructuredText
and Sphinx have not been adopted by as many
high-profile projects as AsciiDoc.
Altogether, AsciiDoc did not seem to elicit much support from list
members. In fact, among those who agreed that Texinfo should be
replaced, there were more voices in favor of adopting Emacs's Org mode for the markup style of choice than any other option. Rasmus Pank
Roulund raised the idea first, noting
that Emacs does not have a reliable editing mode for AsciiDoc and, at
present, AsciiDoc tools can only generate HTML output. Org mode, in contrast, is
well-supported in Emacs and can generate output in a variety of
formats.
Raymond, both in his reply to Webber and elsewhere, responded that
Org-mode markup would be a poor choice for a documentation format
because its use is too limited to the existing Emacs community:
org mode may be functionally
capable enough - I don't know yet - but I think it's the wrong kind
of positioning; it says "We're Emacs, we're going to stick to our
weird ingrown rituals and not-invented-here hostility, go away".
Stallman also weighed in on that
suggestion, initially dismissing Org mode because it is a program, not
a format, and furthermore it is a program that only runs within Emacs.
Org mode's supporters, like Achim Gratz, disagreed, pointing to several other
org-mode compliant programs and the official specification
for the syntax. Stallman subsequently conceded the points, concluding:
If we develop software to browse HTML manuals made by Texinfo with all
the good features of info, then we can drop use of Info format.
Then there will be no reason to insist on one particular source format
for manuals for GNU packages. We could allow any source markup format
that can generate the three kinds of output we want:
* Nicely formatted PS or PDF. (Ideally, passed through TeX.)
* HTML used like Info version 2.
* Plain ASCII.
If Org format can do these jobs, it would be one of the
acceptable source markup formats.
However, there is far from any consensus as to whether or not Org
mode's syntax—or, for that matter, AsciiDoc's—is better
than Texinfo's. In fact, it has proven fairly difficult to come to
any agreement on what better means in this context. Raymond
contended that Texinfo syntax involves
duplicating a document's structure in several places, and said he appreciates the fact that
AsciiDoc is readable as plain text. "I get to write *foo*
instead of <emphasis>foo</emphasis> and you know what? It's better -
lower overhead not just in typed characters but in the amount of
attention required to read it structurally."
In contrast, David Kastrup said
that "When looking at existing Texinfo source, I get a good idea of how to
write Texinfo markup of my own. When looking at AsciiDoc, I have no
clue since it is not apparent what is formatting, and what is
content."
A number of people also pointed out that, contrary to Raymond's initial
claim, Texinfo can (and does) generate HTML content from its source
files. Rüdiger Sonderfeld pointed out
that Texinfo supports images and can produce eye-pleasing output,
citing GNU Octave as an example. Kastrup, similarly, pointed to LilyPond as a positive show piece.
On the other hand, the most oft-cited feature of Texinfo is its
support for indexes: when writing text, the user can add any term to
the manual's index with a markup tag, making the term easy to look up or
search for. The format comes with built-in support for a variety of
separate indexes, and adding additional indexes is relatively trivial.
Critics of the other formats discussed were quick to point out that
similar features are not available in the competition, but Eli
Zaretskii (Emacs's documentation maintainer), countered that it is not the ability to
mark up index entries that is the killer feature—rather, it is the
support in the various info browsers for searching the index of a manual.
Change
Ultimately, arguments about markup formats are probably just as
doomed as arguments about programming languages: too many of the
perceived benefits and drawbacks come down to personal judgment calls
on which a large group of people will rarely reach a consensus. Most
of the list membership agreed that work could be done to improve the
HTML output generated for Emacs's documentation. The notion that a
different source format was required to improve the output, however,
was a harder sell.
But the proposed change in formats also encountered resistance on
another front: Raymond's assertion that switching to AsciiDoc would
attract more Emacs contributors. Several people, such as Ted
Zlatanov, did acknowledge that they
considered Texinfo "a nasty hurdle," even discouraging
them from contributing documentation.
Kastrup, though argued that
switching formats might attract different people, but was
unlikely to attract more. Furthermore, switching formats
would likely have the unwanted effect of causing existing
documentation writers to lose interest. Zaretskii, for his part, called the AsciiDoc plan "simply
insane," given that Texinfo is the only source format in use by
the existing documentation authors.
Zaretskii also argued that:
[...] no matter to which source language we switch, it will
not magically solve our documentation issues. Contrary to what Eric
is saying, 90% of the effort of writing good documentation is not
producing markup -- each one is just a couple of keys in Emacs's
texinfo-mode. No, the main part of the effort is thinking how to
describe a feature in a logical and didactically correct manner, and
then expressing that in clear and concise English text. No markup
language will ever help us solve these problems, as they are
fundamentally human activities based on human creativity.
Few documentation writers would likely disagree with that
sentiment. What is harder to assess, however, is the degree to which
Texinfo, AsciiDoc, or any other documentation tool plays a role in
attracting or repelling a hypothetical new recruit. Zaretskii's position is that "until more people
start contributing to the docs, it makes very little sense to me to
change the tools," which Raymond called "exactly backwards."
Reconciling those stances would not be easy under any
circumstances. It does seem, though, that Raymond may have made his
own task substantially more difficult by the manner in which he presented his
proposal. He approached the list with a solution already in hand—one
that seemingly arose from private conversation with Stallman and not
with Zaretskii or other Emacs documentation team members. It can
hardly be surprising, then, that his scheme was not received with
unbridled enthusiasm by the volunteers on whom he announced it would
be enacted.
Raymond, though, is certainly no stranger to controversy nor to the
job of pushing a project forward on a difficult change. He recently
spearheaded Emacs's move to Git revision control—a process that
was difficult and at times prickly to deal with, but one that most
people seem to regard as a good idea. He positioned dropping Texinfo
in favor of AsciiDoc as a similar endeavor: modernizing the project
for long-term benefit. But not all of Raymond's attempts to push the
bar forward are particularly successful; some may recall his ill-fated
mid-2000
effort to
radically rework the kernel configuration system.
Stallman's support can often be sufficient to push through even a major
change in Emacs, although at present it is not clear if strong
objections from Zaretskii and the other documentation authors is
enough to block the plan. It seems a bit unlikely that Raymond's
original changeover from Texinfo to AsciiDoc will be implemented as-is
(at least, without quite a bit more discussion),—although it is
interesting to note that Stallman has
exhibited a surprisingly open mindset about the possibility of
ditching Texinfo for other formats. But Raymond's proposal has
clearly prompted a number of Emacs contributors to take another look
at Texinfo's HTML output and the web presence of the project's documentation.
As Filipp Gunbin observed, most people's first instinct
when looking for an answer to a programming question is to search
online. Historically, Emacs has not optimized its documentation for
that approach. Whether the project can improve its documentation to
the level expected by adapting its tools to produce slicker-looking,
well-indexed HTML is hard to say—but there seems to be a
willingness to try.
(
Log in to post comments)