LWN.net Logo

Development

A survey of the DocBook landscape

September 12, 2006

This article was contributed by John L. Clark

Introduction

The OpenDocument Format, developed under OASIS (Organization for the Advancement of Structured Information Standards), has been getting quite a bit of attention lately. ODF is an Open Standard and it serves as an important vehicle for the Free Software community and this community's information; the Software Freedom Law Center recently confirmed that ODF is safe from patent claims from its OASIS Technical Committee members. Version 1.0 of the format was ratified in May of 2005 by this TC, and ODF recently arrived at one of the last stages in its process towards ISO/IEC adoption as ISO/IEC 26300. The state of Massachusetts underwent a grueling and well-scrutinized process last year in which it decided to use ODF for its official documents; at least one vendor strongly opposed this decision, but even this vendor has recently announced work on interoperability with ODF.

All this attention is well-deserved, for ODF intends to provide the structure for many of the documents that store many users' information: "office" documents. The basic purpose of a format for office documents is to encode the presentation of information. Most commonly, office documents encode how to present page-based sequential documents in print, spreadsheets in various media, and slides in interactive display and various other media. One alternative approach to authoring content focuses on the semantics of the information; this approach requires more discipline but can provide some advantages, particularly where it comes to reusing the information. In addition to ODF, OASIS also oversees the development of DocBook, which takes this alternative approach. Several significant events in DocBook development warrant some attention in that direction.

DocBook was originally developed as an SGML application and has been modernized to simultaneously support SGML and XML; it focuses on the semantics of software and hardware documentation. DocBook also provides a clear and rich representation of the semantics of general-purpose documentation, including detailed structures for bibliographic information, glossaries, and a variety of contextual devices such as footnotes. Many free software projects make use of DocBook (or a variant), including KDE, GNOME, and OpenDarwin. Not surprisingly, The Linux Documentation Project makes heavy use of DocBook.

What can you do to read a DocBook file if you (unexpectedly) receive one? Perhaps the easiest approach is to use the DocBook XSL stylesheets to format the file as HTML, then view it with your favorite web browser. The xsltproc utility provides XML translation functionality, and it is easy to install if your distribution does not already provide it. Using xsltproc, you can translate a DocBook file to HTML with the command: xsltproc http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl file.docbook > file.html. Other translation tools and stylesheets exist, and perhaps the best solution is to use a native reader or editor of DocBook, such as Vex or Conglomerate, to view and interact with the file directly.

The DocBook language: present and future

The DocBook 4 development line currently produces the stable version of DocBook: DocBook 4.4. The current "OASIS Standard" version of DocBook, however, is DocBook 4.1, which is why you often see projects using DocBook 4.1.2—the latest bug-fix version of DocBook 4.1. DocBook 4.5 is nearly completed, and has also been submitted for approval as an OASIS Standard. Release Candidate 3 (released in June) will likely become the newest stable version; RC2 was itself almost accepted as an OASIS Standard until a small bug in the specification forced the version bump.

As a matter of DocBook project policy, individual DocBook minor versions within a major version are backwards compatible with previous minor versions in the same major version. For example, all documents written in DocBook 4.1.2 are valid DocBook 4.4 documents and all DocBook 4.4 documents will be valid DocBook 4.5 documents when that version is available. These minor versions of DocBook 4 have subtly added to its expressiveness in addition to adding completely new elements, such as user-requested markup for describing tasks.

A new major version of DocBook, version 5, is rapidly approaching. DocBook 5 explicitly breaks backwards compatibility in order to move in some new directions, which largely have to do with aspects of the underlying technology. The naming and semantics of markup in DocBook 5, on the other hand, strongly reflect DocBook 4. DocBook 5 makes a break from its SGML roots, moving to aspects of XML technology that are not represented in the SGML model.

The most prominent of the architectural changes is that DocBook 5 now uses an XML namespace for its element set. This namespace will be used by the stable version when it is released so users will not need to migrate to a different namespace once DocBook 5 stabilizes. The use of an XML namespace allows DocBook to more cleanly take advantage of other XML dialects such as SVG and MathML; it also allows other languages to more easily integrate DocBook, or subsets of DocBook, in places where they want to express prose documentation.

Validation and new features

Document validation is an important tool for supporting document interoperability. Through version 4, DocBook has primarily provided a Document Type Definition (DTD) for assessing document validity. DTDs are well supported and built into the core XML specification, but they are not able to deal with XML Namespaces and they are not as expressive as more modern tools. For these and other reasons, DocBook 5 (like ODF) provides a RELAX NG schema as its basis for validation. RELAX NG is more context-aware, which means that in several places certain DocBook constructs have been simplified or merged, and a number of previously unenforceable constraints are now enforced.

The DocBook 5 schema in RELAX NG is also highly modular, which means that anyone interested in modifying the language can easily pick and choose from small components to build their custom language. If needed, users can also use less accurate, monolithic DTDs or W3C XML Schemas that are generated from the RELAX NG schema. In addition to RELAX NG, the DocBook 5 schema uses a set of optional Schematron assertions to help validate those hard-to-reach places.

DocBook 5 also sports new and improved facilities for expressing content. Instead of native hypertext markup, it uses XLink for hypertext references. Interestingly, in DocBook 5 almost every element can serve as a hyperlink: if xlink is bound to the XLink namespace, then simply set xlink:href="target" on an element to have that element point at the target. In XLink, these types of links are called Simple Links; DocBook 5 also adds support for XLink Extended Links using the new, imaginatively named extendedlink element.

DocBook 5 continues to use XInclude to support transclusion. In addition to many fixes, the removal of several obsolete components, and a number of small adjustments, it also introduces elements designed to support new features, such as a general mechanism for annotating content and a structure for noting the correspondence between a term and its definition.

Practical considerations

DocBook 5 will likely have a stable release soon. Norman Walsh, the main hacker, er, lead architect of DocBook 5, published his first experiments with the new language in May of 2003 and the first official beta of DocBook 5 was published in October of 2005. It is currently at beta 7, and there will be several release candidates before the Technical Committee applies the official DocBook 5.0 seal of approval.

Many of the tools for processing DocBook have gained DocBook 5 support as DocBook 5 has developed. Many users take advantage of the (previously mentioned) DocBook XSL stylesheets for converting DocBook to other formats for publication, such as HTML and XSL-FO (an intermediate step toward producing PDF). The stable version of the DocBook XSL stylesheets is 1.70.1, and it includes support for DocBook 5.0; the next testing version of these stylesheets, version 1.71.0, was released recently. Work has also begun on a rewrite of the DocBook XSL stylesheets using XSLT 2; these are unsurprisingly called the DocBook XSL 2 stylesheets. Developers of some DocBook editors and other tools have worked to integrate support for DocBook 5.

Jirka Kosek, card-carrying member of the DocBook illuminati, has written and currently maintains DocBook V5.0: The Transition Guide, which covers the above DocBook 5 issues in more detail and which will be very useful to anyone interested in migrating from DocBook 4 to DocBook 5.

DocBook offers authors a powerful level of expressiveness, and both the stable version 4 and the new version 5 will soon reach important milestones. DocBook 5 is a refactoring, intended to better integrate with XML technologies and to be easier to use by authors and users who need to customize the language itself. It is written with the intention of avoiding major disruptions of patterns of authoring that exist with DocBook 4. New versions of both DocBook 4 and DocBook 5 continue to offer enhancements that allow authors to better express their thoughts and convey information.

Comments (6 posted)

System Applications

Audio Projects

Rivendell 0.9.73 announced

Version 0.9.73 of the Rivendell radio automation system is out with new features and bug fixes. "Rivendell is a full-featured radio automation system targeted for use in professional broadcast environments. It is available under the GNU General Public License."

Full Story (comments: none)

LDAP Software

LAT 1.1.90 announced

Version 1.1.90 of LAT, the LDAP Administration Tool is available. "This is the first beta for the 1.2 release. Check it out. If you find any bugs, please report them."

Full Story (comments: none)

Security

Sussen 0.29 announced

Version 0.29 of Sussen, a vulnerability and configuration scanner, is out with bug fixes.

Full Story (comments: none)

Web Site Development

ccHost 3.0 released

Version 3.0 of ccHost has been announced. "Creative Commons, a nonprofit organization that provides flexible copyright licenses for authors and artists along with the Creative Commons Developer Community released the ccHost 3.0 today. ccHost is an Open Source web-based media sharing software. This major feature release comes on the heals of winning the Linux Journal Linux World Expo Award for "Best Open Source Solution" and combines approximately five months of development, usage, and testing into packages that anyone may download, install, and use to empower on-line media sharing communities."

Full Story (comments: none)

Plone 2.5.1 and 2.1.4 released

Two new releases of Plone, a web content management system, have been announced. "We have prepared two new releases of the 2.5.x and 2.1.x series with default policy improvements to counter the spam attacks that some Plone sites have been a victim of lately. This is a required upgrade for all Plone sites, please be a responsible administrator and update your sites as soon as possible."

Comments (none posted)

Zope News

The August 16-31, 2006 edition of Zope News is available with coverage of the Zope content management system.

Comments (none posted)

Web Services

Separation of Concerns in Web Service Implementations (O'ReillyNet)

Tieu Luu discusses the separation of concerns in web service implementations in an O'Reilly article. "Separation of concerns is a core principle of Service-Oriented Architectures. Unfortunately, this principle is often lost when it comes to the implementations of SOA services. All too often we see a big implementation class with multiple concerns such as security, transaction management, and logging all mixed in with the business logic. Using the Spring Framework and principles of Aspect Oriented Programming (AOP), we can drive the separation of concerns down into the implementation of services. In this article, we show how to develop a Web service using Apache Axis and Spring, and secure it with Acegi Security--all while keeping the concerns nicely separated."

Comments (none posted)

Desktop Applications

Accessibility

Accessibility Test Suite

Rodney Dawes has posted an update on a GNOME accessibility test suite that he is working on, testers are needed. "Lately, I've been working on some tools to help us improve the level of accessibility support in our desktop. In doing so, I ended up creating a python module to minimize the code duplication between scripts, as each application being tested, needs its own script. The module itself does a little initialization and shutdown stuff, and writes out an HTML file to present a nice tabular report of missing Name and Description identifiers on accessible widgets, using LDTP."

Full Story (comments: none)

Audio Applications

What to expect in Ardour2

A new article about the Ardour multi-track audio editor package entitled What to expect in Ardour2 is out, it describes the plans for the next version in detail. New features will include: GTK2 support, a control surface architecture, OSC Support, a redone sound file browser/importer, saved undo, a revamped UI, destructive recording, support for 64-bit sound formats and more. (Thanks to Taybin Rutkin.)

Comments (none posted)

Desktop Environments

Desktop memory usage comparison

Lubos Lunak has documented a comparison of memory usage with four popular desktop environments running a variety of applications. "These memory benchmarks are meant to measure various cases of desktop configuration and compare KDE to some other desktop environments. Specifically, I compared against Xfce 4.2.2 (as shipped with SUSE Linux 10.0) as the so-called lightweight desktop, WindowMaker 0.92.0 as a plain window manager and GNOME. GNOME, built using GARNOME, was originally version 2.12.2, later redoing it with 2.14.0 (without actually measuring noticeable difference in these specific cases, despite 2.14 release notes claiming performance improvements). As I no longer have the same setup I cannot redo it with the very recent 2.16 unfortunately. Simply consider this to be a bit old. The others are for comparison anyway :). KDE itself was KDE 3.5.2 with my performance patches, all of which are already upstream by now." (Thanks to Alexander Neundorf.)

Comments (4 posted)

GARNOME 2.16.0 released

Version 2.16.0 of GARNOME, the bleeding-edge GNOME distribution, is out. "This release incorporates the GNOME 2.16.0 Desktop and Developer Platform, fine-tuned with love by the GARNOME Team. It includes updates and fixes after the GNOME 2.16.0 freeze, together with a host of third-party GNOME packages, Bindings and the Mono(tm) Platform -- this release is the first of a new stable GNOME branch and ships with the latest and greatest releases."

Full Story (comments: none)

GNOME Software Announcements

The following new GNOME software has been announced this week: You can find more new GNOME software releases at gnomefiles.org.

Comments (none posted)

KDE Software Announcements

The following new KDE software has been announced this week: You can find more new KDE software releases at kde-apps.org.

Comments (none posted)

KDE Commit-Digest (KDE.News)

The September 10, 2006 edition of the KDE Commit-Digest has been announced. The content summary says: "Work begins on Ruby language support in KDevelop 4. Work continues in the KReversi code rewrite. Kalzium gets functionality to visually show the country an element was discovered in. Automatic regression testing for Kate. Mimetype and metadata support for the XML Paper Specification format. Strigi can now use outside applications to index files outside its core scope, such as PDF files. KJots gets greatly improved find and replace functionality. Many improvements in supporting different archive formats in KArchiver."

Comments (none posted)

Electronics

gEDA/gaf 20060906 announced

Version 20060906 of gEDA/gaf, a collection of electronic design tools, has been announced. "This is primarily a bug fix release. Hopefully all of the autosave bugs have been squashed along with a few other annoying bugs fixed. This release also includes Peter Brett's new print dialog which is a vast improvement over the Ales' "piece of something" print dialog box that was part of gschem since almost the beginning. I *highly* recommend that everybody upgrade to this release, especially if you are experiencing random crashes."

Comments (none posted)

Financial Applications

SQL-Ledger 2.6.19 released

Version 2.6.19 of SQL-Ledger, a web-based accounting package, has been announced, it features several bug fixes.

Comments (none posted)

GUI Packages

Qt 4.2 Release Candidate Issued (KDE.News)

KDE.News notes the availability of a Qt 4.2 release candidate. "Trolltech has issued a release candidate of Qt 4.2 under an evaluation licence. This version features CSS-like widget styling capability, a new 2D canvas class called QGraphicsView, text completion, new calendar and font selection widgets, and new desktop integration features."

Comments (none posted)

Music Applications

Amuc 1.3 announced

Version 1.3 of Amuc, the Amsterdam Music Composer, is out. "This version has quite some modifications, and now also can import MIDI files."

Full Story (comments: none)

Office Suites

KOffice 1.6 Beta 1 Released (KDE.News)

KDE.News reports the release of KOffice 1.6 beta1. "This release incorporates a number of new features, mainly from the Google Summer of Code projects, as well as a great number of bug fixes. It also signals the start of the feature freeze that always preceeds a release of a major new version, thus giving the developers exactly a month to fix outstanding bugs. We urge everybody that is interested in KOffice to install and test this version to make sure that the final 1.6 has a high quality." More details are available in the announcement and the full changelog.

Comments (none posted)

Languages and Tools

Caml

Caml Weekly News

The September 12, 2006 edition of the Caml Weekly News is out with new Caml language articles.

Full Story (comments: none)

Perl

Weekly Perl 6 mailing list summary (O'Reilly)

The September 2-9, 2006 edition of the Weekly Perl 6 mailing list summary is out with coverage of the Perl 6 mailing lists.

Comments (none posted)

Ruby

Ruby Weekly News

The September 10th, 2006 edition of the Ruby Weekly News looks at the latest discussions on the ruby-talk mailing list and comp.lang.ruby newsgroup.

Comments (none posted)

Tcl/Tk

Dr. Dobb's Tcl-URL!

The September 12, 2006 edition of Dr. Dobb's Tcl-URL! is online with new Tcl/Tk articles and resources.

Full Story (comments: none)

Page editor: Forrest Cook
Next page: Linux in the news>>

Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds