LWN.net Logo

Advertisement

Advanced thin client solution for Linux, based on Open Source. Mix Windows and Linux, 10 licenses for free!

Advertise here

FOSDEM 2009

A quick look at Conglomerate 0.70

The DocBook format is often promoted as the format of choice for free (and non-free) documentation. DocBook, as an SGML and XML standard, is compliant with as many buzzwords as anybody could wish for. The standard is well developed and highly expressive. And DocBook, of course, is all about structure. More, perhaps, than any other markup language, DocBook forces the author to concentrate on the structure of the language without thinking about how a document will be rendered in any particular medium.

Anybody who has had to create a serious work in full frontal DocBook knows the rest of the story, however. DocBook is complex and verbose. Like PostScript, DocBook requires that the author maintain a deep stack in mind to track the current state of the document. And, like PostScript, DocBook is best used as the output of a higher-level tool, rather than created directly by the author.

Unfortunately, given the current state of the tools available, manipulating DocBook directly with a text editor is often the only option available. So your editor, who is currently in the process of updating a substantial book which is, of course, in DocBook format, was more than usually interested in the recent announcement that Conglomerate 0.70 had been released. As stated in the announcement:

Congomerate is a free, user-friendly XML editor. It is particularly aimed at DocBook editing, but should be able to handle arbitrary XML document types.

For authors working in DocBook, a nice editor would be worth a great deal. So Conglomerate seemed worth checking out.

The first challenge with bleeding-edge software, of course, is getting it installed and running. For Conglomerate, an attempted install on Debian sid proved doomed to failure; the maze of dependencies proved too twisted, and the packaged version in experimental had not been updated. On the other hand, version 0.70 configured, built, and installed on a Red Hat Linux 9 system without trouble. There are advantages to having a variety of distributions sitting around.

What resulted was a tool that shows some serious promise, but which is not yet ready for production use. The sample text used (Chapter Two of Linux Device Drivers, Third Edition) required significant editing (with a text editor) before Conglomerate would accept it. Conglomerate does not recognize common entities (e.g. &ndash;), and there are differences of opinion on how certain types of tag (such as <indexterm>) should be terminated in some situations. The tool spews out an unending series of Gtk warnings, crashes occasionally, [Screenshot] and is generally slow. It is missing fundamental features, such as an "undo" operation. It does, however, work well enough to give a good idea of where the developers are going.

True to the basic premise of DocBook, Conglomerate is all about structure. Looking at a document in DocBook will not tell you much about how it will appear in printed (or web) form, but it is full of information on how the document goes together. To that end, the window (see the screen shot on the right) is divided into two panes. The left side shows the overall structure of the document, in the usual tree presentation. The main window, on the right, shows the text. But this is no WYSIWYG presentation; instead, the document is presented as a set of nested boxes showing, once again, how things are structured. Subtrees of the document can be expanded or hidden at will, providing a sort of zoom feature.

[menu entry] At the structural element level, the right mouse button yields an impressive array of new elements (86 of them) which can be added as subelement or sibling elements. Once you get below the paragraph level, however, a whole new menu with various types of low-level markup (e.g. <emphasis>) appears instead. Conglomerate does not, of course, change the presentation of the text to reflect this sort of [literal] markup. So, for example, rather than italicizing text marked <literal>, it simply indicates that the tag is present. The tool displays internal comments in a highlighted form, but does not appear to provide a way to add or edit comments.

There is no shortage of features that this tool still needs: undo, an easy way to join paragraphs, the ability to read and fix not-quite-perfect files, entity definitions, and some sort of way to quickly see what formatted output would look like. The performance and stability issues need some work. But none of this should detract from the fact that the Conglomerate developers have made substantial progress toward the creation of a desperately needed tool. Conglomerate is headed in the right direction; we're looking forward to the next release.


(Log in to post comments)

Display

Posted Sep 3, 2003 19:30 UTC (Wed) by jamienk (subscriber, #1144) [Link]

Even though the software -- and this article -- are all about structure and not display, I'm really struck by the GTK widgets (whatever theme), nice fonts, etc. Over the past few months, the Linux screenshots I've been seeing (http://www.kde-look.org/content/pics/7559-1.png too) have exceeded Windows and even the Mac in clean, professional simplicity and beauty. No small feat.

LaTeX

Posted Sep 3, 2003 19:35 UTC (Wed) by subhasroy (guest, #325) [Link]

LaTeX is an alternative. Many large technical books have been
written with it. When I was at grad school (early 90's), LaTeX
was the standard tool for all computer science research paper
publications. There is a LaTex2HTML tool also. PDF generation
is easy. Also, LaTeX comes with all Linux distributions. It is
easy to use (I used it to author many papers).

LaTeX

Posted Sep 3, 2003 20:18 UTC (Wed) by trutkin (guest, #3919) [Link]

Lyx is a wonderful editor for LaTeX and TeX documents. I used it for all my papers in college and I imagine that it has only improved since then.

http://www.lyx.org

A quick look at Conglomerate 0.70

Posted Sep 3, 2003 23:46 UTC (Wed) by tosk (guest, #5697) [Link]

give xmlmind a chance, i think i fullyfills your need. with xml-plugins jedit is quite cool too. nevertheless a stable gnome2 docbook-editor would be great!!! best regards are

A quick look at Conglomerate 0.70

Posted Sep 4, 2003 1:57 UTC (Thu) by erich (subscriber, #7127) [Link]

That editor is not free, is it?
So it's not an option - i'm using it for personal stuff not for work.

A quick look at Conglomerate 0.70

Posted Sep 4, 2003 4:51 UTC (Thu) by tosk (guest, #5697) [Link]

i only did a quick review to the license of the standard edition:

1.1 Installation and use
The Software may be installed on any number of Licensee's host machines, either for personal use or for corporate use inside an organization.


it's free enough for my needs ;o) best regards!

A quick look at Conglomerate 0.70

Posted Jan 9, 2004 7:31 UTC (Fri) by iulian_velea (guest, #18562) [Link]

If you don't like XMLMind you can try another editor wich I like:
<oXygen/> XML Editor (www.oxygenxml.com).
You can download this application from www.oxygenxml.com/download.html

A quick look at Conglomerate 0.70

Posted Jan 9, 2004 7:34 UTC (Fri) by iulian_velea (guest, #18562) [Link]

To be easiaer: <oXygen> XML Editor

jEdit (A quick look at Conglomerate 0.70)

Posted Sep 4, 2003 3:44 UTC (Thu) by pdc (subscriber, #1353) [Link]

jEdit + the XML and XSLT plug-ins are what I used for editing XML documents. It has some really nice features:

-- folding (hiding the layers of the document structure),

-- a structure browser (for easy navigation),

-- end-tag auto-completion when you type '</',

-- DTD-driven verification as you type (or when you save the file if you want to be pestered less often).

It requires Java to run.

XMLMind XML Editor

Posted Sep 4, 2003 8:39 UTC (Thu) by fern (subscriber, #7253) [Link]

I also have to endorse XMLMind.

It has two licenses, one is free to just use the simple editor. The other you pay for extra features and freedoms. The editor is nice, fully implemented with CSS2 stylesheets, default styling for Docbook WYSIWYG editing, etc etc. The developers also seem genuinely interested in making a good product, and constantly improving it.

Give it a shot, you might or might not like it. But it's certainly a good tool to have available. Also you might decide that paying for features that you don't need is a good way to support the continueing development of a wonderful tool..

A quick look at Conglomerate 0.70

Posted Sep 12, 2003 14:07 UTC (Fri) by jeremiah (guest, #1221) [Link]

After reading your post, I took a look at XMLMind. Our support staff who writes our documentation now wants to bear your children. They have been using XMLSpy for quite some time 2+ years maybe, and the difference is absolutly amazing. Thanx for mentioning it, and anyone else who did.

LyX exports DocBook

Posted Sep 4, 2003 1:32 UTC (Thu) by stevenj (subscriber, #421) [Link]

LyX, initially a front-end for LaTeX, has been able to export DocBook files for some time now (as well as exporting LaTeX, PDF, etc.).

A quick look at Conglomerate 0.70

Posted Sep 4, 2003 4:00 UTC (Thu) by tpeters (guest, #4579) [Link]

IMHO the problem with DocBook is not how tedious it is to edit, but to get it into print. One of some long tool chain has to be set up to actually get a paper print, and it is a pain to get any of them to work. Moreover, the default style sheets do a poor job. In fact, I only ever got the route -> LaTeX -> dvi -> ps working properly, and the LaTeX generated from DocBook yields poorer lay-out and print then when I would have written it in LaTeX directly.

A quick look at Conglomerate 0.70

Posted Sep 4, 2003 4:47 UTC (Thu) by tosk (guest, #5697) [Link]

yes, i think the same. i've tried the fo-xsl/fop combination too, and sometimes it works. but the qualitity of manuals typed in/with latex is much better, if you'e looking for pdf-output. i think, the main advantage is the structur, because you can parse your document easily. i use this feature to implement my online-help. btw, has somebody some links to advanced/stylish stylesheets for docbook? i thank you in advance for hints. best regards!

A quick look at Conglomerate 0.70

Posted Sep 4, 2003 6:32 UTC (Thu) by wookey (subscriber, #5501) [Link]

I too have struggled with the joys of docbook-SGML and the conversion toolchain. I think the concept is marvellous but the actuality is less good. I use it for producing books in HTML and PDF form at the same time, and it also allows me to generate several slightly different versions from the same source.

The biggest problem is that indexing works fine in HTML but I can't get it to work properly in PDF. Another major annoyance is that if I want the pictures to appear I have to go via DVI but then I lose the internal PDF links in the contents. I have yet to get both at once.

As others have remarked the layout of the generated PDF leaves plenty of room for improvement too - tables and titles are often orphaned for example. this wouldn't matter too much except that its _really_ hard to fix - you need a deep and loving understanding of TeX and scheme and DTDs and how it all fits together. I've tried quite hard and failed to get to the bottom of it in the time available. It clearly is possible as O'Reilly's books have working indexes and look OK and I believe they use this technology. I'd be happy to resort to a professional in this area if they could fix my doc-generation process for me but I haven't found any we could afford the couple of times I looked.

So IMHO the data-entry is not the problem - the toolchain and it's configuration are. A mode that understants SGML tags is handy (emacs does a reasonable job and even jed can ehlp out a bit) and one that understnads the DTD and can remind you what is currently allowed is also handy, but in general even doing it all in plain text is only a bit dull. I suppose the times when this method falls down is if you are hacking a text about a lot - you need help not to get your sections out of step - presumably that's what corbet is doing.

A quick look at Conglomerate 0.70

Posted Sep 4, 2003 13:16 UTC (Thu) by priyadi (guest, #6583) [Link]

I suggest converting DocBook SGML to DocBook XML. Then use this route: DocBook XML -> XSL-FO -> PDF. Images and links are working great at the same time for me here. However, the XSL-FO -> PDF conversion tools requires a bit maturing. For example you'll still find orphans when using FOP (and I can't get PassiveTex working). But you can use XEP (commercial) instead of FOP if you need better output.

A quick look at Conglomerate 0.70

Posted Sep 4, 2003 13:06 UTC (Thu) by priyadi (guest, #6583) [Link]

I don't know if you already know this. But if you are using DocBook XML (not SGML) there is an excellent book at http://www.sagehill.net/, the whole book is even available for online read. To generate postscript, I recommend the route DocBook XML -> XSL FO -> PDF -> PS, it is working great for me.

A quick look at Conglomerate 0.70

Posted Sep 4, 2003 13:13 UTC (Thu) by oak (guest, #2786) [Link]

And doing (La)TeX with LyX frontend is much easier...

LyX can export ASCII (with tables!), HTML, DocBook (OO can do that too), Postscript, PDF. LyX/LaTeX is not quite as structured as DocBook (doesn't have as many types), but it's much easier to use than native DocBook editors (Emacs sgml-mode :)) and the output is superior.

For generating PDF from LyX or LaTeX files, I recommend script at tex2pdf.berlios.de, it has really beatiful output, including PDF index and page thumbnail generation.

Earlier LyX GUI used the crummy pre 1.0 version of xforms widget toolkit, but now it has also a (much nicer) Qt frontend.

A quick look at Conglomerate 0.70

Posted Sep 4, 2003 19:20 UTC (Thu) by Per_Bothner (subscriber, #7375) [Link]

I decided to go the Docbook -> LaTeX route, using an XSLT "stylesheet" to convert the DocBook file to standard LaTeX. This allowed me to use other people's LaTeX styles, including 2-column conference formats. It works pretty well. The scripts and some very terse documentation are available from here.

Screenshots are GIFs

Posted Sep 4, 2003 8:25 UTC (Thu) by stuart (subscriber, #623) [Link]

Any reason for using gifs for the screenshots over PNGs? PNG is usually smaller and PNGs look nicer as they can be in TrueColour. Oh and PNGs have not been patent encumbered.

Just a thought.
Stu.

Screenshots are GIFs

Posted Sep 4, 2003 8:30 UTC (Thu) by corbet (editor, #1) [Link]

Because I was lazy, and the patent has run out in my part of the world. Lame excuse, I know.

Screenshots are GIFs

Posted Sep 4, 2003 10:45 UTC (Thu) by piman (subscriber, #8957) [Link]

I'm guessing LWN pays for its bandwidth?

-rw-r--r-- 1 piman piman 96268 Sep 4 11:40 current.gif
-rw-r--r-- 1 piman piman 70639 Sep 4 11:40 current.png
-rw-r--r-- 1 piman piman 69017 Sep 4 11:40 current-small.png

The first is the result of a simple convert(1), the second is convert + pngcrush. So you save 29% with about 10 seconds of work.

Screenshots are GIFs

Posted Sep 4, 2003 11:07 UTC (Thu) by tjc (guest, #137) [Link]

Way off topic, but what window manager are you using? It looks like Red Hat's Bluecurve theme running on metacity, but the corners are messed up. Is this a Red Hat beta of something newer than 9?

Window manager

Posted Sep 4, 2003 11:17 UTC (Thu) by corbet (editor, #1) [Link]

It's pretty standard RH9 stuff, though I hacked a bunch of stuff off the desktop.

DocBook: high-level enough! XML writing thoughts...

Posted Sep 4, 2003 9:52 UTC (Thu) by denials (subscriber, #3413) [Link]

I live and breathe XML and SGML documentation formats. I'm a technical writer, and my tool of choice is still (g)Vim--the text editor par excellence from http://vim.sf.net. I, for one, do not agree with the author's statement that DocBook is meant to be produced from higher-level tools; tagging documentation with XML elements gives you the ability to create extremely well-structured documents and actually helps keep you focused on delivering a sensible structure, instead of focusing on what the headings look like at a given time. I've been forced to write chapters of a book in Word using styles, and let me tell you--I am much happier being able to say "the code sample starts here <code>... and ends here </core> than having to painstakingly position my cursor, highlight a section, apply a style, and hope that the tagging underneath the covers got it right.

That being said, I am open to better ways of creating XML documents, so I took a look at the Conglomerate interface. I really like the way it displays the tagging; many editors put clunky graphical representations of the tags inline around the text, which makes me want to just put the damn text tags in place. Contextually-sensitive XML markup is a really nice feature, one that Emacs users have had for a while with Emacs+PSGML (but then you need to know Emacs... heh). Conglomerate looks like it does a pretty good job, and once it stabilizes maybe I'll give it a try on my Mandrake box.

BTW, the latest release candidate of OpenOffice.org 1.1 imports and exports DocBook; however, it's not perfect yet either. I took a perfectly valid HOWTO (DB2, in XML format) from the Linux Documentation Project and imported it; OOo didn't really like the metadata, but surfaced the XML tags as separate styles in the stylist. It also provides some basic WYSIWYG functionality for those tags, which is nice. Still, it would take a fair bit of cleanup to convert the output into real DocBook.

Yet another option: XAE

Posted Sep 4, 2003 15:15 UTC (Thu) by Xman (guest, #10620) [Link]

Honestly, I've come to like XAE. It does a has the advantage of being built on top of Emacs, so you have all those powerful editing features that Emacs has that the author sorely missed in Conglomerate.

A quick look at Conglomerate 0.70

Posted Sep 8, 2003 11:47 UTC (Mon) by mly (guest, #2171) [Link]

A completely different approach to producing nice documents and (for the time being a subset of) DocBook is to use docutils and reStructuredText.

See http://docutils.sourceforge.net/

Off-topic: Linux Device Drivers, 3rd edition

Posted Sep 10, 2003 23:38 UTC (Wed) by chiromancer (guest, #10692) [Link]

> The sample text used (Chapter Two of Linux Device Drivers, Third Edition) required significant

I failed to google out anything about 3rd edition of LDD, it seems the latest is still the 2nd. Is the 3rd in progress? Or, is it just a
typo?

Off-topic: Linux Device Drivers, 3rd edition

Posted Sep 11, 2003 6:48 UTC (Thu) by corbet (editor, #1) [Link]

The third edition is in progress; that's why I'm having to deal with DocBook again. It's going to be a little while yet...

OpenOffice

Posted Sep 11, 2003 0:32 UTC (Thu) by guym@arizona.edu (guest, #14981) [Link]

OpenOffice 1.1 can save to DocBook. If anyone has evaluated it for this purpose, I'd love to hear about it -- perhaps a future LWN report?

A reply from the maintainer...

Posted Sep 11, 2003 11:41 UTC (Thu) by dave_malcolm (subscriber, #15013) [Link]

Thanks for your great article on Conglomerate.

I have some questions to the author

  • Looking at the screenshot - is your source document DocBook XML or is it actually DocBook SGML? There are plenty of small but fiddly differences. Conglomerate has an SGML importer but it's a bit flaky at the moment. That may have caused some of your problems.
  • Also, did your test document contain a DTD declaration? I'm guessing that it was an entity representing a single chapter from a book, and hence didn't have a DTD. Conglomerate can filter the tags that can be inserted if a DTD is available, and so hopefully the 86 tags you saw would be reduced to a comprehensible level. Also, I believe this would have allowed the use of the &ndash; entity.

Looking ahead - some of the problems reported in the article should now be fixed in the CVS version, so stay posted for the next tarball (in a few days, I hope) - or grab it from CVS if you're interested in getting more involved - and please report any bugs you find here

Feel free to add feature requests this way as well; we're trying to make Conglomerate the best XML editor out there (free or otherwise).

Or join us on the development mailing list

Dave Malcolm

Copyright © 2003, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds