|
|
Subscribe / Log in / New account

OASIS to create an open office application format

The OASIS Standards Consortium has announced the creation of a "technical committee" which will develop an open, XML-based file format specification for office applications. The goal of the project, of course, is to facilitate interoperability and data exchange between applications. Should they succeed, the days of trying to reverse engineer Word files could come to an end.

It is hard to overemphasize the importance of this effort. Microsoft's office suite monopoly is based on two things: (1) that suite's feature set, and (2) the ability to exchange documents with the rest of the world. There are numerous other office suites which are closing the feature gap (though there is still some ground to cover for the free applications, to say the least). But, without the ability to easily exchange documents with MS Office users (and have them look good when they get there), adoption of alternative office suites will remain limited.

And there, of course, lies the rub. A new, XML-based office suite file format will have a rough life if Microsoft does not play along with it. It is worth pointing out that Microsoft is a member of OASIS; the company has also said that Office will use an XML-based format in the future. But the list of supporting companies in the press release (Arbortext, Boeing, Corel, Drake Certivo, and Sun) does not include Microsoft.

Even without Microsoft, standards for document data can only be a good thing. This particular standard is getting a jump start from Sun, which is contributing the OpenOffice.org format under royalty-free terms (OASIS, in general, is quite happy with RAND or UFO (uniform fee only) terms). Should the committee create a standard based on this format, the existence of a free reference implementation should encourage adoption of the standard in both free and proprietary packages.

Proprietary data formats are a problem for a number of reasons, of which proprietary lockin is only one. Another is the ability of proprietary applications to surprise users by retaining information in documents that those users had thought they had deleted (or never put there in the first place). Future historians will find that much of the documentation of this era is encoded into formats which are no longer readable. An Open format for office information will not, by itself, solve any of these problems. But it sure would be a good start.


to post comments

OASIS to create an open office application format

Posted Nov 21, 2002 1:14 UTC (Thu) by akumria (guest, #7773) [Link] (3 responses)

Office XP already has an XML based format; while the schemes may not be public it isn't too hard to derive them.

OASIS to create an open office application format

Posted Nov 21, 2002 17:14 UTC (Thu) by dbreakey (guest, #1381) [Link] (1 responses)

Just because Office XP uses an XML-based file format doesn't mean it will be significantly easier to decipher; it only simplifies and standardizes the parsing engine used to break the document down into its constituent parts.

There's nothing stopping Microsoft from defining tags for which meanings are unclear. As a for instance, from a purely technical POV, what stopped the originators of HTML from defining—to use an absurd example—<xylophone> as the opening tag for a paragraph, instead of <p>?

Absolutely nothing. It was simply an interest in making HTML markup 'human-readable' that lead them to choose the tags they did; a computer could care less what the tag is defined as.

Use of an XML-based file format can simplify the process of decoding the structure of a document, but it's certainly possible, if they chose to, for Microsoft to make it incredibly difficult to decode any meaningful context from that file. And what stops them from using a proprietary and exclusive compression algorithm to compress the XML file, anyway?

So, does Office XP pull any of these tricks? I'm curious to know ...

OASIS to create an open office application format

Posted Nov 22, 2002 1:46 UTC (Fri) by titousensei (guest, #4144) [Link]

You're right: not only unclear tags can be used, but also the interaction between tags is undefined. RTF is an ASCII based format, fully documented by MS, but impossible to use because of conflicting document layout definitions.
Moreover, there's many ways to describe how to render pages. MS's description might not be the one we are looking for. After all, rtf is not, postscript is not, tex is not, html is not, abiword's xml is not.

OASIS to create an open office application format

Posted Nov 28, 2002 9:46 UTC (Thu) by job (guest, #670) [Link]

Do you *know* that or are you guessing?

<xml>
<data type=MsWordDocument license=patented,proprietary>
adOYU9fLAasfa79faglgi315oiK9rtaoiIAKLyq9wrof
</data>
</xml>

would be a really nice XML-format...

OASIS to create an open office application format

Posted Nov 21, 2002 16:41 UTC (Thu) by iabervon (subscriber, #722) [Link]

With a standard and comprehensive way of representing documents, it would be possible for different packages to use the same document converters, which would mean that every office program using this format would be at least as good at converting documents to and from other formats as the best free one. Of course, it would be nice if MicroSoft used the same format to eliminate the extra step, but everybody else working together can probably reverse-engineer any format they come up with.

OASIS to create an open office application format

Posted Nov 21, 2002 19:23 UTC (Thu) by kreutzm (guest, #4700) [Link]

There are efforts under way (sorry, no URI handy at the moment) to create a standard open office format in Germany and possibly the EU based on (most likly) OpenOffice-Format. I hope this two efforts will coordinate.


Copyright © 2002, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds