LWN.net Logo

Microsoft: Boiling Frogs Since 1975

Microsoft: Boiling Frogs Since 1975

Posted Apr 16, 2012 22:00 UTC (Mon) by anselm (subscriber, #2796)
In reply to: Microsoft: Boiling Frogs Since 1975 by khim
Parent article: Paoli: Microsoft will engage with the open source and standards communities

And engineers absolutely can document them, but why should they if ISO is already satisfied?

The point here is that ISO wasn't actually satisfied (in a meaningful way). People identified all sorts of technical problems with the proposed OOXML standard but Microsoft had carefully arranged to (a) allow as little discussion of the technical issues as possible and (b) have various ISO countries which up to that point had not shown any interest whatsoever in the standardisation of office document formats send voting representatives who just incidentally happened to be employed by Microsoft partner companies. This is what allowed the standard to pass, not its technical excellence (which wasn't anything to write home about in the first place).

the fact that managers were able to get ISO approval means that they have done admirable job obtaining this elusive seal of approval.

There are various adjectives in the English language that one might use to describe this sort of behaviour, but at least as far as I'm concerned »admirable« is not among them. »Reprehensible« or »sleazy« would be more to the point.


(Log in to post comments)

Microsoft: Boiling Frogs Since 1975

Posted Apr 16, 2012 22:20 UTC (Mon) by khim (subscriber, #9252) [Link]

The point here is that ISO wasn't actually satisfied (in a meaningful way).

ISO ratified and published the standard. This finishes the story. Or rather it finishes the first part: now the task is to keep it on back-burner with just enough activity to keep it alive and not provoke ISO to withdraw it's approval. This task also goes fine.

This is what allowed the standard to pass, not its technical excellence (which wasn't anything to write home about in the first place).

Sure. But all standards have shortcomings. ODF has many holes, too (just the latest wart), especially ODF 1.0 (which was ratified by ISO, remember?), so why are you so hostile WRT OOXML technical excellence?

The sad fact in OOXML saga is the simple truth: ODF was just as immature (perhaps even more immature) when it was approved by ISO. Of course there was significantly more urgent need because previous ISO abomintation was widely rejected by office suites, but as far as pure technical side is concerned… no, ODF wasn't anything to write home about.

Microsoft: Boiling Frogs Since 1975

Posted Apr 16, 2012 22:47 UTC (Mon) by anselm (subscriber, #2796) [Link]

But all standards have shortcomings. ODF has many holes, too (just the latest wart), especially ODF 1.0 (which was ratified by ISO, remember?), so why are you so hostile WRT OOXML technical excellence?

Personally I would be very reluctant to ascribe any sort of »technical excellence« to a standard that takes up 7000 pages and still is not sufficient to enable third parties to implement conforming software – which ODF, for all its faults, appears to manage well, and at a fraction of the size.

Microsoft: Boiling Frogs Since 1975

Posted Apr 16, 2012 23:05 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

Have you checked the recent ODF standard with all referenced standards? It's far more than 7000 pages.

Microsoft: Boiling Frogs Since 1975

Posted Apr 16, 2012 23:31 UTC (Mon) by anselm (subscriber, #2796) [Link]

Yes, but most of that (like the XML specification) is independently useful. Where I come from, building on other existing standards is generally considered a Good Thing™.

On the other hand, Microsoft apparently came up with enough new stuff for OOXML – stuff that does not appear to be standardised elsewhere – for them to take 7000 pages to write it all down. If you add pre-existing stuff like the XML specification, which the OOXML standard also references but does not actually include, the document set gets bigger still.

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 1:15 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

>Yes, but most of that (like the XML specification) is independently useful

Yeah? How about SVG which has exactly _zero_ fully-conforming renderers? And now they've decided to play the favorite OpenSource game and rewrite SVG 2.0 almost from scratch. I've worked with VML back in early 2000-s and it was many times easier to write a simple renderer for it.

Then there are questions of CSS in ODF, but my comments would be unprintable.

Frankly, supporting a subset of OOXML without crazy magic formatting options has about the same complexity as supporting ODF. Except that OOXML is internally more consistent.

Now, OOXML is clearly not ideal. But it's OK-ish, and ISO quite fairly standardized it on its technical merits.

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 5:33 UTC (Tue) by khim (subscriber, #9252) [Link]

Except that OOXML is internally more consistent.

Only someone with agenda can claim that these three lines (which mark test as red in .docx, .xlsx and .pptx) are consistent:
    <w:color w:val=”FF0000″/> (.docx)
    <color rgb=”FFFF0000″/> (.xlsx)
    <a:srgbClr val=”FF0000″/> (.xlsx)

No, OOXML is most definitely not consistent. It encodes warts of all the legacy documents quite well (as it was the goal of it's creation) but consistency? Not in the cards. It was not the goal of it's creators, thus you hardly can fault them.

Now, OOXML is clearly not ideal. But it's OK-ish, and ISO quite fairly standardized it on its technical merits.

Note: I never said it. OOXML is internal Microsoft's format and it's one of the better documented proprietary formats. Usually ISO was quite reluctant to standardize such things because, frankly, they have only limited appeal for anyone except the primary author. Usually just some subset was standardized (such as PDF/X, PDF/A, PDF/E, PDF/VT, or PDF/UA). The fact the full pig was pushed through ISO tarnishes it's reputation in IT industry beyond repair, but this is separate issue from the standard's technical quality.

Microsoft: Boiling Frogs Since 1975

Posted Apr 16, 2012 23:23 UTC (Mon) by khim (subscriber, #9252) [Link]

Personally I would be very reluctant to ascribe any sort of »technical excellence« to a standard that takes up 7000 pages and still is not sufficient to enable third parties to implement conforming software – which ODF, for all its faults, appears to manage well, and at a fraction of the size.

Why should it achieve this goal if it was not designed for that? It was designed to preserve formatting of the old documents - and it does it well. I've seen plenty of documents which survive .doc⟷.docx or .xls⟷.xlsx roundtrip just fine but completely lose formatting when roundtripped as .doc⟷.odt or .xls⟷.ods. Some of them keep formatting when opened in LibreOffice even, yet they still breaks apart when saved as .odt or .ods!

Once again: OOXML have achieved all goals it was designed to achieve. It's pretty compatible with legacy formats (when MS Office is used) and it's pretty compatible with other Office suites (such as iWorks) which are developer with Microsoft's help. Thus it obviously achieved all the goals it was set to achieve and as such is quite professional and successful standard.

You bitch about the fact that it does not make it possible to create credible competitor for the MS Office - but this, too, was the important goal and it was successfully achieved so what's your problem?

You don't like situation when standards are used to stifle competition? This is not technical problem with the standard, sorry.

Microsoft: Boiling Frogs Since 1975

Posted Apr 16, 2012 23:37 UTC (Mon) by dlang (✭ supporter ✭, #313) [Link]

it's hard to say if it's really compatible with all the legacy formats as there isn't any software that fully implements it (Even MS Office doesn't fully implement it, at best it only every implemented some draft version)

but it's easy to define a format with 100% backwards compatibility on paper.

File format, one line of ASCII text followed by an additional block of data who'ss format is identified by the line of text.

the line of text can have the value

'ODF 1.0' in which case the second block is in the format specified by PDF 1.0

'Word 2007' in which case the second block is in the format of Microsoft Word 2007

etc.

In a couple of pages I could specify a format with better compatibility than anything that Microsoft has ever even thought about.

and it would be about as much use as the OOXML format in terms of actually allowing anyone to implement it

your repeated claims that OOXML has achieved success only work if you completely ignore all technical measures of success and only consider political and marketing shenanigans, and even then there was enough of a stink over this that it's only a qualified success at best

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 5:13 UTC (Tue) by khim (subscriber, #9252) [Link]

but it's easy to define a format with 100% backwards compatibility on paper.

Sure. That's why I'm talking about real world documents (mostly financial), not about some abstract paper.

and it would be about as much use as the OOXML format in terms of actually allowing anyone to implement it

It'll be much harder to implement it even partially, but yes, you'll achieve compatibility goal. ODF have not done it and thus it's not compatible. It's as simple as that.

your repeated claims that OOXML has achieved success only work if you completely ignore all technical measures of success

Bullshit. I've given you a criteria: take a bunch of old documents from some collection (private and/or government one), convert them to ODF and OOXML, convert them back, watch the result. This is technical criteria (it was guiding policy for Unicode standard BTW, only there they converted plain text documents in different encodings back and forth). OOXML works (as implemented by MS Office), ODF fails (as implemented by LibreOffice). Most of the failures are caused exactly by refusal of ODF to support legacy warts which you so despise.

Microsoft: Boiling Frogs Since 1975

Posted Apr 18, 2012 21:15 UTC (Wed) by vonbrand (subscriber, #4458) [Link]

For a fair comparison, convert StarOffice documents to ODx and back, not MSFT cruft. And BTW, the "conversion" there is mostly wrapping/unwrapping, there is no real "conversion." Or use Office to convert from DOC to ODT and back.

Microsoft: Boiling Frogs Since 1975

Posted Apr 18, 2012 22:03 UTC (Wed) by khim (subscriber, #9252) [Link]

For a fair comparison, convert StarOffice documents to ODx and back, not MSFT cruft.

I don't know where your idea about “fair” comes from. There are bazillion documents created in the format of old versions of MS Office. Number of documents in StarOffice format is minuscule in comparison.

That's why OOXML was designed and presented as format usable and compatible with “old proprietary formats of MS Office”, not with “old proprietary formats of all existing programs”. This is how it was presented in ISO and this is what said format does.

If ISO accepted this goal as design criteria (and looks like it did) then indeed OOXML is the best possible solution.

And BTW, the "conversion" there is mostly wrapping/unwrapping, there is no real "conversion."

Sure, but it works. Not perfectly, but much better then ODT.

Or use Office to convert from DOC to ODT and back.

The results are usually significantly worse then with LibreOffice. That's because when LibreOffice is faced with something unsupported by ODF (but supported by old, proprietary formats) it tries to somehow change the document to fit. MS Office just drops these parts on the floor. And it'a easy to understand why: LibreOffice has no choice while MS Office has much better format (better for the purpose of editing old documents, that is): OOXML.

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 0:52 UTC (Tue) by vonbrand (subscriber, #4458) [Link]

I'm sorry, but roundtrips DOC --> DOCX --> DOC done by the very same people who inflicted said formats on us don't tell anything at all.

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 5:19 UTC (Tue) by khim (subscriber, #9252) [Link]

It does. It proves that such round-trip is possible. The next step is, of course, it to see if anyone else can do that. And, suprise, surprise, it's possible: LibreOffice can not save documents in OOXML format, but at can read them and half-round trip (convert from legacy format to OOXML using MS Office, open said OOXML in LibreOffice then save as legacy format and open in MS Office) still works better then ODF roundtrip.

You may dislike Microsoft shenanigans as much as you want but OOXML is technically the best candidate for the problem it was designed to solve. If such technical problem (standard with support for legacy Microsoft's formats) deserves an ISO approval or not is separate issue.

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 8:32 UTC (Tue) by anselm (subscriber, #2796) [Link]

It proves that such round-trip is possible.

Considering that OOXML is mostly an XML encoding of the legacy Microsoft Office formats, anything else would be … unusual. On the other hand, Microsoft might have botched even that, the way they have botched so many other things, so it's probably best to be grateful for small miracles.

Finally, the alleviation for the »changing the printer driver changes the formatting« in Word is apparently to save your document as RTF and then load it back from the RTF file. Go figure.

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 10:34 UTC (Tue) by cortana (subscriber, #24596) [Link]

The fact that changing your printer driver alters the formatting of a document kind of boggles my mind. WTF does the printer driver have to do with it? What if you have more than one printer installed?

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 13:33 UTC (Tue) by khim (subscriber, #9252) [Link]

What if you have more than one printer installed?

Then you can find out that your painstakingly composed document tested on 360dpi dot-matrix printer falls apart when you try to do a final output on 600dpi LaserJet.

The fact that changing your printer driver alters the formatting of a document kind of boggles my mind.

This is quite unfortunate, but you should remember then first versions of MS Word were supposed to create something acceptable for 72dpi printer! It's impossible to create anything good-looking for such a low resolution unless your document is completely built around this final output device.

And then we have this pesky backward-compatibility concerns, of course.

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 14:14 UTC (Tue) by anselm (subscriber, #2796) [Link]

It's impossible to create anything good-looking for such a low resolution unless your document is completely built around this final output device.

Tell that to Donald Knuth, who came up with TeX in the late 1970s/early 1980s. At that time, TeX output looked substantially the same (modulo resolution) whether produced on a really low-res Xerox graphics printer or a several-thousand DPI phototypesetter, and certainly a lot nicer than Word output does even today. Especially when mathematics are involved. These scientists certainly gave Microsoft's »engineers« a run for their money.

Anyway, 72 dpi is a bit on the low side. Even the 24-pin dot matrix printer I used to have in the 1980s was capable of 360-dpi output in graphics mode, and was actually reasonably fast at 180 dpi. Wikipedia claims that »Word for DOS has been designed for use with high-resolution displays and laser printers, even though none were yet available to the general public«, which makes the strange printer driver issues even more mystifying considering that, even leaving TeX out of the game, there were various other word processing packages on the market at the time, for different platforms including not just DOS and the Mac but also the Apple II and Atari ST, which managed to produce output on a par with (or surpassing) Word but identically across a range of different output devices.

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 16:25 UTC (Tue) by khim (subscriber, #9252) [Link]

Tell that to Donald Knuth, who came up with TeX in the late 1970s/early 1980s.

I'm pretty sure Knuth knows limitations of TeX better then me. It was basically unusable on 72dpi printer. If you had high-quality printer with real double-strike 144dpi mode then yes, it was possible to use TeX - just barely. When 24pin 180dpi/360dpi printers TeX become reality on PC, but it was way too late: WordPerfect and Lotus 1-2-3 were the “established standard” by then. Later MS Office replaced them.

These scientists certainly gave Microsoft's »engineers« a run for their money.

Rilly? You mean in your universe Joe Average uses TeX and not MS Word to print his creations? This is some kind of interesting universe, I must admit. But here and now TeX is historical curiosity (even mathematical journals often use MS Word instead of TeX and general public does not even know TeX exists).

Even the 24-pin dot matrix printer I used to have in the 1980s was capable of 360-dpi output in graphics mode, and was actually reasonably fast at 180 dpi.

By that time battle was basically already won: in the era of 9-dot matrix printers TeX was doubly unusable (typical computer had no memory to run TeX and 72dpi printer generated unreadable output if you used it) and people used WordPerfect and WordStar. WordPerfect was more popular by far. Later MS Office won the crown (that's why you see so many WordPerfect warts in OOXML), but, of course, it needed compatibility to do that.

Which makes the strange printer driver issues even more mystifying considering that, even leaving TeX out of the game, there were various other word processing packages on the market at the time, for different platforms including not just DOS and the Mac but also the Apple II and Atari ST, which managed to produce output on a par with (or surpassing) Word but identically across a range of different output devices.

Which ones do you have in mind? How popular they were back then?

People rarely moved documents between computers in that era, but quality of the output was important. Even if printers had 144dpi mode it was slow and unreliable. MS Word was actually not widely used (it came later but of course it inherited warts of previous editors), but layout in most popular editors was already dependent on printer driver. Later MS Word needed to keep the status quo to be accepted.

The lesson here is simple: scientists may create truly beautiful things… which will be used by other scientists (and scientists wannabe). But engineers produce things for “real users” and this is quite different art.

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 21:59 UTC (Tue) by anselm (subscriber, #2796) [Link]

I'm pretty sure Knuth knows limitations of TeX better then me. It was basically unusable on 72dpi printer.

The problem, if there was one at all, wasn't one with TeX – it is really that the Computer Modern fonts (at the time essentially the only game in town) don't look their best on low-resolution output devices, and as far as CM is concerned, 600 dpi is »low resolution«.

TeX itself was perfectly capable of producing good-looking output even on 72dpi printers (I think you mean 9-pin dot matrix printers), within the limits of the device, if you were willing to wait for the output. It was also possible to adapt Knuth's line breaking/spacing algorithm from TeX to optimise output on dot matrix printers using the built-in fonts at text-printing speed (the relevant paper was published in Software Practice & Experience, I forget the correct citation); I spent quite a lot of time playing with this, way back then, and always wondered why the word processor people wouldn't pick this up.

The popularity (or not) of TeX is a non-issue here; you claimed that it was impossible to produce good-looking output on a low-res device without tying one's software to that device, and I cited TeX as a counter-example. A single counter-example disproves »impossible«. I win.

Which ones do you have in mind?

In the late '80s I (like many other students at my university) used an Atari ST, and people would always prepare documents at home and print them at the university because the laser printers there were a lot more convenient than the dinky dot matrix printers they had at home. I don't recall people complaining that Signum or StarWriter (the great-great-grandfather of today's LibreOffice) would mess up the formatting when a file was moved from one computer to the other. By that time I was a TeX user, anyway, so it was a non-issue for me. But even earlier on the Apple II it made no difference whether you printed stuff on an Epson FX-80 or a Centronics 737 (which were the two printers I was using at the time); they would look subtly different because the built-in fonts were different, but the spacing, page breaks, etc. would of course be identical – which is something that Word apparently hasn't nailed in 2012.

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 23:10 UTC (Tue) by khim (subscriber, #9252) [Link]

TeX itself was perfectly capable of producing good-looking output even on 72dpi printers (I think you mean 9-pin dot matrix printers), within the limits of the device, if you were willing to wait for the output.

You are mixing draft mode (fast, 72dpi, one pass) with NLQ mode (slow, 144dpi, two passes). Cheap printers cheated and had no “real” NLQ at all (they just passed over the same dots the second time - which produced darker picture but kept the resolution @ 72dpi) and more expensive ones were about 5 times slower in NLQ mode thus most documents were printed in draft mode (only letter sent by mail were sometimes printed in NLQ mode).

you claimed that it was impossible to produce good-looking output on a low-res device without tying one's software to that device, and I cited TeX as a counter-example

Yup. At 72dpi TeX produces unreadable text. You either need to use “large, friendly letters” (often not an option because maximum size of letter was limited) or you needed to use slower 144dpi mode - which was also often not an option (even if you had it on your device it was too slow).

A single counter-example disproves »impossible«.

Right. But you don't have it. TeX is unusable @ 72dpi thus it's obviously not a proper counter-example. And 72dpi is important: Mac used this dpi on screen - exactly to make it easier to produce best possible WYSIWYG for these printers.

But even earlier on the Apple II it made no difference whether you printed stuff on an Epson FX-80 or a Centronics 737 (which were the two printers I was using at the time); they would look subtly different because the built-in fonts were different, but the spacing, page breaks, etc. would of course be identical.

Of course these will be identical! They have identical resolutions and identical printer fonts (as far as spacing is concerned). MS Word handles such cases just fine - and always did. It's only when you go from dot-matrix @ 144dpi to dot-matrix @ 360dpi or to laser @ 300✕Ndpi you get real problems.

I don't recall people complaining that Signum or StarWriter (the great-great-grandfather of today's LibreOffice) would mess up the formatting when a file was moved from one computer to the other.

Not sure about Signum, but StarWriter had the same problem as TeX: it produced ugly-looking documents on low-resolution devices. This was probably one reason for why the students tried to print on laser printer whenever possible.

The popularity (or not) of TeX is a non-issue here

It's absolutely of the issue here. TeX was not initially popular because it was unusable on low-res output devices and when high-res devices become available people continued to use what they were accustomed to use.

1001st repeat of the same story: scientists claim the won because their solution is more technically advanced and engineers claim they won because they created solution which is actually used by people. Shows the difference between good scientist and good engineer quite nicely, isn't it?

Microsoft: Boiling Frogs Since 1975

Posted Apr 18, 2012 7:03 UTC (Wed) by anselm (subscriber, #2796) [Link]

TeX is unusable @ 72dpi thus it's obviously not a proper counter-example.

That's what you say. This is what is called the »No true Scotsman« logical fallacy.

TeX output at 72dpi is quite readable as far as I'm concerned. Stuff using Computer Modern is not exactly beautiful at 72dpi but that (as I said) is mostly a font problem, not a TeX problem. I spent considerable time in the 1980s writing a DVI previewer for the Atari ST, which had approximately that screen resolution, and it worked fine.

TeX was not initially popular because it was unusable on low-res output devices

I think that was more to do with the fact that in the early to mid '80s there was no good cheap/free TeX implementation for the PCs of the day. TeX was fine on real computers. Also, TeX was originally meant for typesetting books, not office correspondence, so the observation that it didn't catch on for office correspondence does not detract from the fact that its output was basically OK, which is the original issue.

Microsoft: Boiling Frogs Since 1975

Posted Apr 17, 2012 16:01 UTC (Tue) by oldtomas (guest, #72579) [Link]

"The fact that changing your printer driver alters the formatting of a document kind of boggles my mind."

It boggled mine too -- at the time. But I can confirm that. Quite a while ago, a friend of mine (to whom I played the role of "local friendly hacker") upgraded from a dot matrix printer (360 dpi, remember those?) to an inkjet (300 dpi).

His carefully written tables kind of exploded :-)

Turns out that the internal measuring unit of Word used to be "whatever The Printer is able to resolve", in a typically Microsoftish nonchalant way.

So tab stops could, with some luck be at such points that the different resolution pushed some material one tab stop further -- some not.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds