Weekly edition Kernel Security Distributions Contact Us Search Archives Calendar Subscribe Write for LWN LWN.net FAQ Sponsors

# MathML: horrible

## MathML: horrible

Posted Apr 28, 2011 14:17 UTC (Thu) by danielpf (subscriber, #4723)
Parent article: MathML, Firefox, and Firemath

What an terrible syntax and that for a rather poor result. The comparison table betwenn TeX and MathML shows how far MathML is from TeX (for expert eyes the MathML rendering is awful almost like à la Word). Obviously MathML is not close to improve on the work of Donald Knuth.

The given example

<msup> <mi>x</mi> <mn>2</mn></msup>

is expressed in TeX (switching from text mode) as

$x^2$

which is identical (except for the \$'s) to the convention of most computer algebra languages (Maxima, Maple, etc.) which don't need a special notation to know about the functional meaning (even bulkier in the given example).

I get the impression that once more some people need to reinvent the wheel, and dismiss the past experience, resulting afters years of work in a big failure IMHO.

MathML: horrible

Posted Apr 28, 2011 17:40 UTC (Thu) by cladisch (✭ supporter ✭, #50193) [Link]

One of the primary goals of XML is to make its processing by computers easier; XML is not intended to be written by humans.

MathML rendering quality depends on the implementation. There are MathML-to-TeX convertes, and it would be possible to use TeX's formatting rules in Firefox.

MathML: horrible

Posted Apr 28, 2011 20:09 UTC (Thu) by n8willis (editor, #43041) [Link]

How is the syntax "terrible" precisely? Solely because it is not as compact? In any event, one thing I did not touch on in the main story is that presentation MathML is designed to express *notation*. The x2 example might be the equivalent of x*x, as a computer algebra system would interpret x^2, but it might also mean the second element in a tensor named x in Einstein notation, the charge of a particle, or any number of other things. By being neutral on that, MathML is as a result more flexible.

Nate

MathML: horrible

Posted Apr 28, 2011 21:05 UTC (Thu) by viro (subscriber, #7872) [Link]

The same is true for TeX; as the matter of fact, $$\Gamma^i_{jk} = \frac{1}{2}g^{im}(g_{mk,l} + g_{ml,k} - g_{kl,m})$$ is how you normally spell the formula for Christoffel symbols. I hate to think what that would turn into in your notation... caret-something is just "make something an upper index", no more and no less. No semantics attached; it's up to the reader of the resulting text how to interpret the damn thing.

The thing is, TeX is just a notation as well - one designed by a guy who took care to check how that kind of stuff was expressed in real world, i.e. by typesetters dealing with math texts. As for the flexibility - meh... S-expressions are just as flexible and require as little parsing as that piece of bloat. But then *ML is The Wave Of Future(tm) and S-exp is not, so who cares that if it's vomit-inducing...

MathML: horrible

Posted Apr 28, 2011 23:40 UTC (Thu) by n8willis (editor, #43041) [Link]

I'm confused by what you mean when you refer to "my notation" -- ostensibly that should mean MathML (which, for the record, I did not create in the slightest), but then you go on to take a swipe at the caret, which is not MathML at all.

In the first paragraph, you seem to be saying that TeX, like MathML, is semantics-free, which is true. Presentation MathML and TeX notation are equivalent there. Don't forget, though, that MathML also encompasses Content MathML, which as is explained in the 3.0 docs, is aligned with OpenMath, which *is* a semantic encoding. In *all* non-semantic encodings, it is always left up to the reader to "interpret the damn thing" (and there is nothing you as an author can do to prevent someone, somewhere, from misunderstanding you). The same is true with words, is it not?

But I'm also confused by what you're trying to say in the second paragraph, where you appear to being knocking MathML again, this time on the grounds that it is not connected to the "real world" (meaning, apparently, solely printed books and journal articles?). It is clear, is it not, that MathML is a _web_ publishing technology? The syntax is inherited from HTML, and you're certainly free to love or hate HTML, but its regularity is what makes things like the linking and CSS styling mentioned earlier possible.

I'd suggest you take a look at Mozilla's MathML "demo" pages for a more detailed discussion on how MathML's alignment with HTML and the DOM make it a superior fit for publishing mathematics _on_the_web_. There are more examples there than I had room to discuss. No matter how much you love TeX, you can't enable tooltip-style hover annotations in a printed & bound manual -- even if it was typeset with TeX. I'm not sure you could embed a link in the middle of a TeX-formatted equation in any CMS currently supporting the format, but I wouldn't want to be the first one to try it.

Finally, like most of us I have endless respect for Donald Knuth, but let's not let that cause us to fall into the trap of pretending that TeX's mathematical typesetting is free of errors or the need for manual adjustment when expressing a complex equation or notation. In particular, when it is left to automatic, TeX often chooses less-than-ideal sizes for tokens in multi-level stacks or continued fractions, and as Mozilla documents on the <mo> tracking page (here: http://www.mozilla.org/projects/mathml/demo/mo.xhtml), TeX is only capable of producing symmetric fences.

Nate

MathML: horrible

Posted Apr 29, 2011 0:01 UTC (Fri) by Comet (subscriber, #11646) [Link]

Shortly after I gave up on the idea of using MathML for a small project, simply because the MathML2 docs drove me to drink, someone on a mailing-list started a rant about XML schema abuse, to which my response was:

<paragraph><sentence><word pos="pronoun">You</word><word pos="verb"
subpos="auxilliary">should</word><word pos="verb">be</word><word
<word pos="pronoun">this</word><word pos="verb">is</word><word
<punctuation>.</punctuation></sentence></paragraph>

*That* is what is wrong with MathML -- this contrived example of abusive English markup is what MathML does to molest any poor innocent equation it encounters.

XML: horrible

Posted May 3, 2011 16:36 UTC (Tue) by i3839 (guest, #31386) [Link]

No, that is what is wrong with *all* XML based crap.

MathML: horrible

Posted May 4, 2011 15:02 UTC (Wed) by bronson (subscriber, #4806) [Link]

Bravo! Very well said. It seems like the people driving MathML don't actually like math very much but they adore XML.

MathML: horrible

Posted May 5, 2011 15:16 UTC (Thu) by PaulTopping (guest, #74605) [Link]

You guys are the crazy ones. MathML and all things XML are computer representations. That means that computer programs are the ones that are supposed to read and write them, not humans.

MathML: horrible

Posted May 5, 2011 15:20 UTC (Thu) by mjg59 (subscriber, #23239) [Link]

Then why isn't it a binary representation?

MathML: horrible

Posted May 6, 2011 7:15 UTC (Fri) by cladisch (✭ supporter ✭, #50193) [Link]

One of the goals of XML is to be compatible with SGML, which was designed to be written by humans. (Not very successfully.)

MathML: horrible

Posted May 6, 2011 13:03 UTC (Fri) by mjg59 (subscriber, #23239) [Link]

In fact, http://www.w3.org/TR/xml11/#sec-intro clearly states that XML is intended to be human legible.

MathML: horrible

Posted May 6, 2011 16:08 UTC (Fri) by bronson (subscriber, #4806) [Link]

Crazy? Look around, Paul. Most XML is edited by hand: xhtml, spring configs, ant files, pom files (editors exist but are horrid), docbook, etc etc etc.

This is true of mathml too, where the few automated editors that I can try are even more painful to use than hand-editing, presentation-only, and often get it wrong.

MathML: horrible

Posted May 11, 2011 10:00 UTC (Wed) by oldtomas (guest, #72579) [Link]

This is one of the worst things about the XMLs. If someone says "unreadable!" (and (s)he is right!) the apologists answer: "it's meant to be machine-readable, not human readable". If someone says "redundant!" (WTF do you have to repeat the whole tag name on close? Why are there *two* quote characters for attribute values? Why the F don't you have a plain straight regular escape syntax in attribute values, but have to mis-use entity syntax for that (entities are meant for completely different things, remember!), creating the mess with "internal" vs. "external" entities? On and on!) -- then the apologists retort: "it's to catch (human) input errors!".

I have another theory: XML is a denial-of-service attack on all of us.

OK, to be more precise: XML is passable as a (traditional) document representation language (family). As a data representation language it's plain perverse.

And yes, the line between document and data is somewhat broad, but for maths (and music, and vectr graphics), I'd say we're clear off this line... firmly on the "data" side.

MathML: horrible

Posted May 12, 2011 18:59 UTC (Thu) by PaulTopping (guest, #74605) [Link]

XML is plain text so that it can be easily interchanged between computers, applications, etc. It is true that some people have created mini XML-based languages that they expect people to type. This works ok for them because their language and what it describes consists of a few named options with values. As you point out here, even for this application XML is not a great language. For something like math notation, the things it represents are complicated and its XML representation is complicated as a result. In short, if the thing being described is simple, typing XML is ok but not great. If the thing being described is complex, like math, typing it is ridiculous. Most uses of XML are of the latter kind and are completely behind the scenes.

MathML: horrible

Posted Apr 29, 2011 3:14 UTC (Fri) by PaulTopping (guest, #74605) [Link]

In comparing TeX to MathML it is easy to make a category error. TeX is a human input language. In other words, it was designed to be typed by humans. MathML is a computer representation and was intended as an internal representation and not something for mere mortals to type. Just as RTF (Rich Text Format) is text and can be typed, it is not intended to be typed. RTF and all XML languages, even HTML, are not really designed to be input languages. That is not their forte.

MathML: horrible

Posted May 4, 2011 15:56 UTC (Wed) by danielpf (subscriber, #4723) [Link]

I don't buy that. If MathML was intended to be read only by computers then a proper design would care about efficiency and MathML would be coded in some kind of portable binary format, like Java code. If MathML was designed to be read by some humans, then it is absurd to invent a hard to read notation just to simplify the MathML parser and interpreter.

MathML: horrible

Posted May 12, 2011 20:05 UTC (Thu) by PaulTopping (guest, #74605) [Link]

You should read some of the more general introductions to XML. Wikipedia might be a good start. These days with fast computers and connections, but expensive software development, we choose to represent data in XML so that common software tools can be used to operate on them. XML is a text format for ease of interchange and ease of development.

For those that want speed, you can zip XML and they get really compact. This is what Microsoft's .docx format does. The W3C is also working on a compression scheme that is specific to XML that will probably do even better than zip compression. What is cool about it is that all XML-based file formats will take advantage of it. All XML tools will be updated to deal with the compression/decompression and will work on all those formats.