LWN.net Logo

Converting Between XML and JSON (O'Reilly)

Stefan Goessner discusses the conversion between XML and JSON (JavaScript Object Notation) in an O'Reilly article. "More and more web service providers seem to be interested in offering JSON APIs beneath their XML APIs. One considerable advantage of using a JSON API is its ability to provide cross-domain requests while bypassing the restrictive same domain policy of the XmlHttpRequest object. On the client-side, JSON comes with a native language-compliant data structure, with which it performs much better than corresponding DOM calls required for XML processing. Finally, transforming JSON structures to presentational data can be easily achieved with tools such as JSONT."
(Log in to post comments)

Converting Between XML and JSON (O'Reilly)

Posted Jun 8, 2006 9:16 UTC (Thu) by xoddam (subscriber, #2322) [Link]

The translation from XML to JSON presented in the article doesn't
preserve some essential features of the source language.

Nested tags are, for some reason, treated as though they were
*attributes* (with special and ugly ["@names"]) of the parent tag, thus
losing their ordering; and textual content between opening and closing
tags is treated as another special ["#text"] attribute whose value is a
string. This makes the use of text containing tags (that is, *markup*!
What is XML for exactly?) even more problematic. The suggested preferred
solution is to keep the nested tags in XML format!

That just doesn't make sense to me. XML attributes are simple
properties: their values are simple text, must be unique and are
order-independent, so it makes sense to implement them as key/value pairs
in a dictionary (the JSON term is 'object').

The *content* between the opening and closing tag, on the other hand, is
ordered and may have arbitrary nested tags (objects) interspersed with
textual data, so the only sensible value for the special content
attribute (which I wouldn't call ["#text"] but maybe ["#content"]) is a
*list* (JSON 'array'), consisting of strings (text) and objects (tags)
interspersed, preserving order. Each tag object obviously then requires
a ["#tagname"] property, since it is not otherwise referenced by name.

A JSON structure built this way has a clear one-to-one correspondence
with the XML source. If anything else makes more sense for a particular
application, this probably indicates that XML wasn't such an appropriate
application language in the first place!

Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds