LWN.net Logo

Who owns your data?

Who owns your data?

Posted May 10, 2012 13:55 UTC (Thu) by pboddie (subscriber, #50784)
In reply to: Who owns your data? by Comet
Parent article: Who owns your data?

They have archived copies of the Microsoft document format specifications; much as we might dislike it, the content they need to preserve is the content created by most of the populace.

Although welcome, this raises additional issues. Given this apparent safety net, people are now likely to say "Great, we're covered!" And then they will carry on churning out proprietary format content. But we are not covered.

Firstly, we don't even know if the specifications are complete or accurate. This is Microsoft we're talking about, so although it is possible that these published specifications have had some auditing as part of a regulatory action in the European Union, we can't be sure that they are usable until someone produces a separate implementation.

Secondly, people will happily start producing content in later versions of those formats which aren't covered by publicly available specifications. Again, we're talking about Microsoft, so any remedy for trouble they have managed to get themselves into will only last as long as the company is under scrutiny. Then, it's back to business as usual. Meanwhile, nobody in wider society will have been educated about the pitfalls of such proprietary formats and systems.

Thirdly, the cost of preservation under such initiatives may well be borne by the people whose data is now imprisoned in such formats, instead of the people responsible for devising the format in the first place. In various environments, there are actually standards for archiving, although I can well imagine that those responsible for enforcing such standards have been transfixed by the sparkle of new gadgetry, the soothing tones of the sales pitch, and the quick hand-over of an awkward problem to a reassuring vendor. Public institutions and the public in general should not have to make up the shortfall in the vendors' lack of investment.

Finally, standards compliance is awkward enough even when standards are open and documented. One can argue that a Free Software reference implementation might encourage overdependence on a particular technology and its peculiarities, potentially undermining any underdocumented standard, but this can really only be fixed when you have a functioning community and multiple Free Software implementations: then, ambiguities and inconsistencies are brought to the surface and publicly dealt with.

Sustainable computing and knowledge management requires a degree of redundancy. Mentions of the celebrated case of the BBC Domesday Project often omit the fact that efforts were made to properly document the technologies involved - it is usually assumed that nobody had bothered, which is not the case - but had that project been able to take advantage of widely supported, genuinely open standards, misplacing documentation would have had a substantially smaller impact on preservation activities.

Indeed, with open formats and appropriate licensing of the content, the output of the project might have been continuously preserved, meaning that the content and the means of deploying it would have adapted incrementally as technology progressed. That's a much more attractive outcome than sealing some notes in a box and hoping that future archaeologists can figure them out.


(Log in to post comments)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds