Posted May 10, 2012 10:48 UTC (Thu) by stevan (subscriber, #4342)
Parent article: Who owns your data?
Here in Scotland, I have become involved in the creation of a community digital archive. It is fully free-software based (postgresql and dspace,) KVM-virtualised and all ancillary aspects, like presentation of the archival objects is done via free software too. We view virtualisation as an aspect of longevity too and the choice of dspace was done after checking that scripts were available to migrate if necessary. We mandate that only non-proprietary formats are used in the Archive, and we include standard (non-Adobe-encumbered) pdf in that definition. We have had a request from historians to ensure audio is stored lossless (that's OK - flac) but when it comes to things like video, especially compressed video, it becomes more problematic. It's quite straight forward to mandate vorbis or similar, in the absence of knowing where webm is heading. The reason is trting to guess what is likely to be demanded of the file in the future and to what extent current open standards are going to be transferrable in future. You could argue that file storing everything uncompressed is the answer, but it's not a wildly practical one. The same thing applies to images. A 6cm x 6cm negative from 50 years ago still contains a huge amount of information, but original digital images vary.
I'm not complaining - I think both free software and free and open formats are by no means lacking, and intellectually they are the way to go, but when it comes to applying them it's quite difficult to think practically into the longer term.
And when I say "we," above, I mean "I," as many of these nuances are lost on users of the Archive. It is necessary to have a story to tell to explain why docx gets a swift blow from the digital lead piping when it reaches the Archive.