|
|
Log in / Subscribe / Register

Accessible? Yeah, right.

Accessible? Yeah, right.

Posted Mar 25, 2009 10:24 UTC (Wed) by khim (subscriber, #9252)
In reply to: Huh? by pboddie
Parent article: Stallman: the JavaScript trap

Frequently, document formats can be converted to purely textual formats and remain accessible.

Have you actually tried to convert document with a lot of rich data (tables, graphs, etc) to textual format? It's as legible as text on websites with JavaScript turned off: you can decipher the content... sometimes... if you are lucky... But in general it's often illegible.

I haven't studied PDF in depth, but it would appear to be a lot more like a genuine document format, despite various programmatic extensions for things like form filling, than PostScript.

It was true some time ago. Last versions iclude ECMAScript inyerpreter - and you can do a lot with it... actually some tools already use this capability. And a lot of texts are only available in PostScript form.

Sure, the text in a PostScript document is "in there somewhere", but you don't really want to be given the job of writing a program to get at it.

It's not even necessary true. Have you tried to work with PDF created from TeX not via pdftex, but via "old good" tex->dvi->ps->pdf way? It's a mess. DVI to PS conversion transfers "Hello world!" to something like !"##$%&$'#() and there are no easy way to get legible text back (it's done not out of malice but because it was easier: to reduce size of PS file dvips will create special fonts with glyphs: first letter in the document will be put as " ", next one as "!" and so on - when all 95 positions in first font are filled out the second one is started). Thus resulting PS (and then PDF) can be viewed and printed but that's it - you can not easily pull text back... It's easy for any decent cryptoanalyst, but not for normal person...

In contrast, HTML documents should generally preserve the accessibility of their content.

Yeah, it was the idea behind HTML. But like PDF HTML evolves and this idea is in the past. Today HTML is treated like "new PostScript": you have original version of content somewhere, but what the site actually serves is not an easily parseable document but more like opaque program for web- browser...

Try saving the page source in a JavaScript-intense application - you won't get anything meaningful, even though getting the content being shown is a legitimate thing to do. That's why the ability to control and modify the code has become an important and desirable thing to do.

You lost me at the last step. Why this ability is not important and desirable for PostScript and PDF but suddenly important and desirable for HTML? If HTML is "a new PostScript" then it should be treated as such: demand content in easy to use and understand formats (like ODS or even "simple HTML" with just a few markup tegs), don't try to turn sausage back to cow...


to post comments

Accessible? Yeah, right.

Posted Mar 25, 2009 19:20 UTC (Wed) by pboddie (guest, #50784) [Link] (9 responses)

It was true some time ago. Last versions iclude ECMAScript inyerpreter - and you can do a lot with it... actually some tools already use this capability.

That's why I wrote "despite various programmatic extensions for things like form filling" which is where one usually sees these features.

In contrast, HTML documents should generally preserve the accessibility of their content.
Yeah, it was the idea behind HTML. But like PDF HTML evolves and this idea is in the past. Today HTML is treated like "new PostScript": you have original version of content somewhere, but what the site actually serves is not an easily parseable document but more like opaque program for web- browser...

But isn't this part of the problem? People have decided to subvert the original objectives of the Web in order to use it as yet another opaque platform.

Try saving the page source in a JavaScript-intense application - you won't get anything meaningful, even though getting the content being shown is a legitimate thing to do. That's why the ability to control and modify the code has become an important and desirable thing to do.
You lost me at the last step. Why this ability is not important and desirable for PostScript and PDF but suddenly important and desirable for HTML?

I wasn't saying it wasn't desirable for PostScript and PDF. Various programs do a reasonable job at, for example, copying text from those kinds of documents, but the effort required is substantial and the results not necessarily reliable.

If HTML is "a new PostScript" then it should be treated as such: demand content in easy to use and understand formats (like ODS or even "simple HTML" with just a few markup tegs), don't try to turn sausage back to cow...

But the point is that HTML isn't supposed to be a new PostScript, and HTML plus CSS isn't anything comparable to PostScript. I think that even our verbose guest contributor asserting that JavaScript is "content" can accept that. A principal benefit of the Web is that your data (the actual content) is supposed to be delivered to you in a way that makes it relatively easy to access (like a "view source" function actually working). I think that out verbose contributor could acknowledge that JavaScript changes all that.

However, the genie is out of the bottle, and people are turning the Web into yet another platform where the data is locked away behind code which, as Stallman points out, you might not be able to improve or to fix. In your terminology: there's only sausage on the menu. Again, I think Stallman sees the bigger picture - the risks of "cloud computing" and software as a service - before the majority does.

Accessible? Yeah, right.

Posted Mar 26, 2009 5:30 UTC (Thu) by TRS-80 (guest, #1804) [Link] (7 responses)

A principal benefit of the Web is that your data (the actual content) is supposed to be delivered to you in a way that makes it relatively easy to access (like a "view source" function actually working). I think that out verbose contributor could acknowledge that JavaScript changes all that.

However, the genie is out of the bottle, and people are turning the Web into yet another platform where the data is locked away behind code which, as Stallman points out, you might not be able to improve or to fix. In your terminology: there's only sausage on the menu. Again, I think Stallman sees the bigger picture - the risks of "cloud computing" and software as a service - before the majority does.

I doubt very much data is natively stored as HTML - it's simply not a format useful for storing data in. So whether we get the data as HTML transformed from the SQL database server-side or piped via JSON and then transformed client-side, it's not the preferred means of storage; the data is already hidden behind code. Arguably, JSON interfaces are better since you can write your own JavsScript application that runs on a page you control (modulo same-origin restrictions, but running a proxy is easy). Of course, none of this is necessary if you have direct access to the data in question because you're running it on your own server. That's where Stallman should be focusing his efforts, rather than worrying about those who use hosted systems - they've already lost.

The view source principle is about understanding the structure of a webpage so you can write your own, not extracting your data from it.

Already lost? Not really.

Posted Mar 26, 2009 6:14 UTC (Thu) by khim (subscriber, #9252) [Link] (2 responses)

That's where Stallman should be focusing his efforts, rather than worrying about those who use hosted systems - they've already lost.

I beg your pardon. I can pull all data from GMail via IMAP - and this is a lot better then just to have some "free JavaScript"! I can take my data and go elsewhere - I don't need any "free JavaScript" for that. Yes, I can not easily change the interface, but that's secondary grief. I can pull my data - and that's enough for me. As for Google Docs - I'm avoiding them not because the JavaScript is not free, but because their export capabilities suck: exported .ods often has broken macros and you can not be even export Presentation to .odp! This is much bigger grief to me then freeness of JavaScript.

I care about my data first, about code availability second. And even if you care about availability of code to discuss availability of code for server and client parts make no sense: they are tied more tightly than Siamese twins!

Already lost? Not really.

Posted Mar 26, 2009 7:48 UTC (Thu) by TRS-80 (guest, #1804) [Link] (1 responses)

I care about my data first, about code availability second. And even if you care about availability of code to discuss availability of code for server and client parts make no sense: they are tied more tightly than Siamese twins!
The sentence previous to the one you quoted was "Of course, none of this is necessary if you have direct access to the data in question because you're running it on your own server." IMAP is direct access to the data in question, I'm not sure why you think we disagree on the uselessness of replacing the client-side JavaScript.

I've talked about cloud...

Posted Mar 26, 2009 12:04 UTC (Thu) by khim (subscriber, #9252) [Link]

The sentence previous to the one you quoted was "Of course, none of this is necessary if you have direct access to the data in question because you're running it on your own server."

Yes, but I'm not running GMail on my own server! This was side-note about cloud. Even if you are NOT using "your own server" your data is not always held hostage: it's trivial to pull all data from GMail via IMAP (and it's probably good idea to do this from time to time - who knows when Google will declare you nasty spamer and remove your account?). So it's not true that people who are using hosted systems "already lost". But it's much better to use sane protocol (IMAP) for that rather then try to pull the data via "free JavaScript".

IMAP is direct access to the data in question, I'm not sure why you think we disagree on the uselessness of replacing the client-side JavaScript.

This is not about client-side JavaScript. This is about cloud: as long as my data can be pulled out of cloud I don't particularry care if cloud server is free software or not (I don't have resources to run my own replacement anyway): it's acceptable situation (of course free software is better... but it's only one factor out of many). If my data is held hostage on server and can only be accessed by JavaScript client - I don't really care if it's free software or not: I'll try to avoid such service as much as possible.

Accessible? Yeah, right.

Posted Mar 26, 2009 12:57 UTC (Thu) by pboddie (guest, #50784) [Link] (3 responses)

I doubt very much data is natively stored as HTML - it's simply not a format useful for storing data in. So whether we get the data as HTML transformed from the SQL database server-side or piped via JSON and then transformed client-side, it's not the preferred means of storage; the data is already hidden behind code.

The question of native storage is not directly relevant to the issue of accessibility: HTML is principally an interchange format, and an obviously desirable property of such formats is that the data "stored" in exchanged documents can be accessed by the recipient. Similarly, JSON is an interchange format. I don't know whether Google Docs, for example, uses JSON or a similar open interchange format, but if it did not, then obviously the lack of a documented interface would be a hindrance to anyone who has issues (technical or other) with the code.

Arguably, JSON interfaces are better since you can write your own JavsScript application that runs on a page you control

Indeed. But you're referring to applications built to be interoperable, which is a step up from "black box" applications who ask your browser to run code on their behalf. Another factor is what kind of data the JSON interfaces expose. If you're only getting small fragments of the larger whole, building up an entire document is likely to be very awkward.

I have personally had to migrate e-mail data out of a Web-only system where POP and IMAP support was not available. In the end, I had to write a script to pull out each message one at a time, and I had to settle for an inferior version of the original content. I've also used another Web-based system where I was fortunately able to use POP - "fortunately" because the Web interface was very heavy on the JavaScript, and automating the Web browser and then traversing the browser's DOM would have been necessary to access the content. Moreover, that application's JavaScript didn't always work on various browsers that were normally adequate for browsing the Web - another reason for wanting to improve the JavaScript employed by that application.

Sometimes I think that people who apparently don't see the need for the kind of interoperability advocated in this matter - summarised as "I can't see why you'd want this" - have either been fortunate enough never to experience data access or migration issues, or didn't really care when a chunk of their personal data went away once upon a time.

"I CAN see why you'd want this - I just don't think it's realistic

Posted Mar 26, 2009 13:51 UTC (Thu) by khim (subscriber, #9252) [Link] (2 responses)

The question of native storage is not directly relevant to the issue of accessibility: HTML is principally an interchange format, and an obviously desirable property of such formats is that the data "stored" in exchanged documents can be accessed by the recipient.

It's quite relevant, of course. There are two choices:
1. Give sane access to the native storage (POP, IMAP, etc) or
2. Create stable server<->client API and build free JavaScript client on top of that.

Without stable server<->client API free JavaScript is pretty useless and I can not see how FOSS guys can demaind stable API from web-sites after declarations like this one. If you care about data at all you need to talk about solution #1 because solution #2 is totally not realistic.

I have personally had to migrate e-mail data out of a Web-only system where POP and IMAP support was not available.

And this IS the problem with such systems. If they had IMAP access - will you still need "Free JavaScript(tm)", or not? It's easier to implement IMAP then to offer free and usable JavaScript on client...

"I CAN see why you'd want this - I just don't think it's realistic

Posted Mar 26, 2009 17:52 UTC (Thu) by pboddie (guest, #50784) [Link] (1 responses)

The question of native storage is not directly relevant to the issue of accessibility
It's quite relevant, of course.

I should have written "in the context of the original complaint". And when someone says "I doubt very much data is natively stored as HTML", it's a total red herring: we're talking about applications where the only interface may well be HTML plus CSS plus JavaScript, with no "line of sight" to the native storage.

I have personally had to migrate e-mail data out of a Web-only system where POP and IMAP support was not available.
And this IS the problem with such systems. If they had IMAP access - will you still need "Free JavaScript(tm)", or not? It's easier to implement IMAP then to offer free and usable JavaScript on client...

Well, of course I wouldn't need to run a modified version of the JavaScript code if I could have access to the underlying data, but we don't always get the choice. And of course Stallman talks about solution #1, but I guess he realises that sometimes solution #2 is worth demanding if that's all you're likely to get.

Meanwhile, in all this discussion, I think we've completely demolished the "JavaScript is just content" notion, which is what I mostly objected to in the beginning.

"Let them eat cake" approach just does not work...

Posted Mar 27, 2009 15:29 UTC (Fri) by khim (subscriber, #9252) [Link]

Well, of course I wouldn't need to run a modified version of the JavaScript code if I could have access to the underlying data, but we don't always get the choice.

If you can not convince people to give you this access how the hell are planning to ask them to give you stable client<->server API - without which free JavaScript is useless? It's usually easier to give you access to raw data than to create and support stable client<->server API...

Meanwhile, in all this discussion, I think we've completely demolished the "JavaScript is just content" notion, which is what I mostly objected to in the beginning.

I see no such demolishing. Sure JavaScript is what server uses to show you tables, circles and other figures but it's no different from how PostScript and PDF are used. Either you should declare all these countless PostScript and PDF papers "programs" and fight for freedom (good luck) or you should accept that JavaScript is just a content in pretty opaque form...

Yeah, it was nice idea

Posted Mar 26, 2009 6:23 UTC (Thu) by khim (subscriber, #9252) [Link]

A principal benefit of the Web is that your data (the actual content) is supposed to be delivered to you in a way that makes it relatively easy to access (like a "view source" function actually working).

Unfortunatelly this idea was killed while Web was in it's infancy. Almost before it was born. If you remember initially HTML was supposed to only contain content and have no style. It got some usage but Web only started to spread like wildfire when this idea was abandoned and Netscape added tons of tags to make styling possible. Gopher (which refused to budge) was more-or-less killed. To think that the people will stop at this point is uttery ridiculous.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds