Simple is good enough

Posted Nov 15, 2010 10:10 UTC (Mon) by mjthayer (guest, #39183)
In reply to: Simple is good enough by quotemstr
Parent article: LPC: Life after X

> "Why does copying this image freeze the program for 30 seconds? Lol that doesnt happen under Windows."

If that was also addressed to me, I will make an attempt at defending my proposal (consider the correction I added above as part of it).

* Clipboard data is written to a file on disk in order to share it, but due to disk caching that doesn't have to mean that the disk has to be a bottleneck (even disregarding the fact that /var/clipboard could be a tmpfs). This would need real life testing of course.
* Some latency is acceptable (ESR's estimate is 0.7 seconds - http://www.faqs.org/docs/artu/ch10s01.html). In this case we have additional room for manoeuvre, as we also have the time the user needs to switch from the copying to the pasting application.
* I will also point out that this proposal actually removes a potential source of latency (one that does occur in the wild) with the X11 selection protocol - when an application pastes X11 clipboard data it requires several rounds of communication between the two applications via the X server. If the application offering the data is currently busy the application pasting will often freeze until the data can be served. With the scheme I proposed the data will be available at once.

I realise that it might still not be workable despite all that, but I do think that there is a chance it might be.

Simple is good enough

Posted Nov 15, 2010 10:30 UTC (Mon) by mjthayer (guest, #39183) [Link] (10 responses)

>> "Why does copying this image freeze the program for 30 seconds? Lol that doesnt happen under Windows."

> If that was also addressed to me, I will make an attempt at defending my proposal (consider the correction I added above as part of it).

Replying to myself. One clear weakness of my proposal would be that it might not work well with select and middle button paste, as selecting is something you do more often than copying (again, that would need testing to be sure). I would give it more chances of working with drag and drop (which I personally prefer over middle button paste, but I greatly fear I am in a minority here with that).

Simple is good enough

Posted Nov 16, 2010 13:00 UTC (Tue) by i3839 (guest, #31386) [Link] (9 responses)

Not really. People probably don't want to copy and paste complex things
like images with select and middle mouse button. So if applications are
smart they only copy simple things that are quick to copy when selecting,
and only do the slow copy when users explicitly copy something.

Simple is good enough

Posted Nov 16, 2010 14:11 UTC (Tue) by mjthayer (guest, #39183) [Link] (8 responses)

> So if applications are smart they only copy simple things that are quick to copy when selecting, and only do the slow copy when users explicitly copy something.

Shouldn't applications be doing what the user asks them to rather than being smart? Selecting and middle click pasting an image works now. If the user selects it, should the application really assume that they don't want to paste it? Of course, it might still turn out that users don't select things often enough that the overhead would be a big issue.

Simple is good enough

Posted Nov 16, 2010 22:26 UTC (Tue) by i3839 (guest, #31386) [Link] (7 responses)

It depends on the program and context. If you select an image in a browser I wouldn't copy the image. But if it's an image editing program, I would.
But copying it into a hundred formats isn't something you should do for every selection of any random thing (or ever, but if you do...).

Simple is good enough

Posted Nov 16, 2010 22:42 UTC (Tue) by mjthayer (guest, #39183) [Link] (6 responses)

> It depends on the program and context. If you select an image in a browser I wouldn't copy the image. But if it's an image editing program, I would.

This does happen currently though (I tested it earlier today).

> But copying it into a hundred formats isn't something you should do for every selection of any random thing (or ever, but if you do...).

I changed that aspect of the proposal in a previous comment - in the new version a file containing a single mime type (with a well-known magic number) is saved to disk. To handle conversions, a (large) set of filters is installed on the system, and the application reading the clipboard must iterate through the installed filters to find ones which convert the file to a format it can use. This is roughly what BeOS did, which apparently worked well (or so I am told by a colleague who developed for BeOS).

Simple is good enough

Posted Nov 19, 2010 22:13 UTC (Fri) by i3839 (guest, #31386) [Link] (5 responses)

> I changed that aspect of the proposal in a previous comment - in the new
> version a file containing a single mime type (with a well-known magic
> number) is saved to disk. To handle conversions, a (large) set of filters
> is installed on the system, and the application reading the clipboard
> must iterate through the installed filters to find ones which convert the
> file to a format it can use. This is roughly what BeOS did, which
> apparently worked well (or so I am told by a colleague who developed for
> BeOS).

Well, the problem is that for complex types you can't easily convert from one to the other, because only the program doing the copying has all the info.

Example: If you copy a bit of a webpage, it can be either plain text, the raw HTML code, or formatted text depending on the style etc. Converting to plain text is almost always possible, but anything else doesn't really work. So a single mime type isn't always sufficient.

There are two sides to a copy and paste system: The program ABI to do the copying and pasting, and the system ABI how it's actually done. I think the latter shouldn't be set in stone, only the former, to keep the implementation flexible. So all copying and pasting should happen through the system installed copy&paste library, or the copy and paste programs (simple front-ends for the lib).

(I still haven't found to start implementing this, hopefully next week.)

Simple is good enough

Posted Nov 22, 2010 14:52 UTC (Mon) by mjthayer (guest, #39183) [Link] (4 responses)

> Well, the problem is that for complex types you can't easily convert from one to the other, because only the program doing the copying has all the info.

Actually the idea was that the application doing the copying provided the data in its native/internal format, which by definition should have all the information. It could always define an x- or a vnd. mime format for this and provide whatever converters it wanted to transform that data into other formats (they could probably double up as export filters too).

> Example: If you copy a bit of a webpage, it can be either plain text, the raw HTML code, or formatted text depending on the style etc. Converting to plain text is almost always possible, but anything else doesn't really work. So a single mime type isn't always sufficient.

In this case the native format is presumably "text/html", which should be convertable to either plain text or formatted text without the copying application even having to provide its own converters.

Simple is good enough

Posted Nov 25, 2010 21:44 UTC (Thu) by i3839 (guest, #31386) [Link] (3 responses)

Problem is that in the case of html, you generally lose the formatting information because that's not in the part you copied, but higher up or in a css file. So there is no native format, you don't want to copy raw html code into a word processor, nor the plain text, but something that more or less looks like what you copied. Not to mention that usually the selected part is "broken" html because not all tags are closed. So it's not that simple and I don't think it's safe to get rid of the list support.

For images and other data formats with an obvious raw format are much easier and better suited for automatic convertion. That can be done automatically without changing the API.

Simple is good enough

Posted Nov 25, 2010 21:56 UTC (Thu) by mjthayer (guest, #39183) [Link]

> Problem is that in the case of html, you generally lose the formatting information because that's not in the part you copied, but higher up or in a css file. So there is no native format, you don't want to copy raw html code into a word processor, nor the plain text, but something that more or less looks like what you copied.

Just for interest I copied some text in Firefox and ran my clipboard format viewer. Here are the results:

$ ../tmp/viewclipformats
Found clipboard format: TIMESTAMP
Found clipboard format: TARGETS
Found clipboard format: MULTIPLE
Found clipboard format: text/html
Found clipboard format: text/_moz_htmlcontext
Found clipboard format: text/_moz_htmlinfo
Found clipboard format: UTF8_STRING
Found clipboard format: COMPOUND_TEXT
Found clipboard format: TEXT
Found clipboard format: STRING
Found clipboard format: text/x-moz-url-priv

Without knowing, it wouldn't surprise me if one of those contained both the html and the formatting information, which I think should be feasible with my proposal too.

Simple is good enough

Posted Nov 26, 2010 22:26 UTC (Fri) by mjthayer (guest, #39183) [Link] (1 responses)

You also have to ask, when an application puts HTML data into the clipboard, what data it is actually putting there. When I select a section of text, pictures and whatever in Firefox and copy I get HTML data in the clipboard. But Firefox can't just put the source of the document from the point where the selection begins to the point where it ends into the clipboard, as it is announcing HTML data, and as you point out, that wouldn't be HTML, it would be broken HTML. So Firefox has no choice but to massage the HTML data anyway, and if it is doing that already, adding the style information inline is no great hardship.

Of course, if you want to reuse that data as is as HTML for some other web page then you are probably out of luck, but if you think of it that makes no sense anyway - if you want to do that you should probably be copying the source of the HTML as plain text. If you select and copy part of a page in Firefox, chances are that what you are actually about to do is to paste it either as plain text (the text visible on the page, not the HTML source) or as formatted text into e.g. OpenOffice.

And if you were copying the data inside some visual HTML editor, it would probably still not make sense for the editor to insert the data as naive HTML - chances are there would be no way to paste the data in any form resembling the source of the page the editor was generating, and in any case, if you were trying to get at the generated source it would make more sense to ask the editor directly than copying and pasting to get at it. In fact I would expect the visual editor to use some internal format which was not valid HTML at all when copying to the clipboard, but which another instance of the editor would know what to do with when pasting it. It might provide a filter to convert it to HTML, but not for the purposes of viewing the source - you don't use the clipboard for that - but rather as a stepping stone for converting it to OOXML or something else.

Hope that made sense, as I am rather short of sleep currently. I would really like to be clear that I am not trying to argue for the sake of arguing here, but rather because responding to the points you make forces me to think things through myself.

Simple is good enough

Posted Nov 27, 2010 10:39 UTC (Sat) by i3839 (guest, #31386) [Link]

> I would really like to be clear that I am not trying to argue for the
> sake of arguing here, but rather because responding to the points you
> make forces me to think things through myself.

Same here, we're trying to figure out if a list of formats is really needed, or if always providing only one and having convertors is sufficient. This choice determines the API, so it's pretty important.

Only having one format and providing convertors is simpler, but less complete. My main concern is that it's not always sufficient, or that it makes implementing copy harder than necessary for some applications, because they have to create one "complete" format and convertors.

Another concern is that you convert from simple->complex->simple, when also supporting a complex type, hoping that the "simple" in the end is the same as what you started with. So the unrelated complex type makes simple types more complex too, with too much room for errors in my opinion. Or in other words, copying simple types is not simple anymore, if you also copy a complex one.

Lastly, I don't really see a way to support multiple types when pasting. It should be the pasting program's decision what type to paste, if it supports multiple types. I don't see another way than supporting a list of types in the pasting API anyway, and then you can as well support lists in the copy API too.

I think you make too many assumptions about what the user or pasting program expects in your line of thinking.

All in all I think the automatic convertion idea is good, but not always sufficient. Combined with today's multiple format support in applications, I think it's best to support multiple formats, but to encourage convertor usage when possible.

Then when someone pastes something the lists are compared, and if they have no common format, a convertor is used.

A list of formats is basically "more of the same", so I think the added complexity, both for the API and implementation, is small enough.

Now we just have to find some time to implement this. I think I'll give it a stab next week. I'll keep you informed (my email address is indan@nul.nu).

Simple is good enough

Posted Dec 9, 2010 19:16 UTC (Thu) by Lestibournes (guest, #71790) [Link]

Maybe something like this will work:
1. Program A indicates that it is ready to supply data by writing its identifier to clipboard/source.
2. Program B requests the data by writing its identifier to clipboard/destination.
3. Program A writes the data files in clipboard/data.
4. Program A indicates that it finished writing the data by erasing the content of clipboard/source and clipboard/destination.
5. Program B reads the data files from clipboard/data.

If no one requests the data from Program A, then it will still dump the data when it terminates. The only weaknesses I detect are a delay when the Paste operation is performed, and that the data will be lost if Program A crashes. There should be a separate clipboard folder for each session to avoid conflicts such as two users who share an account and override each other's Copy operations.