LWN.net Logo

you do NOT need to write all your programs together to make them work together.

you do NOT need to write all your programs together to make them work together.

Posted Jan 29, 2013 13:59 UTC (Tue) by HelloWorld (guest, #56129)
In reply to: you do NOT need to write all your programs together to make them work together. by mgb
Parent article: Poettering: The Biggest Myths

The funny thing about this "text streams" meme is that Unix doesn't actually use text streams, yet nobody seems to realise that. A Unix pipe is a byte stream, not a text stream. To interpret a byte stream as text, one needs to negotiate an encoding, and Unix pipes don't offer a means to do that.

And besides, even if Unix did offer text streams, I'd still disagree. Text is not a universal interface, there's a *lot* of data where a text format doesn't make any sense at all, like videos, audio, images, and many many others. And in fact, most programs today don't mess around with text files any longer, but use structured data representations and libraries to (de)serialise them. DBus is a binary protocol because it's faster, but also because nobody cares as everyone uses it via a library (and code generators) anyway.


(Log in to post comments)

you do NOT need to write all your programs together to make them work together.

Posted Jan 29, 2013 17:24 UTC (Tue) by smurf (subscriber, #17840) [Link]

> one needs to negotiate an encoding,
> and Unix pipes don't offer a means to do that.

One might argue that the negotiation is implied by the LOCALE setting.
Or that it is not necessary these days, because anybody who does not use UTF-8 deserves to lose. :-P

you do NOT need to write all your programs together to make them work together.

Posted Jan 29, 2013 18:17 UTC (Tue) by HelloWorld (guest, #56129) [Link]

> One might argue that the negotiation is implied by the LOCALE setting.
The LOCALE setting is just a way to specify manually the information that can't be negotiated through the pipe. If pipes were actually a text stream, there'd be no need to do with manually and things would just work.

Anyway, I don't think such a design would be desirable, because as I said before, text is in fact not a universal interface. Which is of course why many programs today communicate with much more structured protocols like D-Bus.

While I do sympathise with your views about UTF-8, there's a large amount of data stored in legacy encodings, and it's not going away any time soon.

you do NOT need to write all your programs together to make them work together.

Posted Jan 29, 2013 20:28 UTC (Tue) by anselm (subscriber, #2796) [Link]

A Unix pipe is a byte stream, not a text stream. To interpret a byte stream as text, one needs to negotiate an encoding, and Unix pipes don't offer a means to do that.

In the interest of fairness it should be mentioned that, at the time Doug McIlroy made the quoted statement, ASCII was still the text encoding of choice (at least if you were in the US). The idea that an encoding might have to be »negotiated« before a pipe could be useful didn't really show up on anyone's radar.

Also, most of the usual Unix tools assume that their input consists of lines of text separated by newline characters, rather than arbitrary »byte streams«, and generate output to suit this assumption. Note that UTF-8 was carefully defined (by Ken Thompson, no less) to fit the common Unix notion of »text« – up to a point where many Unix tools will do »reasonable« things when fed UTF-8 data even if they do not include explicit support for UTF-8.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds