LWN.net Logo

Exaggerated.

Exaggerated.

Posted Feb 23, 2007 18:57 UTC (Fri) by AJWM (guest, #15888)
Parent article: How an Accident of Hardware Design Encouraged Open Source (O'ReillyNet)

The article makes a big deal out of the byte ordering differences between IBM's 360/370 series and DEC's PDP-11 series (both of which used an 8-bit byte).

It had nothing to do with byte order. It had everything to do with the fact that the IBM used EBCDIC as its native character set, which among its other faults, is non-contiguous in its alphabetic sequences. That's what makes text manipulation such a pain in the butt on those systems. The article doesn't even mention EBCDIC.

(Mind, that's more an OS limitation than hardware -- Amdahls Unix V7 for the 370 architecture, UTS, used ASCII, and porting apps from a PDP-11 to a 370 running UTS was trivial.)

A contemporary major line of systems, Burroughs', used a 48-bit word and strings of either 6-bit characters (common on many architectures, hence the lack of lower case letters) or 8-bit, the latter either EBCDIC or ASCII. The systems had dedicated string processing hardware, the segment descriptor that pointed to the string specified the character set.


(Log in to post comments)

Exaggerated.

Posted Feb 25, 2007 12:09 UTC (Sun) by eru (subscriber, #2753) [Link]

It had everything to do with the fact that the IBM used EBCDIC as its native character set, which among its other faults, is non-contiguous in its alphabetic sequences. That's what makes text manipulation such a pain in the butt on those systems.

An anglo-centric view. For most other languages, alphabetic sequences are non-contiguous in all widely used character sets.

Exaggerated.

Posted Feb 26, 2007 23:17 UTC (Mon) by jzbiciak (✭ supporter ✭, #5246) [Link]

Well, perhaps today, but put it in context. In the relevant era (1970s in this case), when you're comparing EBCDIC vs. ASCII, character classification is quite a bit easier in ASCII than in EBCDIC. EBCDIC might not have been so bad to work with on BCD-centric hardware, but on machines that only focused on 2s complement, I can see it causing some heartburn in some cases.

In the modern day, obviously the debate's moot in the presence of Unicode, UTF-8, etc., and plenty of MIPS, RAM and disk to go around (as compared to the 1970s)....

Exaggerated.

Posted Feb 26, 2007 18:19 UTC (Mon) by MBR (guest, #43632) [Link]

I think you're misunderstanding the relevance of the IBM 360/370 to my point.

In the early 1980s, a number of startups wanting to build the next generation of desktop computers looked to Unix as an alternative to writing their own operating system from scratch because it was written in C rather than assembler. Many of them (e.g. Sun Microsystems) were using CPUs like the Motorola 68000 whose byte numbering scheme was exactly the same as that used by the IBM 360. Whether the Motorola engineers who designed the 68000 had copied this from the 360 or copied it from someone who copied it from someone who copied it from the 360 is impossible to say. This generation of computers all used ASCII. No-one in their right minds would have used EBCDIC, except for IBM who was stuck trying to maintain backward compatibility with their earlier offerings.

The PDP-11 is relevant because Unix had been rewritten from assembler into C on a PDP-11 in 1973. It had ported easily from DEC's PDP-11 to DEC's VAX because DEC designed had the VAX to be a grown-up PDP-11. The VAX followed the PDP-11's byte numbering scheme. By the early 1980s Unix was mature enough that a startup could consider using it for their OS.

The result was that programmers at countless Silicon Valley startups in the early 1980s found themselves constantly struggling to eliminate byte-order dependencies in C code as they ported Unix from one of the two standard VAX implementations, the Berkeley distribution or the Bell Labs distribution, to the new microcomputers. In comparison to the hordes of programmers in Silicon Valley, around Boston's Route 128, and elsewhere, who were porting code from ASCII-based little-endian machines to ASCII-based big-endian machines, the number of programmers porting code between EBCDIC-based IBM machines and these new machines was small. And so the former group had a much greater influence on common programming practice in the Unix world than the latter group.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds