LWN.net Logo

Big endian vs little endian

Big endian vs little endian

Posted Mar 6, 2007 9:03 UTC (Tue) by xoddam (subscriber, #2322)
In reply to: Big endian vs little endian by giraffedata
Parent article: How an Accident of Hardware Design Encouraged Open Source (O'ReillyNet)

> (big-endian) is much easier for humans to visualize and talk about.

For "humans", substitute "people who use big-endian representations in their everyday written language" and you'll be correct. Numbers are little-endian in written Arabic and some other languages.

There is a totally inconsistent exception for *telephone* numbers. A telephone number is a sequence of digits, written from left to right *even in Arabic*.


(Log in to post comments)

Big endian vs little endian

Posted Mar 6, 2007 16:34 UTC (Tue) by giraffedata (subscriber, #1954) [Link]

Numbers are little-endian in written Arabic and some other languages.

Thanks. I've wondered about that. In those languages, is there a more explicit form of writing numbers, such as the English "three thousand two hundred five"? If so, is it little endian or big endian?

There is a totally inconsistent exception for *telephone* numbers. A telephone number is a sequence of digits, written from left to right *even in Arabic*.

I don't see any inconsistency. A telephone number isn't a number; it's just a digit sequence. There's only one sane way to write a telephone number: in the order in which you dial it. And there was only one practical way to build the early telephone switches: most significant digit first.

I know of one such inconsistency, though: Some Chinese is written from right to left with numbers as Western numerals left to right. The reader skips ahead and reads the numeral most significant digit first (as it is spoken).

Big endian vs little endian

Posted Mar 8, 2007 20:20 UTC (Thu) by netizen (guest, #43966) [Link]

I think it important to remember that when the IBM System/360 was announced on 01 APR 1964 as an all around scientific -and- commercial data processing device (yes, "data processing"; "information technology" had net even been coined, then) that the primary purpose for even commercial processing was for numerical processing! Text processing was just something else it did, some times with a move command, but usually with a sub-routine which utilized specialized text processing instructions. Also, while its native mode was Enhanced Binary Coded Decimal for Interchange Code -- EBCDIC -- all instructions had an ASCII-bit for switching to ASCII interpretation. ASCII was new at the time. From 1964 until today I have never heard of anyone actually using the ASCII-bit. EBDCIC was an extension/enhancement to the code structure which had been used in IBM's previous mainframes -- both commercial and scientific oriented number processors.

System/360 operated with 32-bit and 64-bit instructions. And, yes, some few instructions worked on only 16-bit or 8-bit at a time. A decimal number, e.g 750124, was converted, by the hardware into a series of hexadeciminal units. Hexadecimal had the advantage that the hex-bits laid out the same way, left-to-right, in HEX, BCD, EBCDIC, the way they were written on a bank check (for instance) and the same way they would have been punched into an IBM-card.

Inside the CPU a single instruction could fetch a customer's entire record. A second instruction could load the customer's account balance into a register of choice. The third instruction could subtract the customer's bank check payment from that resister, after which a fourth instruction stored the resulting new balance back into the customer's record. Finally a fifth instruction would write the updated record back onto a designated storage device. There was no bit manipulation. Just Bif!, Bam! And that was all using assembler language instructions!

Such was the power of a "big iron" instruction set with big-endian data representation. Multiple, i.e. sixteen, 32-bit registers cost big time and instruction set execution which could process such registers (or pairs of such registers for 64-bit operations) in one cycle was expensive.

The Intel 4000-series (4004, etc.) and 8000-series (8008, 8080, etc.) understandably took a different approach. IBM's main customers were primarily insurance companies which were drowning in a sea of paper record processing and desperately needed to automate or die. (Word was than many sent clerks to IBM programming school; those which passed still had jobs.)

Years later, Intel's 4000/8000 customers were, originally, vending machine manufactures who wanted to escape the break-down prone mechanical change-making processors which were then being used in machines which sold more than one-type, one-price product. Little-endian programming seemed appropriate for such (no insult intended) nickel-and-dime processing using a minimal instruction set. {Intel counted adding to A-reg and adding to B-reg as two different instructions; IBM counted a instruction which could add to any register, 0 through 15, as one instruction.]

IMHO, I prefer big-iron type powerful instruction sets which operate in one fell swoop on entire chunks of data using open-source code and open-format data representation. It takes billions of us to operate this planet and anyone who feels the need or urge ought to be able to have a fair go at it and, if successful leave behind a trail others can extend. {Well, that's my 64 bits worth! :) Thanks for reading if you stayed this far.}

Big endian vs little endian

Posted Mar 8, 2007 22:31 UTC (Thu) by giraffedata (subscriber, #1954) [Link]

while its native mode was Enhanced Binary Coded Decimal for Interchange Code -- EBCDIC

I've heard this said before, but I've done a great deal of programming in later implementations of that architecture and I can't recall the CPU ever being cognizant of what character code I was using except in those instructions that convert between EBCDIC and ASCII. And ISTR there were some with which you could take advantage of the fact that the lower 4 bits of the code for a digit was also the binary reprentation of that digit. (That's true for EBCDIC and ASCII). How is EBCDIC native, and what did the ASCII bit do?

A decimal number, e.g 750124, was converted by the hardware into a series of hexadecimal units.

What is a hexadecimal unit? It sounds a lot like you're talking about what IBM calls packed decimal and everyone else calls BCD: a number code in which each decimal digit is represented in binary in 4 bits and those nybbles are lined up big-endian. But that has nothing to do with hexadecimal.

Also, that was one of two number codings used on S/360. The other was the big-endian pure binary code we've been talking about. Pure binary is easiest to do arithmetic on, but packed decimal is easiest to do input and output on. Many early applications were much more input and output than arithmetic.

Incidentally, Intel 8080 also had instructions for packed decimal/BCD.

the hex-bits laid out the same way, left-to-right, in HEX, BCD, EBCDIC,
But there is no left or right in computer memory, so this doesn't figure into the choice of big-endian or little-endian. Big-endian is not left to right. Big endian is the most significant byte in the location with the lowest address.
Little-endian programming seemed appropriate for such (no insult intended) nickel-and-dime processing using a minimal instruction set.
What is the connection between little-endian and minimal instruction set?
{Intel counted adding to A-reg and adding to B-reg as two different instructions; IBM counted a instruction which could add to any register, 0 through 15, as one instruction.]
I think you're really pointing out that IBM had expensive general purpose registers, while Intel had special purpose registers. In fact, the only Intel CPU of that era that I programmed (8080) had only one register you could add to: A (the accumulator). But I must miss your point anyway; why do we care how people classify the instructions?

Big endian vs little endian

Posted Mar 23, 2007 22:09 UTC (Fri) by BugLess (guest, #43869) [Link]

"I don't see any inconsistency. A telephone number isn't a number; it's just a digit sequence. There's only one sane way to write a telephone number: in the order in which you dial it."

How does that make -any- sense? If you're reading right-to-left, you'd read the numbers right-to-left and dial right-to-left.

Big endian vs little endian

Posted Mar 24, 2007 0:22 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

"I don't see any inconsistency. A telephone number isn't a number; it's just a digit sequence. There's only one sane way to write a telephone number: in the order in which you dial it."
How does that make -any- sense? If you're reading right-to-left, you'd read the numbers right-to-left and dial right-to-left.

I failed to notice that the quote to which I was responding is contradictory. My statement makes sense if you believe the first half of it ("telephone numbers are an exception to writing numbers little-endian"), but nonsense if you believe the second half ("telephone numbers are left to right).

I confirmed at http://www2.ignatius.edu/faculty/turner/arabic/anumbers.htm that numbers in Arabic are written little-endian (least significant digit on the right).

Now the only question is what direction are telephone numbers written? Common sense tells me the "left to right" from the original is a typo and is supposed to say "right to left." That way, it is big-endian, which is inconsistent with the way numbers are written, but is in the order of dialing. Which would make my objection correct: there's no real inconsistency because telephone numbers aren't numbers.

(While telephone numbers aren't numbers, I consider the digits to have significance in the same way numbers do; the digits that select among the largest geographical area in the original geographical numbering system are the more significant).

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds