Software and hardware obsolescence in the kernel

Posted Aug 31, 2020 6:34 UTC (Mon) by epa (subscriber, #39769)
In reply to: Software and hardware obsolescence in the kernel by pizza
Parent article: Software and hardware obsolescence in the kernel

Yikes! I guess that’s one concrete reason why left-to-right scripts are superior (apart from ink smudging). So in conclusion, numbers are *written* in big-endian direction in all common scripts, and probably read in that direction too, but this may be the opposite direction to the normal one.

Software and hardware obsolescence in the kernel

Posted Aug 31, 2020 17:11 UTC (Mon) by marcH (subscriber, #57642) [Link]

> Yikes!

This zig-zag doesn't feel like a very hard hand writing challenge, I mean not unless you have to deal with crazy long numbers. For computers and terminals it's apparently a bit harder :-)

> I guess that’s one concrete reason why left-to-right scripts are superior (apart from ink smudging).

Ink smudging _and_ hiding what you just wrote. Look at how left-handed people tend to bend their wrist, even with a pencil.

My urban legend is that right-to-left languages were superior for... carving. Ten commandments and all that :-)

Software and hardware obsolescence in the kernel

Posted Aug 31, 2020 20:53 UTC (Mon) by nybble41 (subscriber, #55106) [Link] (7 responses)

> So in conclusion, numbers are *written* in big-endian direction in all common scripts, and probably read in that direction too, but this may be the opposite direction to the normal one.

They're read big-endian but written little-endian. Endianness is determined by the position (address) of each digit, not temporal order in which they're written. The least-significant digit is located at the lowest address, closest to the beginning of the text. When "serialized" (read aloud or subvocalized) the numbers are converted into big-endian format, with the most significant digit spoken first.

Software and hardware obsolescence in the kernel

Posted Aug 31, 2020 21:22 UTC (Mon) by marcH (subscriber, #57642) [Link] (6 responses)

> The least-significant digit is located at the lowest address, closest to the beginning of the text.

No because the numbers are not part of the text, they're a left-to-right insert in a right-to-left text. There are effectively two "address spaces" embedded in one another (a.k.a. "zig-zag").

As explained here, Arabic speakers start with the most significant digit when they read and write just like everyone else and that it is what should define what the "lowest address" is, otherwise non-Arabic speakers are misled into thinking Arabic speakers do something different which is exactly what happened in this thread. Speech readers would be confused too.

Software and hardware obsolescence in the kernel

Posted Aug 31, 2020 22:22 UTC (Mon) by nybble41 (subscriber, #55106) [Link] (4 responses)

> No because the numbers are not part of the text, they're a left-to-right insert in a right-to-left text.

> As explained here, Arabic speakers start with the most significant digit when they read and write just like everyone else and that it is what should define what the "lowest address" is…

It doesn't make sense to talk about big-endian or little-endian without a single, consistent frame of reference for the addressing which is independent of the content. In a context where you would write the elements of a list right-to-left, that means starting with the lowest address on the right and monotonically increasing toward the left. Only after having defined this addressing scheme can we venture to answer whether the components of the list are written big-endian or little-endian with respect to that surrounding context.

The digit you read or write first (temporally, not spatially) has nothing to do with endianness. The order in which you wrote the digits is not part of the written record. Someone coming along later can't even tell what order the digits were recorded in; it makes no difference to them whether you wrote the least- or most-significant digit first. All they can see is the order of the digits as they are laid out visually on the page.

In serial communication the standard is different. There it matters which digit is pronounced first, because the temporal order of the symbols is *all* you can observe.

Software and hardware obsolescence in the kernel

Posted Sep 1, 2020 1:23 UTC (Tue) by marcH (subscriber, #57642) [Link] (3 responses)

> The order in which you wrote the digits is not part of the written record. Someone coming along later can't even tell what order the digits were recorded in

Of course they can, that's called "reading". I can hardly believe you wrote this...

Computers are not as smart though, so they may need some additional clues: https://www.w3.org/International/articles/inline-bidi-mar...

Software and hardware obsolescence in the kernel

Posted Sep 1, 2020 15:01 UTC (Tue) by nybble41 (subscriber, #55106) [Link] (2 responses)

> Of course they can, that's called "reading". I can hardly believe you wrote this...

Are you being deliberately obtuse? If I sent you a picture of some digits I wrote left-to-right and some digits I wrote right-to-left, "reading" is not going to be enough to tell them apart. Here, I'll demonstrate:

1234
1234

To simulate physical writing I filled both lines with spaces and then overwrote the spaces with digits. One line was filled in left-to-right, and the other line right-to-left. Please tell me, which one was written left-to-right?

Software and hardware obsolescence in the kernel

Posted Sep 1, 2020 16:28 UTC (Tue) by marcH (subscriber, #57642) [Link] (1 responses)

> Are you being deliberately obtuse?

I thought you were.

Natural languages are all about context, that's why computers need Unicode bidi = a bit more help. This has been well discussed and explained in several other places in this thread (thanks to all those who did) but if not obtuse you are definitely not receptive. Never mind.

Software and hardware obsolescence in the kernel

Posted Sep 2, 2020 3:51 UTC (Wed) by nybble41 (subscriber, #55106) [Link]

> Natural languages are all about context, that's why computers need Unicode bidi = a bit more help.

Indeed, natural language is all about context. I get the feeling that we are talking about two completely different things and getting frustrated because the other person's answers make no sense in the context of what we each thought the conversation was about. I have been trying to describe how the terms "big-endian" or "little-endian" would apply to the *visual* layout of Arabic numerals *at rest*, for example as symbols written on paper—akin to the individual bytes of an integer field which is part of a larger structure stored in RAM or a file on disk. You seem to be interpreting my statements in the context of data which is *being* written, or read, or typed into a computer—a *serialization* of the data. Or perhaps you are referring to the particular way that the digits would be serialized as Unicode code points in a text file. Naturally my statements would seem like nonsense when taken that way; they were not intended for that context.

For data at rest there is no "time" component; all that matters is the relationships between the addresses or coordinates where each of the digits is stored. For digits written in a single line on paper this corresponds to linear physical coordinates; a digit may appear either to the left or the right of another symbol. In terms of the analogy to the storage of an array of multi-byte integers in computer memory, a system in which the most-significant digit of each number in a list of numbers is physically located on the same side as the first element of the list is "big-endian" and a system in which the least-significant digit is physically closest to the first element of the list is "little-endian". Any given serialization of the data (the process of reading or writing, for example) may employ a different "endianness" independent of the visual layout, and indeed that is the case for Arabic numerals: they are stored or rendered (on paper or other visual medium) as little-endian, but read, written, typed, or spoken aloud with the most significant digit first, in big-endian format.

Anyway, this debate is almost as pointless as the fictional conflict from which we get the terms "big-endian" and "little-endian".[1] I only replied in hopes of conveying that we are arguing *past* each other more than we are actually disagreeing about anything of substance.

[1] https://www.ling.upenn.edu/courses/Spring_2003/ling538/Le...

Software and hardware obsolescence in the kernel

Posted Aug 31, 2020 23:23 UTC (Mon) by kjpye (subscriber, #81527) [Link]

Actually, everybody reads numbers in a zig-zag fashion.

If you are reading a number like 8034175, you start "eight million", but you can't get past the "eight" until you have scanned from the right of the number to the left to determine the magnitude.

So a non-Arabic speaker will read left to right, encounter the number and skip to the end of the number and scan back to determine the magnitude and then read the number left to right and continue reading towards the right.

An Arabic speaker will encounter the right-hand end of the number first, scan across it to determine the magnitude and then read the number left to right. Then they will jump back to the left of the number and continue reading towards the left.

The only real difference is in whether the jump occurs before reading the number (non-Arabic) or after (Arabic).

Software and hardware obsolescence in the kernel

Posted Sep 1, 2020 0:43 UTC (Tue) by notriddle (subscriber, #130608) [Link]

August 31, 2020 at 12:00 PM

Figure out the endianness of THAT notation!