The future of 32-bit Linux

Posted Dec 6, 2020 7:22 UTC (Sun) by atomsymbol (guest, #109825)
Parent article: The future of 32-bit Linux

In my opinion, this is a highly unbalanced article in the sense that it does not mention any of the following items:

Many (such as: 80% of all) mobile and desktop applications do not require a 64-bit address space in order to function properly, for example:
- A Bash script that would require more than 1 GiB of memory is highly improbable, thus Bash (which is implemented in C) does not actually require 64-bit pointers in order to function properly
- Anybody running top/htop on their machine once in a while can clearly see that many apps (random examples: bluetoothd, nm-applet, xscreensaver) consume about 5-30 MiB of memory
32-bit pointers are more efficient than 64-bit pointers in terms of how big a working set size can fit into L1D/L2 caches
The belief that time_t is somehow required to be linked to the size of the address space is false. That is, a 64-bit time_t and 32-bit pointers are perfectly compatible with each other.
It is possible to have both 32-bit and 64-bit versions of the same library installed on a Linux-based system. In other words, it is very easy for an application developer to freely choose a 32-bit or a 64-bit address space depending on the particular needs of the application.
- Maybe if clang/gcc made the choice between the command-line options -m32 and -m64 mandatory then it would enable programmers to give it some thought

-atom

The future of 32-bit Linux

Posted Dec 6, 2020 9:39 UTC (Sun) by jem (subscriber, #24231) [Link] (6 responses)

>The belief that time_t is somehow required to be linked to the size of the address space is false. That is, a 64-bit time_t and 32-bit pointers are perfectly compatible with each other.

I think everybody agrees on that. The link is more of a practical one: the size of a pointer and the size of the largest native integer type typically go hand in hand. 32-bit architectures have 32-bit registers, and both pointers and native arithmetic instructions are limited to 32 bits. This has led to C compilers defining the primitive integer data types as having a maximum size of 32 bits, which in turn is reflected in the ABIs. Furthermore, the time_t type is defined as an arithmetic type, i.e. it must support arithmetic using C's arithmetic operators.

The future of 32-bit Linux

Posted Dec 6, 2020 11:29 UTC (Sun) by glaubitz (subscriber, #96452) [Link] (5 responses)

Sounds like something that can and should be addressed in software.

Just deprecating old hardware because it's convenient leads to obscure bugs and reduced portability.

We found lots of such weird bugs in Debian while building the distribution for 20 different architectures.

The future of 32-bit Linux

Posted Dec 6, 2020 13:39 UTC (Sun) by pizza (subscriber, #46) [Link] (4 responses)

> Sounds like something that can and should be addressed in software.

That only works if you have the source code to everything and/or infinite developer-hours at your disposal.

Binary compatibility going back decades is the primary reason Intel and Windows are still relevant. Arm is now in the same boat, with the stratospheric rise of cortex-A and cortex-M making binary blobs commercially (and routinely) viable to distribute.

The future of 32-bit Linux

Posted Dec 6, 2020 14:08 UTC (Sun) by glaubitz (subscriber, #96452) [Link] (1 responses)

> That only works if you have the source code to everything and/or infinite developer-hours at your disposal.

We have the first requirement fulfilled in the open source world. And you don't really need infinite developer hours, it's not as bad as you claim it is.

> Binary compatibility going back decades is the primary reason Intel and Windows are still relevant.

Windows isn't really binary compatible. You can't run many Windows 95 binaries on Windows 10 anymore. Linux has, in fact, a much better binary compatibility allowing you to run software from the 90s provided that you have the necessary shared libraries as well.

The future of 32-bit Linux

Posted Dec 6, 2020 15:18 UTC (Sun) by pizza (subscriber, #46) [Link]

> Windows isn't really binary compatible. You can't run many Windows 95 binaries on Windows 10 anymore. Linux has, in fact, a much better binary compatibility allowing you to run software from the 90s provided that you have the necessary shared libraries as well.

"Provided you have the necessary libraries" can be a pretty massive undertaking, and is one of the points of the TFA. It's also not a new problem; commercial users of Linux have been complaining about this for over twenty years, and another point of the TFA is that commercial users are beginning to discover just how much they've been reliant upon volunteers, and if they want something differently, they're going to have to start paying to get what they want. And perhaps the main point of TFA is that "solving this in software" is exactly what's being done at the kernel level, because that's a lot smaller/simpler/cheaper than getting every random binary to get recompiled/fixed for 64-bit systems. And new software built/tested/fixed for 32-bit systems.

Meanwhile. IIRC, generally it wasn't the Win95 applications that have problems, it's the *installers* for those Win95-era applications, which relied on 16-bit Win3.1-era binaries that don't work on x86_64.

Offhand I can only think of a handful of Win95-era applications/games that simply didn't work on more modern systems, one relied on Win95 not trapping an illegal memory access, and the others relied on reserved filenames like "CON". (I'm also deliberately skipping over applications that required specific hardware/device drivers that never got ported to later versions of Windows)

But there were plenty of applications that technically ran, but were only ever tested on Win95 and therefore assumed they had full administrator access and behaved accordingly, writing files anywhere they wanted and other questionable assumptions. I should point out that this was in violation of MS's guidelines. WinXP SP2 further tightened down many other areas that were critical for changing Windows' reputation as security swiss cheese, but this "broke binary compatibility" and pissed off a lot of users.

But with regards to the latter; random decades-old Linux binaries will find themselves in the same boat when plonked down into modern distributions; assumptions they made about how things are laid out might not hold, and if they directly interfaced with kernel interfaces (eg OSS, specific /proc files, VFL1, or whatever) they might not get very far.

(In more modern times, the general attitude towards this sort of backwards compatibility has shifted towards "Just run the old system in a VM and firewall the crap out of it" -- That's what MS finally did with Win10, including XP in a sandboxed VM)

The future of 32-bit Linux

Posted Dec 12, 2020 10:15 UTC (Sat) by flussence (guest, #85566) [Link] (1 responses)

> That only works if you have the source code to everything and/or infinite developer-hours at your disposal.

Java already does this automatically.

The future of 32-bit Linux

Posted Dec 12, 2020 14:49 UTC (Sat) by pizza (subscriber, #46) [Link]

> > That only works if you have the source code to everything and/or infinite developer-hours at your disposal.
> Java already does this automatically.

That's great, but we're not talking about Java.

(Or the fact that Java shops inevitably end up becoming 3rd-world body farms)

The future of 32-bit Linux

Posted Dec 6, 2020 13:36 UTC (Sun) by nix (subscriber, #2304) [Link] (1 responses)

... are we reading the same article? Most of these points are mentioned at least once: some (like the ability to run 32-bit stuff on 64-bit platforms) more than once (it covers both 64-bit kernel / 32-bit userspace, and 64-bit hardware / 32-bit kernel).

The future of 32-bit Linux

Posted Dec 16, 2020 20:23 UTC (Wed) by atomsymbol (guest, #109825) [Link]

Yes, we are reading the same article, but the point is that my viewpoint greatly differs from the article's viewpoint. The article's viewpoint is that 32-bit pointers on 64-bit machines is a question of the last resort, i.e. use 32-bit pointers if you are forced to do so ---- while my viewpoint is to use 32-bit pointers on 64-bit machines when it makes sense, i.e. use 64-bit pointers if you are forced to do so.

The article generally implies the following: 32-bit pointers are a legacy thing if you have more than 8 GB of RAM
My comment generally implies the following: 64-bit pointers might be unnecessary even if you have more than 8 GB of RAM

-atom

The future of 32-bit Linux

Posted Dec 6, 2020 13:49 UTC (Sun) by arnd (subscriber, #8866) [Link] (2 responses)

Thanks for the feedback. I agree I could have expanded the section on compat mode, which is indeed an extremely important tool for reducing memory usage on 64-bit machines, and should in my opinion be the default on any machines with less than a few gigabytes of RAM.

For the performance benefit of running 32-bit code, the tradeoff is less clear than you describe it. While there are obvious benefits to running smaller code because of cache footprint, TLB usage and and memory interface limits, the two most common architectures also benefit a lot from running 64-bit code: on x86, 32-bit mode is limited to eight general-purpose registers compared to 16 on 64-bit, and on Arm, you go from 16 to 32 registers as well as a more modern instruction set. Unlike other traditional architectures (powerpc, sparc, mips, parisc, s390), this means you only get the clear performance benefits when you run the 64-bit instruction set with an ILP32 toolchain (x32 in case of x86). As I described, these have largely failed to gain traction despite being faster, because of the cost of maintaining a third set of binaries in addition to the ones that are already needed.

Even the mixed 32/64 environments are on their way out in common distros. This used to be common in Android and ChromeOS and well supported in Red Hat, SUSE, Ubuntu and others, but these days the binary distros either only offer full builds for 64-bit or like Android always pick the 64-bit binaries when both are available. Debian is the most notable exception of course, allowing to mix and match binaries from three different targets (arm64/armhf/armel or x86_64/i386/x32) plus more if you count qemu-user. A more common and more practical way of mixing nowadays is to use containers to e.g. run a 32-bit Debian or Adelie docker image on an Arm64 or x86-64 host, as this avoids tricky library dependencies. Note also that a mixed root file system is less portable on Arm as it does not work on modern aarch64-only processors, and it adds overhead for keeping two sets of shared libraries on disk and in memory, but that is a similar overhead to any container setup.

The future of 32-bit Linux

Posted Dec 16, 2020 21:20 UTC (Wed) by atomsymbol (guest, #109825) [Link]

I agree with most parts of your comment. However, just some notes:

It is likely that the number of programmer-visible registers in future CPUs will be less relevant to performance than today:
- Future CPUs will be able to process multiple L1D cache loads and stores per cycle
  - Year 2021: A fast x86 CPU core can process up to 3 L1D loads per cycle
- A Zen 2 CPU core, or a Skylake CPU core, has 180 integer registers. Zen 2 integer reorder buffer size is 224 items. A Zen 3 CPU core has 192 integer registers.
  - The main question is how to extract more ILP (instruction-level parallelism) from a sequential instruction stream in order to maximize the utilization of those 180/192 integer registers.
- A high-performance CPU core contains a data forwarding network bypassing the L1D cache
I believe you might be putting too much emphasis on what the programmer sees (8 registers vs 16 registers), instead of putting emphasis on what the CPU is doing

The future of 32-bit Linux

Posted Jan 1, 2021 11:14 UTC (Fri) by wtarreau (subscriber, #51152) [Link]

Actually I've used to run my previous PC for 5 years or so with a 64-bit kernel and a 32-bit distro. This was a nice solution. By then, no single application required more than 2G of RAM, the code was compact and used to load fast. I'd say that upgrading an old 32-bit system just by using a 64-bit kernel in compat mode is already a nice and easy step forward.

On ARM, I consistently found that 32-bit thumb2 code runs 20% faster than 64-bit armv8 code for compilation jobs, leading me to build my toolchains in 32-bit and run them on a 64-bit system in compat mode as well.

My feeling is that compat mode is another good solution to allow the kernel to abandon older 32-bit support (I mean at least the largest setups which do not make sense anymore).