|
|
Subscribe / Log in / New account

Software and hardware obsolescence in the kernel

Software and hardware obsolescence in the kernel

Posted Aug 30, 2020 4:15 UTC (Sun) by marcH (subscriber, #57642)
In reply to: Software and hardware obsolescence in the kernel by willy
Parent article: Software and hardware obsolescence in the kernel

> No individual machine has more than 16TiB of memory, but we want to have a unique address for each byte across the entire cluster.

I'm not sure it's wise to waste silicon and bandwidth with local address lines to address remote memory. RDMA is not the only option.

> So we're going to need 128 bit CPUs sooner rather than later.

The era of silicon custom for HPC is long gone.


to post comments

Software and hardware obsolescence in the kernel

Posted Aug 30, 2020 14:54 UTC (Sun) by willy (subscriber, #9762) [Link] (2 responses)

There's really no such thing as address lines any more. Everything uses packets of data on high speed serial lines. Intel's QPI supports 46 bits of physical address space. I assume UPI supports more. PCIe supports 64 bits.

But I'm not talking about supporting lots of physical address bits. I'm talking about supporting:

1. Files larger than 16 EiB
2. Storage devices larger than 16 EiB
3. Virtual address spaces larger than 16 EiB

We can hack around the missing 128 bit data types for a while. We did it on 32 bit systems for a decade before 64 bit systems were so prevalent that we stopped caring about inefficient 32 bit systems.

The era of custom silicon for HPC is very much still with us. Fujitsu's A64FX and Sunway's SW26010 are in 2 of the top 5 supercomputers. And HPC is far from the only user of large virtual addresses.

Software and hardware obsolescence in the kernel

Posted Aug 30, 2020 15:23 UTC (Sun) by Paf (guest, #91811) [Link]

Yeah, speaking as someone who works in HPC storage, file systems are pushing towards an exabyte in total size *now* (hundreds of petabytes), which means 16 exabytes is only a few years away, and single files in that range are on the decadal horizon for sure.

Also, the era of custom silicon for HPC is ... complex. It’s *mostly* over - the machines with it tend to be exceptions. Fujitsu is the last major vendor doing their own CPUs for HPC, and the Chinese machine noted can’t buy top class CPUs (plus they want to do their own to close that gap).

IBM, the former Cray and SGI (now both HPE), Bull in Europe... none of them have done a full up HPC CPU in quite a while. Cray is pushing towards 20 years, SGI I think is even further out from their last MIPS. IBM comes the closest, but their Cell and big Power chips have always been intended to have significant other markets.

Software and hardware obsolescence in the kernel

Posted Aug 30, 2020 23:09 UTC (Sun) by marcH (subscriber, #57642) [Link]

> There's really no such thing as address lines any more. Everything uses packets of data on high speed serial lines.

That doesn't really change the problem: you're still forcing all local memory addresses to pay the additional price of a significant number of extra, constant zeroes only for the programming convenience of an addressing scheme unified with non-local memories that have totally different performance characteristics.

Hardware engineers "wasting" bandwidth and other resources in their design for software convenience? That doesn't sound very likely.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds