|
|
Log in / Subscribe / Register

Another round of speculative-execution vulnerabilities

Another round of speculative-execution vulnerabilities

Posted Aug 12, 2023 16:51 UTC (Sat) by farnz (subscriber, #17727)
In reply to: Another round of speculative-execution vulnerabilities by atnot
Parent article: Another round of speculative-execution vulnerabilities

Itanium failed to outperform AMD64 on hand-coded assembly as well as on C code. It wasn't killed by the C model, it was killed by a failure to deliver performance greater other CPUs. VLIW CPUs like Transmeta failed because VLIW code is inherently low-density in memory, and our current bottleneck for performance tends to be L1 cache size. Mill has never reached a point where hand-written code in simulation outperforms hand-written code for AMD64 given the same simulated resources as AMD64. EDGE is an ongoing research project, and may (or may not) prove worthwhile - there's certainly not been an effort to build a good EDGE CPU that can be compared to something "C-friendly" like RISC-V.

Similar failures apply to Lisp Machines. While they had dedicated hardware to make running Lisp code faster, they lost out because RISC CPUs like SPARC and MIPS were even faster at running Lisp code for a given energy input than Lisp Machines were. Again, not about programming model, but about the Lisp Machines being worse hardware for running Lisp than MIPS or SPARC.

In terms of competing models of computation that have actually made it to retail sale, FPGAs are a commercial success, but are not programmed like CPUs, because they're defined as a sea of interconnected logic gates, and you are better off exploiting that via a Hardware Description Language than via something like C, FORTRAN or COBOL. GPUs are a commercial success; individual threads on a GPU are similar to a CPU with SIMD, with many threads per core (8 on Intel, more on others), and a hardware thread scheduler that allows you to have a pool of cores sharing thousands or even hundreds of thousands of threads.

None of this is about the "C model"; underpinning all of the noise is that humans struggle to coordinate concurrent logic in their heads, and prefer to think about a small number of coordination points (locks, message channels, rendezvous points, whatever) with a single thread of execution between those points. OoOE with speculative execution is one of the two local minima we've found for such a mental model of programming, and supports the case where a single thread of logic is the bottleneck. The other model that works well is the workgroup model used by GPU programming, where something distributes a very large number of input values to a pool of workers, and lets the workers build a large number of output values. Between the input and output values, there's very little (if not no) coordination between workers.

And while the 6502 is not supported upstream in any of the big C compilers, nor are many other CPUs of the same vintage. The Z80 is not supported in any of the big C compilers, nor is the 6809, for example, and both of those were big selling CPUs at the time the 6502 was current; the Z80 is also a lot friendlier to C than the 6502, since the Z80 does not limit you to a single 256 byte stack at a fixed location in memory, whereas the 6502 has a 256 byte stack fixed in page 1. I've never personally programmed a 6809 system, but I believe that it's also a lot more C friendly than the 6502.

Fundamentally, the thing that has killed every alternative to date is that the surviving processor types are simply faster for commercially significant problems than any competitor was, even with alternative programming models. This applies to VLIW, and to EPIC, and to Lisp Machines.


to post comments

Another round of speculative-execution vulnerabilities

Posted Aug 14, 2023 20:08 UTC (Mon) by mtaht (guest, #11087) [Link] (2 responses)

I remain fond of the Mill set of ideas for many reasons, but was not aware of any benchmarks of the compiler, or public sim information? I have not kept track.

Weirdly enough I do not care about IPC, what I care about is really rapid context and priv switching, something that unwinding speculation on the TLB flush on spectre really impacted. I am tired of building processors that can only go fast in a straight line. And like everyone here, tired of all these vulnerabilities.

The mill held promise of context or priv switching in 3 clocks. The implicit zero feature and byte level protections seemed like a win. But it has been a long 10+ years since that design was announced, have there been any updates?

Another round of speculative-execution vulnerabilities

Posted Aug 14, 2023 21:52 UTC (Mon) by mathstuf (subscriber, #69389) [Link]

I recently perused the forum and it seems that they're in another funding round and looking to go from startup to a proper company (salaries, etc.). Technical progress (well, at least publicizing it) is blocked on that. To be fair, they are apparently in it for the money (based on the Q&A in at least one of the talks that have been released).

Another round of speculative-execution vulnerabilities

Posted Aug 17, 2023 14:43 UTC (Thu) by farnz (subscriber, #17727) [Link]

It's a while since I saw the information (around 10 years), so I don't have links to hand, and it was investor-targeted. They seemed to be making the same mistake as Itanium designers, though - they compared hand-optimized code on their Mill simulator to GCC output on a then current Intel chip (Haswell, IIRC), showing that simulated Mill was better than GCC output on Haswell. The claim was that compiler improvements needed for Mill would bring Mill's performance on compiled code ahead of Haswell's performance; but it failed to take into account that, with a lot of human effort, I could get better performance from Haswell with hand-optimized code than they got with GCC output, using GCC's output as a starting point.

I am inherently sceptical of "compiler improvement" claims that will benefit one architecture and not another; while I'll accept that the improvement is not evenly distributed, until Mill Computing can show that their architecture with their compiler can outperform Intel, AMD, Apple, ARM or other cores with a modern production-quality (e.g. GCC, LLVM) compiler for the same language, I will tend towards the assumption that anything that they improve in the compiler will also benefit other architectures.

This holds especially true for compiler improvements around scheduling, which is what Mill depends upon, and what Itanium partially needed to beat OoOE - improvements to scheduling of instructions benefit OoOE by making the static schedule closer to optimal, leaving the OoOE engine to deal with the dynamic issues only, and not statically predictable hazards.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds