The "Retbleed" speculative execution vulnerabilities [LWN.net]

The "Retbleed" speculative execution vulnerabilities

Posted Jul 12, 2022 17:29 UTC (Tue) by tome (subscriber, #3171) [Link] (3 responses)

> mitigating Retbleed has unfortunately turned out to be expensive: we have measured between 14% and 39% overhead with the AMD and Intel patches respectively.

Ouch. It especially sucks to run Intel processors now.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 12, 2022 23:01 UTC (Tue) by fulke (guest, #140430) [Link]

Not Intel but both old processors

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 16:09 UTC (Wed) by developer122 (guest, #152928) [Link] (1 responses)

How about earlier AMD processors? for example the pre-zen 15h and 16h, or even 10h?

I remember in 2018 (in the original round of patches) skylake would fall back to the branch target buffer when the return stack buffer underflowed. Skylake was forced to use IBRS instead of retpolines for a while until the return-stack stuffing sequence was perfected.

So, do any of the pre-zen architectures speculate on returns the same way?

The "Retbleed" speculative execution vulnerabilities

Posted Jul 14, 2022 18:39 UTC (Thu) by peri (guest, #159703) [Link]

It would be irresponsible of me to speculate on speculation for your vendor but: I expect this to result in a lot of additional patches/emergency updates in lots of processors and devices beyond the original announcement.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 12, 2022 18:38 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

It has been 0 days since the last Spectre vulnerability was disclosed.

See you all in a few months or so, when we get to reset the counter again.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 12, 2022 18:39 UTC (Tue) by NHO (subscriber, #104320) [Link] (1 responses)

Could it still be turned off by mitigations=off in kernel command line?

The "Retbleed" speculative execution vulnerabilities

Posted Jul 12, 2022 18:46 UTC (Tue) by JoeBuck (subscriber, #2330) [Link]

+	retbleed=	[X86] Control mitigation of RETBleed (Arbitrary
+			Speculative Code Execution with Return Instructions)
+			vulnerability.
+
+			off          - no mitigation
+			auto         - automatically select a migitation
+			auto,nosmt   - automatically select a mitigation,
+				       disabling SMT if necessary for
+				       the full mitigation (only on Zen1
+				       and older without STIBP).
+			ibpb	     - mitigate short speculation windows on
+				       basic block boundaries too. Safe, highest
+				       perf impact.
+			unret        - force enable untrained return thunks,
+				       only effective on AMD f15h-f17h
+				       based systems.
+			unret,nosmt  - like unret, will disable SMT when STIBP
+			               is not available.
+
+			Selecting 'auto' will choose a mitigation method at run
+			time according to the CPU.
+
+			Not specifying this option is equivalent to retbleed=auto.
+

The "Retbleed" speculative execution vulnerabilities

Posted Jul 12, 2022 19:46 UTC (Tue) by iabervon (subscriber, #722) [Link] (2 responses)

I wonder if control flow integrity would help to mitigate this. Does it notice that the integrity check don't pass, or does it just speculate that they're going to pass?

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 16:04 UTC (Wed) by developer122 (guest, #152928) [Link] (1 responses)

Probably the latter. It's speculation, after all.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 18:02 UTC (Wed) by iabervon (subscriber, #722) [Link]

True, but I could imagine the CPU reacting to having a shadow stack by predicting that nothing will alter the shadow stack, and knowing that the shadow stack is a good way of refilling your return prediction buffer if it had overflowed previously.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 12, 2022 20:16 UTC (Tue) by flussence (guest, #85566) [Link] (5 responses)

It's obvious this approach isn't working.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 12, 2022 23:23 UTC (Tue) by nix (subscriber, #2304) [Link] (4 responses)

Quite. I mean I have an AMD box for which mitigation means turning SMT off entirely *and* using untrained return thunks (another big performance loss)... and I bought it new *last year*. Half a decade after this stuff was first spotted and still there are holes in nearly new CPUs big enough that if you turned on everything necessary to mitigate them you might as well just go back to a machine from 2010: the speed would be more or less the same.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 22:30 UTC (Wed) by flussence (guest, #85566) [Link] (3 responses)

My decision to hold on to a bunch of first-gen Atom hardware (unaffected by any of this) is looking better with every passing logo-and-website...

The "Retbleed" speculative execution vulnerabilities

Posted Jul 14, 2022 4:05 UTC (Thu) by scientes (guest, #83068) [Link] (2 responses)

Is Atom really an in-order CPU?

The "Retbleed" speculative execution vulnerabilities

Posted Jul 16, 2022 6:56 UTC (Sat) by flussence (guest, #85566) [Link] (1 responses)

The first few models are in-order; it's Pentium 1 logic on 2010 silicon with hyperthreading slapped on to compensate for the complete lack of superscalar ability. `lscpu` does in fact print "Not affected" down the board for them and the Spectre test programs fail too.

Later Atoms are more or less Celerons, they changed focus to network appliances after the initial goal of "sandbag ARM out of the mainstream market" failed.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 19, 2022 17:02 UTC (Tue) by anton (subscriber, #25547) [Link]

Bonnell (first-generation Atom) is a two-wide in-order CPU, like the P5 (first Pentium). But Bonnell has 16-19 pipeline stages (according to wikichip), while P5 has 5. Silvermont (second-generation Atom) is a two-wide OoO CPU. Celeron is a marketing name used for many different microarchitectures; some Silvermont chips were sold as Celerons, but I guess you mean something else.

performance

Posted Jul 12, 2022 21:13 UTC (Tue) by yodermk (subscriber, #3803) [Link] (2 responses)

Does this mean basically all compute-intensive operations on our computers will be that much slower? Or just some stuff in kernel code?

If the former, there's going to be a lot of mitigation disabling. I can't really see paying that price on a single-user computer where, if someone else got access, I'd be screwed anyway.

performance

Posted Jul 12, 2022 21:30 UTC (Tue) by deepfire (guest, #26138) [Link] (1 responses)

Do you think you[*] can avoid running WASM code in your browser?

Are we being collectively marched down a death valley?

--
*: substitute the average user here.

performance

Posted Jul 13, 2022 6:46 UTC (Wed) by davmac (guest, #114522) [Link]

I'm not sure this particular vulnerability can be exploited via WASM. Part of the exploit requires training the indirect branch prediction buffers by jumping to kernel addresses. While that's possible in user space (albeit resulting in a fault) I don't see how you could manage such a thing via WASM.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 12, 2022 22:14 UTC (Tue) by ndesaulniers (subscriber, #110768) [Link]

https://usercontent.irccloud-cdn.com/file/SslYH25a/mifune...
Spoiler: Capt Mifune doesn't survive that scene.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 12, 2022 22:24 UTC (Tue) by Sesse (subscriber, #53779) [Link] (4 responses)

So, these slowdowns… are they with benchmarks that just run syscalls over and over again? Or with some more realistic desktop/server load?

It seems Zen 3 is unaffected, perhaps?

The "Retbleed" speculative execution vulnerabilities

Posted Jul 12, 2022 22:44 UTC (Tue) by Smon (guest, #104795) [Link] (2 responses)

According to the security researchers, Retbleed only impacts AMD Zen 1/1+/2 and Intel Core 6th through 8th Gen processors.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 0:05 UTC (Wed) by JoeBuck (subscriber, #2330) [Link] (1 responses)

I think you read it wrong. They say that those were the processors that they verified were vulnerable. They did not say that other Intel and AMD processors are not affected.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 5:19 UTC (Wed) by jukivili (subscriber, #60126) [Link]

In their paper they also tested Intel 9th and 12th gen and AMD Zen3 and found those not to be affected.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 6:24 UTC (Wed) by Otus (subscriber, #67685) [Link]

According to the paper the performance numbers are for unixbench: https://github.com/kdlucas/byte-unixbench

"Retbleed" even with REP prefix?

Posted Jul 12, 2022 22:29 UTC (Tue) by jreiser (subscriber, #11027) [Link]

Sometimes I see the sequence REP; RET advocated as a technique to speed execution on x86 because REP stops instruction prefetch. Does this sequence have any interaction with Retbleed?

Fuck.

Posted Jul 12, 2022 23:03 UTC (Tue) by Subsentient (guest, #142918) [Link] (2 responses)

Fuck. More performance pain. I already own systems that have gotten noticeably slower since the original Meltdown/Spectre stuff.

If this one doesn't affect my ancient Intel CPUs like Core 2 Quad and Nehalem i5 in my Thinkpad T410, that will be something.
Retbleed does, apparently, hit one of my personal AMD boxes with a Ryzen 3 2200G.

I almost wish they didn't find these vulnerabilities at all. The cost is proving to be quite high. Maybe ignorance is bliss here? I guess if they didn't disclose them, three letter agencies would be using them instead.

Fork

Posted Jul 13, 2022 12:46 UTC (Wed) by eru (subscriber, #2753) [Link] (1 responses)

I wonder at which point the workarounds for speculative execution vulnerabilities slow the CPU more than what speculative execution speeds it up?
Also, could these workarounds be a fool's errand, because the vulnerabilities drop out of how speculative execution works by design, so they will always be present in some form if the CPU speculates?

Fork

Posted Jul 19, 2022 17:10 UTC (Tue) by anton (subscriber, #25547) [Link]

No, these vulnerabilities come from treating microarchitectural state as being irrelevant.

The CPU architects know very well how to implement speculation properly: they do it for architectural state; if a speculative register change or memory write turns out to be wrong, it is never committed to permanent state.

For microarchitectural state (e.g., caches) they thought (and maybe still do) that they don't need to go to the lengths that they do for architectural state and can, e.g., permanently load a speculatively loaded cache line into the L1 cache, thus providing a side channel to attackers.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 4:33 UTC (Wed) by amarao (guest, #87073) [Link] (9 responses)

Do they have a working exploit? Before I see one, sorry, I'm turning this thing off

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 5:07 UTC (Wed) by epa (subscriber, #39769) [Link] (8 responses)

The Retbleed site includes a video showing a leak of kernel memory.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 9:24 UTC (Wed) by amarao (guest, #87073) [Link] (7 responses)

I understand they have a demo. There are tons of demos for very crazy things (like sniffing private keys via mic and capacitor noise for CPU power unit). Before I gave up 75% of the CPU price (simpler CPU is about 39% slower than top CPU), I want to see working exploit. I believe there going to be a lot of troubles in real-world, and I will conciser such crazy sacrifice only if I can see this exploit is actually doing something which I consider unacceptable.

39% is extremely high number to loose. In reverse, not having this 'fix' is having 2.5 acceleration. This is number I got by switching 10y.o. Core 2 Duo to a shiny new Ryzen CPU, and is really hurtful.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 13:00 UTC (Wed) by birdie (guest, #114905) [Link] (2 responses)

To this date there have been no malware [detected] in the wild which uses any of Transient execution CPU vulnerabilities.

It doesn't mean it doesn't exist though but hackers don't seem to be interested. Application/system level vulnerabilities are easier to detect and exploit.

Lastly, all the Transient execution CPU vulnerabilities require you to be able to run arbitrary code on a remote system which means you need a way to exploit it first.

All these vulnerabilities are primarily the headache for cloud providers where they share the same CPU/RAM among a large number of clients - those guys need to enable all the mitigations or risk clients sniffing each others VMs secrets and data.

Home/Enterprise users? They may as well run with `mitigations=off`.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 15:59 UTC (Wed) by pclouds (guest, #76590) [Link] (1 responses)

> to be able to run arbitrary code on a remote system

Isn't that door wide open (ok not "wide") with JIT in browser already? I'm aware there was some mitigation to prevent javascript from exploit Spectre the last time, but would that still work with Retbleed?

The "Retbleed" speculative execution vulnerabilities

Posted Jul 14, 2022 9:27 UTC (Thu) by roc (subscriber, #30627) [Link]

Browser JS mitigations against Spectre are designed to stop Javascript using Spectre vulnerabilities to sniff the contents of other sites running in the same rendering process. Browsers supporting full site isolation (Firefox and Chrome on desktop) don't apply those mitigations AFAIK, because by design they have no more than one site's content in a given rendering process.

An uncompromised browser rendering process running hostile JS would find it very difficult, perhaps impossible, to exploit this Retbleed vulnerability. For example, non-buggy JS engine is not going to try to branch into kernel space so you won't be able to poison BTBs that way.

OTOH compromising a browser rendering process (e.g. by exploiting a bug in the JS engine) and then sniffing kernel memory from inside the browser renderer sandbox is a definite possibility.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 14:23 UTC (Wed) by birdie (guest, #114905) [Link] (1 responses)

Why do I even bother to post anything on this website and seeing "Your comment has been queued for moderation" each time I post a comment? Not a single website on this planet premoderates my comments. It looks extremely insulting and belittling. And it's not like my profile is new. I've had it for years now without a single SPAM comment. I hate being treating like a piece of sh it, so over and out, LWN. Logging out until something piques my interest again.

And please don't BS me with "We are a small website, etc. etc. etc.":

1. You have a very small number of people commenting on your articles
2. I've not seen any SPAM attacks on you ever
3. I single-handedly moderate a website where hundreds of people leave comments - I've never enabled comments premoderation.

Moderation

Posted Jul 13, 2022 14:24 UTC (Wed) by corbet (editor, #1) [Link]

In response to your complaints:

You don't see spam on the site because we go out of our way to prevent that from happening. It is certainly not indicative of a lack of spammers.
Spamming is a reason for being put on moderation, but so is personal attacks against others.
Please look to your own posting history to understand why your account was placed on moderation.

Anybody who reads LWN for any period of time knows that we are not inclined to over-moderate; indeed, we are frequently criticized for not moderating enough. I do not think that we are overstepping here.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 15:11 UTC (Wed) by Paf (subscriber, #91811) [Link] (1 responses)

I mean, the demo is a program running unprivileged on the local system.

I understand leaving the mitigations off but this isn’t some crazy “we used mic power levels” thing.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 14, 2022 8:51 UTC (Thu) by amarao (guest, #87073) [Link]

It's running at very specific configuration. We don't know how many hours had they tinkered system and execution parameters and how many attempts were before.

The main difference between demo and exploit is 'reproducibility'. Exploit shows that it can, indeed, on a given (arbitrary or almost arbitrary) system do harm. And this is what I (as operator/sysadmin/devops/sre/younameit) care. If some research may set up an amazing quantum entanglement system which steals bytes with spooky action in a distance, that's cool, but to mitigate it I want to see it doing this on my (sacrificial) server with a usual workload.

If network interrupts are coming, and few system processes active, is it still viable, or it required pristine conditions to run? If it's a threat for virtualization, what happens if we have few guests with active network interfaces, doing usual things (like ignoring ARP/BUM junk on the net)?

If their exploit can work through this, this is a big story. If not, well, it's a nice paper for future research to reference to, nothing more.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 16:35 UTC (Wed) by wtarreau (subscriber, #51152) [Link] (5 responses)

I hope at some point people will understand that there really are two different needs of hardware machines:
- the fast ones to do fast stuff with trusted users (development, games, appliances, etc)
- the slow ones to do slow stuff with tons of untrusted users (including VMs, clouds etc)

And when the workload is critical, just don't share your resources.

You simply can't have it both fast and safe with untrusted users. Everything is observable. We're still sharing resources and as soon as you suffer from someone else's activity, in a way or another you observe it. Sometimes it's so much finegrained that you can observe in amazing details.

I'm wondering what the TOP500 of HPC machines would look like if they enabled all those ridiculous mitigations! In addition they're a pain for software engineers who constantly scratch their heads trying to do better for certain use cases, and once used differently that's a total disaster and mitigations are complicated.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 13, 2022 22:47 UTC (Wed) by JoeBuck (subscriber, #2330) [Link] (1 responses)

Alternatively, perhaps for an extra fee cloud providers could sell customers a flow that guarantees that they won't share a physical processor with anyone else, so any processes that could observe cache behavior or other side channels would belong to the same customer (and perhaps they even want those kinds of observations for performance monitoring). Then mitigations wouldn't be needed and processors could speculate freely. It would be up the customer not to run untrusted code in this mode.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 14, 2022 9:29 UTC (Thu) by roc (subscriber, #30627) [Link]

You can rent bare-metal boxes in AWS and elsewhere.

It's also pretty easy to see which AWS instances guarantee you an entire socket --- it's the ones where AWS lets you use the PMU. They don't want you using the PMU to sniff other customers on the same socket via side channels. However, I don't think AWS *guarantees* that no-one else is on the socket.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 14, 2022 15:19 UTC (Thu) by dmoulding (subscriber, #95171) [Link]

Isn't it more than just that though? I think cloud providers, and users of those clouds, aren't the only ones potentially impacted by these kinds of vulnerabilities. That's because arbitrary code execution vulnerabilities in other software pop up regularly. Those vulnerabilities, when combined with one of these hardware vulnerabilities, can turn a "I can execute arbitrary code as an unprivileged user" exploit into an "I can execute arbitrary code as root" exploit.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 14, 2022 18:39 UTC (Thu) by peri (guest, #159703) [Link]

You can ask their kernel teams in a few weeks. SPECTRE caused tons of rework and there are so many different vendors/distributors that have small cores with speculative execution enabled for kiosks/point of sale/transaction processing systems. If your low power CPU is running an intel-like architecture x64 without a lot of specific mitigations, expect to patch a ton of things and rack/rerack a lot of devices.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 15, 2022 12:36 UTC (Fri) by eduperez (guest, #11232) [Link]

The tons of JavaScript code that my browser executes each time I visit a website... in which category do they fall?

The "Retbleed" speculative execution vulnerabilities

Posted Jul 18, 2022 15:51 UTC (Mon) by paulj (subscriber, #341) [Link] (8 responses)

More and more of the speculative execution logic in CPUs, designed to extract instruction-level parallelism from code, is turning out to have security issues. We get updates that basically disable chunks of logic in existing CPU designs; and more logic in new designs to try workaround the security issues - all with performance costs (outright and perf/Watt - especially the first approach).

The CPU design people who, in the 90s, argued to move away from ever more complex, OOO, super-scalar architectures might have had a point? They argued for CPU designs moving to simple, in-order cores, and fast switching between different (and parallel running) execution contexts.

Is that what basically what people have got - performance wise - when they're running a recent Intel or AMD CPU with 16+ cores, with most of the speculative logic disabled? :)

The "Retbleed" speculative execution vulnerabilities

Posted Jul 19, 2022 16:53 UTC (Tue) by anton (subscriber, #25547) [Link] (7 responses)

Who are those people that you mean? The only thing that comes to my mind is Niagara (UltraSPARC T1), but that came out in 2005, and maybe Bonnell (first-Generation Atom), but that came out in 2008. The end of both stories is that they switched to OoO (SPARC T4 in 2011 and Intel Silvermont in 2013).

Unfortunately, we have not seen a hardware fix for Spectre yet. I think that a good fix would not cost much silicon nor much performance. And no, disabling speculation is not a good fix.

I don't think anyone runs these CPUs "with most of the speculative logic disabled". But given that we don't see good hardware fixes, it seems to me that the hardware guys think they can throw this stuff over the wall to the software people who (they think) shall employ mitigations that include eliminating branch prediction (and thus speculation), e.g. retpolines.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 19, 2022 18:49 UTC (Tue) by atnot (guest, #124910) [Link] (6 responses)

> it seems to me that the hardware guys think they can throw this stuff over the wall to the software people who (they think) shall employ mitigations

It works the other way around too. Here goes: "Software people demand that hardware people make them faster and faster processors without changing the way it is programmed, that they should just go and apply more architectural optimizations. The whole reason all of this is done, why x86 is at this point a declarative language for describing how to distribute parallel compute tasks over hardware resources using an abstract control flow graph and dataflow labels (aka "registers") is to keep the fiction that computers haven't changed since the 80s alive to programmers.

Hardware people offered solutions long ago: Itanium had explicit speculation instructions. They were perfectly aware of the troubles the current direction would bring. But software people rejected it in favor of a 64bit pdp11 because it didn't look familiar enough to them and their PDP11 languages, then made fun of them. To the point that nobody has dared publish serious research on novel CPU architectures since around 2010."

This might not be completely fair, but it's not wrong either. Software developers are at least as culpable for the current situation as hardware vendors are. There's barriers yes, but they need to be broken in both directions.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 19, 2022 20:33 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

> But software people rejected it in favor of a 64bit pdp11 because it didn't look familiar enough

No. People rejected Itanic because it just Did Not Work. It's not possible to statically predict how the program will run, because a lot of timings are dependent on inputs. Even when caches are not in play, a good old divide instruction can take anywhere from 1 to 10 cycles to complete.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 19, 2022 20:52 UTC (Tue) by deater (subscriber, #11746) [Link] (3 responses)

it's not that you couldn't design a performant VLIW setup. It's just programming such a thing is really hard and Intel/Itanium thought they could hide this with a good compiler which turned out not to be possible.

You should look into the later itanium, such as Poulson, which had speculation and out-of-order in an attempt to catch up.

Also look into modern x86 systems where many of the divide instructions have a constant latency.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 19, 2022 22:55 UTC (Tue) by atnot (guest, #124910) [Link] (2 responses)

> It's just programming such a thing is really hard and Intel/Itanium thought they could hide this with a good compiler which turned out not to be possible.

Absolutely, targeting C-like languages at VLIW is very difficult and requires advanced scheduling which was not really available at the time. This was a huge factor. It fared much better with GPUs which were targeted with more easily parallelizable languages. Even those would move away eventually though, coincidentally around the time CUDA and GPGPU came about.

Itanium was definitely far from perfect. The initial implementation was terrible and the decision to encode many implementation details of the first CPUs directly into the ISA was a mistake they quickly recognized. But so was x86, we've just gotten used to it. Certainly, today's 12-wide CPUs would have a lot easier time emulating a mediocre 2000s explicitly parallel VLIW CPU than a mediocre 80s microprocessor. Even with it's flaws, Itanium is still significantly less far off from what a modern CPU actually looks like.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 20, 2022 10:24 UTC (Wed) by farnz (subscriber, #17727) [Link]

The other issue with advanced scheduling is that an out-of-order execution design also benefits from a well-scheduled program. An out-of-order processor has a limited instruction window within which it can reschedule dynamically, and a well-scheduled program is set up so that all the rescheduling that can be done in that window is a consequence of the data the program is processing.

GPUs are a different case because they're designed for the world where single threaded performance is not particularly interesting - as long as all threads complete their work in a millisecond or so, we don't care how long each individual thread took. It's thus possible to avoid OoOE in favour of having more threads available to hardware, and better hardware for switching between threads when one thread gets blocked. In contrast, the whole point of CPUs in a modern system (with GPUs as well as CPUs) is to deal with the code where the time for one thread to complete its work sets the time for the whole operation.

I suspect that, for the subset of compute where the performance of a single thread is the most important factor, an out-of-order CPU is the best possible option. The wide-open question is whether we can design an ISA that allows us to avoid unwanted speculation completely; Itanium had that, because it was designed around making all the possible parallelism explicit, but Itanium wasn't a good ISA for out-of-order execution, and had low instruction density.

The other issue that Itanium's explicit speculation didn't account for is that we're starting to see uses of value prediction, not just memory access prediction; do we want to be explicit about all the possible speculative paths (e.g. "you can speculate that the value in r2 is less than the value in r3", or "you can speculate if you believe that r2 is between -16 and +96"), or do we instead want to find a good way to block speculation completely where it's potentially dangerous?

The "Retbleed" speculative execution vulnerabilities

Posted Jul 20, 2022 18:56 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

> Absolutely, targeting C-like languages at VLIW is very difficult

Targeting ANY languages with VLIW is difficult. The fundamental issue is that scheduling depends on input data, and no language can change that.

> Even those would move away eventually though, coincidentally around the time CUDA and GPGPU came about.

Yup. It's just not efficient to use VLIW for anything, even when OOO is not needed.

The "Retbleed" speculative execution vulnerabilities

Posted Jul 20, 2022 6:52 UTC (Wed) by anton (subscriber, #25547) [Link]

Software people demand that hardware people make them faster and faster processors without changing the way it is programmed

Not sure about "demand", but that's the way it works in those areas affected by the software crisis (with the classic criterion being that software costs more than hardware), i.e., pretty much everything but supercomputers and tiny embedded controllers. It would be just too costly to rewrite all software for something like the TILE or (more extreme) Greenarrays hardware, especially in a way that performs at least as well as on mainstream hardware plus Spectre fixes.

Concerning IA-64 (later renamed into Itanium processor family), that's probably not what paulj meant, because Itanium is not simple, and switching between parallel running execution contexts was not envisioned for it in the 1990s (although it was implemented around 2010 or so). Poulson is not OoO AFAIK.

As for implementing ("emulating") IA-64 with the techniques for today's OoO hardware (the widest of which is 8-wide AFAIK), I doubt that that would be easier than implementing AMD64, Aarch64, or RISC-V; I don't see anything in IA-64 that helps the implementation significantly, and one would have to implement all the special features like the ALAT that are better handled by the dynamic alias predictor in modern CPUs; likewise, one would have to implement the compiler speculation support (based on imprecise static branch prediction) and (for performance) still have to implement the much better dynamic branch prediction and hardware speculation.

Single-issue in-order (what you call a PDP11) turns out to be a very good software-hardware interface (i.e., an architectural principle), even if the implementation is pretty far from what a straightforward implementation looks like.