Intel's "redundant prefix issue"

Posted Nov 15, 2023 21:55 UTC (Wed) by pizza (subscriber, #46)
In reply to: Intel's "redundant prefix issue" by paulj
Parent article: Intel's "redundant prefix issue"

> The hardware was too sophisticated, it needed special management, open-source hackers would never have the expertise to understand, only the vendor would be able to XYZ, ...

It's hardly controversial to state that without meaningful (if not comprehensive) documentation of a given CPU's internal design, a microcode patch file is pretty much gobbleygook. And it's also hardly controversial to state that modern CPU cores are _extremely_ complicated.

> When we got drivers, it turned out none of them were as magic as claimed, developers outside the vendors were well capable of working on them (often making better drivers!), and there was often a lot of things that could be improved and even generalised.

I'm well aware of this argument, having personally reverse-engineered dozens of printer drivers (and firmware) to create 100% Free Software alternatives.

I've also spent most of my career writing device drivers (some Free, some not), and more recently, building hardware models that have to work with existing (often binary-only, timing-critical) software.

Intel's "redundant prefix issue"

Posted Nov 16, 2023 10:53 UTC (Thu) by paulj (subscriber, #341) [Link] (14 responses)

Ok, let's say that writing drivers and (even more so) dealing with concurrency in OS kernels is a level of complexity that only (completely hand-waving) 1% of programmers could deal with with some level of competence. Let's say CPU microcode is an order of complexity higher - so 0.1% of programmers. And of course, the number of engineers who would be /really good/ at this stuff is a tenth of each of those. So 0.1% for low-level systems software, and 0.01% for microcode, say.

That still means there will be a few engineers and researchers in the world at large who would be as good at understanding this stuff, and furthering the state of the art, as any engineer within Intel. The world would be better off to have it open.

Maybe Intel wouldn't in the short-term. That said, again just like in the 90s, economically, the PHBs also were afraid of opening up device drivers for similar economic/financial reasons decades ago. Along with the "But others could see our IP and sue us!" arguments made in other parts of these comments. And again, their fears didn't come to pass. Hardware makers have /not/ been economically disadvantaged by going open-source. All of we seen the quality of drivers increase a lot, and one could argue good hardware becomes more valuable when a rising tide has equalised the software.

So, I just don't buy these arguments, when it comes to hardware makers.

Intel's "redundant prefix issue"

Posted Nov 25, 2023 4:54 UTC (Sat) by calumapplepie (guest, #143655) [Link] (13 responses)

You switched words there -- did you notice? "Programmer" and "Engineer". Those are different titles!

As described in the other comments, microcode is meaningless without detailed CPU design information. By "detailed", I mean "is literally the blueprint to build the CPU design". There are entire college courses dedicated to understanding simplified implementations of RISC architectures; nobody without at least that experience is going to get anything out of such a blueprint. To get something meaningful, or find a new problem, you'd need to work pretty heavily with those designs; unlike programming, where you can basically scan for risky constructions (array access without bounds check, etc), in CPU design, everything is a risky construction!

There are some engineers who are as good at microcode as anyone within Intel. They mostly work for AMD.

Intel's "redundant prefix issue"

Posted Nov 27, 2023 12:29 UTC (Mon) by paulj (subscriber, #341) [Link] (11 responses)

Yes, the Patterson and Hennessy "Computer Organization and Design" book is a great text and the basis for many of those classes. Previous editions focused more on MIPS, more recent editions appear to have been revised to use RISC-V - which obviously has modern, open, complete designs available. Many EEE and CS students have done these classes and are competent enough to have a decent working knowledge of the innards of CPUs that they can work on microcode. IIRC that book actually goes into detail on microcode and has example problems on it (I can dig up my copy to double-check).

For sure, at least some of said classes had assignments that required students to /develop/ microcode functions to implement some instruction on the (basic) CPU architecture in the P&H CAO / RISC book.

Is this stuff hard, and would it take work to become competent enough at this kind of thing to be able to make some small, but useful changes? No doubt.

Is every CS/EEE graduate (or even student) outside of Intel and AMD too stupid to develop that? Absolutely not.

It's just dumb to argue that only engineers at $MAJOR_CPU_MAKERS have the ability to work on microcode, ergo that's a good reason for keeping it closed. It was used against open-sourcing software, and was invalid. It remains invalid when this argument is used wrt CPUs. (Hello, RISC-V!).

Intel's "redundant prefix issue"

Posted Nov 27, 2023 13:55 UTC (Mon) by pizza (subscriber, #46) [Link] (10 responses)

> It's just dumb to argue that only engineers at $MAJOR_CPU_MAKERS have the ability to work on microcode, ergo that's a good reason for keeping it closed.

That's not the argument being made.

The *actual* argument is that, without access to more or less the complete design files (schematics, blueprints, verilog, whatever), even a highly-skilled-in-the-arts engineer won't be able to accomplish much of anything because the "microcode" files in question are *tightly coupled to a particular design/implementation*

So by arguing that "microcode should be open" you're actually arguing "the entire CPU design should be open" which is a very different kettle of fish.

Intel's "redundant prefix issue"

Posted Nov 27, 2023 14:50 UTC (Mon) by paulj (subscriber, #341) [Link] (9 responses)

It's not true that ability to programme microcode is useless without full schematics to the CPU. You just need *interface* to the hardware which the microcode can interact with, and visible state changes arising thereof to be documented. You simply do not need to have detailed transistor level schematics.

Could a good chunk of the microcode consist of deep twiddling of internal units - enabling or disabling functionality there-in under certain conditions say - sure. Perhaps a vendor would be reluctant to document such things. Though, it'd still be useful to some user somewhere someday to know this stuff (see sibling comment from farnz).

A lot of it will also be a bit more mundane, details on how to use internal functional units (ALUs, vector engines, register files, etc.). Stuff like that could be useful to a programmer who wants to add support for a macro-instruction to (say) support some new crypto primitive (ala the AES instructions). Or a macro-instruction to accelerate some common bit manipulation (string handling, some common construct in some new network protocol).

AMD and Intel can not cater - at their level - to all the possible little optimisations for all various things out there; they havn't the engineering time, nor the microcode SRAM space. A user in a specific area, focused on that, could have the time - and they can create the space by dropping other macro-instructions they don't need (obsolete ones, e.g., or ones added in later ISA revs that they don't use themselves).

Intel's "redundant prefix issue"

Posted Nov 27, 2023 15:06 UTC (Mon) by farnz (subscriber, #17727) [Link] (8 responses)

For typical microcode, the interface to the hardware that the microcode interacts with is a significant chunk of the design. Further, that interface is not set in stone - it can change with every stepping of the CPU, and is completely redone between generations as some things move to fixed logic and some things move to microcode.

Realistically, it's not practical for Intel or AMD to document their microcode format without giving away the entire CPU design - there's just too much coupling between the hardware design and the microcode. It'd be like asking for the interface between the Linux kernel and all possible kernel modules, but without disclosing any Linux source code (so that I can write a non-Free module against this interface without risk of copyright issues). Yes, it's theoretically possible, but it's a lot of work, and it has to be updated for every single kernel release.

Intel's "redundant prefix issue"

Posted Nov 27, 2023 15:28 UTC (Mon) by paulj (subscriber, #341) [Link] (7 responses)

I don't buy it changes completely from generation to generation. Intel architects have described how functional blocks are re-used again and again. Didn't you link to a talk with a P6 architect who was proud that some of the P6 core functional blocks had survived to CPUs many generations later?

Sure, some specifics - bit patterns to access something - could change, but the behaviour of much of the functional block is going to stay the same. And you're telling me Intel doesn't have internal documentation for microcode writers? :)

I'd accept things are far from perfect for what you'd want to release as a supported external interface. But that isn't what's asked for. Docs suck for much of the open-source world too - doesn't stop people hacking on it, or even /creating/ such documentation (and some basic interfacing information can be auto-generated from HDLs).

Intel's "redundant prefix issue"

Posted Nov 27, 2023 15:42 UTC (Mon) by farnz (subscriber, #17727) [Link] (2 responses)

But it's those specifics that you care hugely about, and not the broader functional units; great, the LSQ is the same, but now you've got a completely different set of interactions with microcode to last time, because the hardware designers are confident in different bits of it, and have attached microcode hooks to different bits of the LSQ. And no, I don't believe Intel does have internal documentation for microcode writers; IME, microcode is written by people embedded in the CPU design team, and they use the actual design sources as the documentation.

I suspect that Intel doesn't have anything it can release as a microcode "interface"; you're asking them to write this documentation, unlock the microcode (making reverse engineering their CPUs easier for competitors) and for no gain to Intel. This isn't something Intel will do, for obvious reasons - it's all costs to them, for no gains, and it's stuff they have to redo for each generation.

Intel's "redundant prefix issue"

Posted Nov 27, 2023 16:28 UTC (Mon) by paulj (subscriber, #341) [Link] (1 responses)

Ok, fair enough, I'll defer to the more specific experience you and pizza have.

I still think that, even without _any_ documentation, if the microcode were at least user-replaceable, there would be clever people out there - outside of AMD and Intel - who would start to reverse engineer at least some of its functions, and we'd get some useful things out of it. There'd be bored EEE students writing fuzzers, measuring the visible changes on x86 ISA, and mapping out at least some subset of the microcode to hit on something useful.

Intel's "redundant prefix issue"

Posted Nov 27, 2023 16:35 UTC (Mon) by farnz (subscriber, #17727) [Link]

I'd certainly agree that people would reverse engineer the microcode and do things with it - I'm just not sure how I'd write that as a positive for Intel, since most of them will be aiming to either find a cool new vulnerability in Intel's designs, or trying to work out how Intel manages to be "better" than their preferred design at doing something.

We only got what we got for GPUs because Intel (and later AMD) needed something to get them design wins over the leading GPUs on the market, and the business case was that, while writing OSS drivers would help competitors reverse-engineer their designs, having OSS drivers would result in them getting sales to customers who see OSS drivers as a positive - and with ATI and nVidia winning (for Intel) and later nVidia dominating over AMD, they valued those design wins over protecting themselves from reverse-engineering. And this was especially clear since people could, and did, reverse engineer nVidia's closed drivers because they had documentation for the CPUs they ran on, even though they didn't have documentation for the GPUs.

Intel's "redundant prefix issue"

Posted Nov 27, 2023 15:52 UTC (Mon) by pizza (subscriber, #46) [Link] (3 responses)

> Sure, some specifics - bit patterns to access something - could change, but the behaviour of much of the functional block is going to stay the same. And you're telling me Intel doesn't have internal documentation for microcode writers? :)

These "bit patterns to access something" is what 95% of these "microcode updates" actually manipulate. A typical change is to invert a chicken bit that [dis]ables a bit of hardwired logic, but only when bits #12213 and #21123 are set, and the rest is writing a _patch_ to the microcode baked into ROM to deal with the changed reality.

Even in simple/trivial designs there can be literally dozens of these bits that mean nothing without detailed design information. -- To provide an example from pop-ish culture, What does the switch labeled "SCE to AUX" do, and when would it matter? Why not always run with it turned on?

(And I speak about this from experience, having a couple of SoCs under my professional belt, including devloping patches for ROM'd code. One chicken bit in particular saved us from having to do a complete respin due to a problem with an untestable-until-silicon-gets-back analog component.)

Intel's "redundant prefix issue"

Posted Nov 27, 2023 16:13 UTC (Mon) by paulj (subscriber, #341) [Link] (2 responses)

Thanks for the comment, informative examples. Cheers.

Just to be clear, that's what I had in mind. The bit pattern to change the FROB function from the FOO mode to BAR mode in the FROBBASHER functional unit could be wired up to bit-pattern 12213 in one implementation and 54213 in another. However, this does not present some fundamental obstacle to documenting the FROBBASHER block and how it could be programmed from micro-code - surely that is obvious?

The answer is the block is documented as having a FROB_MODE setting, which selects either the FOO or BAR mode, which implies <etc, etc.>. Then the internal architecture documentation for the Super-Duper-Micro-Electronics SDME 20001 specifies it uses the FOBBASHER v2.1 block, with its various programmable operations wired to:

...
FROB_MODE 12213
...

And the SDME 20100 similar, with (say)

....
FROB_MODE 54213, with X set to 1.

There are, 100% for sure, a number of fairly well re-used functional units in Intel CPUs. And further, a number of such functional units are to implement things that are very common and similar and well-understood in high-level operation across all CPUs. And particularly on x86 CPUs, with their CISC ancestry, with a number of (fairly new!) instructions that are simply there to optimise logical constructs of the software which are implemented in micro-code (VGATHER, VBLEND, etc.).

Intel's "redundant prefix issue"

Posted Nov 27, 2023 16:17 UTC (Mon) by paulj (subscriber, #341) [Link]

I.e., there are specialised logical optimisations that one could implement as a macro-instruction entirely on top of even the existing subset of intel instructions that are /documented/ as single-micro-op (and have been for a number of generations) that could speed up certain applications. Applications too niche for intel/AMD to go add an instruction themselves.

Intel's "redundant prefix issue"

Posted Nov 27, 2023 16:26 UTC (Mon) by farnz (subscriber, #17727) [Link]

That's not how reused functional units work in a CPU, though. You have a FROBBASHER functional unit, whose mode bits are set by the CPU in different ways in different implementations; so, in the SDME 20001, FROBBASHER takes its mode bits directly from microcode, while in the 20100 where the FROBBASHER is also used by non-microcoded operations, its mode bit comes from elsewhere, unless the microcode execution engine is in control in which case it comes from a hidden state register (not from the microcode), and you now have to document that hidden state register and how it's controlled by the system.

Intel's "redundant prefix issue"

Posted Nov 27, 2023 13:50 UTC (Mon) by farnz (subscriber, #17727) [Link]

However, Intel and AMD are working under time pressures, and backport pressures; an external engineer is under a different set of pressures. It doesn't matter if my hobby change to the microcode on the 2012 Ivy Bridge CPU (a Xeon E3-1245v2) I have to hand for experiments takes me 6 months and is incompatible with current Intel silicon; however, both of those would be unacceptable to Intel. Similarly, I don't care if I break compatibility with kernels I don't run with my microcode change; Intel, however, cares deeply about things like the Windows kernel. Additionally, I don't care if by making my change, I break multi-socket Xeons, or break i3/i5 CPUs, because my Xeon E3 isn't affected by either of those; Intel does care. I don't care if I break systems without ECC RAM, or which use the acceleration part of the iGPU in my E3-1245v2; Intel does.

Because I don't care about the same things as Intel, I can do a better job for my needs than they can, even if I'm less expert at doing the job than Intel or AMD engineers are. I can take much more time than Intel engineers would, because I'm under a different set of pressures. And when I've demonstrated that my change has value, I can share it with other people who'll fill in the bits that I don't care about - or it remains private to me because it only has value for me.