|
|
Subscribe / Log in / New account

An update on Apple M1/M2 GPU drivers

By Jake Edge
October 30, 2024

XDC

The kernel graphics driver for the Apple M1 and M2 GPUs is, rather famously, written in Rust, but it has achieved conformance with various graphics standards, which is also noteworthy. At the X.Org Developers Conference (XDC) 2024, Alyssa Rosenzweig gave an update on the status of the driver, along with some news about the kinds of games it can support (YouTube video, slides). There has been lots of progress since her talk at XDC last year (YouTube video), with, of course, still more to come.

It is something of an XDC tradition, since she began it in Montreal in 2019 (YouTube video), for Rosenzweig to give her presentations dressed like a witch. This year's edition was no exception, though this time she started her talk in French, which resulted in some nervous chuckles from attendees. After a few sentences, she switched to English, "I'm just kidding", and continued with her talk.

Updates and tessellation

Last year at XDC, she and Asahi Lina reported that the driver had reached OpenGL ES 3.1 conformance. They also talked about geometry shaders, because "that was the next step". Since then, the driver has become OpenGL 4.6 conformant. That meant she was going to turn to talking about tessellation shaders, "as I threatened to do at the end of last year's talk".

Tessellation, which is a technique that "allows detail to be dynamically added and subtracted" from a scene, is required for OpenGL 4.0, and there is a hardware tessellator on the Apple GPU—but, "we can't use it". The hardware is too limited to implement any of the standards; "it is missing features that are hard required for OpenGL, Vulkan, and Direct3D". That makes it "pretty much useless to anybody who is not implementing Metal". Apple supports OpenGL 4.1, though it is not conformant, but if you use any of the features that the hardware does not support, it simply falls back to software; "we are not going to do that".

[Alyssa Rosenzweig]

As far as Rosenzweig is aware, the hardware lacks point mode, where points are used instead of the usual triangles; it also lacks isoline support, but those two things can be emulated. The real problem comes with transform feedback and geometry shaders, neither of which is supported by the hardware, but the driver emulates them with compute shaders. However, the hardware tessellator cannot be used at all when those are being emulated because minute differences in the tessellation algorithms used by the hardware and the emulation would result in invariance failures. She is not sure whether that is a problem in practice or not, "but the spec says not to do it", so she is hoping not to have to go that route.

Instead, "we use software". In particular, Microsoft released a reference tessellator a decade or more ago, which was meant to show hardware vendors what they were supposed to implement when tessellation was first introduced. It is "a giant pile of 2000 lines of C++" that she does not understand, despite trying multiple times; "it is inscrutable". The code will tessellate a single patch, which gave the driver developers an idea: "if we can run that code, we can get the tessellation outputs and then we can just draw the triangles or the lines with this index buffer".

There are some problems with that approach, however, starting with the fact that the developers are writing a GPU driver; "famously, GPUs do not like running 2000 lines of C++". But, she announced, "we have conformant OpenCL 3.0 support" thanks to Karol Herbst, though it has not yet been released. OpenCL C is "pretty much the same as regular CPU C", though it has a few limitations and some extensions for GPUs. So the idea would be to turn the C++ tessellation code into OpenCL C code; "we don't have to understand any of it, we just need to not break anything when we do the port".

That works, but "tessellator.cl is the most unhinged file of my career"; doing things that way was also the most unhinged thing she has done in her career "and I'm standing up here in a witch hat for the fifth year in a row". The character debuted in the exact same room in 2019 when she was 17 years old, she recalled.

The CPU tessellator only operates on a single patch at a time, but a scene might have 10,000 patches—doing them all serially will be a real problem. GPUs are massively parallel, though, so having multiple threads each doing tessellation is "pretty easy to arrange". There is a problem with memory allocation; the CPU tessellator just allocates for each operation sequentially, but that will not work for parallel processing. Instead, the driver uses the GPU atomic instructions to manage the allocation of output buffers.

In order to draw the output of the tessellators, though, there is a need to use draw instructions with packed data structures as specified by the GPU. That is normally done from the C driver code using functions that are generated by the GenXML tool. Since the tessellators are simply C code, "thanks to OpenCL", the generated functions can be included into the code that runs on the GPU. Rosenzweig went into more detail, which fills in the holes (and likely inaccuracies) of the above description; those interested in the details should look at the presentation video and her slides.

"Does it work? Yes, it does." She showed an image of terrain tessellation from a Vulkan demo. It was run on an M2 Mac with "entirely OpenCL-based tessellation". There is also the question of "how is the performance of this abomination?" The answer is that it is "okay". On the system, software-only terrain tessellation runs at less than one frame-per-second (fps), which "is not very fun for playing games"; for OpenCL, it runs at 265fps, which is "pretty good" and is unlikely to be the bottleneck for real games. The hardware can do 820fps; "I did wire up the hardware tessellator just to get a number for this talk." There is still room for improvement on the driver's numbers, she said.

Vulkan and gaming

She also announced Vulkan 1.3 conformance for the Honeykrisp M1/M2 GPU driver. It started by copying the NVK Vulkan driver for NVIDIA GPUs, "smashed against the [Open]GL 4.6 [driver]", which started passing the conformance test suite "in about a month". That was six months ago and, since then, she has added geometry and tessellation shaders, transform feedback, and shader objects. The driver now supports every feature needed for multiple DirectX versions.

There are a lot of problems "if we want to run triple-A (AAA) games on this system", however. A target game runs on DirectX and Windows on an x86 CPU with 4KB pages, but "our target hardware is running literally none of those things". What is needed is to somehow translate DirectX to Vulkan, Windows to Linux, x86 to Arm64, and 4KB pages to 16KB pages. The first two have a well-known solution in the form of the DXVK driver and Wine, which are "generally packaged into Proton for Steam gaming". Going from x86 to Arm64 also has off-the-shelf solutions: FEX-Emu or Box64. She has a bias toward FEX-Emu; "when I am not committing Mesa crimes, I am committing FEX-Emulation crimes". The big problem, though, is the page-size difference.

FEX-Emu requires 4KB pages; Box64 has a "hack to use 16KB pages, but it doesn't work for Wine, so it doesn't help us here". MacOS can use 4KB pages for the x86 emulation, but "this requires very invasive kernel support"; Asahi Linux already has around 1000 patches that are making their way toward the mainline kernel, but "every one of those thousand is a challenge". Making changes like "rewriting the Linux memory manager" is not a reasonable path.

It turns out that, even though Linux does not support heterogeneous page sizes between different processes, it does support them between different kernels; "what I mean by that is virtualization". A KVM guest kernel can have a different page size than the host kernel. So, "this entire mess", consisting of FEX-Emu, Wine, DXVK, Honeykrisp, Steam, and the game, "we are going to throw that into a virtual machine, which is running a 4KB guest kernel".

There is some overhead, of course, but it is hardware virtualization, so that should have low CPU overhead. The problem lies with the peripherals, she said. So, instead of having Honeykrisp in the host kernel, it runs in the guest using virtgpu native contexts; all of the work to create the final GPU command buffer is done in the guest and handed to the host, rather than making all of the Vulkan calls individually traverse the virtual-machine boundary. The VirGL renderer on the host then hands that to the GPU, which "is not 100% native speed, but definitely well above 90%", Rosenzweig said.

The good news is that the overheads for the CPU and GPU do not stack, since the two run in parallel. "So all the crap overhead we have in the CPU is actually crap that is running in parallel to the crap overhead on the GPU, so we only pay the cost once."

"'Does it work?' is the question you all want to know." It does, she said, it runs games like Portal and Portal 2. She also listed a number of others: Castle Crashers, The Witcher 3, Fallout 4, Control, Ghostrunner, and Cyberpunk 2077.

All of the different pieces that she mentioned were made available on October 10, the day of the talk. For those running the Fedora Asahi Remix distribution, she suggested immediately updating to pick up the pieces that she had described. Before taking questions, she launched Steam, which took some time to come up, in part because of the virtual machine and the x86 emulation. Once it came up, she launched Control, which ran at 45fps on an M1 MAX system.

There was a question about resources from someone who has a Mac with 8GB of RAM. Rosenzweig said that the high-end gaming titles are only likely to work on systems with 16GB or more. She noted that she was playing Castle Crashers on an 8GB system during the conference, so some games will play; Portal will also work on that system. She hopes that the resources required will drop over time.

Another question was about ray-tracing support, since Control can use that feature. Rosenzweig suggested that patches were welcome but that she did not see that as a high priority ("frankly, I think ray tracing is a bit of a gimmick feature"). Apple hardware only supports it with the M3 and the current driver is for M1 and M2 GPUs, though she plans to start working on M3 before long. The session concluded soon after that, though Rosenzweig played Control, admittedly poorly, as time ran down.

[ I would like to thank LWN's travel sponsor, the Linux Foundation, for travel assistance to Montreal for XDC. ]


Index entries for this article
ConferenceX.Org Developers Conference/2024


to post comments

Very nice!

Posted Oct 30, 2024 20:59 UTC (Wed) by titaniumtown (subscriber, #163761) [Link]

Alyssa Rosenzweig is very cool. I watched her talk and it was fascinating and I loved how she presented. I'm following her work very closely, very impressive!

Great !

Posted Oct 30, 2024 20:59 UTC (Wed) by matp75 (subscriber, #45699) [Link]

Whoah ! This looks crazy and huge huge work even if I am far away from understanding it all !

Fun and impressive

Posted Oct 30, 2024 22:49 UTC (Wed) by Paf (subscriber, #91811) [Link]

“ "when I am not committing Mesa crimes, I am committing FEX-Emulation crimes"”

Hah!

From the outside, this looks really impressive, and it’s presented with great style.

I find it a little disturbing . . .

Posted Oct 31, 2024 3:13 UTC (Thu) by himi (subscriber, #340) [Link] (5 responses)

. . . that she's apparently ~22 and doing world-beating stuff like this . . .

It's always amazing seeing kids doing this sort of thing. I'm old enough to remember seeing Rasterman making everyone else feel old back in the day (even though I'm a few years younger than him, he still made me feel old) - it's nice to see that in this aspect at least, the world of 2024 is as exciting and hopeful as the world of 1999.

I find it a little disturbing . . .

Posted Oct 31, 2024 17:02 UTC (Thu) by Lennie (subscriber, #49641) [Link] (4 responses)

And on a side note, they are women, the presenter of this talk but also the person who started Asahi Linux.

I find it a little disturbing . . .

Posted Oct 31, 2024 22:19 UTC (Thu) by himi (subscriber, #340) [Link]

Yeah, that's one of the things that makes me hopeful - there are so few women in this industry, and even fewer in the open source/free software subset, having young women out there doing exciting things and having fun with it is great to see.

I find it a little disturbing . . .

Posted Nov 1, 2024 0:05 UTC (Fri) by lockecole2 (✭ supporter ✭, #63710) [Link] (2 responses)

Thought it was Hector Marcan who founded it:

https://web.archive.org/web/20210217130206/https://asahil...

I find it a little disturbing . . .

Posted Nov 2, 2024 11:41 UTC (Sat) by Lennie (subscriber, #49641) [Link] (1 responses)

I assumed the name of Lina came first, but I guess not.

I find it a little disturbing . . .

Posted Nov 2, 2024 22:27 UTC (Sat) by Lonjil (guest, #152573) [Link]

But the presenter is Alyssa Rosenzweig, not Lina.

Very impressive

Posted Oct 31, 2024 4:29 UTC (Thu) by flewellyn (subscriber, #5047) [Link] (5 responses)

The combination of brilliance, madness, and willingness to commit coding atrocities necessary to make this work is quite something! Ms Rosenzweig has done some fantastic work. A shame Apple didn't make things a lot easier, though.

I admit to some curiosity: why the witch hat? Has Ms Rosenzweig ever explained this?

Very impressive

Posted Oct 31, 2024 9:12 UTC (Thu) by ballombe (subscriber, #9523) [Link] (1 responses)

witch + apple is calling for seven dwarves.

Very impressive

Posted Oct 31, 2024 10:59 UTC (Thu) by Wol (subscriber, #4433) [Link]

Where's the BUNCH?

Cheers,
Wol

Very impressive

Posted Oct 31, 2024 15:35 UTC (Thu) by Paf (subscriber, #91811) [Link]

It is very close to Halloween, so I assume that’s *part* of the reason

Very impressive

Posted Nov 1, 2024 2:47 UTC (Fri) by indrora (guest, #167938) [Link]

She has a long history of committing source wizardry on GPU drivers. We have a libre Mali-T860 driver because of her.

in/re the hat... *snrk* he doesn't know about the witch hats (/s)

To quote her Mastodon profile: "Princesse-sorcière de Linux qui respecte la politique de l'OQLF" -- I'll let you put the parts together. A surprising number of kernel hackers working on fringe hardware (or just... different stuff) are trans folk and a *lot* of them are of the witchy persuasion.

Very impressive

Posted Nov 1, 2024 9:20 UTC (Fri) by atnot (subscriber, #124910) [Link]

> I admit to some curiosity: why the witch hat?

You're a mysterious woman spending your days in a dim room doing mystical incantations with strange hardware people don't understand but are very impressed by, opening the news to see a new satanic panic, riots, hate crimes and politicians trying to ban your existence. You may as well get the fun hat to go with it.

libkrun

Posted Oct 31, 2024 7:08 UTC (Thu) by pbonzini (subscriber, #60935) [Link]

Shout out to the libkrun project, which is used to run the lightweight VMs with 4K pages. :)


Copyright © 2024, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds