|
|
Subscribe / Log in / New account

The state of eBPF

The state of eBPF

Posted Feb 1, 2024 23:52 UTC (Thu) by Cyberax (✭ supporter ✭, #52523)
In reply to: The state of eBPF by Manifault
Parent article: The state of eBPF

> A month? Well that seems extremely unrealistic, but OK. If it would take less than a month, I guess we can all expect your patch set soon?

Yes, a month is enough. I've done something very similar (JIT-compiling Excel-like formulae and running them in a sandbox) with Wasmtime. The code just writes itself, it's so straightforward.

> And herein lies the problem. It's not difficult to write something that's validated in user space. What's difficult is building something that's validated completely in the kernel, and which interacts safely and ergonomically with the surrounding kernel ecosystem. For example, BPF allows you to take references to struct task_struct * objects, and ensures that if you're holding a refcounted task that you either store it in a map (in which case the reference will be automatically dropped when the map is freed), or release it manually in your BPF prog.

Again, all this is easily done by providing API wrappers exposed to the code. There is nothing whatsoever complicated about doing it, that's also how eBPF runtime itself does it. I follow eBPF development, and I don't see anything that would require some sort of complicated development, and yes, this even includes BTF for debugging.

> Years? I thought you said it could be done in less than a month. And citation needed about the political aspect.

Yes, years. Because the userspace infrastructure that does it is complicated. And moving it into the kernel would require rewriting it in C, and it'll also have to be able to completely emulate eBPF at this point. Nobody is going to allow two separate implementations.

And then it'll get vetoed anyway.

> This is not the same thing. If you want to dynamically load a program into the kernel, you're still going to need some kind of in-kernel verifier component. It doesn't matter if rustc compiled that binary in user space. It's not trusted if it's not verified in the kernel.

I'm talking about using WASM runtimes with JIT (wasmtime, wasmer, wasmedge). They ALL provide that functionality, there's pretty much nothing in eBPF that is unique anymore. And whatever unique features it had, it had already thrown away long ago.

My personal wish for eBPF folks would be to stop. Just stop all feature development for a year, and think about what you've done. Then make an actual plan of what eBPF should look like in 10 years.


to post comments

The state of eBPF

Posted Feb 2, 2024 1:07 UTC (Fri) by Manifault (guest, #155796) [Link] (9 responses)

I'll add double-quotes where relevant so people can see the parts of my previous comment that you conveniently declined to include in your replies.

>Yes, a month is enough. I've done something very similar (JIT-compiling Excel-like formulae and running them in a sandbox) with Wasmtime. The code just writes itself, it's so straightforward.

Everything you're saying is handwavey word salad with absolutely nothing concrete to tie it to. Excel in a WASM sandbox? Sorry, what? I'm talking about running code in the kernel. I'm talking about actual code that's merged upstream that's used extensively *in the kernel* across the entire tech industry. You're just throwing meaningless jargon at the wall and expecting it to stick to something that sounds JIT-like. Yes, I realize WASM is used in the tech industry as well, but not in the kernel because *nobody has implemented it yet, so it's meaningless to even talk about it*.

If it's so easy, then just do it. Proposing WASM as an alternative to BPF is completely meaningless until there's something concrete to hand to people. Why are you wasting your time complaining about BPF on LWN instead of just implementing WASM in the kernel?

>>>Moving it into the kernel is not easy, though.

>> And herein lies the problem. It's not difficult to write something that's validated in user space. What's difficult is building something that's validated completely in the kernel, and which interacts safely and ergonomically with the surrounding kernel ecosystem. For example, BPF allows you to take references to struct task_struct * objects, and ensures that if you're holding a refcounted task that you either store it in a map (in which case the reference will be automatically dropped when the map is freed), or release it manually in your BPF prog.

**And the part you didn't choose to include in your reply**

>>And yes, I'm sure you could handwave and say, "Oh that's easy to do in WASM. You just need to expose an API... [snip]" like you did above for locks. Great. Go build it. That will be a lot more useful to people than a handwaved explanation about how it _could_ work if only someone would actually build it.

And your predictable reply:

>Again, all this is easily done by providing API wrappers exposed to the code.

Are you capable of explaining anything concrete beyond just handwaving and saying to use API wrappers? "Just use API wrappers" is a comically naive watering down of everything that's involved.

>There is nothing whatsoever complicated about doing it, that's also how eBPF runtime itself does it. I follow eBPF development, and I don't see anything that would require some sort of complicated development, and yes, this even includes BTF for debugging.

You "follow eBPF development" and you haven't seen anything that would require "complicated development". Ah, thanks for clarifying, I didn't realize you were an expert. I guess it's not complicated because you just have to use API wrappers, right?

Also, if this is exactly what BPF does, why are you telling people to use WASM? You still haven't actually answered why WASM is somehow a better alternative.

>>Years? I thought you said it could be done in less than a month. And citation needed about the political aspect.

>Yes, years. Because the userspace infrastructure that does it is complicated. And moving it into the kernel would require rewriting it in C, and it'll also have to be able to completely emulate eBPF at this point. Nobody is going to allow two separate implementations.

Again, why is it suddenly now years instead of a month? You promised it was easy -- you've run excel in a WASM sandbox after all! Also, I don't know what you mean by "emulate BPF". Yes, you would probably have to provide many of the features that BPF provides, because they're useful and specific to being in the kernel (and often in the Linux kernel in particular). Those are in fact the most useful part of BPF -- the instruction set itself is only a small part of the BPF programming ecosystem in the kernel. If it would take you years to do that, then it's not a month long project.

Here's another instance of you purposefully misrepresenting something:

>>>Perhaps once Rust support in Linux matures enough.

>>This is not the same thing. If you want to dynamically load a program into the kernel, you're still going to need some kind of in-kernel verifier component. It doesn't matter if rustc compiled that binary in user space. It's not trusted if it's not verified in the kernel.

>I'm talking about using WASM runtimes with JIT (wasmtime, wasmer, wasmedge). They ALL provide that functionality, there's pretty much nothing in eBPF that is unique anymore. And whatever unique features it had, it had already thrown away long ago.

Actually, you literally said Rust two comments ago. And anyways, if BPF and WASM accomplish the same thing then why do you keep spamming people with handwavey WASM platitudes? Go write your C WASM verifier.

>My personal wish for eBPF folks would be to stop. Just stop all feature development for a year, and think about what you've done. Then make an actual plan of what eBPF should look like in 10 years.

Let's recap the sum of your argument:

1. WASM is a better alternative to BPF. You haven't actually ever said why beyond that "the code writes itself", but OK.

2. It would take you a month to build a WASM JIT engine in the kernel because all you need are wrapper APIs, and the code "writes itself".

3. You won't implement WASM because it would actually take years due to implement due to how difficult it would be to write a WASM JIT engine in C in the kernel, and you'd have to implement all of the features that are currently provided by BPF. I don't see how this doesn't directly contradict your point in (2), but OK.

4. The BPF community should stop development because we need to think harder about what we've done, and where BPF will be in 10 years.

This is all self-contradicting nonsense. I can't stop you from falsely advertising this non-existent in-kernel WASM engine that you seem to believe is such a better alternative to BPF. I can, however, call you out on it.

The state of eBPF

Posted Feb 2, 2024 1:50 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

> I'll add double-quotes where relevant so people can see the parts of my previous comment that you conveniently declined to include in your replies.

I don't want to write page-long replies to educate you on the state-of-the-art outside of the narrow kernel world.

> Yes, I realize WASM is used in the tech industry as well, but not in the kernel because *nobody has implemented it yet, so it's meaningless to even talk about it*.

First, this is actually untrue: https://github.com/wasmerio/kernel-wasm WASM runtimes have also been ported to raw hardware. There is nothing whatsoever complicated in hosting a WASM runtime in the kernel.

> **And the part you didn't choose to include in your reply**

Because they are just the first things that you thought of, without taking a second to think about obvious ways they can be implemented.

> Also, if this is exactly what BPF does, why are you telling people to use WASM? You still haven't actually answered why WASM is somehow a better alternative.
> Actually, you literally said Rust two comments ago.
> Go write your C WASM verifier.

You clearly have no idea about WASM. Please go and at least do a cursory Google about it. The most advanced WASM runtimes that do JIT compilation are written in Rust, so that's why you need a substantial Rust infrastructure in the kernel, or a large rewrite into C. There are also no WASM verifiers because any WASM program is safe by construction.

And finally, WASM is superior because it's not a NIH-ed evolved mess without a clear vision.

> This is all self-contradicting nonsense. I can't stop you from falsely advertising this non-existent in-kernel WASM engine that you seem to believe is such a better alternative to BPF. I can, however, call you out on it.

Sorry, I'm not educating you for free at this point when you've clearly demonstrated a total lack of understanding of the subject matter.

If the kernel folks can chime in and say that adding WASM to the kernel would be OK at least in theory, even though it requires bringing in a substantial Rust-based infrastructure, then I'd be glad to do at least some of the work.

The state of eBPF

Posted Feb 2, 2024 2:21 UTC (Fri) by Manifault (guest, #155796) [Link] (7 responses)

>I don't want to write page-long replies to educate you on the state-of-the-art outside of the narrow kernel world.

That doesn't seem fair, given that I've written page-long replies educating you on the kernel side of things.

>First, this is actually untrue: https://github.com/wasmerio/kernel-wasm WASM runtimes have also been ported to raw hardware. There is nothing whatsoever complicated in hosting a WASM runtime in the kernel.

You've sent me a link to a project that hasn't had a commit in 4 years and doesn't even seem to compile. Also, taken from the link you just sent me:

"Running user code in kernel mode is always a dangerous thing. Although we use many techniques to protect against different kinds of malicious code and attacks, it's advised that only trusted binaries should be run through this module, in a short term before we fully reviewed the codebase for security."

It's not hard to write a C JIT engine that runs some generic ISA. It's hard to do it _safely_. If the WASM engine doesn't have a robust verifier, that's absolutely useless and goes against literally every point I've made repeatedly in this entire discussion. It's not difficult to write some JIT engine that executes a few instructions and can crash your system. In this 4-year old project you've sent me, I don't see any way to store and safely read kernel pointers, for example. Or to run WASM code from an NMI handler. Or to prevent infinite loops.

Here's another fun excerpt from that repo: "Check and ensure that your kernel has preemption enabled. Attempting to run WASM user code without kernel preemption will freeze your system."

>You clearly have no idea about WASM. Please go and at least do a cursory Google about it. The most advanced WASM runtimes that do JIT compilation are written in Rust, so that's why you need a substantial Rust infrastructure in the kernel, or a large rewrite into C. There are also no WASM verifiers because any WASM program is safe by construction.

I've done cursory readings about WASM. For example, I understand that infinite loops are possible in WASM. Guess what, that means they're unsuitable to run in the kernel. The fact that you just assume that WASM is safe because it's WASM shows how little you actually even care about the problem space. Also, if the kernel doesn't have the infrastructure you need and it would take years to add, then it will take years, and not a month. There's nothing else to discuss -- anything else is just hypothetical and divorced from reality. The BPF community had to implement and upstream an entire backend compiler in LLVM to get the project off the ground. They didn't go on LWN and complain about how bad kernel modules are and expect something to happen from that.

>And finally, WASM is superior because it's not a NIH-ed evolved mess without a clear vision.

This is obviously not a real criticism that deserves to be taken seriously.

>Sorry, I'm not educating you for free at this point when you've clearly demonstrated a total lack of understanding of the subject matter.

Sorry, what was that? I couldn't read your sentence because my machine is crashing because a buggy WASM program running in the kernel had an infinite loop.

>If the kernel folks can chime in and say that adding WASM to the kernel would be OK at least in theory, even though it requires bringing in a substantial Rust-based infrastructure, then I'd be glad to do at least some of the work.

Right, well, I'm sure if you suggest that people use WASM instead of BPF on more LWN articles, the "kernel folks" will take notice and tell you to go right ahead.

The state of eBPF

Posted Feb 2, 2024 10:32 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

> >I don't want to write page-long replies to educate you on the state-of-the-art outside of the narrow kernel world.

> That doesn't seem fair, given that I've written page-long replies educating you on the kernel side of things.

I hate to say it, but it would help if you read what Cyberax wrote.

Almost the first thing I picked up was you conflating Cyberax' months vs years. ONE month to WRITE. YEARS to get ACCEPTED. They're two completely different things, so don't treat them the same.

Please go back. Read what Cyberax wrote. Then maybe he might write page-long replies, if he thinks they're actually going to get read, and not skimmed and mis-understood.

(I shouldn't be too harsh - I have a habit of doing exactly what you're doing :-)

Cheers,
Wol

The state of eBPF

Posted Feb 2, 2024 16:08 UTC (Fri) by Manifault (guest, #155796) [Link]

> Almost the first thing I picked up was you conflating Cyberax' months vs years. ONE month to WRITE. YEARS to get ACCEPTED. They're two completely different things, so don't treat them the same.

Yes, I understand they're different, from extensive, painful *first hand* knowledge. I understand why you're confused by my replies, because I was also dealing with him saying conflicting things, and him giving handwavy, poorly described explanations of the work involved. In [0], he says this:

> Moving it into the kernel is not easy, though. And the biggest problem will be political. I'm pretty sure the NIH folks in Linux eBPF will veto the changes.

Yes, he's highlighting politics as the main blocker.

BUT, he also says the following in [1]:

>Yes, years. Because the userspace infrastructure that does it is complicated. And moving it into the kernel would require rewriting it in C, and it'll also have to be able to completely emulate eBPF at this point. Nobody is going to allow two separate implementations.

Most of this is an engineering problem, and he's conflating writing a tiny, useless, unsafe WASM engine as "1 month of work", and "making it so people will actually accept this as a useful thing by actually adding features that BPF already has" as "years of political work". He doesn't understand at all what BPF even does, despite his claims to the contrary. Search even just the #BPF section of the LWN index and you'll see what I mean. There's no way on earth that he would be able to write a fully safe, and featurful WASM verifier in the kernel in one month. For example, he kindly linked me to this github page: https://github.com/wasmerio/kernel-wasm, which contains a very minimal WASM engine that doesn't seem to compile, and has large warnings saying, "THIS IS UNSAFE" and "This will hang the kernel if preemption isn't enabled." If people throw a NAK in his face for adding that, it's not political, it's because he's several years from being done with the technical part of the project.

>Please go back. Read what Cyberax wrote. Then maybe he might write page-long replies, if he thinks they're actually going to get read, and not skimmed and mis-understood.

Everything he's saying is inconsistent, handwavey, and hypothetical, so it's easy to read two things he's said and come up with two different logical conclusions. I'm honestly pretty surprised the takeaway from that whole convo is, "David is completely missing his point about how long it would take to write a WASM JIT engine". Hey, maybe that's deserved, and I should curb my tone a bit. In fact, I'm sure that's true. But I am tired of reading his constant whinging about how BPF sucks compared to WASM on every single BPF article that's ever posted on this site (including ones that I've written), when it's all completely hypothetical, and he's not indicated an intention to lift a finger to actually build the thing. That's the actual salient point here.

It's an open data plane and he's free to write and say whatever he wants, just like I'm free to call him a hypocrite for saying that the BPF community should halt all work immediately because we have no idea what we're doing (see [1] for his actual wording), while he's doing literally nothing but sitting around complaining on LWN about how nobody uses his non-existent tool that he doesn't even have any plans on building.

[0]: https://lwn.net/Articles/960518/
[1]: https://lwn.net/Articles/960521/

The state of eBPF

Posted Feb 2, 2024 17:51 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

> That doesn't seem fair, given that I've written page-long replies educating you on the kernel side of things.

I know how the kernel development works. And I have first-hand experience with eBPF and WASM.

> You've sent me a link to a project that hasn't had a commit in 4 years and doesn't even seem to compile. Also, taken from the link you just sent me:

This is to show you that just hosting an existing WASM runtime in the kernel is trivial. It's a job that can be done within a week, and adding eBPF support is another couple of weeks (and would you know it, there's a project doing this: https://github.com/eunomia-bpf/wasm-bpf ). Adding the second 90% of the work (BTF, SPECTRE mitigations) will be more complicated, but nothing non-trivial.

> It's not hard to write a C JIT engine that runs some generic ISA. It's hard to do it _safely_. If the WASM engine doesn't have a robust verifier, that's absolutely useless and goes against literally every point I've made repeatedly in this entire discussion.

WASM is meant to execute and contain outright malicious code. It's designed to be safe by construction. And it's also designed to support features that eBPF is going to grow in the near future: coroutines and async, functions, loops, recursion.

> I've done cursory readings about WASM. For example, I understand that infinite loops are possible in WASM.

What did I say about educating you?... OK, this is the last one that is for free.

WASM runtimes support a concept called "metering". You can allocate a certain amount of "fuel" to a WASM program, and the runtime will either suspend or terminate the program, once the fuel is exhausted. Fuel accounting is updated at backbranches and function calls, so this makes arbitrary loops safe. Example: https://github.com/bytecodealliance/wasmtime/blob/main/ex...

Alternatively, it's possible to terminate/suspend a WASM program simply via an epoch counter mechanism ( https://github.com/bytecodealliance/wasmtime/blob/main/cr... ), that can be triggered by a timer on timeout. Example: https://github.com/bytecodealliance/wasmtime/blob/main/ex...

eBPF initially didn't need anything like this, because it can statically prove that eBPF programs always terminate after a fixed number of instructions. But then the reality intruded, and eBPF grew a host of helper functions for string manipulation, unbounded external iterators, and so on. Moreover, the maximum instruction limit got increased, so now it's trivial to write eBPF programs that are "bounded" in name only.

As a result, eBPF had to add full-blown exceptions to support error conditions and cancellations: https://elixir.bootlin.com/linux/v6.8-rc1/source/kernel/b...

> Right, well, I'm sure if you suggest that people use WASM instead of BPF on more LWN articles, the "kernel folks" will take notice and tell you to go right ahead.

I'm not too invested in eBPF/WASM, but it's terrible to see a project keep digging themselves deeper into a hole.

The state of eBPF

Posted Feb 2, 2024 18:47 UTC (Fri) by Manifault (guest, #155796) [Link] (3 responses)

>This is to show you that just hosting an existing WASM runtime in the kernel is trivial. It's a job that can be done within a week, and adding eBPF support is another couple of weeks (and would you know it, there's a project doing this: https://github.com/eunomia-bpf/wasm-bpf ). Adding the second 90% of the work (BTF, SPECTRE mitigations) will be more complicated, but nothing non-trivial.

Uh, that project literally *uses BPF in the kernel* and has a WASM runtime in user space? That's not even remotely the same thing as having a WASM engine itself run entirely in the kernel. Seriously, stop sending hypothetical examples that show how easy something _could_ be. It's worth literally nothing until you have actual code.

>> I've done cursory readings about WASM. For example, I understand that infinite loops are possible in WASM.

>What did I say about educating you?... OK, this is the last one that is for free.

I was basing this off of Mozilla documentation: https://developer.mozilla.org/en-US/docs/WebAssembly/Refe.... What you're describing below sounds like some likely-inappropriate-for-kernel-space detail of a specific WASM runtime. I may be wrong about that being specific to a WASM runtime though, I freely admit that I'm not at all a WASM expert (unlike your willingness to admit that you know very little about BPF).

>WASM runtimes support a concept called "metering". You can allocate a certain amount of "fuel" to a WASM program, and the runtime will either suspend or terminate the program, once the fuel is exhausted. Fuel accounting is updated at backbranches and function calls, so this makes arbitrary loops safe. Example: https://github.com/bytecodealliance/wasmtime/blob/main/ex...

Ok, so how does that work if you have a synchronous program that can't be interrupted? Or if you're in an RCU read region? All of these little details that matter a lot in kernel space that you can just nuke away in user space.

>Alternatively, it's possible to terminate/suspend a WASM program simply via an epoch counter mechanism ( https://github.com/bytecodealliance/wasmtime/blob/main/cr... ), that can be triggered by a timer on timeout. Example: https://github.com/bytecodealliance/wasmtime/blob/main/ex...

Ok, so what does the runtime do if it's running in an NMI handler? Or even just with IRQs disabled? No timer anymore. I guess the WASM will have special logic that verifies that a WASM program running in those contexts is safe without timers?

Maybe there are answers to these questions, but they require actual code instead of hypotheticals and links to things that work in some user space runtime. Point me to as many GitHub repos as you want. They mean literally nothing until you have working code that's not broken and actually runs in the kernel.

>As a result, eBPF had to add full-blown exceptions to support error conditions and cancellations: https://elixir.bootlin.com/linux/v6.8-rc1/source/kernel/b...

Notice how you linked actual upstream source code? As opposed to all of your examples of links to either defunct unsafe projects, or WASM projects that run in user space and literally use BPF under the hood to run in the kernel? And yes, I'm well aware of exceptions. I also recall your totally level-headed comment in [0] where you decried how useless it was.

[0]: https://lwn.net/Articles/938435/

>Guys, really? You are now adding freaking exceptions.
>This is not just "feature creep", it's a "feature runaway train at 100mph".

Imagine my shock when I saw that Cyberax was upset that someone had added an incredibly useful feature to BPF. If you hated it so much, why didn't you say anything on the mailing list? Oh, and here's another one:

>One of the bragging points of BPF was "but it always returns a value!", and that's why it's apparently better than WASM.
>With the addition of exceptions, this guarantee is lost.
>Not that it mattered either way in practice, but still. BPF is now just adding features without even considering their impact on the overall BPF model.

Sorry, what are you even talking about? You keep boiling BPF down to some very simple JIT thing and ignoring the extremely rich *Linux kernel* programming ecosystem of BPF that's far more robust than just "does it return a value". Also, things can evolve. Here, I recommend reading an article [1] that I wrote about the future direction of BPF.

[1]: https://lwn.net/Articles/909095/

Shocker, there's another Cyberax comment on that one too:

>Can we just switch from BPF to WASM?

You made that comment almost a year and a half ago. I wonder how many WASM-in-the-kernel patches you've sent to LKML since then?

>I'm not too invested in eBPF/WASM, but it's terrible to see a project keep digging themselves deeper into a hole.

It's abundantly clear that you're not invested in BPF based on how little you know about it, and how you never actually participate in any discussion forums where your contributions would be meaningful.

Anyways, I'm done with this convo. Feel free to send some patches upstream that actually add WASM to the kernel, and I promise you that I will give them a thorough and genuine review. Or if you'd prefer to keep spamming LWN with your hatred of BPF and your dreams of a hypothetical WASM utopia, go ahead.

The state of eBPF

Posted Feb 2, 2024 19:16 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

> Ok, so how does that work if you have a synchronous program that can't be interrupted? Or if you're in an RCU read region?

All back-branches are instrumented with the fuel check, so any JIT-compiled WASM program can be interrupted. Nothing whatsoever makes it a problem with RCU. Also, eBPF is just as interruptible.

I'm done trying to educate you, because you quite clearly have no idea and you're talking about, and you're throwing half-understood things (RCU! Kernel! Synchronous!) around without even bothering to do a cursory check.

The state of eBPF

Posted Feb 2, 2024 19:48 UTC (Fri) by Manifault (guest, #155796) [Link] (1 responses)

>All back-branches are instrumented with the fuel check, so any JIT-compiled WASM program can be interrupted. Nothing whatsoever makes it a problem with RCU. Also, eBPF is just as interruptible.

Sigh, I wasn't talking about whether the program is _running_ in an RCU read region. I was saying that it itself opens an RCU read region and is in the middle of it when the magical WASM runtime meter thing decides it needs to exit: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-n...

>I'm done trying to educate you, because you quite clearly have no idea and you're talking about, and you're throwing half-understood things (RCU! Kernel! Synchronous!) around without even bothering to do a cursory check.

Thanks. Looking forward to seeing your contribution to the Linux kernel community on the next BPF-related LWN article.

The state of eBPF

Posted Feb 2, 2024 23:53 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

I'm done educating you for free. RCU or locking and not at all a problem for WASM.

Hint: eBPF supports exceptions and RCU. How does it do that? Hint 2: eBPF supports only one RCU lock at a time.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds