Lua in the kernel?

Posted Sep 18, 2020 21:41 UTC (Fri) by HelloWorld (guest, #56129)
In reply to: Lua in the kernel? by farnz
Parent article: Lua in the kernel?

> You can compile a Turing complete language to a non-Turing complete language, by rejecting valid programs.

If you reject valid programs, then you're not compiling the Turing-complete language but a non-Turing-complete language that happens to be a subset of the Turing-complete one.

Lua in the kernel?

Posted Sep 19, 2020 13:33 UTC (Sat) by farnz (subscriber, #17727) [Link] (19 responses)

While that's true, it's a statement that only applies in theory - we compile Turing complete languages such as C down to finite state machines like x86-64 and AArch64 all the time, and there's no useful outcome from saying that we don't, in fact, compile C, we compile a non-Turing complete language that happens to be a subset of C.

Lua in the kernel?

Posted Sep 19, 2020 14:10 UTC (Sat) by HelloWorld (guest, #56129) [Link] (18 responses)

I have already addressed this nonsense elsewhere:

https://lwn.net/Comments/831793/

I believe our editor already asked you to stop with the personal insults...

Posted Sep 20, 2020 22:00 UTC (Sun) by sdalley (subscriber, #18550) [Link] (17 responses)

...so you are now in my rather short plonk list. The poor signal-to-noise ratio just isn't worth it.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 21, 2020 5:13 UTC (Mon) by smurf (subscriber, #17840) [Link] (16 responses)

That remark was about the content of your post, not about your person. Thus it was not a personal insult by definition.

Yes, calling a post "nonsense" is harsh. On the other hand, a discussion only works by listening to the other side's arguments, and @farnz plainly didn't do that.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 21, 2020 8:48 UTC (Mon) by farnz (subscriber, #17727) [Link] (14 responses)

Actually, smurf, I believe that HelloWorld is making an argument without a point and refusing to listen to anyone who disagrees with him - this is emphasised by him pointing to a comment he made in another subthread ; Turing-completeness is of limited practical value, as it's an argument from theory. But in practice, we have no Turing machines to run programs on; we always run on finite state machines in as far as our physics goes (we have reason to believe that the distance over which you can communicate is finite, that everything is contained in quantum states, and that the number of quantum states in a finite space is also finite).

If you are going to appeal to Turing-completeness, then you also need to explain why, in this case, the distinction you are drawing is helpful. Otherwise, it's just a claim that the observable universe is a finite space.

As I said in my comment, which HelloWorld has completely and utterly failed to respond to with anything other than an insult, the question is not about Turing completeness - it's about which programs are rejected that could have been accepted. That is an interesting question, and worth considering. Are the programs rejected by compiling C to BPF actually interesting in this context, or are they of no consequence?

I believe our editor already asked you to stop with the personal insults...

Posted Sep 22, 2020 23:31 UTC (Tue) by HelloWorld (guest, #56129) [Link] (13 responses)

> Actually, smurf, I believe that HelloWorld is making an argument without a point and refusing to listen to anyone who disagrees with him - this is emphasised by him pointing to a comment he made in another subthread ; Turing-completeness is of limited practical value, as it's an argument from theory. But in practice, we have no Turing machines to run programs on; we always run on finite state machines

Yes, I pointed to a comment that I made in another subthread. In that thread excors wrote this:
> In practice C is not Turing complete, because it runs on systems with limited memory and limited time.
This is the *exact same point* that you made:
> we compile Turing complete languages such as C down to finite state machines like x86-64 and AArch64

> If you are going to appeal to Turing-completeness, then you also need to explain why, in this case, the distinction you are drawing is helpful.
I already did in my other comment. The point is that while the number of states a computer can be in is finite, it is still so large that there is no *practical* way for arbitrary programs whether they will halt. A computer is thus a good enough approximation of a turing machine *in practice* that results from theoretical compsci do carry over. And the “compiling C to BPF” use case clearly shows this: it can't be done, which is why they can't compile C to BPF but a less general subset of C.

> As I said in my comment, which HelloWorld has completely and utterly failed to respond to
You just didn't understand the response, and that is not a problem on my side.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 23, 2020 2:06 UTC (Wed) by excors (subscriber, #95769) [Link]

Consider these languages:

1) C compiled to verifiable BPF
2) C compiled to a finite x86 machine
3) C compiled to a hypothetical unbounded machine
4) C compiled to verifiable BPF then implemented on a universal Turing machine implemented in Conway's Game of Life implemented on a hypothetical unbounded machine

(By "C" I mean some C-like language that is appropriate for the target architecture, not strictly ISO standard C.)

Theoretical CS says that the halting problem is solvable for 1, 2, and 4, but unsolvable for 3. Only 3 is Turing complete.

In practice, you can only determine whether a program will halt in a reasonable amount of time on 1.

Your original question was about how can GCC compile C to BPF. Obviously GCC can already compile C to x86 (and I assume you're happy to say C-on-x86 really is C). So your question was about the difference between languages 1 and 2, and the difference is not in their Turing completeness, because both are incomplete.

In general, proofs involving Turing machines seem perfectly happy to take a finite algorithm and make it exponentially more expensive, e.g. using unary representation for numbers, because it'll remain finite (which is all they care about). The Game of Life UTM is a reasonable theoretical construction, despite being completely absurd in practice. So the techniques used to examine Turing completeness are very different to the techniques you need to practically determine if a program will halt in reasonable time. When trying to solve almost any practical problem, Turing completeness is an irrelevant distraction, which is why I pushed back against framing the question that way.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 23, 2020 8:52 UTC (Wed) by farnz (subscriber, #17727) [Link] (11 responses)

> > If you are going to appeal to Turing-completeness, then you also need to explain why, in this case, the distinction you are drawing is helpful.
> I already did in my other comment. The point is that while the number of states a computer can be in is finite, it is still so large that there is no *practical* way for arbitrary programs whether they will halt. A computer is thus a good enough approximation of a turing machine *in practice* that results from theoretical compsci do carry over. And the “compiling C to BPF” use case clearly shows this: it can't be done, which is why they can't compile C to BPF but a less general subset of C.

That's still avoiding the key point - the important question is nothing to do with the halting problem, but whether what you can compile from C to BPF (or x86-64, or AArch64) is useful. If the subset of C that compiles to BPF is sufficiently large and expressive, then it doesn't matter that obscure capabilities are missing (e.g. true with AArch64, as the inability to have a single value in memory that needs 2**68 bits to represent it is not an issue). If it's not large enough, then there's a problem - but that requires not an appeal to Turing-completeness, but a pointer to language features that cannot be used in C->BPF compilation, together with an explanation of why those features matter in the real world.

> > As I said in my comment, which HelloWorld has completely and utterly failed to respond to
> You just didn't understand the response, and that is not a problem on my side.

No, that is a problem on your side - you are assuming that if people don't understand what you're saying, it's because they are the issue, and not your explanation. Taken to extremes, this attitude is why the British conquerors thought Africans were stupid - the Africans didn't understand English, and therefore the problem was on their side.

And I would note that this applies both ways round - you've ignored the meat of my disagreement in favour of showing off how clever you are and making statements about theoretical CS that don't apply directly to the real world. Does that imply that you just don't understand CS?

I believe our editor already asked you to stop with the personal insults...

Posted Sep 23, 2020 10:12 UTC (Wed) by HelloWorld (guest, #56129) [Link] (10 responses)

> That's still avoiding the key point - the important question is nothing to do with the halting problem, but whether what you can compile from C to BPF (or x86-64, or AArch64) is useful.

I started this thread, so I get to determine what the question was about. I asked how it's possible to compile C to BPF when that is clearly impossible due to the halting problem (nitpicking about finite memory notwithstanding). And the answer is that they *don't* compile C to BPF but only a subset. Whether that subset is useful or not is completely besides the point.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 23, 2020 10:17 UTC (Wed) by farnz (subscriber, #17727) [Link] (8 responses)

But, by that very statement, compiling C to x86-64 is also impossible, and yet you've said that the subset of C that compiles to x86-64 is fine by you, citing the fact that the limits on what C can be compiled to x86-64 are so large that they are practically not relevant.

I'm asking you why you don't feel the same way about the subset of C that compiles to a practical BPF program - after all, I can just raise the limits on BPF (as we do with x86-64) to get into the same situation where the limits exist but are so huge that they amount to the same limits as x86-64.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 23, 2020 12:05 UTC (Wed) by foom (subscriber, #14868) [Link] (3 responses)

It's different precisely because you *cannot* simply raise the limits to fix the problem. BPF requires the program to be proven to normally terminate on all possible inputs, before the program can be run. This is an entirely different ball game from C or x86-64, which require no such proof.

And -- of course -- most programs which in fact DO always terminate (without raising an exception, and within the proscribed run time bounds) cannot be proven to do so by the BPF verifier, and are therefore incorrectly rejected. This will not change if you increase the limits.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 23, 2020 17:40 UTC (Wed) by farnz (subscriber, #17727) [Link] (2 responses)

Except that BPF has no such requirement - the kernel's eBPF validator will not accept a program that it cannot verify will terminate normally on all possible inputs, but that's a consequence of that restricted execution environment, and not an eBPF requirement.

The same restriction would apply to x86-64 if the execution environment wanted to verify that it terminates on all possible inputs. That does not mean that you can't compile C to eBPF or x86-64 - it just means that in the context of compiling for the kernel eBPF validator (which would be the same for x86-64, or Lua bytecode, or WebAssembly), you can't use constructs that the verifier would reject.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 23, 2020 21:13 UTC (Wed) by foom (subscriber, #14868) [Link] (1 responses)

> Except that BPF has no such requirement

Oh come on, this is a useless nitpick.

There is effectively no use for eBPF, except to feed it to the Linux kernel. And for that purpose it must be verifiable. So, yes, fine -- when I said "BPF requires" I really meant "BPF-with-verifier-as-implemented-by-the-linux-kernel-for-use-from-user-space requires". _That_ is the target we're talking about in this whole thread, not the practically-useless "BPF-without-verifier" target.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 24, 2020 10:23 UTC (Thu) by farnz (subscriber, #17727) [Link]

But that limit can be raised, too - it's an artificial limit the kernel imposes so that it can safely accept user programs that get to hold the CPU. We can change that limit if it's a problem - we can (for example) allow programs that can't be proven to terminate a fixed number of executed instructions after which a default result is produced.

Which puts us right back at the original problem that HelloWorld is continuing to not answer; what practical programs would you like to express in C that cannot be compiled down to eBPF right now, and can we fix that up?

I believe our editor already asked you to stop with the personal insults...

Posted Sep 23, 2020 22:34 UTC (Wed) by HelloWorld (guest, #56129) [Link] (3 responses)

> I'm asking you why you don't feel the same way about the subset of C that compiles to a practical BPF program

Because that C subset is much, much smaller than the C subset that is implemented by common C implementations. It's reasonable to say that C is a good approximation of a Turing machine (from a computability perspective), but the same cannot be said for the C subset that can be compiled to verified BPF.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 24, 2020 10:26 UTC (Thu) by farnz (subscriber, #17727) [Link] (2 responses)

Still no specifics - if there are things you want to express in C that cannot be compiled to eBPF as it stands today, what exactly are they? How difficult is it to change the rules of the verifier such that those things are accepted?

The reason I don't care about x86-64 being more limited than C on a Turing Machine is that there's nothing practical we want to do with C that doesn't also run on x86-64; as far as I can see, the same currently applies with eBPF (in that, while there are things I can't do in C compiled to eBPF, they're also things I don't want to do with an eBPF program anyway).

I believe our editor already asked you to stop with the personal insults...

Posted Sep 24, 2020 12:26 UTC (Thu) by nix (subscriber, #2304) [Link] (1 responses)

Neither loops of runtime-computed length nor indirect function calls through function pointers are exactly unusual, yet neither are verifiable in BPF as it stands, and the latter is not even expressible (which was one of the motivations for xBPF).

Indeed, both are so far from unusual that the BPF verifier itself makes extensive use of both.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 25, 2020 8:01 UTC (Fri) by Wol (subscriber, #4433) [Link]

> Neither loops of runtime-computed length

And yet the original FORTRAN spec didn't permit these either (nor did it deny them). It explicitly permitted storing the loop counter in a read-only register, which imho BPF should certainly permit.

Cheers,
Wol

I believe our editor already asked you to stop with the personal insults...

Posted Sep 23, 2020 19:31 UTC (Wed) by smurf (subscriber, #17840) [Link]

> I asked how it's possible to compile C to BPF when that is clearly impossible due to the halting problem

Well … we're pretty far away from the kind of complexity where the halting problem, in the CS theory sense of that word, would be a meaningful boundary for verifying BPF-or-whatever programs.

There are lots of valid algorithms that can easily be proven to yield a correct result within some small bounded runtime – but which you cannot write in BPF because its set of operations is incomplete and/or because its verifier is too dumb.

This difference matters because a somewhat-extended instruction set plus an improved verification algorithm (with back propagation and a couple of other tricks) might well be able to cope with a large-enough class of BPF programs so that the current problems, esp. with optimizing compilers, can be solved.

This kind of verifier probably is too large and complex to implement inside the kernel, but doing it in userspace and signing the result would work just as well.

I believe our editor already asked you to stop with the personal insults...

Posted Sep 21, 2020 9:16 UTC (Mon) by sdalley (subscriber, #18550) [Link]

"Nonsense" is actually an OK thing to call an argument if it doesn't make any sense. I had just followed HelloWorld's own link to where he accuses subscribers worthy of a good deal more respect than his behaviour deserves, of being smart-arses. I prefer not to waste time with such unpleasantness.