DeVault: Announcing the Hare programming language
DeVault: Announcing the Hare programming language
Posted May 4, 2022 13:56 UTC (Wed) by Vipketsh (guest, #134480)In reply to: DeVault: Announcing the Hare programming language by felix.s
Parent article: DeVault: Announcing the Hare programming language
It would also be great if there were words to differentiate between undefined behaviour that can not be avoided (e.g. use-after-free) and those which we talk about only because compiler authors decided to explicitly add transformations based on said allowances (e.g. deleting NULL pointer checks due to NULL dereference being undefined). Lastly, I think we would be in a much better situation if 'undefined behaviour' would be read as 'allowed by the underlying machine, forbidden for the compiler'.
Posted May 4, 2022 14:25 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (1 responses)
Everyone keeps repeating this, but given something where you have some inline-able utility functions that all safeguard against `NULL` internally (because that's the sensible thing to do). Do you not want to let the compiler optimize out those checks when inlining the bodies into a caller that, itself, has a NULL-then-bail check on that pointer (which a dereference is implicitly a check for as well)? If you want to remove this optimization, what kinds of optimizations are you willing to leave on the floor?
FWIW, once you get to LTO, the language it was written in may be long gone and seeing something like (forgive my lack of asm familiarity):
load eax, *ecx ; int x = *ptr;
and not remove that `xor/jnz/ret` sequence?
Posted May 4, 2022 14:48 UTC (Wed)
by Vipketsh (guest, #134480)
[Link]
It is a very reasonable optimisation, but you don't need "dereferencing NULL is undefined behaviour" to make it! Proving that optimisation valid can be done with range analysis: since you have the first check for NULL, you know that the pointer is not NULL afterwards, and thus any following checks against NULL can not evaluate to true.
> what kinds of optimizations are you willing to leave on the floor?
If someone would actually quantify how much one or another optimisation gets us, we could have a discussion. I would guess that in many cases, it is something I could live with, but if people who do compiler benchmarks would tell me "well, that can loose you up to 10%" (or similar) I may very well concede that the undefined behaviour is worth keeping.
> FWIW, once you get to LTO, the language it was written in may be long gone
Yes, and no. LTO operates on the compiler's middle-end representation -- the same on which most all transformations occur. So, while the original language is indeed lost, all information from it needed to decide if one or another transformation is valid is still present.
Posted May 4, 2022 15:55 UTC (Wed)
by khim (subscriber, #9252)
[Link] (37 responses)
It can be done and it was done. Here is an example with undefined behavior, here is another with unspecified behavior. They are not set of stone, yet they are the rules of the game. But most discussions about undefined behavior go like “why do you say I couldn't hold basketball and run — I tried that and it works!”. Yes, it may work if the referee looks the other way. But it's against the rules. If you want to make it rules-compliant then you have to change the rules first! It is the end of story for a compiler writer or a C developer. Just like “it's against the rules” is “the end of story” for the basketball player. Sure. Raise that issue with ISO/IEC JTC1/SC22/WG21 committee. If they would agree — rules would be changed. Or you can try to see what it takes to change the compiler and report. But rules are rules, they are “the status quo”. Adherence to rules doesn't need any justifications. But changes to the rules need a justification, sure. Such thing already exist in the standard. It's called “implementation-defined behavior”. Undefined behavior is called undefined because it's, well… undefined. P.S. There are cases where modern compilers break a perfectly valid programs which don't, actually, trigger UB. That can only be called an act of sabotage, but these are different storied from what we are discussing here.
Posted May 4, 2022 18:03 UTC (Wed)
by Vipketsh (guest, #134480)
[Link] (36 responses)
It's also not like compiler writers have never ignored standards when it suited them. I think it was yourself who mentioned in some thread from a while ago that LLVM is doing all sorts of optimisations based on pointer provenance when there is exactly no mention of them in any standard. So, no, I can't just simply accept "it's against the rules" to end a discussion.
> “why do you say I couldn't hold basketball and run — I tried that and it works!”
To take your analogy further, maybe basketball would be better that way. If I think so, I can gather up a bunch of people go to some court to try it for a while and, if it works, maybe the basketball rules committee will take an interest and change the rules or maybe its the birth of a new sport. So, yes, turning certain undefined behaviours into defined ones in a compiler is exactly the place to have this discussion and the rule changes can very well happen later.
Posted May 4, 2022 18:28 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (4 responses)
FWIW, I go because I work on CMake and would like to ensure that modules are buildable (as would my employer). But I go to other presentations that I am personally interested in (when not a schedule conflict with what I'm there to do) and participate.
With the pandemic making face-to-face less tenable, I expect it to be easier than ever for folks to attend committee meetings.
> I think it was yourself who mentioned in some thread from a while ago that LLVM is doing all sorts of optimisations based on pointer provenance when there is exactly no mention of them in any standard.
That was me I believe. My understanding is that without provenance, pointers are extremely hard to actually reason about and make any kind of sensible optimizations around (this includes aliasing, thread safety, etc.). It was something that was encountered when implementing the language that turned out to be underlying a lot of things but was never spelled out. There is work to figure out what these rules actually are and how to put them into standard-ese ("pointer zap" is the search term to use for the paper titles).
Posted May 4, 2022 18:51 UTC (Wed)
by Vipketsh (guest, #134480)
[Link]
(For reference, long ago I have engaged with the Unicode people as part of a research group and that experience was one of dealing with megalomanic cesspools, lobby groups, legal threats and the like -- truly awe full)
Posted May 4, 2022 19:06 UTC (Wed)
by khim (subscriber, #9252)
[Link] (2 responses)
That was different discussion. This would have been great if, while these rules are not finalized, compilers produced working programs by default. Instead clang not just breaks programs by default, it even refuses to provide And it's not as if rules were actually designed, presented to C/C++ committee (like happend with DR 236) then rejected because of typical committee politics. At least then you can say “yes, changes to the standard are not accepted yet, but you can read about our rules here”. Instead, for two decades compilers silently miscompiled valid programs and their developers offered no rules which software developers can follow to develop programs which wouldn't be miscompiled! That's an act of deliberate sabotage. And, worst of all, it have only gained prominence because of Rust: since Rust developers actually care about program correctness (and because LLVM miscompiled programs which they believed to be correct) they tried to find the actual list of rules that govern provenance in C/C++… and found nothing. Yes, this provenance fiasco is awful blow against C/C++ compiler developers. You couldn't say “standard is a treaty, you have to follow it” and then turn around and add “oh, but it's too hard for me to follow it, thus I would violate it whenever it would suit me”. That's not a treaty anymore, it's a joke. But even then: the way forward is not to ignore the rules, but an attempt to patch them. Ideally new compilers should be written which would actually obey some rules, but that's too big of an endeavor, I don't think it'll happen.
Posted May 4, 2022 19:36 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (1 responses)
There may be hope. Some docs: https://lightningcreations.github.io/lccc/xlang/pointer (though no mention of provenance yet(?)).
Posted May 5, 2022 9:59 UTC (Thu)
by khim (subscriber, #9252)
[Link]
<p>Things like <a href="https://www.ralfj.de/blog/2020/12/14/provenance.html">that</a> should be impossible.</p>
<p>Today optimizations in compilers are dark art: not only they produce suboptimal code sometimes (that's probably something which we would never be able to fix), we discover, from time to time, that they are just simply invalid (as in: they turn perfectly valid programs into invalid).</p>
<p>Ideally we should ensure that all optimizations leave valid programs still valid (even if, perhaps, not optimal).</p>
Posted May 4, 2022 18:33 UTC (Wed)
by khim (subscriber, #9252)
[Link] (30 responses)
Indeed. That was a great violation of rules and that's what pushed me to start using Rust, finally. I always was sitting on the fence about it, but when it turned out that with C/C++ I have to watch not just for UBs actually listed in the standard, but also other, vague and uncertain requirements which were supposed to be added to it more than a decade ago… at this point at became apparent that C/C++ are not suitable for any purpose. The number of UBs is just too vast, they are not enumerated (C tried, but, as provenance issues shows, failed to list them, C++ haven't tried at all) and, more importantly, there are no automated way to separate code where UB may happen (and which I'm supposed to rewrite when new version of compiler is released) from code where UB is guaranteed to be absent (and which I can explicitly trust). But that was just the final straw. Even without this gross violation of rules it was becoming more and more apparent that UBs have to go: it's not clear if low-level language without UB is possible in principle, but even if you eliminate them from 99% of code using an automated tools that would still be a great improvement. Very rarely. Except for that crazy provenance business (which is justfied by DR260 and the main issue lies with the fact that compilers started miscompiling programs without appropriate prior changes to the standard) I can only recall DR236 (where committee acted as committee and refused to offer any sane way to deal with the issue and just noted that the requirement of global knowledge is problematic). And in cases where it was shown that UB requirements are just too onerous to bear they changed standard in the favor of C++ developers, thus it was definitely not one-way street. You are perfectly free to do that with the compilers, too. Most are gcc/clang based novadays thus you can just start with implementing your proposed changes, then can measure things and promote your changes. Indeed, but it's your responsibility to change the compiler and show that it brings real benefits. Just like most basketball players: they wouldn't even consider trying to play basketball with some changed rules unless you can show that other prominent players are trying that, changed, variant. Some changes may even become new
Posted May 4, 2022 19:14 UTC (Wed)
by Wol (subscriber, #4433)
[Link] (1 responses)
It's easy to eliminate UB from any language. A computer language is Maths, and as such you can build a complete and pretty much perfect MODEL.
When you run your program, it's Science (or technology), and you can't guarantee that your model and reality co-incide, but any AND ALL UB should be "we have no control over the consequences of these actions, because we are relying on an outside agent". Be it a hardware random number generator, a network card receiving stuff over the network, a buggy CPU chip, etc etc. The language should be quite clear - "here we have control therefore the following is *defined* as happening, there we have no control therefore what happens is whatever that other system defines". If that other system doesn't bother to define it, it's not UB in your language.
And that's why Rust and Modula-2 and that all have unsafe blocks - it says "here be dragons, we cannot guarantee language invariants".
Cheers,
Posted May 4, 2022 19:32 UTC (Wed)
by khim (subscriber, #9252)
[Link]
You just have to remember that not all unsafe blocks are marked with nice keywords. E.g. presumably “safe” Rust program may open But even then: it's time to stop making languages which allow UB to happen just in any random piece of code which does no such crazy things! Sometimes model is just too restrictive. E.g. in Rust you have to go into But even that, rigid and limited model, allows one to remove UB from a surprising percentage of your code. And it's the only way to write complex code. C/C++ approach doesn't scale. We don't have enough DJBs.
Posted May 5, 2022 0:33 UTC (Thu)
by tialaramex (subscriber, #21167)
[Link] (27 responses)
Provenance is difficult, I would agree with you that C++ didn't do a great job here by trying to kick this ball into the long grass rather than wrestle with the difficult problem, but I think we should be serious for a moment and consider that if C++ 11 had tried to do what Aria is merely proposing (and of course hasn't anywhere close to consensus to actually do for real in stable yet) for Rust, it would have been stillborn.
The same exact people who are here talking about some hypothetical C dialect or new language where it does what they meant, (whatever the hell that is) would have said Provenance is an abomination, just let us do things the old way. Even though "the old way" doesn't have any coherent meaning which is why this came up as a Defect Report not as a future feature proposal.
Posted May 5, 2022 10:30 UTC (Thu)
by khim (subscriber, #9252)
[Link] (26 responses)
It was abuse of the system, plain and simple. One part of the standard say saying, back then, that only visible values matter and if two pointers are identical they should behave identically. That's understanding of the vast majority of the practical programmers and this is what should have been kept as a default (even if it affected optimizations). The other part of the standard was talking about pointer validity and contradicted the first, e.g. That's what saboteurs wanted to hear and that's what they used to sabotage C/C++ community. It's not a “difficult problem” at all. If sabotage would have failed then rules for when identical pointers are not considered to be identical would have become not “hidden”, “unwritten” part of the standard, but a non-standard mode, extension. If it would have been proven that they are helping to produce much better code then they would have been enabled explicitly in many projects. And people would have became aware. Instead language was silently and without evidence changed behind the developer's back. That's real serious issue IMNSHO. Layman are not supposed to know all the fine points of law. S/he especially is not supposed to know all the unwritten rules.
Posted May 5, 2022 12:47 UTC (Thu)
by foom (subscriber, #14868)
[Link] (25 responses)
This is inflammatory and unhelpful language. That certainly doesn't actually describe anyone involved, as I suspect you're well aware.
I believe what you actually mean is that you disagree strongly with some of the decisions made, and consider them to have had harmful effects. Say that, not "saboteurs".
Posted May 5, 2022 13:02 UTC (Thu)
by wtarreau (subscriber, #51152)
[Link] (2 responses)
The first rule should be not to break what has reliably worked for ages. *even if that was incorrect in the first place*. As Linus often explains, a bug may become a feature once everyone uses it; usage prevails over initial intent.
I'm pretty certain that most of the recent changes were driven exclusively by pride, so say "look how smart the compiler became after my changes", forgetting that their users would like it to be trustable instead of being smart.
Posted May 5, 2022 13:52 UTC (Thu)
by khim (subscriber, #9252)
[Link] (1 responses)
That's one possibility, yes. But there's another possibility: follow the standard. Dūra lēx, sed lēx. Under that assumption you just follow the law. What law says… goes. Even if what the law says is nonsense. That what C/C++ compiler developers promoted for years. But if you pick that approach you cannot then turn around and say “oh, sorry, law is too harsh, I don't want to follow it”. Either law is law and everyone has to follow it, it's enough to follow it, or it's not the law. Provenance rules are not like that. They allow Instead they were introduced in a submarine patent way, without any options, not even opt-out options. And they break standards-compliant programs. That's an act of sabotage, sorry.
Posted May 5, 2022 15:26 UTC (Thu)
by wtarreau (subscriber, #51152)
[Link]
I wouldn't deny that, but:
We're really fighting *against* the compiler to keep our code safe these years. This tool was supposed to help us instead. And it failed.
Posted May 5, 2022 13:38 UTC (Thu)
by khim (subscriber, #9252)
[Link] (21 responses)
This describes the majority of people I have talked with. I was unable to find anyone who have honestly claimed that careful reading the standard is enough to write the code which wouldn't violate provenance rules. On the contrary: people expressed regret or, in many cases, anger about the fact that standard doesn't enable usage of provenance rules by the compilers, yet none claimed that they follow from the existing standard. If the standard is the treaty between compiler and programmer then it was deliberate breakage of the treaty. Worse: it placed the users of the compiler at a serious disadvantage. How are they supposed to follow the rules which don't exist? Especially if even people who are yet to present these rules agree that they are really complex and convoluted? And when people, horrified, asked for the Sorry, but if that is not an act of sabotage, then I don't know what is. And people who are doing the sabotage are called saboteours. No. It's not about me agreeing with something or disagreeing with something. Let me summarize what happened: Sorry, but when you knowingly break standards-compliant programs and refuse to support them in any shape or form — that's an act of sabotage.
Posted May 9, 2022 18:58 UTC (Mon)
by tialaramex (subscriber, #21167)
[Link] (20 responses)
Suppose it is the year 1889, the twentieth century seems bright ahead, and you have learned everything there is to know (or so you think) about Set Theory. (This is called Naive Set Theory). It seems to you that this works just fine.
Fast forward a few years, and this chap named Ernst Zermelo says your theory is incomplete and that's why it suffers from some famous paradoxes. His new axiomatised Set Theory requires several exciting new axioms, including the infamous Axiom Of Choice and with these axioms Zermelo vanquishes the paradoxes.
Now, is Ernst wrong? Was your naive theory actually fine? Shouldn't you be allowed to go back to using that simpler, "better" set theory and ignore Ernst's stupid paradoxes and his obviously nonsensical Axiom of Choice ? No. Ernst was right, your theory was unsound, _and it was already unsound in 1889_, you just didn't know it yet. Your naive theory _assumed_ things which Zermelo made into axioms.
Likewise, the C89 compilers you have nostalgia for were actually unsound, and it would have been possible (or if you resurrect them, is possible today on those compilers) to produce nonsensical results because in fact pointer provenance may not have been mentioned in the C89 standard but it was relied upon by compilers anyway. It was silently assumed, and had been for many years.
The excellent index for K&R's Second Edition of "The C Programming Language" covering C89, doesn't even have an entry for the words "alias" or "provenance". Because there are _assumptions_ about these things baked in to the language, but they haven't been surfaced.
The higher level programming languages get to have lots of optimisations here because assumptions like "pointer provenance" are necessarily true in a language that only has references anyway. To keep those same optimisations (as otherwise they'd be slower!) C and C++ must make the assumptions too, and yet to deliver on their "low level" promise they cannot. Squaring this circle is difficult which is why the committees punt rather than do it over all these years.
I happen to think Rust (more by luck than judgement so far as I can see) got this right. If you define most of the program in a higher level ("safe" in Rust terms) language, you definitely can have all those optimisations and then compartmentalize the scary assumption-violating low level stuff. This is what Aria's Tower of Weakenings is about too. Aria proposes that even most of unsafe Rust can safely keep these assumptions, something like Haphazard (the Hazard Pointer implementation) doesn't do anything that risks provenance confusion and so it's safe to optimize with those assumptions and more stuff like that can be safely labelled safe, until only the type of code that really mints "pointers" from integers out of nowhere cannot justify the assumptions and accordingly cannot be optimised successfully.
It's OK if five machine instructions can't be optimised, probably even if they're in your tight inner loop, certainly it's better than accidentally optimising them to four *wrong* instructions. What's a problem for C and C++ is that the provenance problem is infectious and might spread from that inner loop to the entire program and then you're back to slower than Python.
Posted May 9, 2022 21:55 UTC (Mon)
by Wol (subscriber, #4433)
[Link] (19 responses)
Whoops. I know a lot of people think my view of maths is naive, but that's not what I understand an axiom to be. An axiom is something which is *assumed* to be true, because it can't be proven. Zermelo would have provided a proof, which would have changed your naive theory from axiom to proven false.
This is Euclid's axiom that parallel lines never meet. That axiom *defines* the special case of 3D geometry. but because in the general case it's false, it's not an axiom of geometry.
Cheers,
Posted May 9, 2022 22:39 UTC (Mon)
by tialaramex (subscriber, #21167)
[Link] (18 responses)
As I understand it, the problem with assumptions in naive set theories and C89 (and various other things) is that because it's an assumption rather than an axiom you don't spot where the problems are. You never write it down at all and so have no opportunity to notice that's it's too vague whereas when you're writing an axiom you can see what you're doing.
The naive theories let Russell's paradox sneak in by creating this poorly defined set, but the axioms in Zermelo's theory oblige you to define a set more carefully to have a set at all, and in that process the paradox gets defused. In particular ZFC has a "specification" axiom which says in essence OK, so, tell me how to make this "set" using another set and first order logic. The sets naive set theories were created for can all be constructed this way no problem, but weird sets with paradoxical properties cannot.
C89 assumes that pointers to things must be different, which sounds reasonable but does not formally explain how this works. I believe that it's necessary (in order for a language like C89 to avoid its equivalent of paradoxes, the programs which abuse language semantics to do something hostile to optimisation) to define such rules, and they're going to look like provenance.
I do not believe that C89 is fine, and thus that we should or could just implement C89 as if provenance isn't a thing and be happy. That's my point here. C89 wasn't written in defiance of a known reality, but in ignorance of an unknown one, like the Ptolemaic system. Geocentrists today are different from Ptolemy, but not because Ptolemy was right.
Posted May 10, 2022 2:18 UTC (Tue)
by khim (subscriber, #9252)
[Link] (17 responses)
No. It's the opposite. C89 doesn't assume that pointers to things must be different. But yes, it asserts that if pointers are equal then they are completely equivalent — you can compare any two pointers and go from there. Note that it doesn't work in the other direction: it's perfectly legal to have pointers which are different yet point to the same object. That's easily observable in MS-DOS's Show me a Russel's paradox, please. Not in a form “optimizer could not do X or Y, which is nice thing to be able to do and thus we must punish the software developer who assumes pointers are just memory addresses”, but “this valid C89 program can produce one of two outputs depending or how we read the rules and thus it's impossible to define how it should work”. Then and only then you would have a point. I think you are skipping one important step there. Yes, there was an attempt to add rules about how certain pointers have to point to different objects. But it failed spectacularly. It was never adopted and final text of C89 standard don't even mention it. In C89 pointers are just addresses, no crazy talks about pointer validity, object creation and disappearance and so on. There are some inaccuracies from that failed attempt: C89 defines only two lifetimes: static and automatic… yet somehow the value of a pointer that refers to freed space is indeterminate. Yet if you just declare that pointers are addresses then it should be possible to fix these inaccuracies without much loss. Where non-trivial lifetimes first appeared is C99, not C89. There yes, it has become impossible to read from "union member other than the last one stored into", there limitation on whether you can compare two pointers or not were added (previously it was possible to compare two arbitrary pointers and if they happen to be valid and equal, they would be equivalent), etc. But I don't see why C89 memory model would be, somehow, unsound. Hard to optimize? Probably. Wasterful? Maybe so. But I don't see where it's inconsistent. Show me. Please. Yes, it absolutely rejects pointer provenance in any shape or form (except something like CHERI where provenance actually exist at runtime and is 100% consistent). Yes, it may limit optimization opportunities. But where's the Russel's paradox, hmm?
Posted May 10, 2022 15:04 UTC (Tue)
by tialaramex (subscriber, #21167)
[Link] (6 responses)
Your resulting C compiler is not the GCC I grew up with (well, OK, the one that teenage me knew), or that with some minor optimisation passes disabled, it's an altogether different animal, perhaps closer to Python. In this language, pointers are all just indexes into an array containing all of memory - including the text segment and the stack, and so you can do some amazing acrobatics as a programmer, but your optimiser is badly hamstrung. C's already poor type checking is further reduced in power in the process, which again makes the comparison to Python seem appropriate.
I don't believe there is or was an audience for this compiler. People weren't writing C because of the expressive syntax, the unsurpassed quality of the tooling or the comprehensive "batteries included" standard library, it didn't have any of those things - they were doing it because C compilers produce reasonable machine code, and this alternative interpretation of C89 doesn't do that any more.
> (except something like CHERI where provenance actually exist at runtime and is 100% consistent)
You can only do this at all under CHERI via one of two equally useless routes:
1. The "Python" approach I describe where you declare that all "pointers" inherit a provenance with 100% system visibility, this obviously doesn't catch any bugs, and you might as well switch off CHERI, bringing us to...
2. The compatibility mode. As I understand it Morello provides a switch so you can say that now we don't enforce CHERI rules, the hidden "valid" bit is ignored and it behaves like a conventional CPU. Again you don't catch any bugs.
This is because under your preferred C89 "no provenance" model there isn't any provenance, CHERI isn't a fairytale spell it's just engineering.
Posted May 10, 2022 16:10 UTC (Tue)
by khim (subscriber, #9252)
[Link] (3 responses)
But that's is the language which Kernighan and Ritchie designed and used to write Unix. Their goal was not to create some crazy portable dream, they just wanted to keep supporting both 18-bit PDP-7 and 16-bit PDP-11 from the same codebase by rewriting some parts of code written in PDP-7 assembler in the higher-level language. They have been using B which had no types at all and improved it. By adding character types, then structs, arrays, pointers (yes, B conflated pointers and integers, it only had one type). Yet How about “all the C users for the first decade of it's existence”? Initially C was used exclusively in Unix, but in 1980th it became used more widely. Yet I don't think any compilers of that era support anything even remotely resembling “pointer provenance”. That craziness started after a failed attempt of C standard committee to redo the language. They then went back and replaced that with a simpler aliasing rules which prevented type puning, but even these weren't abused by compilers till XXI century. Can you, please, stop rewriting history? C was quite popular way before ANSI C arrived and tried (but failed!) to introduce crazy aliasing rules. Yes, C compilers were restricted and couldn't do all the amazing optimizations… but C developers can do these, instead! When John Carmack was adopting his infamous 0x5f3759df-based trick he certainly haven't cared to think about the fact that there are some aliasing rules which may render code invalid and that was true for the majority of users who grew in an era before GCC started breaking good K&R programs. It's engineering, yes, but you can add provenance to C89. It just has to be consistent. You can even model it with “poor man's CHERI” aka MS-DOS Large model by playing tricks with segment and offset. E.g. This way all the fast-path code would be negated and you would never have the situation where bitwise-identical pointers point to different objects. This may not be super-efficient but it is compatible with C89. Remember? Bitwise-different pointers can point to the same object, but the opposite is forbidden! Easy, no? All these games where certain pointers can be compared but not others and its programmer's responsibility to remember all these unwritten rules… I don't know how that language can be used to development, sorry. The advice I have gotten from our toolchain-support team is to ask clang developer about low-level constructs which I may wish to create! So much for “standard is a treaty” talks…
Posted May 10, 2022 18:03 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (2 responses)
Early C did not have a formal specification - what the one and only implementation did was what the language did.
And the problem is that formally specified C - including K&R C, and C89 - left a huge amount unspecified; users of C assumed that the behaviour of their implementation of the language was C behaviour, and not merely the way their implementation behaved.
Up until the early 1990s, this wasn't a big deal. The tools needed for compilers to do much more than peephole optimizations simply didn't exist in usable form; remember that SSA, the first of the tools needed to start reasoning about blocks or even entire programs doesn't appear in the literature until 1988. As a result, most implementations happened, more by luck than judgement, to behave in similar ways where the specification was silent.
But then we got SSA, the polytope model, and other tools that allowed compilers to do significant optimizations beyond peephole optimizations on the source code, their IR, and the object code. And now we have a mess - the provenance model, for example, is compiler authors trying to find a rigorous way to model what users "expect" from pointers, not just what C89 permits users to assume, while C11's concurrent memory model is an effort to come up with a rigorous model for what users can expect when multiple threads of execution alter the state of the abstract machine.
Remember that all you are guaranteed about your code in C89 is that the code behaves as-if it made certain changes to the abstract machine for each specified operation (standard library as well as language), and that all volatile accesses are visible in the real machine in program order. Nothing else is promised to you - there's no such thing as a "compiler barrier" in C89, for example.
Posted May 10, 2022 19:57 UTC (Tue)
by khim (subscriber, #9252)
[Link] (1 responses)
True, but irrelevant. The most important part that we discussing here was specified in both: pointers are addresses, if two pointers are equal they can be used interchangeably. Yes. And there was an attempt to inject ideas that make these useful into C89. Failed one. The
committee has created an unreal language that no one can or will actually use. It was ripped out (and replaced with crazy aliasing rules, but that's another story). Can you, please stop lying? Provenance models are trying to justify deliberate sabotage where fully-standard compliant programs are broken. It's not my words, the provenance proposal itself laments:
To make the existing compiler behavior sound, my ass. The whole story of provenance started with sabotage: after failed attempt to bring provanance-like properties to C89 saboteurs returned in C99 and, finally, succeeded in adding some (and thus rendered some C89 programs invalid in the process), but that wasn't enough: they got the infamous DR260 resolution which was phrased like that: After much discussion, the UK C Panel came to a number of conclusions as to what it would be desirable for the Standard to mean. Note: the resolution hasn't changed the standard. It hasn't allowed saboteurs to break more programs. No. It was merely a suggestion to develop adjustments to the standards — and listed three cases where such adjustments were supposed to cause certain outcomes. Nothing like that happened. For more than two decades compilers invented more and more clever ways to screw the developers and used that resolution as a fig leaf. And then… Rust happened. Since Rust guys are pretty concerned about program correctness (and LLVM sometimes miscomplied IR-programs they perceived correct) they went to C++ guys and asked “hey, what are the actual rules we have to follow”? And the answer was… “here is the defect report, we use it to screw the developers and miscompile their valid programs”. Rust developers weren't amused. And that is when the lie was, finally, exposed. So, please, don't liken problems with pointer provenance to problems with C11 memory model. Indeed, C89 or C99 doesn't allow one to write valid multi-threaded programs. Everything is defined strictly for single-threaded program. To support programs where two threads of execution can touch objects “simultaneously” you need to extend the language somehow. Provenance excuse is used to break completely valid C and C++ programs. It's not about extending the language, it's about narrowing it. Certain formerly valid programs have to be deemed to have undefined behaviour..And after more than two decades we don't even have the rules which we are supposed to follow finalized! And they express it in a form of if you believe pointers are mere addresses, you are not writing C++; you are writing the C of K&R, which is a dead language. IOW: they know they sabotaged C developers —and they are proud of it. Yes. And to express many things which would be needed to, e.g., write an OS kernel in C89, you need to extend the language in some way. This is deliberate: the idea was to make sure strictly-conforming C89 programs run everywhere, but conforming programs may require certain language extensions. Not ideal, but works. Saboteurs managed to screw it all completely and categorically refuse to fix what they have done.
Posted May 10, 2022 20:05 UTC (Tue)
by corbet (editor, #1)
[Link]
I would like to humbly suggest that we let this topic rest at this point.
Thank you.
Posted May 10, 2022 17:16 UTC (Tue)
by Vipketsh (guest, #134480)
[Link] (1 responses)
Re: your mathematics analogy. I think you have taken the wrong view point there: pretty much all of mainstream mathematics is concerned with either extending existing and useful theory (e.g. how rational numbers where extended to create irrational numbers) or to put existing theory on a more sound footing (e.g. Hilbert's axioms), possibly closing off various paradoxes. Realise how in pretty much all of the evolution of mathematics a very strong emphasis was placed on any new theory being backwards compatible -- no-one, taken seriously, wanted to ever end up with 1+1=3 but instead worked to solidify the intuition that 1+1=2. I think that if one wanted to paint mathematical evolution onto C, the definitions underpinning C would need to be changed in a way that (i) they are backwards compatible with existing programs and (ii) that loopholes exploited by compiler writers be closed instead of officially sanctioned. Right now, it's the opposite: people are trying to convince C programmers that the intuition they had all along was always false and reality is actually completely different.
*: Possibly the one with the most problems is the idea that realloc() will, in the absence of failure, always (i) destroy the object passed to it, and (ii) will allocate a completely new one. This is counter to the intuition of many a programmer and there is no enforced rule in C that prevents programmers from carrying pointers over the realloc() call, which would make the idea actually work. The reason people are annoyed is that such code exists, is used and has worked for a long time and there is no evidence that this idea has much, if any, benefit on the compiled program.
Posted May 11, 2022 16:09 UTC (Wed)
by tialaramex (subscriber, #21167)
[Link]
I shall quote K&R themselves on this subject in their second edition:
"An object, sometimes called a variable, is a location in storage, and its interpretation depends on two main attributes ..."
> This ends up bringing with it rules such as you can not just arbitrarily turn one object into another and back again (i.e. no pointer casts), you can not arbitrarily split one object into two (i.e. no pointer arithmetic)
Nope, a language can (and some do, notably Rust of course but also this is possible with care in C and C++) provide pointer casts and pointer arithmetic. Provenance works this just fine for these operations.
Rust's Vec::split_with_spare_mut() isn't even unsafe. Most practical uses for this feature are unsafe, but the call itself is fine, it merely gives you back your Vec<T> (which will now need to re-allocate if grown because any spare space at the end of it is gone) and that space which was not used yet as a mutable slice of MaybeUninit<T> to do with as you see fit.
> and you can not arbitrarily manufacture pointers out of random data.
But here's where your problem arises. Here provenance is conjured from nowhere. It's impossible magic.
> Unfortunately C is not such a language and by forcing provenance rules on it, one is in essence trying to retrofit some kind of object model to it without any of the expressiveness and enforced rules that are needed for the programmer to not make programmes that are obviously wrong under the assumptions (i.e. provenance)
As we saw C is in fact such a language after all. The fact that many of its staunchest proponents don't seem to understand it very well is a problem for them and for C.
Posted May 10, 2022 16:39 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (9 responses)
When you say "if pointers are equal, then they are completely equivalent", are you talking at a single point in time, or over the course of execution of the program?
Given, for example, the following program, it is a fault to assume that ptr1 and ptr2 are equivalent throughout the runtime, because ptr1 is invalidated by the call to release_page:
This is the sort of code that you need to be clear about; C89's language leaves it unclear whether it's legitimate to assume that *ptr1 == test, even though the only assignments in the program are to *ptr1 (setting it to -1) and test. The thing that hurts here is that even if, in bitwise terms including hidden CHERI bits etc, ptr1 == ptr2, it's possible for the underlying machine to change state over time, and any definition of "completely equivalent" has to take that into account.
One way to handle that is to say that even though the volatile keyword does not appear anywhere in that code snippet, you give dereferencing a pointer volatile-like semantics (basically asserting that locations pointed to can change outside the changes done by the C program), and say that each time it's dereferenced, it could be referring to a new location in physical memory. In that case, this program cannot print "-1 == 0", because it has to dereference ptr1 to determine that.
Another is to follow the letter of the C89 spec, which says that the only things that can change in the abstract machine's view of the world other than via a C statement are things marked volatile. In that case, this program is allowed to print "-1 == 0" or "-1 != 0" depending on whether ptr1 == ptr2, because the implementation "knows" that it is the only thing that can assign a value to *ptr1, and thus it "knows" that because no-one has assigned through *ptr1 since it read the value to get test it is tautologically true that *ptr1 == test.
Both are valid readings of this source under the rules set by C89, because C89 states explicitly that the only thing expected to change outside of explicit changes done by C code are things marked as volatile. But in this case, the get_zeroed_page and release_page pair change the machine state in a fairly dramatic way, but in a way that's not visible to C code - changing PTEs, for example.
And that's the fundamental issue with rewinding to C89 rules - C89 implies very strongly that the only interface between things running on the "abstract C89 machine" and the real hardware are things that are marked as volatile in the C abstract machine. In practice, nobody has bothered being that neat, and we accept that there's a whole world of murky, underdefined behaviour where the real hardware changes things that affect the behaviour of the C abstract machine, but it happens that C compilers have not yet exploited that.
Note, too, that I wasn't talking about optimization in either case - I'm simply looking at the semantics of the C abstract machine as defined in C89, and noting that they're not powerful enough to model a change that affects the abstract machine but happens outside it. I find it very tough, within the C89 language, to find anything that motivates the position that *ptr1 != test given that ptr2 == ptr1 and *ptr2 != test - it's instead explicitly undefined behaviour.
Posted May 10, 2022 17:00 UTC (Tue)
by khim (subscriber, #9252)
[Link] (2 responses)
No. Calling Yes and no. Change is dramatic, sure. But it's most definitely visible to C code. By necessity such things have to either be implemented with volatile or by calling system routine (which must be added to the list of functions like Place where you pass your pointer to the In practice people who are doing these things have to use volatile at some point in the kernel, or else it just wouldn't work. Thus I don't see what you are trying to prove. The fact that real OSes have to expand list of “special” functions which may do crazy things? It's obvious. In practice your functions are called No. You couldn't do things like change to PTEs in a fully portable C89 program. It's pointless to talk about such programs since they don't exist. The only way to do it is via
Posted May 10, 2022 18:10 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (1 responses)
fread and fwrite are poor examples, because they are C code defined in terms of the state change they make to the abstract machine, and with a QoI requirement that the same state change happens to the real machine. Indeed, everything that's defined in C89 has its impact on the abstract machine fully defined by the spec; the only get-out is that volatile marks something where all reads and writes through it must be visible in the real machine in program order.
But note that this is a very minimal promise; the only thing happening in the real machine that I can reason about in C89 is the program order of accesses to volatiles. Nothing else that happens in the abstract machine is guaranteed to be visible outside it - everything else is left to the implementation's discretion.
And no, the state change is not visible inside the C89 abstract machine; if I write through a volatile pointer to a PTE, the implementation must ensure that my write happens in the real machine as well as the abstract machine, but it does not have to assume that anything has changed in the abstract machine. That, in turn, means that it may not know that ptr1 now has changed in the "real" machine, because it's not volatile and thus changes in the real machine are irrelevant.
And I absolutely can change a PTE without assembly or a system routine, using plain C code; all I need is something that gives me the address of the PTE I want to change. Now, depending on the architecture, that almost certainly is not enough to guarantee an instant change - e.g. on x86, the processor can use old or new value of the PTE until the TLB is flushed, and I can force a TLB flush with invlpg to get deterministic behaviour - but I can bring the program into a non-deterministic state without calling into an assembly block or running a system routine, as long as I have the address of a PTE.
And there's no "list of system routines" in C89; the behaviour of fread, fwrite and other such functions is fully defined in the abstract machine by the spec, with a QoI requirement to have their behaviour be reflected in the "real" machine. By invoking the idea of a "list of system routines", you're extending the language beyond C89.
You're making the same mistake a lot of people make, of assuming that the behaviour of compilers in the early 1990s and earlier reflected the specification at the time, and wasn't just a consequence of limited compiler technology. If compilers really did implement C89 to the letter of the specification, then much of what makes C useful wouldn't be possible; provenance is not something that's new, but rather an attempt to let people do all the tricks like bit-stuffing into aligned pointers (which is technically UB in C89) while still allowing the compiler to reason about the meaning of your code in a way compatible with the C89 specification.
Posted May 10, 2022 19:01 UTC (Tue)
by khim (subscriber, #9252)
[Link]
Which is the only way to have PTEs in C code. If you have such an address then you have to extend the language somehow. Or, alternatively, don't touch it. Of course. Because it's impossible to write C89 program which changes the PTEs, such a concept just couldn't exist in it. You have to extend the language to cover that usecase. …then such compilers would have been as popular as ISO 7185. Means: no one would have cared about these and no one would have mentioned their existence. Yes. But some programs would still be possible. Programs which do tricks with pointers would work just fine, programs which touch PTEs wouldn't. Citation needed. Badly. Because, I would repeat once more, in C89 (not in C99 and newer) the notion of “pointers which have the same bitpattern yet different” doesn't exist. If you add a few bits to the pointer converted to integer and then clear these same bits you would get the exact same pointer — guaranteed. The fact that these bits are lowest bits of converted integer is implementation-specific thing, you can imagine a case where these would live as top bits, e.g. So yet, that requires some implementation-specific extension. But pretty mild and limited. Yes, provenance is an attempt to deal with the idea of C99+ that some pointers may be equal to others yet, somehow, still different — but that's not allowed in C89. If two pointers are equal then they are either both null pointers, or both point to the same object, end of story. Sure, it makes some optimizations hard and/or impossible. So what? This just means that you cannot do such optimizations in C89 mode. Offer
Posted May 10, 2022 17:26 UTC (Tue)
by Vipketsh (guest, #134480)
[Link] (5 responses)
The only way I can see your reasoning working is if you are somehow allowed to assume that function calls are an elaborate way of saying "nop".
Posted May 10, 2022 18:36 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (4 responses)
If I promise that the function calls are just naming what code does, but it's real behaviour is poking global volatile pointers, and those functions are implemented in pure C89, there's no difference in behaviour. Given the following C definitions of get_zeroed_page, get_ptr_to_handle and release_page, you still have the non-determinism, albeit I've introduced a platform dependency:
This has semantics on the real machine, because of the use of volatile - the writes to *pte are guaranteed to occur in program order. But the compiler does not have any way to know that volatile int *v_addr ends up with the same value between two separate calls to get_ptr_to_handle but points to different memory.
Also, I'd note that C89 does not have language asserting what you claim - it actually says quite the opposite, that the compiler does not have to assume that *ptr1 has changed within the C abstract machine, since ptr1 is not volatile. It's just that early implementations made that assumption because to do otherwise would require them to analyse not just the function at hand, but also other functions, to determine if *ptr1 could change. Like khim, you're picking up on a limitation of 1980s and early 1990s compilers, and assuming it's part of the language as defined, and not merely an implementation detail.
Posted May 10, 2022 19:22 UTC (Tue)
by Vipketsh (guest, #134480)
[Link] (2 responses)
> void *get_zeroed_page()
If you don't have to assume that this writes over data pointed to by some other pointer it means that your aliasing rules say that no two pointers alias. Or put another way, for all practical purposes, having two pointers pointing to the same thing is unworkable. By some reading of C89 that may be the conclusion, but quite clearly that was never the intent and exactly no one expects things to work that way (including compiler writers, oddly enough).
> compiler does not have to assume that *ptr1 has changed within the C abstract machine
You mean across a function call ? That quite simply means that exactly no data could ever be shared by any two functions (in different compile units). Again, this would make the language completely unworkable and be counter what anyone expects.
> [...] you're picking up on a limitation of 1980s and early 1990s compilers, and assuming it's part of the language as defined, and not merely an implementation detail.
No. The language is defined, first and foremost, by what existing programs expect. If the standard allows interpretations and compilers to do things counter to what a majority of these programs expect, it is the standard that is broken and not the majority of all programs. I firmly believe that the job of a standard is to document existing behaviour and not to be a tool to change all programmes out there.
p.s.: I find it fascinating that instead of arguing about actual behaviour the C standard keeps coming up as if it where a bible handed down by some higher power and everything in it is completely infallible. Then the conclusion is that "See? It all sucks, so use Rust" because Rust is so excruciatingly well specified that, last I checked, it has no specification at all.
Posted May 10, 2022 20:10 UTC (Tue)
by khim (subscriber, #9252)
[Link]
Rust hasn't needed any specs because till very recently there was just one compiler. Today we have 1.5: LLVM-based rustc and GCC-based rustc. One more is in development, thus I assume formal specs would be higher on list of priorities now. This being said IMNSHO it's better to not have specs rather than have ones which are silently violated by actual compilers. At least when there are no specs you know that discussions between compiler developers and language users have to happen, when you have one which is ignored…
Posted May 10, 2022 23:48 UTC (Tue)
by tialaramex (subscriber, #21167)
[Link]
However, Rust does extensively document what is promised (and what is not) about the Rust language and its standard library, and especially the safe subset which Rust programmers should (and most do) spend the majority of their time working with.
For example, all that ISO document has to say about what happens if I've got two byte-sized signed integers which may happen to have the value 100 in them and I add them together is that this is "Undefined Behaviour" and offers no suggestions as to what to do about that besides try to ensure it never happens. In Rust the "no specification" tells us that this will panic in debug mode, but, if it doesn't panic (because I'm not in debug mode and I didn't enable this behaviour in release builds) it will wrap, to -56. I don't know about you, but I feel like "Absolutely anything might happen" is less specific than "The answer is exactly -56".
Rust also provides plenty of alternatives, including checked_add() unchecked_add() wrapping_add() saturating_add() and overflowing_add() depending on what you actually mean to happen for overflow, as well as the type wrappers Saturating and Wrapping which are useful here (e.g. Saturating<i16> is probably the correct type for a 16-bit signed integer used to represent CD-style PCM audio samples)
Posted May 10, 2022 23:48 UTC (Tue)
by nybble41 (subscriber, #55106)
[Link]
> If an attempt is made to refer to an object defined with a volatile-qualified type through use of an lvalue with non-volatile-qualified type, the behavior is undefined.[57]
> [57] This applies to those objects that behave as if they were defined with qualified types, even if they are never actually defined as objects in the program (such as an object at a memory-mapped input/output address).
Objects in the page at pte->v_addr behave as if they were defined as volatile objects because the content changes in ways not described by the C abstract machine when pte->p_addr is updated. The same applies to passing a pointer to volatile object(s) to memset(), which takes a non-volatile pointer.
The initializer for page_location (pte->v_addr) is also not a constant, but I assume this is just pseudo-code for the value being set by some initialization function not shown here.
DeVault: Announcing the Hare programming language
…
xor ecx, ecx ; if (!ptr) return;
jnz pc+2
ret
…
DeVault: Announcing the Hare programming language
> They are *not* set in stone and there is no (technical) reason they can't be changed.
DeVault: Announcing the Hare programming language
DeVault: Announcing the Hare programming language
DeVault: Announcing the Hare programming language
DeVault: Announcing the Hare programming language
> That was me I believe.
DeVault: Announcing the Hare programming language
-fno-provenance
switch which may be used to stop miscompiling them!DeVault: Announcing the Hare programming language
DeVault: Announcing the Hare programming language
> I think it was yourself who mentioned in some thread from a while ago that LLVM is doing all sorts of optimisations based on pointer provenance when there is exactly no mention of them in any standard.
DeVault: Announcing the Hare programming language
-fwrav
, who knows? But it's changes to the language that need a justification, the rules are rules by default.DeVault: Announcing the Hare programming language
Wol
> And that's why Rust and Modula-2 and that all have unsafe blocks - it says "here be dragons, we cannot guarantee language invariants".
DeVault: Announcing the Hare programming language
/proc/self/mem
.unsafe
realm just to create queue or linked list.DeVault: Announcing the Hare programming language
> Even though "the old way" doesn't have any coherent meaning which is why this came up as a Defect Report not as a future feature proposal.
DeVault: Announcing the Hare programming language
realloc
in C99 (but, notably, not in C89) is permitted to return different pointer which can be bitwise identical to the original one.DeVault: Announcing the Hare programming language
> sabotage
DeVault: Announcing the Hare programming language
> The first rule should be not to break what has reliably worked for ages. *even if that was incorrect in the first place*. As Linus often explains, a bug may become a feature once everyone uses it; usage prevails over initial intent.
DeVault: Announcing the Hare programming language
clang
and gcc
to eliminate calls from malloc
and free
in some cases. This may bring amazing speedups. And if these things were an opt-in option I would have applauded these efforts and it would have been a great way to promote them and to, eventually, add them to the standard.DeVault: Announcing the Hare programming language
> Under that assumption you just follow the law. What law says… goes. Even if what the law says is nonsense.
> That what C/C++ compiler developers promoted for years.
- the law is behind a paywall
- lots of modern abstractions in interfaces sadly make it almost impossible to follow. Using a lot of foo_t everywhere without even telling you whether they're signed/unsigned, 32/64 causes lots of trouble when you have to perform operations on them, resulting in you being forced to cast them and enter into the nasty area of type promotion. That's even worse when you try hard to avoid an overflow based on a type you don't know and the compiler knows better than you and manages to get rid of it.
> That certainly doesn't actually describe anyone involved, as I suspect you're well aware.
DeVault: Announcing the Hare programming language
-fno-provenance
option? They got answer: although provenance is not defined by either standard, it is a real and valid emergent property that every compiler vendor ever agrees on.
-std=c89
(variant of the C standard which does not include any provisions for provenance whatsoever).DeVault: Announcing the Hare programming language
DeVault: Announcing the Hare programming language
Wol
DeVault: Announcing the Hare programming language
> C89 assumes that pointers to things must be different, which sounds reasonable but does not formally explain how this works.
DeVault: Announcing the Hare programming language
large
memory model.DeVault: Announcing the Hare programming language
> Your resulting C compiler is not the GCC I grew up with (well, OK, the one that teenage me knew), or that with some minor optimisation passes disabled, it's an altogether different animal, perhaps closer to Python.
DeVault: Announcing the Hare programming language
malloc
was not special, free
was not special and not even all Unix programs used them (just look obn the source of original Bourne Shell some days).realloc
could turn 0x0:0x1234
pointer into 0x1:0x1224
pointer if it decided not to move an object.DeVault: Announcing the Hare programming language
> And the problem is that formally specified C - including K&R C, and C89 - left a huge amount unspecified; users of C assumed that the behaviour of their implementation of the language was C behaviour, and not merely the way their implementation behaved.
DeVault: Announcing the Hare programming language
These GCC and ICC outcomes would not be correct with respect to a concrete semantics, and so to make the existing compiler behaviour sound it is necessary for this program to be deemed to have undefined behaviour.
When we start seeing this type of language, and people accusing each other of lying, it's a strong signal that a thread may have outlived its useful lifetime. This article now has nearly 350 comments on it, so I may not be the only one who feels like that outliving happened a little while ago.
This looks like a good time to stop
DeVault: Announcing the Hare programming language
DeVault: Announcing the Hare programming language
DeVault: Announcing the Hare programming language
handle page_handle = get_zeroed_page();
int test;
int *ptr2;
int *ptr1 = get_ptr_to_handle(page_handle);
*ptr1 = -1; // legitimate - ptr1 points to a page, which is larger than an int in this case and correctly aligned etc.
test = *ptr1; // makes test -1
release_page(page_handle);
page_handle = get_zeroed_page();
ptr2 = get_ptr_to_handle(page_handle); // ptr2 could have the same numeric value as ptr1.
if (ptr2 == ptr1 && *ptr1 == test) {
puts("-1 == 0");
} else {
puts("-1 != 0");
}
release_page(page_handle);
> Both are valid readings of this source under the rules set by C89, because C89 states explicitly that the only thing expected to change outside of explicit changes done by C code are things marked as volatile.
DeVault: Announcing the Hare programming language
fread
and fwrite
can certainly change things, too.fread
and fwrite
as system extension, or else you couldn't use them them).invlpg
would be place where compiler would know that object may suddenly change value.mmap
and munmap
and they should be treated by compiler similarly to read
and write
: compiler either have to know what they are doing or it should assume they may touch and change any object they can legitimately refer given their arguments.asm
and/or call to system routine which, by necessity, needs extensions to C89 standard to be usable. In both cases everything is fully defined.DeVault: Announcing the Hare programming language
> By invoking the idea of a "list of system routines", you're extending the language beyond C89.
DeVault: Announcing the Hare programming language
-fno-provenance
option, enable it for -std=c89
, done.DeVault: Announcing the Hare programming language
DeVault: Announcing the Hare programming language
const size_t PAGE_SIZE;
struct pt_entry {
volatile char *v_addr;
volatile char *p_addr;
}
volatile struct pt_entry *pte; // Initialized by outside code, with suitable guarantees on v_addr for the compiler implementation and on p_addr
int *page_location = pte->v_addr;
void *get_zeroed_page() {
pte->p_addr += PAGE_SIZE;
memset(pte->v_addr, 0, PAGE_SIZE);
return pte->v_addr;
}
void release_page(void *handle) {
assert(handle == pte->v_addr);
pte->p_addr -= PAGE_SIZE;
}
void *get_ptr_to_handle(void* handle) {
assert(handle == pte->v_addr);
return page_location;
}
DeVault: Announcing the Hare programming language
> [...]
> memset(pte->v_addr, 0, PAGE_SIZE);
> Then the conclusion is that "See? It all sucks, so use Rust" because Rust is so excruciatingly well specified that, last I checked, it has no specification at all.
DeVault: Announcing the Hare programming language
DeVault: Announcing the Hare programming language
DeVault: Announcing the Hare programming language