What every C Programmer should know about undefined behavior #2/3
The end result of this is that we have lots of tools in the toolbox to find some bugs, but no good way to prove that an application is free of undefined behavior. Given that there are lots of bugs in real world applications and that C is used for a broad range of critical applications, this is pretty scary."
Posted May 16, 2011 15:54 UTC (Mon)
by gowen (guest, #23914)
[Link] (11 responses)
This one is interesting because (a) something very much like it caused a real security hole in the linux kernel recently and (b) the ONLY reason it exists is because of C's "declarations go at the start of the block" rule.
Someone wants to declare a variable, and knows its good practice to initialise it and, in the interest of style wants to avoid
C++ (since RAII strongly encourages initialise-at-declaration) and c99 (should it ever catch on) should make this one considerably less common.
Posted May 16, 2011 16:37 UTC (Mon)
by cfischer (guest, #3983)
[Link] (3 responses)
Methinks, the
So the problem seems more one of mixing styles to me.
Posted May 16, 2011 17:43 UTC (Mon)
by ledow (guest, #11753)
[Link]
But then, I'd also explicitly use NULL too, rather than 0.
Posted May 16, 2011 18:28 UTC (Mon)
by iabervon (subscriber, #722)
[Link] (1 responses)
Of course, there's also the situation of a programmer who doesn't think C99 has caught on modifying a C99 codebase, and moving the line with the declaration up to the start of the block; the result may not be "natural" C89 to write, but it might be a "natural" C89 change to a "natural" C99 block.
Posted May 16, 2011 19:57 UTC (Mon)
by jreiser (subscriber, #11027)
[Link]
Often it is better to make a block by inserting the braces which enclose the intended scope.
Posted May 16, 2011 21:35 UTC (Mon)
by cmccabe (guest, #60281)
[Link] (4 responses)
1. Initializing something before checking if it's NULL has nothing to do with whether you decide to put declarations at the start.
2. C99 doesn't have a "declarations go at the start of the block rule."
That being said, a lot of people consider it good style in C to put the declarations at the start of the block in C, because it encourages you to keep functions short and sweet.
Posted May 17, 2011 5:36 UTC (Tue)
by gowen (guest, #23914)
[Link] (3 responses)
I don't think that word means what you think it means. Clue: It doesn't mean "someone who doesn't share my opinion".
> 1. Initializing something before checking if it's NULL has nothing to do
Combined with a coding rule that variables must be initialised when declared, it really kind of does. And thats a common coding rule, because using an unitialised variable is - surprise - undefined behaviour. The vertical space between a variables declaration and its initialisation represent a region in which using that variable is a bug. It is good practice to keep that space as small as possible ("but no smaller" being the extra advice thats forgotten in this case). Yes, its not an unavoidable bug. But its an unnecessary vector for bug transmission.
So, as mentioned above this clash of good advice - or rather, a slightly blind application of usually-good advice - combined with C89's archaism on variable declaration resulted in a bug.
For the recent Linux hole, why do *you* think the (presumably experienced) coder initialised the variable on declaration - which sadly preceded the NULL check?
> 2. C99 doesn't have a "declarations go at the start of the block rule."
I did actually mention that. So zero out of ten for reading the whole post.
Posted Jun 7, 2011 21:14 UTC (Tue)
by cmccabe (guest, #60281)
[Link] (2 responses)
Cherry-picking one example of a hole and then using it to justify your style preferences is fairly silly. I could equally well find an overflow caused by signed overflow and say "aha! signed numbers are teh evil."
The reason why I prefer the C89 style of initializing the variables at the top of the block is that it tends to lead to shorter, clearer functions. If you end up with a page worth of declarations, it makes it clear even to the dullest programmer that his function is getting too long. It also makes it clearer how much stack space is actually being used, which is nice when you're really going for performance. And if you're not going for performance, what are you doing using C?
I understand the arguments for the C99/C++ "declare right before use" style. In C++ it's almost a must, because declarations trigger constructors to run, consuming CPU cycles. It also works well with C++'s RAII style. It also can move the definition closer to the use, making it easier to see the type. But again, that assumes you are writing gigantic, multi-page functions, which you *should not do*.
So basically, I think we are going to have to agree to disagree, for C at least. For C++, yes, you should declare as close as possible to where you use a variable.
Posted Jun 8, 2011 13:39 UTC (Wed)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted Jun 16, 2011 0:05 UTC (Thu)
by cmccabe (guest, #60281)
[Link]
Anyway, lazy or careless people can always find a way to do lazy or careless things. It is nice if you get a helpful hint that what you are doing is wrong, though. For example, using 4 or 8 space indents tends to give you a wakeup call that 20 levels of nesting might be more than the human mind can understand in C or C++. Using 1 or 2 space tabs does not. etc.
Posted May 17, 2011 8:40 UTC (Tue)
by marcH (subscriber, #57642)
[Link] (1 responses)
Posted May 17, 2011 14:55 UTC (Tue)
by nye (subscriber, #51576)
[Link]
Incorrect - it's not a result of dead code elimination. This bug arises even if dead is used for something later on, because the initial assignment invokes undefined behaviour in the case that P is NULL. It's therefore always conformant for the compiler to assume that P is *not* NULL and remove the check, since if that assumption is incorrect then the behaviour is undefined and the compiler can do whatever it damn well pleases.
Posted May 16, 2011 16:14 UTC (Mon)
by benjamingeer (guest, #67678)
[Link] (3 responses)
Posted May 16, 2011 16:25 UTC (Mon)
by felixfix (subscriber, #242)
[Link] (1 responses)
Posted May 16, 2011 18:34 UTC (Mon)
by rpbrennan (guest, #70904)
[Link]
Posted May 16, 2011 19:29 UTC (Mon)
by renox (guest, #23785)
[Link]
This doesn't mean that one should use C: premature optimisation is still the root of all evil, but just that the reaction "undefined behaviour == don't use C" is naive at best.
Posted May 16, 2011 20:12 UTC (Mon)
by nix (subscriber, #2304)
[Link] (49 responses)
Posted May 16, 2011 20:26 UTC (Mon)
by jmalcolm (subscriber, #8876)
[Link] (6 responses)
The US Nuclear Arsenal could probably make the world come to an end me thinks. Is it controlled in any way by C (or Unix)? Or is it all in ADA?
This is not a slam on C by the way. I have a deep love of that crazy little language. It also commands enough respect to almost qualify as fear.
Your comment just made me chuckle.
Posted May 17, 2011 12:11 UTC (Tue)
by dgm (subscriber, #49227)
[Link] (5 responses)
Posted May 18, 2011 12:57 UTC (Wed)
by marcH (subscriber, #57642)
[Link] (4 responses)
This is really a pointless comment. Here is a similar one: Nothing can prevent the best burglars to break into your house (so why buy an expensive lock?)
> Some languages can help reduce "certain" kinds of errors, often trading-off execution speed and/or generality...
Yes.
> and introducing subtle, new kinds of errors.
No.
Posted May 19, 2011 15:22 UTC (Thu)
by dgm (subscriber, #49227)
[Link] (3 responses)
>This is really a pointless comment. Here is a similar one: Nothing can prevent the best burglars to break into your house (so why buy an expensive lock?)
I would not call it pointless, but I agree it's rather trivial. Anyway, it's useful to keep it in mind when listening to vendor's preaching the latest silver bullet.
>> and introducing subtle, new kinds of errors.
The logical conclusion would be, then, that a "perfect" language that prevents any kind of error is possible, which is absurd.
Posted May 19, 2011 23:45 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (1 responses)
Let's keep the trivial statements coming: every vendor is preaching the latest silver bullet. It's their job, they are paid for it. Their lies does not mean every product sucks.
> The logical conclusion would be, then, that a "perfect" language that prevents any kind of error is possible,
Your logic is really beyond me.
Posted May 20, 2011 3:01 UTC (Fri)
by viro (subscriber, #7872)
[Link]
Their lies do not mean that water is wet either...
Posted May 20, 2011 6:10 UTC (Fri)
by dark (guest, #8483)
[Link]
That doesn't follow. An alternate conclusion is that even the subtlest errors are already possible in existing languages.
Posted May 16, 2011 21:08 UTC (Mon)
by HelloWorld (guest, #56129)
[Link] (9 responses)
Posted May 17, 2011 1:32 UTC (Tue)
by wahern (subscriber, #37304)
[Link] (3 responses)
Posted May 17, 2011 5:26 UTC (Tue)
by cmccabe (guest, #60281)
[Link] (2 responses)
There was another tool for Ruby that would randomly alter your code at runtime (!) and see how you handled the resulting errors. I am having a really hard time remembering the name, though...
Posted May 17, 2011 10:26 UTC (Tue)
by pager2 (guest, #72197)
[Link]
Posted May 17, 2011 18:35 UTC (Tue)
by wahern (subscriber, #37304)
[Link]
Posted May 17, 2011 21:16 UTC (Tue)
by kleptog (subscriber, #1183)
[Link] (2 responses)
It's not clear that the programs we write are anywhere near that kind of complexity. I feel it should be possible to prove some useful properties of programs but we miss some infrastructure for describing the things we want to prove. You can say "all strings must be UTF-8", but how can you explain that to a computer? There's always exceptions to deal with (the result of read() for example).
It's pointed elsewhere in this thread that want you want is to check the program against a model. The problem is, we (often) don't have the model written in a formal way. So we might need a program that tries to determine the model, which we humans then check and which can then be used to check the program. So you can get messages like: function foo is always called with a UTF-8 string, except for that call over there.
Perhaps this can happen during development, so you get prompted: method foo was always called with object bar, and now also with object baz, correct y/n? Depending on the result the model is updated. The question is, can you make this so it doesn't get in your way too much.
Posted May 18, 2011 14:20 UTC (Wed)
by jd (guest, #26381)
[Link]
The reason for using something like Z is that it is implementation-independent, so it doesn't make any difference if you're writing in C, C++, LISP or Prolog. There will be a valid mapping.
The disadvantage of Z is that in the same way the same specification can produce many implementations, the same implementation can produce many specifications. These will not be of equal use and I know of no easy way to generate the specification in a way that guarantees usefulness.
(For those more familiar with Z as the starting point, I'm totally inverting the flow. This may shock some. Sorry. With this scheme, I'm solely concerned with back-engineering what the specification of the code is from the code, not in generating code from a specification.)
The rationale is that Z is intended to be easier to validate than code. Easier, not easy. It's still hard work. But doing the validation (which is the hardest part) in a single frame of reference and converting to it from the different langauges (which is hard but not as bad) should require less work than writing a validator specific to each language.
Posted May 18, 2011 20:10 UTC (Wed)
by njs (subscriber, #40338)
[Link]
That isn't quite right. You can easily write a program that answers those questions *if they have an answer* -- the halting problem is related to the fact that there might not be an answer at all. It's entirely possible that those particular questions do have answers, though, which case a computer program could find it. But this doesn't help much in practice because that computer program probably won't finish until sometime after the heat-death of the universe.
> You can say "all strings must be UTF-8", but how can you explain that to a computer? There's always exceptions to deal with (the result of read() for example).
Oh, but actually this is easy! Any reasonably competent statically-typed language can do this. E.g., in C++, you define a type "utf8_string", and you make sure that any publicly accessible way of constructing an instance does appropriate validity checking. Then code that assumes valid utf8 just declares that it takes input of type utf8_string. read(), of course, doesn't return a utf8_string, so if you want to read a string and then you want to pass it to some code that assumes utf8 (maybe somewhere else entirely in your program, after its been passed through 10 other functions), the compiler won't let you unless you sanitize it first. And as you refactor your APIs, the conversion naturally gets moved around to be in the right place.
It works great. But I've only seen one project that used C++ like this. It's very sad :-(.
Posted May 18, 2011 11:34 UTC (Wed)
by Wol (subscriber, #4433)
[Link] (1 responses)
Just because you've proved it in the imaginary world of maths doesn't mean diddley-squat. The ONLY valid proof in the real world is "well, we haven't been wrong yet - but there's always a first time ...".
Cheers,
Posted May 18, 2011 14:22 UTC (Wed)
by jd (guest, #26381)
[Link]
Posted May 16, 2011 21:10 UTC (Mon)
by njs (subscriber, #40338)
[Link] (24 responses)
Large-scale proofs about programs are very hard, but that's a tools problem, not a deep conceptual abyss where no-one should ever even try to tread. (And we'd probably have better tools -- and programming languages -- if people were less scared of the abyss.)
Posted May 16, 2011 21:50 UTC (Mon)
by cmccabe (guest, #60281)
[Link] (23 responses)
In my humble (non-tenured) opinion, proofs belong in math class; model checking belongs in engineering. If you tried to prove any non-trivial thing about your life, you would quickly find it impossible.
Imagine trying to prove that you were going to go to the store and return with a gallon of milk. It sounds simple, but consider: how can you prove that you won't have a car crash or a heart attack? Even if none of those disasters happen, the store might be closed. The kind of milk you want might not be in stock. The traffic might be so heavy that it takes you an hour to drive there. If you can only get skim milk, does that still count as picking up the milk? If traffic takes an hour and the milk is warm when you get back, does that count as success?
Similarly, if you try to prove that your browser will successfully display a web page, you slam head-on into a mountain of difficulties. What if the RAM is bad? How about the hard disk? Will the network lose some packets? What if the web page is so complex that it takes 10 minutes to render on our puny CPU? What about the libraries I depend on? Is there a bug in there? What if the web page is rendered in an "ugly" way (a subjective term). Can I prove that that won't happen? Of course not.
As Einstein said: "As far as the laws of mathematics refer to reality, they are not certain, and as far as they are certain, they do not refer to reality."
The best I can do is set up a bunch of models and validate my program against them. Static type checking is one such model. Unit tests are another set of models. Tools like lint, sparse, cppcheck provide yet more tests. Another set of tests is giving the program to users and seeing if they like the responsiveness, the user interface, and the overall design.
Posted May 16, 2011 22:17 UTC (Mon)
by HelloWorld (guest, #56129)
[Link]
> What if the RAM is bad? How about the hard disk?
Posted May 16, 2011 22:33 UTC (Mon)
by njs (subscriber, #40338)
[Link] (13 responses)
In my experience, careful proofs can form a critical part of complex system design. I'm thinking of, for example, the problem of a VCS trying to merge a DAG of changes, each of which may contain arbitrary file adds/deletes/renames. This is a horrible problem with many incorrect solutions littering history, but, solvable with careful formal reasoning.
> Similarly, if you try to prove that your browser will successfully display a web page, you slam head-on into a mountain of difficulties. What if the RAM is bad? How about the hard disk? Will the network lose some packets? What if the web page is so complex that it takes 10 minutes to render on our puny CPU? What about the libraries I depend on? Is there a bug in there? What if the web page is rendered in an "ugly" way (a subjective term). Can I prove that that won't happen? Of course not.
Yes, but that just means you tried to prove the wrong thing.
Can that web page trigger writes to arbitrary pages on my hard disk? Is there anywhere in my program that uses latin1 when it should be using UTF-8? Will this image decoder return either valid data or a defined error code on arbitrary inputs? Does this program invoke undefined behavior? Can this data be corrupted if certain code is scheduled concurrently? Proving those kinds of properties is totally useful, possible in principle (if you set things up right), and in some cases easily doable today.
> The best I can do is set up a bunch of models and validate my program against them. Static type checking is one such model.
But static type checking *is* a way to prove global properties of your program! Or another example: C++'s "private:" keyword is useful not because it gives some kind of 'security' or something (like many documents about it seem to think), but because it lets me know for certain (i.e., prove) that I only have to look at a certain bounded amount of code if I want to see how certain variables are modified. (And then that in turn is useful because it lets me verify that all that code maintains the relevant invariants, which is another kind of informal proof.)
I just want better tools for *non-heuristic* reasoning about programs. That's really not impossible -- though it might require changes to everything from the programming languages to your program's architecture -- and it would be useful to people without tenure, too.
Posted May 16, 2011 22:56 UTC (Mon)
by cmccabe (guest, #60281)
[Link] (11 responses)
Actually, the "private" keyword in C++ allows you to prove no such thing. I can simply typecast the class to a byte buffer and modify to my heart's content.
If I want to be even more evil, I can add:
Java also has a way for "unauthorized" classes to get access to private data. I think you can use the Reflect package to get at it.
Private data in Java and C++ is not and was not intended to provide the kind of guarantees a proof checker would need.
On the other hand, if the goal is to allow the programmer to have a reasonable mental model, they work pretty well!
> I just want better tools for *non-heuristic* reasoning about programs.
Too bad. Artificial intelligence is about at where chemistry was in 1000 AD. Theorem provers are good for proving theorems, but bad at real-world reasoning.
On the other hand, what I can offer you is sandboxed programming langauges and model checkers. It's amazing how much more productive you can be when you have a garbage colector and a good type system.
C.
Posted May 16, 2011 22:58 UTC (Mon)
by cmccabe (guest, #60281)
[Link] (7 responses)
:)
Posted May 16, 2011 23:12 UTC (Mon)
by tialaramex (subscriber, #21167)
[Link] (5 responses)
Of course in a debug mode, or in a non-compliant JVM, or if there's a bug, you may make this work. But in /theory/ at least they've thought of this, so it would be fair for a Java programmer (unlike a C++ programmer) to treat private members as genuinely private.
Posted May 16, 2011 23:30 UTC (Mon)
by cmccabe (guest, #60281)
[Link] (4 responses)
I have a pretty good hunch that this API gives me a hole in "private" big enough to drive my truck through.
I haven't tried it, though.
Posted May 17, 2011 1:07 UTC (Tue)
by foom (subscriber, #14868)
[Link]
Posted May 17, 2011 8:54 UTC (Tue)
by tialaramex (subscriber, #21167)
[Link] (1 responses)
In other cases it would be very deliberate e.g. we can imagine a system where untrusted code runs in a Java sandbox on a remote system, and has access to certain serialisable objects which are sensitive, so their serialisations are encrypted, versioned and signed with keys not available to the untrusted code.
Even if you're allowed to make the subclass (security policy again) - your subclass doesn't get to look at protected data members from other instances, so this will often be useless. Remember this is Java, so type restrictions are enforced at runtime.
Basically this goes on and on, unlike in C++ the designers actually intended this to be enforced, not just a vague guideline to help those willing to help themselves. So even if you find a crack in the wall, someone will fix it. There really aren't any gaps "big enough to drive a truck through" as you imagine and as is the case in something like C++. If you want to drive a truck in, you need someone to conveniently open the truck-sized gate from the other side by disabling the relevant security policy.
Posted May 17, 2011 9:01 UTC (Tue)
by tialaramex (subscriber, #21167)
[Link]
Actually this bit might be wrong. You could be able to just pass in a suitable instance and have the code inside your imposter subclass poke around in its protected internals. But again the security policy gets to decide whether you're allowed to make this subclass at all (unlike the 'final' keyword this places no such restriction on the author of the rest of the system who may very well operate under a different policy).
Posted May 17, 2011 17:21 UTC (Tue)
by jeremiah (subscriber, #1221)
[Link]
Posted May 17, 2011 8:39 UTC (Tue)
by chad.netzer (subscriber, #4257)
[Link]
Posted May 17, 2011 1:28 UTC (Tue)
by foom (subscriber, #14868)
[Link]
The compiler/linker will prevent that if you're using MS VC++: on that platform, the mangling for public and private methods is different! Forces you to be more evil than just #define to get your way. :)
Posted May 17, 2011 14:35 UTC (Tue)
by dgm (subscriber, #49227)
[Link]
<flame>Oh, yes! You can churn buggy *and s... l... o... w* code much faster (and in greater quantities!) with that. The new guy in the corner just does this all day long. Thanks C#!</flame>
Posted May 19, 2011 23:57 UTC (Thu)
by marcH (subscriber, #57642)
[Link]
To demonstrate that code verification tools do not work, you hit them with a sledgehammer. Interesting and conclusive.
Posted May 16, 2011 22:57 UTC (Mon)
by mpr22 (subscriber, #60784)
[Link]
Posted May 17, 2011 9:00 UTC (Tue)
by marcH (subscriber, #57642)
[Link] (7 responses)
Such a statement requires omniscience.
I am afraid you are not omniscient
http://en.wikipedia.org/wiki/Polyspace (just a random example)
Posted May 17, 2011 22:38 UTC (Tue)
by cmccabe (guest, #60281)
[Link] (6 responses)
Polyspace looks like another such tool. It can prove that certain programs clearly violate C semantics, like dereferencing a NULL pointer. But it can't prove the program as a whole correct because it doesn't have enough information about the requirements and the environment.
There are other examples of little model checkers. Sparse allows kernel hackers to annotate functions with interesting properties like what locks it takes, etc. cpplint finds potential errors in C++ programs. And -Wall and -Wextra add even more checks. Those things are great and we should have more of them.
But if you did a survey of most academic programming language departments, you would find that most of them focus on developing completely new programming languages and rewriting things from scratch. Nothing is more common than a grad student inventing a new functional programming language for his thesis. Nothing is more uncommon than an actual useful tool coming out of it. Even your own example confirms this: Polyspace is commercial and not developed in academia.
Posted May 18, 2011 0:34 UTC (Wed)
by price (guest, #59790)
[Link] (5 responses)
> Polyspace is commercial and not developed in academia. Wrong. It is commercial, but it came from academia. Quick history of Polyspace, unfortunately quoting an obituary (http://christele.faure.pagesperso-orange.fr/AlainDeutsch.html - see original for many links): So the tool was developed by a PhD researcher working at a research lab (and well-known hotbed of static type systems), refined for several years, and finally commercialized by him and others. Another static analysis tool widely used in practice is Coverity -- that one came from a bunch of academics at Stanford. That's the usual story for this kind of tool.
Posted May 18, 2011 4:08 UTC (Wed)
by cmccabe (guest, #60281)
[Link] (4 responses)
It's also interesting that it was associated with the launch of a European rocket, Ariane 5. Everyone knows that the space race in the 1960s helped to push technology ahead in the United States; I guess that still happens, at least to a certain extent.
I think this kind of research is really interesting. It seems like it could help create more efficient compilers and perhaps better tools.
For what it's worth, I like static type systems. I really hope that in the future, we'll be able to have more and more information about programs even before running them. Programmers ought to be free from the drudgery of spotting typos or passing the wrong arguments to functions. Or even accidentally dereferencing a pointer before assigning it. As I said before, though, there are always things in the design that are impossible to "prove" (in the mathematical sense), like user interface, the performance of heuristic algorithms, and artistic design.
Posted May 18, 2011 5:02 UTC (Wed)
by price (guest, #59790)
[Link] (3 responses)
Definitely beats Tang, if you ask me.
Posted May 19, 2011 3:53 UTC (Thu)
by raven667 (subscriber, #5198)
[Link] (2 responses)
Posted May 19, 2011 13:11 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (1 responses)
Posted May 19, 2011 18:01 UTC (Thu)
by raven667 (subscriber, #5198)
[Link]
Posted May 16, 2011 23:08 UTC (Mon)
by salvarsan (guest, #18257)
[Link]
C is a well-defined devil, and the work-arounds are, too.
Posted May 17, 2011 8:38 UTC (Tue)
by stijn (subscriber, #570)
[Link] (5 responses)
Posted May 17, 2011 15:07 UTC (Tue)
by dgm (subscriber, #49227)
[Link] (4 responses)
I don't see why. What's the magic that makes C "blow harder" than any other language where a NULL pointer dereference is possible (assembler or Pascal, for example)?
> As a result we get duplicate mem and str libraries, already a good indication something is amiss.
Nonsense. Pascal, for instance, does not use zero-terminated string (uses size-prefixed strings), neither do other languages like C++, C# or Java. And all of them have separate facilities for dealing with strings and arbitrary buffers.
> Reasoning about \0-terminated strings is hard, not because of buffer sizes, but because you always have to make sure that no stowaway \0 can possibly be present inside your string.
"Hard" is relative. I have foggy memories of having had some problem with an embedded zero in an string on my first week of writing C, like 20 years ago, but never after that.
Posted May 17, 2011 19:06 UTC (Tue)
by stijn (subscriber, #570)
[Link] (3 responses)
Posted May 17, 2011 21:57 UTC (Tue)
by baldridgeec (guest, #55283)
[Link]
So you use memchr() instead of strchr().
Or you use C++, and call it a string instead of a byte[]. (and use the actual C++ string-manipulation functions, not strchr - g++ warns about (byte[])string as being an invalid cast nowadays anyway though) :)
Posted May 19, 2011 14:49 UTC (Thu)
by dgm (subscriber, #49227)
[Link] (1 responses)
Posted May 19, 2011 15:49 UTC (Thu)
by stijn (subscriber, #570)
[Link]
What every C Programmer should know about undefined behavior #2/3
void contains_null_check(int *P) {
int dead = *P;
if (P == 0)
return;
*P = 4;
}
void contains_null_check(int *P) {
if (P == 0) return;
{
int dead = *P;
*P = 4;
}
}
Result: bug.
Why not just say:What every C Programmer should know about undefined behavior
void contains_null_check(int *P) {
int dead;
if (P == 0)
return;
dead = *P;
/* do with 'dead' whatever you want */
}
Wouldn't that be the more 'natural' C style?
int dead = *P;
in the example is a misguided result of the urge to 'initialize',
as opposed to 'assign'. In C++ those might be different, but in
old-style C, for simple datatypes, avoiding the assignment for
such reasons could already be seen as bad influence by new-fangled
object-oriented hip languages.What every C Programmer should know about undefined behavior
What every C Programmer should know about undefined behavior
If you then have to write C89, the easy thing is to move the whole line, ...
What every C Programmer should know about undefined behavior
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
> with whether you decide to put declarations at the start.
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
If you end up with a page worth of declarations, it makes it clear even to the dullest programmer that his function is getting too long.
You have too much confidence in dull programmers. I have worked on functions with ten pages of variable declarations at the top. (The functions themselves were ten thousand lines long, which *anyone* should have realized was too long, but they had grown slowly to that length and nobody wanted to take the 'risk' of splitting them.)
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
Solution: Don't use C.
Solution: Don't use C.
Solution: Don't use C.
Solution: Don't use C.
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
>No.
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
The logical conclusion would be, then, that a "perfect" language that prevents any kind of error is possible, which is absurd.
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
This is, and has always been, a bullshit argument. The fact that we can't write a program that will tell us whether some program will halt doesn't mean that we can't prove it for some specific program we care about.
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
Wol
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
For example, type systems (if properly abused) let you prove all sorts of interesting properties about global data flow (e.g., the property "no string ever goes from the user to the database without being sanitized"). There's no good reason why the behavior of your X server, kernel, web server, browser, etc. should ever be *uncomputable*.
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
Of course it's not currently feasible to prove the correctness of a complex program like a web browser, already because there is no formal mathematical specification for how HTML documents should be rendered. But it'd already be a success if certain properties of a program could be proven automatically. Properties such as "every file descriptor opened is also closed at some point" or "every array index is within the bounds of the array".
Bad RAM is not a software bug, thus it's out of scope for software developers. It's like saying that you can't prove anything in math because your axiom or logic may be "broken". You just have to assume something. In Peano numerals, it's the existence of the number 0, and if you want to prove stuff about programs, you have to assume that the machine you run it on works.
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
> gives some kind of 'security' or something (like many documents about it
> seem to think), but because it lets me know for certain (i.e., prove) that
> I only have to look at a certain bounded amount of code if I want to see
> how certain variables are modified
#define public private
to the beginning of my .cc file and include the header file for your class. Then nobody will stop me from doing whatever I want with your private data-and methods-- not the compiler, and certainly not the linker.
What every C Programmer should know about undefined behavior #2/3
#define private public
#define protected public
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
http://download.oracle.com/javase/1.5.0/docs/api/java/lan...
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
You also might need:What every C Programmer should know about undefined behavior #2/3
#define class struct
What every C Programmer should know about undefined behavior #2/3
> #define public private
> to the beginning of my .cc file and include the header file for your class. Then nobody will stop me from doing whatever I want with your private data-and methods-- not the compiler, and certainly not the linker.
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
Actually, given that C and C++ pointers have no fandango-on-core protection, to prove your private members never get edited by non-member non-friend code you have to prove the memory-access correctness of the entire program, including the underlying standard library.
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3
What every C Programmer should know about undefined behavior #2/3