LWN.net Logo

What every C Programmer should know about undefined behavior #2/3

What every C Programmer should know about undefined behavior #2/3

Posted May 18, 2011 20:10 UTC (Wed) by njs (guest, #40338)
In reply to: What every C Programmer should know about undefined behavior #2/3 by kleptog
Parent article: What every C Programmer should know about undefined behavior #2/3

> Indeed, the halting problem is (amongst other things) just a formal way of saying you can't write a universal theorem prover. Imagine, you could write a program that terminates on the first odd perfect number, or on an unexpected zero of the Riemann zeta function and your "halting problem solver" would give you the answer to questions mathematicians having been breaking their heads over for generations.

That isn't quite right. You can easily write a program that answers those questions *if they have an answer* -- the halting problem is related to the fact that there might not be an answer at all. It's entirely possible that those particular questions do have answers, though, which case a computer program could find it. But this doesn't help much in practice because that computer program probably won't finish until sometime after the heat-death of the universe.

> You can say "all strings must be UTF-8", but how can you explain that to a computer? There's always exceptions to deal with (the result of read() for example).

Oh, but actually this is easy! Any reasonably competent statically-typed language can do this. E.g., in C++, you define a type "utf8_string", and you make sure that any publicly accessible way of constructing an instance does appropriate validity checking. Then code that assumes valid utf8 just declares that it takes input of type utf8_string. read(), of course, doesn't return a utf8_string, so if you want to read a string and then you want to pass it to some code that assumes utf8 (maybe somewhere else entirely in your program, after its been passed through 10 other functions), the compiler won't let you unless you sanitize it first. And as you refactor your APIs, the conversion naturally gets moved around to be in the right place.

It works great. But I've only seen one project that used C++ like this. It's very sad :-(.


(Log in to post comments)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds