Sponsored link Serve your customers, not your servers, with VERIO Linux VPS. Full-access test-drive here. |
Buggifying critical core modulesBuggifying critical core modulesPosted Mar 18, 2008 21:09 UTC (Tue) by nix (subscriber, #2304)In reply to: Buggifying critical core modules by ncm Parent article: Who maintains dpkg?
To be honest, though, I've introduced this particular bug myself without noticing, and so has everyone else I know. The real problem here is C and C++ being deeply counterintuitive (even more than usual).
(Log in to post comments)
Buggifying critical core modules Posted Mar 19, 2008 2:20 UTC (Wed) by ncm (subscriber, #165) [Link] Which part of "don't break code that works, for no reason" is counterintuitive? (I ask merely for information.)
Buggifying critical core modules Posted Mar 19, 2008 8:31 UTC (Wed) by nix (subscriber, #2304) [Link] The counterintuitive part is NULL not always being the thing to use if you want a null pointer. Breaking code that works, well, I've done *that*, too, but generally when refactoring old and tangled messes. I didn't think dpkg would count (and random stylistic cleanups into a *different style than the rest of the code*, well, ick.)
null pointers Posted Mar 19, 2008 9:54 UTC (Wed) by tialaramex (subscriber, #21167) [Link] In what scenario does NULL not work if you want a null pointer? Specifically, when doesn't GCC's __null (or equivalent in other modern compilers) do what you expected a null pointer to do ?
null pointers Posted Mar 19, 2008 10:01 UTC (Wed) by tialaramex (subscriber, #21167) [Link] Ah, I think I get it now. So the concern is that on some (now obscure) platforms, there are a variety of pointer sizes available, and someone might actually be stupid enough to use one narrower (and thus less general) than void * in a vararg function. At which point NULL (being void *) will be the wrong size and cause unexplained problems. It seems to me that (type *) NULL is still a perfectly good choice of value for such a parameter and that we have modern vararg type checking for exactly this sort of reason, although admittedly that probably won't warn you about the portability problem if you're testing on platforms where pointers are always the size of the machine word.
null pointers Posted Mar 19, 2008 10:13 UTC (Wed) by cortana (subscriber, #24596) [Link] I wonder... is that a -W option to GCC that enables a warning if you try to pass NULL or 0 to a varargs function where a pointer to some data is expected?
null pointers Posted Mar 20, 2008 10:22 UTC (Thu) by msmeissn (subscriber, #13641) [Link]
If you mark up the function with __attribute__((sentinel)) then
it will warn.
All common functions are marked up already.
$ cat xx.c
#include <unistd.h>
void f() {
execl("hello","world","!",0);
}
$ gcc -Wall -O2 -c xx.c
xx.c: In function ‘f’:
xx.c:4: warning: missing sentinel in function call
$
I personally marked up X, GLIB and GTK for instance... :/
Ciao, Marcus
null pointers Posted Mar 22, 2008 17:03 UTC (Sat) by spitzak (subscriber, #4593) [Link] Aha! So that's what that damn warning means! It would really help if the warning said "missing cast of 0 to varargs function" rather than "missing sentinal"!
null pointers Posted Mar 19, 2008 15:54 UTC (Wed) by vmole (subscriber, #111) [Link] Actually, that's still not it. A varargs function that specifies a "(type *)" argument requires a "(type *)" argument, not something else, not even "(void *)". The problem with the modification made to dpkg was not the replacement of "0" with "NULL", but the removal of the "(char *)" cast.
null pointers Posted Mar 19, 2008 17:18 UTC (Wed) by quotemstr (subscriber, #45331) [Link] Name a POSIX system, or better yet, a platform on which dpkg runs, where code and data pointers are not the same size. Using NULL for function pointers is fine given certain assumptions, and these days, IMO, one can safely make these assumptions.
null pointers Posted Mar 19, 2008 17:54 UTC (Wed) by vmole (subscriber, #111) [Link] Once upon a time, people assumed that sizeof(int) == sizeof(_ptr_). Once upon a time, people assumed that sizeof(int) == sizeof(long). Once upon a time, people assumed that sizeof(int) == 2 and sizeof(long) == 4. Once upon a time, people assumed you could dereference NULL (or 0), and its value was 0. Once upon a time, people assumed that all the world was ASCII. Times change, and your "safe assumptions" are not so safe anymore.
null pointers Posted Mar 19, 2008 18:05 UTC (Wed) by quotemstr (subscriber, #45331) [Link] Assumptions are necessarily for practicality. You assume that cars will stop at red lights when you cross the street, don't you? You assume that a char has eight bits, don't you? Standards are simply sets of assumptions we allow programs to make. POSIX implies that using NULL, with a sensible definition, is okay for every kind of pointer. Some assumptions are warranted and others not. It's not warranted to assume little-endian byte ordering, for example, because the choice is arbitrary and still not uniform. OTOH, it is warranted to assume that pointer types are the same size because they're uniform ally so, and because there's a strong reason to think that the assumption will hold into the future.
null pointers Posted Mar 19, 2008 20:20 UTC (Wed) by vmole (subscriber, #111) [Link] You assume that cars will stop at red lights when you cross the street, don't you? You obviously don't live in Houston. I'd be dead if I was that careless. (No smiley.) You assume that a char has eight bits, don't you? No, why would I? Sure, that's the most common situation, but if I'm actually coding something that requires that (unusual), then I check. Now, I admit that I probably would just error-out, rather than spend the time coding for the unusual situation until I actually needed it. Sure, we all make assumptions when we have to. But why make an assumption when you can just as easily follow the standard? If you're coding to POSIX, and POSIX really does imply that null-pointer types are all interchangeable, then fine. But the code isn't C standard compliant, and I've spent way too much of my life fixing code full of assumptions about platforms and compilers to have much positive to say about code that doesn't follow appropriate standards.
null pointers Posted Mar 19, 2008 21:39 UTC (Wed) by quotemstr (subscriber, #45331) [Link] If you've represented a UTF-8 encoded string with some kind of char array, you've relied on 'char' having at least eight bits. There is a tradeoff involved in not making assumptions. Say you're writing a portable program and you don't want to assume AF_UNIX support. You can either use some other IPC mechanism, likely to be less efficient, or code two versions, one that uses unix-domain sockets and another that uses something else. Either way, there is an overhead for not assuming AF_UNIX support, either in runtime overhead or maintenance. You can add build-time checks of your assumptions, true, but some assumptions hold true so often that it's often not worth even bothering to check. All I'm arguing is that for non-embedded systems, all pointers being the same size is the case often enough that it's not worth it to litter the code with strange casts. Checking whether the assumption held would be simple enough with something like autoconf, but there's no reason to clutter the code itself.
null pointers Posted Mar 19, 2008 22:39 UTC (Wed) by vmole (subscriber, #111) [Link] Whose talking about littering the code with strange casts? Do you really find:
somevarargfunc("these", "are", "some", "strings", (char *) NULL);
strange or confusing? Because that's all we're talking about. Any non-varargs function will have a prototype that will take care of this. (At this point someone will pipe-up with the comment that C89 doesn't require prototypes. That's true. But then you have to cast a lot of things, and my argument becomes stronger, not weaker.)
And of course there's a tradeoff in not making assumptions. But the kind of thing you're talking about, such as OS features available, is a whole different level, and no, I don't see anything wrong with assuming (for example) that a program is unix-like specific, and coding it as such, when the cost of doing otherwise is non-trivial. But why this desire to willfully violate a basic standard to save a few characters, when the (not-likely, but could happen) downside is obscure failures and tedious debugging? Oh, and there's a big difference between "assume char is 8 bits" and "assume char is at *least* 8 bits". The first is an assumption that will eventually bite you, the second is not an assumption at all, but is guaranteed by the C89 standard.
null pointers Posted Mar 20, 2008 10:19 UTC (Thu) by tfheen (subscriber, #17598) [Link] Whether C89 requires prototypes or not is irrelevant to dpkg though, as dpkg is using C99. (--std=gnu99, to be precise.)
null pointers Posted Mar 19, 2008 21:03 UTC (Wed) by nix (subscriber, #2304) [Link] It doesn't even hold into the present :(
null pointers Posted Mar 19, 2008 21:11 UTC (Wed) by quotemstr (subscriber, #45331) [Link] Can you name an example of a POSIX system for which the assumption does not hold?
null pointers Posted Mar 19, 2008 21:47 UTC (Wed) by nix (subscriber, #2304) [Link] Any IA64 or PPC64-based systems. I think HPPA too but can't remember. (On both these platforms it so happens that data pointers are all the same size: but function pointers are larger... and yes this has exposed bugs in free software.)
null pointers Posted Mar 19, 2008 21:57 UTC (Wed) by quotemstr (subscriber, #45331) [Link] I don't think that's true. See the downthread comments on the same topic. I'd be interested in knowing what the specific bugs were.
null pointers Posted Mar 19, 2008 22:18 UTC (Wed) by nix (subscriber, #2304) [Link] If I could remember, I'd say. I'll have a dig.
null pointers Posted Mar 28, 2008 22:33 UTC (Fri) by anton (guest, #25547) [Link] On [PPC64 and IA64] it so happens that data pointers are all the same size: but function pointers are larger.Not on Linux-PPC64 (and probably not on Linux-IA64, either):
#include <stdio.h>
int main()
{
printf("%ld %ld\n", sizeof(void *), sizeof(int(*)()));
return 0;
}
prints
8 8
null pointers Posted Mar 29, 2008 1:08 UTC (Sat) by nix (subscriber, #2304) [Link] Oh. My memory is failing me and I can't read simulator source code, it seems. (I was *sure* they were examples of arches using a descriptor consisting of a data pointer combined with other stuff.)
null pointers Posted Mar 20, 2008 15:07 UTC (Thu) by lysse (subscriber, #3190) [Link] > Assumptions are necessarily for practicality... You assume that a char has eight bits, don't you? I think that's called "disproving your own argument by contradiction".
null pointers Posted Mar 19, 2008 20:54 UTC (Wed) by nix (subscriber, #2304) [Link] Have two, IA64 and PPC64. Code and data pointers being different sizes is not uncommon: generally, in that situation, the data pointer is a plain pointer and the code pointer is some sort of descriptor.
null pointers Posted Mar 19, 2008 21:22 UTC (Wed) by quotemstr (subscriber, #45331) [Link] No go. How would dlsym() work if that were true? Also, an IA64 function pointer is a pointer to a function descriptor, which is a pair of values -- a pointer to the start of the code and the global pointer to use for that function. The pointer to the descriptor is a normal data pointer, and is what corresponds to the C-level function pointer.
null pointers Posted Mar 19, 2008 21:48 UTC (Wed) by nix (subscriber, #2304) [Link] Ah, oops. Forgot about that. (It's tripped me up before, too.) (I still have memories of pointer sizing bugs in the area of dlsym(), but I can't remember what they were.)
null pointers Posted Mar 20, 2008 3:52 UTC (Thu) by rganesan (subscriber, #1182) [Link] > Name a POSIX system, or better yet, a platform on which dpkg runs, where > code and data pointers are not the same size. Using NULL for function > pointers is fine given certain assumptions, and these days, IMO, one can > safely make these assumptions. I think you guys are missing the point. The issue is not code vs data pointers. The issue is ptr vs data. A varargs function being passed a NULL pointer needs to be passed (void *) 0 or (char *) 0 or some pointer. Passing just a 0 which is legal representation for a NULL pointer in C does not work for 64-bit systems. 0 is a 32-bit quantity, (void *) 0 is a 64-bit quantity for LP64 (Unix/Linux) as well as P64 (Windows) 64-bit platforms.
null pointers Posted Mar 19, 2008 20:44 UTC (Wed) by tialaramex (subscriber, #21167) [Link] How is what I said "still not it" when it's exactly the same as your explanation ?
null pointers Posted Mar 19, 2008 21:11 UTC (Wed) by vmole (subscriber, #111) [Link] Because it's not a matter of narrow vs. wide, or using something "narrower than (void *)". If the function specification says "This vararg list is terminated by null character pointer", then, by the standard, you have to pass something equivalent to "(char *) 0", and "(void *) 0" isn't. The idea of a "void *" being generic is that variables of type "void *" can store pointers of any type, and pointers can be cast to void and then back to the original type without information loss. But they are a distinct type, and not necessarily interchangeable with other pointer types. Does this matter the vast majority of the time? No. Is it C standard lawyer nitpickery of the most annoying kind? Yes, of course. But the C89 standard is this way precisely because there were some systems for which this kind of nitpicking *did* matter. But yes, where you wrote: "It seems to me that (type *) NULL is still a perfectly good choice of value for such a parameter." is correct. I should have distinguished that in my original reply.
A small nit-pick... Posted Mar 19, 2008 22:27 UTC (Wed) by dw (subscriber, #12017) [Link] "The idea of a "void *" being generic is that variables of type "void *" can store pointers of any type, and pointers can be cast to void and then back to the original type without information loss." As far as I remember it is illegal to convert between void* and any function pointer type in ANSI C; POSIX' specification of dlsym() actually relies on what is essentially a vendor extension to work for function pointers.
A small nit-pick... Posted Mar 19, 2008 22:52 UTC (Wed) by vmole (subscriber, #111) [Link] Oops, you're correct about that. I just looked in Plauger and Brodie's "Standard C" (which is not the standard, but I think we can trust them), and they agree: it's *object* pointers that can be intraconverted with void pointers. But the hilarious thing is this: The types _pointer_to_char_, _pointer_to_signed_char_, _pointer_to_unsigned_char_, and _pointer_to_void_ all share the same representation.So "(void *) 0" is interchangeable with "(char *) 0", but not "0" or "(int *) 0".
A small nit-pick... Posted Mar 19, 2008 23:15 UTC (Wed) by nix (subscriber, #2304) [Link] This is surely related to the aliasing rules, which also have a similar special hole allowing aliasing of of void * and char *, but not of pointers of any other distinct types.
|
Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.