Weekly Edition Return to the Development page |
How Do I Make This Hard to Misuse?
Kernel hacker Rusty Russell has some thoughts on how to make APIs hard to misuse. The idea is that in addition to making APIs easier to use, that they also be made hard to misuse. "So I've created a 'best' to 'worst' list: my hope is that by putting 'hard to misuse' on one axis in our mental graphs, we can at least make informed decisions about tradeoffs like 'hard to misuse' vs 'optimal'."
(Log in to post comments)
How Do I Make This Hard to Misuse? Posted Mar 31, 2008 18:53 UTC (Mon) by ajross (subscriber, #4563) [Link] I'm surprised to find a huge one missing: Don't provide two ways to do the same thing if one of them is wrong. Rusty is talking about kernel code, so I guess he might be assuming a higher quality of developer than I am. But misunderstandings of "thick" APIs is probably the source of more "API misuse" (and other bugs and misfeatures) than anything else I'm aware of. All library code has this disease to some extent or another (Java has it like the plague), and what it means in the real world is that coders with limited understanding of the library as a whole go thumbing through the documentation looking for a gadget that does what they want, and then plug it in, all the while failing to realize that another gadget would have been a much better choice. And since they don't understand the library as a whole, they don't have a prayer of understanding the tradeoffs and misfeatures that result from their choice. The solutions to this are either (1) make developers understand the design of the libraries they use at a deep level, or (2) write libraries with a minimal but complete feature set, such that developers don't get stuck using the wrong tool for the job. Choice 1 is clearly preferable but very expensive -- there aren't many such developers. Option 2 seems like a much better choice. The only downsides are that more "boilerplate" code must be written to make up for the lack of "convenience" (ugh) functionality. And, I guess, there will be objections from people who don't understand this "minimal" aesthetic and want to choose the wrong solution from the choices offered by a much larger library. Personal plug: here's my take on "minimal but complete" functionality as expressed in a small embeddable scripting language: http://plausible.org/nasal.
How Do I Make This Hard to Misuse? Posted Mar 31, 2008 20:43 UTC (Mon) by nix (subscriber, #2304) [Link] Hear, hear, indeed. Forget Java: Win32 is the truly horrible example in this area, with dozens of ways to do some fundamental things, all with different (often poorly documented) shortcomings.
How Do I Make This Hard to Misuse? Posted Mar 31, 2008 21:11 UTC (Mon) by i3839 (subscriber, #31386) [Link] Nasal looks interesting. > Small! 146k source code. Is that in lines of code? ;-)
How Do I Make This Hard to Misuse? Posted Mar 31, 2008 21:24 UTC (Mon) by ajross (subscriber, #4563) [Link] Goodness no: kilobytes of C code. And it includes only the core library stuff, and ignores extension code (currently readline, pcre, sqlite, gtk and cairo) and soft-coded libraries (there's an XML parser and a few other gadgets). The current code in CVS is a little bigger now, at 158k. Looking at line endings as the LOC metric I count 5507 lines. Lines with semicolons are an easy trick to use if you want a measure that ignores comments (i.e. code complexity, not verbosity), and there are 2620 of those. I'm sure there are others out there, but I don't much care. The point is that it's small. :)
How Do I Make This Hard to Misuse? Posted Mar 31, 2008 21:31 UTC (Mon) by rgmoore (subscriber, #75) [Link] what it means in the real world is that coders with limited understanding of the library as a whole go thumbing through the documentation looking for a gadget that does what they want, and then plug it in, all the while failing to realize that another gadget would have been a much better choice. ... The solutions to this are either (1) make developers understand the design of the libraries they use at a deep level, or (2) write libraries with a minimal but complete feature set, I think that you're missing an important possibility: write really good documentation. Sometimes you really want a dozen gadgets, each of which is different from the others in some small but important way. That's a lot less likely to trip up your users if you:
I know that when I'm writing documentation for closely related functions, I try to do some or all of these. It's very helpful when coming back to use something I wrote a few years ago to find notes like "function_A does not guarantee that data will be written in any particular order. If output order is important, use function_B instead."
How Do I Make This Hard to Misuse? Posted Mar 31, 2008 21:44 UTC (Mon) by ajross (subscriber, #4563) [Link] I think that you're missing an important possibility: write really good documentation.I limited my treatment to techniques that have been proven to actually work in practice. :) Seriously: what you describe would be great. I've just never seen anything like it. At best (or at least the best I've seen), you get documentation like what Sun provides for the JDK: a very clean, readable, hyperlinked guide to a true rats nest of a library that only a tiny elite class of Java gurus actually understand. The core problem being that great documentation does nothing for those who don't read it, and the sheer size of modern libraries guarantees that users won't read the documentation. You can get around this by finding developers who can read and distill only the core architecture from the effusive documentation, but then you're basically implementing a version of my "option 1" above. For that matter, good developers tend to get the least relative benefit from all that "convenience code" anyway, and are happy to write the 2-3 lines of boilerplate needed to turn their iterator output into an array, or vice versa, etc... Which means that even given a guarantee that only talented developers will use your library, you're still better off making it minimal than you are adding functionality.
How Do I Make This Hard to Misuse? Posted Mar 31, 2008 22:12 UTC (Mon) by nix (subscriber, #2304) [Link] Oh, believe me, even when I describe exactly what functions do in the header files, complete with examples... they *still* don't get read, or people read the first line and ignore the DO NOT DO THIS in screaming flashing red with associated MIDI of a screaming police siren (or the closest I can get to that in source code). If I make whateveritis not compile if misused, it gets hacked by someone else so that it *does* compile when misused, because 'that was easier'. (No it bloody wasn't.) Given that I work in the financial sector I'm tempted to see if I can write something which if misused in an unlikely way transfers the contents of the misuser's bank account into mine, and document this as a failure mode. I'd be rich within the week! ;}
reasons for kmalloc GFP_ATOMIC... Posted Mar 31, 2008 20:10 UTC (Mon) by hmh (subscriber, #3838) [Link] AFAIK, kmalloc in fact CANNOT know if it could sleep or not. As it was explained to me in the in_atomic() thread on LKML, that information just doesn't exist in the kernel right now. You simply have to know in which context you are, and tell everyone about it (thus, GFP_ATOMIC).
reasons for kmalloc GFP_ATOMIC... Posted Mar 31, 2008 21:57 UTC (Mon) by jengelh (subscriber, #33263) [Link] And you might even have a reason to call it with GFP_ATOMIC even if you have a user context and could, theoretically, sleep!
How Do I Make This Hard to Misuse? Posted Apr 1, 2008 5:41 UTC (Tue) by jd (subscriber, #26381) [Link] A few others:
How Do I Make This Hard to Misuse? Posted Apr 1, 2008 5:41 UTC (Tue) by im14u2c (subscriber, #5246) [Link] One of my favorite C APIs to love to hate: fputc(int, FILE *) and fgets(char *, int, FILE *) Why is the file pointer the last argument?! As Rusty pointed out, "context" arguments such as handles (FILE * in this case) idiomatically belong at the front, just like in fprintf(FILE *, const char *, ...).
How Do I Make This Hard to Misuse? Posted Apr 1, 2008 10:21 UTC (Tue) by xbobx (guest, #51363) [Link] > Why is the file pointer the last argument?! That would probably be an optimization. C function calling convention is that arguments are pushed onto the stack in reverse order, so with this function the FILE pointer is pushed first. Then the caller is free to manipulate the stack without touching the FILE pointer, and possibly call these functions multiple times. Otherwise, code that repeatedly gets input from/outputs to stdout (e.g., *everything*, at least when libc was designed) has to push/pop the stdout FILE ptr around _every_ call to these functions. Not that this would ever be noticeable on modern hardware, just saying...
How Do I Make This Hard to Misuse? Posted Apr 1, 2008 11:14 UTC (Tue) by im14u2c (subscriber, #5246) [Link] Sounds plausible (and quaint!), but I'm not sure I'm following the scenario. I'm interpreting "reverse order" as "left most argument pushed last." So, suppose putchar(int c) is a wrapper around fputc(int c, FILE *f):
int putchar(int c)
{
return fputc(c, stdout);
}
So, in pseudo-code, the resultant assembly ought to look roughly like this (assuming tail-call optimization):
POP 'c' into a register
PUSH 'stdout'
PUSH 'c' back on stack
JUMP to fputc and let it return for us.
It seems like if arguments went in reverse order, then adding the FILE * argument at the beginning would be the optimization:
PUSH 'stdout'
JUMP to fputc and let it return for us
I must be missing something.
How Do I Make This Hard to Misuse? Posted Apr 1, 2008 17:22 UTC (Tue) by vmole (subscriber, #111) [Link] My guess is that it's even simpler than that: we've got a function that puts a character to stdout, and we need one that puts to an arbitrary FILE *, so we'll add an argument. Many years ago, I worked on a Data General. The COPY and MOVE commands took, as a first argument, the destination file or directory. This made some sort of sense from an implementation point of view, but, as the TA warned at the time, "Sooner or later *all* of you are going to overwrite a source file." I think we all did.
How Do I Make This Hard to Misuse? Posted Apr 7, 2008 7:49 UTC (Mon) by liljencrantz (subscriber, #28458) [Link] Kind of how the ln command in unix works, then? I've always found this massively unintuitive.
How Do I Make This Hard to Misuse? Posted Apr 7, 2008 14:38 UTC (Mon) by nix (subscriber, #2304) [Link] So do I, but I think that's because of overexposure to C, where that sort of thing is helpfully always the other way around. If you think about it, ln(1) is perfectly consistent with cp(1): it creates or updates (for directories) the last thing you list.
How Do I Make This Hard to Misuse? Posted Apr 7, 2008 17:17 UTC (Mon) by vmole (subscriber, #111) [Link] It's consistent, kinda. The problem comes when describing: "cp foo bar" translates as "copy _foo_ to _bar_" okay, but the obvious translation of "ln foo bar" to "link _foo_ to _bar_" doesn't; the latter seems to say _bar_ is the original, at least to my taste. You have to process it as "create a link to _foo_ named _bar_". Or just memorize it. :-)
How Do I Make This Hard to Misuse? Posted Apr 8, 2008 8:24 UTC (Tue) by IkeTo (subscriber, #2122) [Link] > but the obvious translation of "ln foo bar" to "link _foo_ to _bar_" doesn't I see this problem as an inaccuracy of the translation "link _foo_ to _bar_". This seems to imply that both _foo_ and _bar_ are pre-existing, and somehow a "link" is created between them as a result of running the command. Obviously not what is done by "ln". It is instead to "build a link to _foo_ called _bar_". The cp is to "make a copy of _foo_ called _bar_". Pretty consistent to me.
How Do I Make This Hard to Misuse? Posted Apr 8, 2008 12:13 UTC (Tue) by im14u2c (subscriber, #5246) [Link] Although, since the final argument can be a directory, perhaps the best connector for both is "at": Make a copy of foo _at_ bar Make a link to foo _at_ bar Or in the plural case: Make copies of foo, bar, baz, quux _at_ dest Make links to foo, bar, baz, quux _at_ dest
How Do I Make This Hard to Misuse? Posted Apr 9, 2008 3:06 UTC (Wed) by roelofs (subscriber, #2599) [Link] Also keep in mind that the target for ln is optional. Thus:
ln -s foo/bar . ln -s foo/bar ...are equivalent. Sadly, cp foo/bar doesn't quite work. Greg
How Do I Make This Hard to Misuse? Posted Apr 9, 2008 3:33 UTC (Wed) by im14u2c (subscriber, #5246) [Link] That aspect of 'cp' always drove me nuts, probably because I learned MS-DOS first. I've found myself tempted to write a wrapper around 'cp' to make that form work. I won't, though, only because I know it'll wreak havoc when I go to use someone else's account for whatever reason. (e.g. to show them how to do something.)
How Do I Make This Hard to Misuse? Posted Apr 1, 2008 18:00 UTC (Tue) by felixfix (subscriber, #242) [Link] Possibly push the file handle, push the char, call, replace the char at top of stack, call, repeat. But I am just guessing, and hate working on code like that. I spent two years dealing with the memory constraints that made such code tempting, and despised it.
How Do I Make This Hard to Misuse? Posted Apr 2, 2008 4:27 UTC (Wed) by xbobx (guest, #51363) [Link] True, for that specific case it is better the other way around. But suppose you have a function such as:
void print_strings(FILE *stream, int num_strings, const char **list) {
int i;
for (i = 0; i < num_strings; i++) {
fputs(list[i], stream);
}
}
In this case, the assembly for this function will push stream once, then just push/pop n pointers to strings onto the stack and call fputs to print all of the strings. One could imagine that this would be useful when, say, implementing fprintf or other similar higher-level functions which all output to the same FILE *.
How Do I Make This Hard to Misuse? Posted Apr 2, 2008 4:36 UTC (Wed) by im14u2c (subscriber, #5246) [Link] See? I knew I was missing something. Thanks to you and felixfix both, since you are both describing the same particular optimization. Cute, in an ugly, quaint way. :-D I'm certain I'm guilty of far worse horrors in my 2 decades of assembly programming. This optimization never occurred to me since I've always managed to pass arguments in registers. (An odd fluke of history, that. I've written whole-program assembly where I control all the conventions, C with inline-asm only (no function calls from asm) on machines with stack-based calling conventions, and C callable asm on machines with register based calling conventions.)
How Do I Make This Hard to Misuse? Posted Apr 1, 2008 6:42 UTC (Tue) by olecom (guest, #42886) [Link]
> 9. The compiler/linker won't let you get it wrong.
>
> As a C person, I like that[...]
> compile errors (it evalates sizeof(char[1-2*!!(cond)]) which won't
> compile if cond is true).
>
> I use this in the kernel's module_param(name, type, perm) macro to
> check that the read/write permissions for the module parameter are sane
> (a common mistake was to specify 644 instead of 0644).
[]
> 1. Read the correct mailing list thread and you'll get it right.
>
> The reason the some strange interface quirk exists might be for
> compatibility with some strange OS or compiler, weird corner case or even
> older versions of this codebase. In other words, historical reasons ("see,
> on the VAX we only had 6 characters for..."). You sometimes only find this
> when you send a patch to fix it and the original author yells at you.
>
> Sometimes they add it to the FAQ. That does not increase the interface's
> score very much: please try harder.
Q: don't you think streaming editor can handle that?
A: our tools have not such thing
http://article.gmane.org/gmane.linux.kernel/659995
When they will go out of the C box, or just programing language box?
Extending gcc to to waste more time, yes!
_________
Open-before-use Posted Apr 1, 2008 10:56 UTC (Tue) by epa (subscriber, #39769) [Link] It's hard for the compiler to ensure that the user calls your "open" routine before your other routines, but an "assert()" can at least get you to this level.In C++ it would be normal practice to make the 'open' routine the constructor, so you automatically have to call it first before any member functions. But you can do this in C too, if your functions all take a handle argument and open() is the only one that generates such a handle.
Open-before-use Posted Apr 1, 2008 12:59 UTC (Tue) by nlucas (subscriber, #33793) [Link] But you can't do it in practice many times, because it's normal to need to "reopen" the resource (because of a connection error, the usb device was disconnected, you don't know the resource name beforehand, etc.), which means the added logic for this case is just the same as not opening it in the constructor (the default constructor, at least) and providing "open"/"Close" methods.
Open-before-use Posted Apr 1, 2008 15:35 UTC (Tue) by NAR (subscriber, #1313) [Link] That still wouldn't solve the problem, one could write code like this, even if only an "open" would return a valid handle:handle_t* handle; read(handle);
How Do I Make This Hard to Misuse? Posted Apr 1, 2008 16:28 UTC (Tue) by piggy (subscriber, #18693) [Link] > 5. Do it right or it will always break at runtime. The person who taught me my testing skills (Kevin Curry) had a nice way to phrase this: "Programmers fix core-dumps, so make sure that you dump core."
|
Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.