How Do I Make This Hard to Misuse?
So I've created a 'best' to 'worst' list: my hope is that by putting 'hard to misuse' on one axis in our mental graphs, we can at least make informed decisions about tradeoffs like 'hard to misuse' vs 'optimal'."
      Posted Mar 31, 2008 18:53 UTC (Mon)
                               by ajross (guest, #4563)
                              [Link] (6 responses)
       I'm surprised to find a huge one missing: Don't provide two ways
to do the same thing if one of them is wrong.
 Rusty is talking about kernel code, so I guess he might be assuming
a higher quality of developer than I am.  But misunderstandings of
"thick" APIs is probably the source of more "API misuse" (and other
bugs and misfeatures) than anything else I'm aware of.
 All library code has this disease to some extent or another (Java
has it like the plague), and what it means in the real world is that
coders with limited understanding of the library as a whole go
thumbing through the documentation looking for a gadget that does what
they want, and then plug it in, all the while failing to realize that
another gadget would have been a much better choice.  And since
they don't understand the library as a whole, they don't have a prayer
of understanding the tradeoffs and misfeatures that result from their
choice.
 The solutions to this are either (1) make developers understand the
design of the libraries they use at a deep level, or (2) write
libraries with a minimal but complete feature set, such that
developers don't get stuck using the wrong tool for the job.  Choice 1
is clearly preferable but very expensive -- there aren't many
such developers.
 Option 2 seems like a much better choice.  The only downsides are
that more "boilerplate" code must be written to make up for the lack
of "convenience" (ugh) functionality.  And, I guess, there will be objections from
people who don't understand this "minimal" aesthetic and want to
choose the wrong solution from the choices offered by a much larger
library.
 Personal plug: here's my take on "minimal but complete"
functionality as expressed in a small embeddable scripting language:
http://plausible.org/nasal.
      
           
     
    
      Posted Mar 31, 2008 20:43 UTC (Mon)
                               by nix (subscriber, #2304)
                              [Link] 
       
     
      Posted Mar 31, 2008 21:11 UTC (Mon)
                               by i3839 (guest, #31386)
                              [Link] (1 responses)
       
     
    
      Posted Mar 31, 2008 21:24 UTC (Mon)
                               by ajross (guest, #4563)
                              [Link] 
       
     
      Posted Mar 31, 2008 21:31 UTC (Mon)
                               by rgmoore (✭ supporter ✭, #75)
                              [Link] (2 responses)
       I think that you're missing an important possibility: write really good documentation.  Sometimes you really want a dozen gadgets, each of which is different from the others in some small but important way.  That's a lot less likely to trip up your users if you:
 I know that when I'm writing documentation for closely related functions, I try to do some or all of these.  It's very helpful when coming back to use something I wrote a few years ago to find notes like "function_A does not guarantee that data will be written in any particular order.  If output order is important, use function_B instead."
      
           
     
    
      Posted Mar 31, 2008 21:44 UTC (Mon)
                               by ajross (guest, #4563)
                              [Link] (1 responses)
       I limited my treatment to techniques that have been proven to
actually work in practice.  :)
 Seriously: what you describe would be great.  I've just never seen
anything like it.  At best (or at least the best I've seen), you get
documentation like what Sun provides for the JDK: a very clean,
readable, hyperlinked guide to a true rats nest of a library that only
a tiny elite class of Java gurus actually understand.
 The core problem being that great documentation does nothing for
those who don't read it, and the sheer size of modern libraries
guarantees that users won't read the documentation.  You can
get around this by finding developers who can read and distill only
the core architecture from the effusive documentation, but then you're
basically implementing a version of my "option 1" above.
 For that matter, good developers tend to get the least relative
benefit from all that "convenience code" anyway, and are happy to
write the 2-3 lines of boilerplate needed to turn their iterator
output into an array, or vice versa, etc...  Which means that even
given a guarantee that only talented developers will use your library,
you're still better off making it minimal than you are adding
functionality.
      
           
     
    
      Posted Mar 31, 2008 22:12 UTC (Mon)
                               by nix (subscriber, #2304)
                              [Link] 
       
     
      Posted Mar 31, 2008 20:10 UTC (Mon)
                               by hmh (subscriber, #3838)
                              [Link] (1 responses)
       
     
    
      Posted Mar 31, 2008 21:57 UTC (Mon)
                               by jengelh (guest, #33263)
                              [Link] 
       
     
      Posted Apr 1, 2008 5:41 UTC (Tue)
                               by jd (guest, #26381)
                              [Link] 
       
 
     
      Posted Apr 1, 2008 5:41 UTC (Tue)
                               by jzbiciak (guest, #5246)
                              [Link] (13 responses)
       One of my favorite C APIs to love to hate: fputc(int, FILE *) and fgets(char *, int, FILE *) Why is the file pointer the last argument?!  As Rusty pointed out, "context" arguments such as handles (FILE * in this case) idiomatically belong at the front, just like in fprintf(FILE *, const char *, ...). 
     
    
      Posted Apr 1, 2008 10:21 UTC (Tue)
                               by xbobx (subscriber, #51363)
                              [Link] (12 responses)
       
     
    
      Posted Apr 1, 2008 11:14 UTC (Tue)
                               by jzbiciak (guest, #5246)
                              [Link] (11 responses)
       Sounds plausible (and quaint!), but I'm not sure I'm following the scenario.  I'm interpreting "reverse order" as "left most argument pushed last."  So, suppose putchar(int c) is a wrapper around fputc(int c, FILE *f): 
So, in pseudo-code, the resultant assembly ought to look roughly like this (assuming tail-call optimization):
 It seems like if arguments went in reverse order, then adding the FILE * argument at the beginning would be the optimization: I must be missing something. 
     
    
      Posted Apr 1, 2008 17:22 UTC (Tue)
                               by vmole (guest, #111)
                              [Link] (7 responses)
       My guess is that it's even simpler than that: we've got a function that puts a character to stdout, and we need one that puts to an arbitrary FILE *, so we'll add an argument.
 Many years ago, I worked on a Data General. The COPY and MOVE commands took, as a first argument, the destination file or directory. This made some sort of sense from an implementation point of view, but, as the TA warned at the time, "Sooner or later *all* of you are going to overwrite a source file." I think we all did.
      
           
     
    
      Posted Apr 7, 2008 7:49 UTC (Mon)
                               by liljencrantz (guest, #28458)
                              [Link] (6 responses)
       
     
    
      Posted Apr 7, 2008 14:38 UTC (Mon)
                               by nix (subscriber, #2304)
                              [Link] (5 responses)
       
     
    
      Posted Apr 7, 2008 17:17 UTC (Mon)
                               by vmole (guest, #111)
                              [Link] (4 responses)
       It's consistent, kinda. The problem comes when describing: "cp foo bar" translates as "copy _foo_ to _bar_" okay, but the obvious translation of "ln foo bar" to "link _foo_ to _bar_" doesn't; the latter seems to say _bar_ is the original, at least to my taste. You have to process it as "create a link to _foo_ named _bar_". Or just memorize it. :-)
      
           
     
    
      Posted Apr 8, 2008 8:24 UTC (Tue)
                               by IkeTo (subscriber, #2122)
                              [Link] (3 responses)
       
     
    
      Posted Apr 8, 2008 12:13 UTC (Tue)
                               by jzbiciak (guest, #5246)
                              [Link] (2 responses)
       
     
    
      Posted Apr 9, 2008 3:06 UTC (Wed)
                               by roelofs (guest, #2599)
                              [Link] (1 responses)
       
 
...are equivalent.  Sadly, cp foo/bar doesn't quite work.
 
Greg
      
           
     
    
      Posted Apr 9, 2008 3:33 UTC (Wed)
                               by jzbiciak (guest, #5246)
                              [Link] 
       
     
      Posted Apr 1, 2008 18:00 UTC (Tue)
                               by felixfix (subscriber, #242)
                              [Link] 
       
     
      Posted Apr 2, 2008 4:27 UTC (Wed)
                               by xbobx (subscriber, #51363)
                              [Link] (1 responses)
       
     
    
      Posted Apr 2, 2008 4:36 UTC (Wed)
                               by jzbiciak (guest, #5246)
                              [Link] 
       
     
      Posted Apr 1, 2008 6:42 UTC (Tue)
                               by olecom (guest, #42886)
                              [Link] 
       
     
      Posted Apr 1, 2008 10:56 UTC (Tue)
                               by epa (subscriber, #39769)
                              [Link] (2 responses)
       
     
    
      Posted Apr 1, 2008 12:59 UTC (Tue)
                               by nlucas (guest, #33793)
                              [Link] 
       
     
      Posted Apr 1, 2008 15:35 UTC (Tue)
                               by NAR (subscriber, #1313)
                              [Link] 
       
     
      Posted Apr 1, 2008 16:28 UTC (Tue)
                               by piggy (guest, #18693)
                              [Link] 
       
     
    How Do I Make This Hard to Misuse? 
      How Do I Make This Hard to Misuse? 
      
Hear, hear, indeed. Forget Java: Win32 is the truly horrible example in 
this area, with dozens of ways to do some fundamental things, all with 
different (often poorly documented) shortcomings.
How Do I Make This Hard to Misuse? 
      
Nasal looks interesting.
> Small! 146k source code.
Is that in lines of code? ;-)
How Do I Make This Hard to Misuse? 
      
Goodness no: kilobytes of C code.  And it includes only the core library stuff, and ignores
extension code (currently readline, pcre, sqlite, gtk and cairo) and soft-coded libraries
(there's an XML parser and a few other gadgets).  The current code in CVS is a little bigger
now, at 158k.
Looking at line endings as the LOC metric I count 5507 lines.  Lines with semicolons are an
easy trick to use if you want a measure that ignores comments (i.e. code complexity, not
verbosity), and there are 2620 of those.  I'm sure there are others out there, but I don't
much care.  The point is that it's small. :)
How Do I Make This Hard to Misuse? 
      what it means in the real world is that coders with limited understanding of the library as a whole go thumbing through the documentation looking for a gadget that does what they want, and then plug it in, all the while failing to realize that another gadget would have been a much better choice. ... The solutions to this are either (1) make developers understand the design of the libraries they use at a deep level, or (2) write libraries with a minimal but complete feature set,
      I think that you're missing an important possibility: write really
good documentation.
How Do I Make This Hard to Misuse? 
      How Do I Make This Hard to Misuse? 
      
Oh, believe me, even when I describe exactly what functions do in the 
header files, complete with examples... they *still* don't get read, or 
people read the first line and ignore the DO NOT DO THIS in screaming 
flashing red with associated MIDI of a screaming police siren (or the 
closest I can get to that in source code). If I make whateveritis not 
compile if misused, it gets hacked by someone else so that it *does* 
compile when misused, because 'that was easier'. (No it bloody wasn't.)
Given that I work in the financial sector I'm tempted to see if I can 
write something which if misused in an unlikely way transfers the contents 
of the misuser's bank account into mine, and document this as a failure 
mode. I'd be rich within the week! ;}
reasons for kmalloc GFP_ATOMIC...
      
AFAIK, kmalloc in fact CANNOT know if it could sleep or not.  As it was explained to me in the
in_atomic() thread on LKML, that information just doesn't exist in the kernel right now.  You
simply have to know in which context you are, and tell everyone about it (thus, GFP_ATOMIC).
reasons for kmalloc GFP_ATOMIC...
      
And you might even have a reason to call it with GFP_ATOMIC even if you have a user context
and could, theoretically, sleep!
      A few others:
How Do I Make This Hard to Misuse? 
      
How Do I Make This Hard to Misuse? 
      How Do I Make This Hard to Misuse? 
      
> Why is the file pointer the last argument?!
That would probably be an optimization.  C function calling convention is that arguments are
pushed onto the stack in reverse order, so with this function the FILE pointer is pushed
first.  Then the caller is free to manipulate the stack without touching the FILE pointer, and
possibly call these functions multiple times.  Otherwise, code that repeatedly gets input
from/outputs to stdout (e.g., *everything*, at least when libc was designed) has to push/pop
the stdout FILE ptr around _every_ call to these functions.
Not that this would ever be noticeable on modern hardware, just saying...
How Do I Make This Hard to Misuse? 
      
int putchar(int c)
{
    return fputc(c, stdout);
}
    POP 'c' into a register
    PUSH 'stdout'
    PUSH 'c' back on stack
    JUMP to fputc and let it return for us.
    PUSH 'stdout'
    JUMP to fputc and let it return for us
How Do I Make This Hard to Misuse? 
      How Do I Make This Hard to Misuse? 
      
Kind of how the ln command in unix works, then? I've always found this massively unintuitive.
How Do I Make This Hard to Misuse? 
      
So do I, but I think that's because of overexposure to C, where that sort 
of thing is helpfully always the other way around.
If you think about it, ln(1) is perfectly consistent with cp(1): it 
creates or updates (for directories) the last thing you list.
How Do I Make This Hard to Misuse? 
      How Do I Make This Hard to Misuse? 
      
> but the obvious translation of "ln foo bar" to "link _foo_ to _bar_" doesn't
I see this problem as an inaccuracy of the translation "link _foo_ to _bar_".  This seems to
imply that both _foo_ and _bar_ are pre-existing, and somehow a "link" is created between them
as a result of running the command.  Obviously not what is done by "ln".  It is instead to
"build a link to _foo_ called _bar_".  The cp is to "make a copy of _foo_ called _bar_".
Pretty consistent to me.
How Do I Make This Hard to Misuse? 
      
Although, since the final argument can be a directory, perhaps the best connector for both is
"at":
Make a copy of foo _at_ bar
Make a link to foo _at_ bar
Or in the plural case:
Make copies of foo, bar, baz, quux _at_ dest
Make links to  foo, bar, baz, quux _at_ dest
      Also keep in mind that the target for ln is optional.  Thus:
How Do I Make This Hard to Misuse? 
         ln -s foo/bar .
   ln -s foo/bar
How Do I Make This Hard to Misuse? 
      
That aspect of 'cp' always drove me nuts, probably because I learned MS-DOS first.  I've found
myself tempted to write a wrapper around 'cp' to make that form work.
I won't, though, only because I know it'll wreak havoc when I go to use someone else's account
for whatever reason.  (e.g. to show them how to do something.)
How Do I Make This Hard to Misuse? 
      
Possibly push the file handle, push the char, call, replace the char at top of stack, call,
repeat.  But I am just guessing, and hate working on code like that.  I spent two years
dealing with the memory constraints that made such code tempting, and despised it.
      True, for that specific case it is better the other way around.
But suppose you have a function such as:
How Do I Make This Hard to Misuse? 
      
void print_strings(FILE *stream, int num_strings, const char **list) {
    int i;
    for (i = 0; i < num_strings; i++) {
        fputs(list[i], stream);
    }
}
In this case, the assembly for this function will push stream once, then just push/pop n pointers to strings onto the stack and call fputs to print all of the strings.  One could imagine that this would be useful when, say, implementing fprintf or other similar higher-level functions which all output to the same FILE *.
      
          How Do I Make This Hard to Misuse? 
      
See?  I knew I was missing something.  Thanks to you and felixfix both, since you are both
describing the same particular optimization.  Cute, in an ugly, quaint way.  :-D
I'm certain I'm guilty of far worse horrors in my 2 decades of assembly programming.  This
optimization never occurred to me since I've always managed to pass arguments in registers.
(An odd fluke of history, that.  I've written whole-program assembly where I control all the
conventions, C with inline-asm only (no function calls from asm) on machines with stack-based
calling conventions, and C callable asm on machines with register based calling conventions.)
How Do I Make This Hard to Misuse? 
      
> 9. The compiler/linker won't let you get it wrong.
>
>    As a C person, I like that[...]
> compile errors (it evalates sizeof(char[1-2*!!(cond)]) which won't
> compile if cond is true).
>
>    I use this in the kernel's module_param(name, type, perm) macro to
> check that the read/write permissions for the module parameter are sane
> (a common mistake was to specify 644 instead of 0644).
[]
> 1. Read the correct mailing list thread and you'll get it right.
>
>    The reason the some strange interface quirk exists might be for
> compatibility with some strange OS or compiler, weird corner case or even
> older versions of this codebase. In other words, historical reasons ("see,
> on the VAX we only had 6 characters for..."). You sometimes only find this
> when you send a patch to fix it and the original author yells at you.
>
>    Sometimes they add it to the FAQ. That does not increase the interface's
> score very much: please try harder.
Q: don't you think streaming editor can handle that?
A: our tools have not such thing
http://article.gmane.org/gmane.linux.kernel/659995
When they will go out of the C box, or just programing language box?
Extending gcc to to waste more time, yes!
_________
Open-before-use
      It's hard for the compiler to ensure that the user calls your "open" routine before your other routines, but an "assert()" can at least get you to this level.
In C++ it would be normal practice to make the 'open' routine the constructor, so you automatically have to call it first before any member functions.  But you can do this in C too, if your functions all take a handle argument and open() is the only one that generates such a handle.
      
          Open-before-use
      
But you can't do it in practice many times, because it's normal to need to "reopen" the
resource (because of a connection error, the usb device was disconnected, you don't know the
resource name beforehand, etc.), which means the added logic for this case is just the same as
not opening it in the constructor (the default constructor, at least) and providing
"open"/"Close" methods.
      That still wouldn't solve the problem, one could write code like this, even if only an "open" would return a valid handle:
Open-before-use
      
handle_t* handle;
read(handle);
      
          How Do I Make This Hard to Misuse? 
      
> 5. Do it right or it will always break at runtime.
The person who taught me my testing skills (Kevin Curry) had a nice way to phrase this:
"Programmers fix core-dumps, so make sure that you dump core."
 
           