LWN.net Logo

What If I Don't Actually Like My Users?

The other half of the "hard to misuse" list is now available on kernel hacker Rusty Russell's weblog. "Here begins our descent into hell; if an interface manages to achieve negative scores on the Hard To Misuse List, your users may detect the dull red glow of malignancy rather than incompetence." We linked to the the positive half of his list earlier in the week.
(Log in to post comments)

What If I Don't Actually Like My Users?

Posted Apr 4, 2008 16:15 UTC (Fri) by dgm (subscriber, #49227) [Link]

I miss a case: "You will get it wrong but It will appear to work anyways". 

What If I Don't Actually Like My Users?

Posted Apr 4, 2008 20:02 UTC (Fri) by nix (subscriber, #2304) [Link]

'It tries to intelligently DWIM and does the right thing nearly all the 
time, except when you really need it to, when it gets confused and does 
exactly the worst possible thing'.

Intelligence (-> irregular, DWIM behaviour) is useful, especially in user 
interfaces, but *you must be able to turn it off*.

What If I Don't Actually Like My Users?

Posted Apr 10, 2008 10:43 UTC (Thu) by NRArnot (subscriber, #3033) [Link]

Or more honestly

"It tries to rather simplemindedly DWIM and does the right thing often enough to keep a dimwit
happy. Indeed, you, the dimwit programmer, will be very happy indeed, because no other
interface to the system is provided, and the hotshots who might wish to show you up will be no
more able to use it reliably than you are"

(Remind you of a certain proprietary operating system at all? )

What If I Don't Actually Like My Users?

Posted Apr 10, 2008 23:37 UTC (Thu) by nix (subscriber, #2304) [Link]

It reminds me of entirely too much of my own code before I realised the 
problems with it. (The other problematic thing: thoughtless information 
hiding. Yes, reducing coupling is good, but if you have an internal 
parameter affecting the behaviour of the system, *export it* somehow, if 
need be by way of a separate wrapping shared library with different 
interface guarantees, so you can change the implementation and eliminate 
or change those parameters, breaking the interface of that wrapping 
library, without breaking the interface of the 'real' library. Why? So 
that testsuites, not necessarily just those you write, can peek at enough 
of the library's internal state that they can guarantee that they've 
exercised all its corners.)

What If I Don't Actually Like My Users?

Posted Apr 11, 2008 6:18 UTC (Fri) by nix (subscriber, #2304) [Link]

Good grief. I'm sorry about perpetrating that horrific run-on sentence. It 
just sort of... metastasized withot my realising it.

What If I Don't Actually Like My Users?

Posted Apr 4, 2008 16:27 UTC (Fri) by lokpest (guest, #45764) [Link]

If your thinking about throwing real end users in the blender, get help!

...they are heavy...

(http://www.youtube.com/watch?v=8NeR2LyILWQ)

What If I Don't Actually Like My Users?

Posted Apr 4, 2008 17:02 UTC (Fri) by JoeBuck (subscriber, #2330) [Link]

The gets() function actually achieves criterion 10: it's impossible to get right. Any use at all means that a very long line in the input will cause a buffer overflow.

What If I Don't Actually Like My Users?

Posted Apr 4, 2008 20:06 UTC (Fri) by nix (subscriber, #2304) [Link]

Hm, yes. I was going to say that tmpnam() et al are impossible to get 
right, but they're not: you can run them in constrained environments in 
which you know you won't get attacked. You don't *need* to be attacked for 
gets() to shoot you in the head.

(Why oh why were gets(), puts() and the other pre-stdio functions not 
quietly retired when stdio was invented? At least gets() is rarely used in 
free software, although probably not as rarely as seekdir()/telldir(), 
which I've never even heard of anyone using.)

What If I Don't Actually Like My Users?

Posted Apr 4, 2008 20:38 UTC (Fri) by xbobx (subscriber, #51363) [Link]

> Why oh why were gets(), puts() and the other pre-stdio functions not quietly retired when stdio was invented? puts() is still used all the time. In fact, for the almost-simplest of programs:
#include <stdio.h>
int main(void) {
    printf("hi\n");
    return 0;
}
gcc 4.1.2 on my system generates the following:
sub    $0x8,%rsp
mov    $0x4005f8,%edi
callq  400430 <puts@plt>
xor    %eax,%eax
add    $0x8,%rsp
retq   
If you use anything more complicated than a constant static string it will actually call printf().

What If I Don't Actually Like My Users?

Posted Apr 4, 2008 20:49 UTC (Fri) by nix (subscriber, #2304) [Link]

Well, yeah, but as the analogue of gets(), entirely redundant to fputs(), 
both should have gone if either do, and gets() certainly should have gone.

But it didn't.

What If I Don't Actually Like My Users?

Posted Apr 5, 2008 3:33 UTC (Sat) by njs (subscriber, #40338) [Link]

Any use at all of gets() does cause a linker warning, at least.

What If I Don't Actually Like My Users?

Posted May 19, 2008 8:32 UTC (Mon) by TBBle (guest, #52146) [Link]

> At least gets() is rarely used in 
> free software, although probably not as rarely as seekdir()/telldir(), 
> which I've never even heard of anyone using.

Samba uses it... http://www.vnode.ch/fixing_seekdir

Mind you, I wouldn't have known Samba was using it either (and in fact it took me a little
while to wrap my head around why) before I saw that article.

What If I Don't Actually Like My Users?

Posted May 19, 2008 19:55 UTC (Mon) by nix (subscriber, #2304) [Link]

So a really quite substantial bug (affecting perhaps 3% of all calls to 
this function in nontrivial directories) persisted for *a quarter of a 
century* before anyone noticed it.

I suspect that seekdir()/telldir() has exactly one user: Samba. Given how 
horrible it makes filesystem implementations, and the closeness of Samba 
implementors to the kernel, I'm not sure that it's worth preserving this 
function for that one user (which is privileged in any case so the usual 
oops-it-might-use-up-too-much-memory arguments against a naive 
entirely-in-VFS implementation do not apply).

Votes to make seekdir()/telldir() root-only, anyone?

Pedantry

Posted Apr 5, 2008 9:46 UTC (Sat) by tialaramex (subscriber, #21167) [Link]

It's not /impossible/ to get right although I agree that it provides no functionality you
would otherwise want that's not better implemented elsewhere in a modern POSIX API.

It's not impossible for the same reason that the non-thread safe functions aren't impossible
to get right, and the temporary file functions that aren't protected against races aren't
impossible to use safely. So long as you don't need that safety, they're fine. Similarly
gets() is fine so long as you don't need any protection against over-long strings.

e.g. suppose you have a socket, over which you receive instructions from a device driver in
the kernel, in this case gets() can be appropriate because the inverted privilege separation
means that crashing you with a buffer overflow achieves nothing - as a userspace process you
actually have less privileges than the kernel driver calling you. So long as there's an agreed
rule about buffer size (e.g. each instruction is the name of a file, so max pathname plus a
newline) and a mechanism ensuring that you are connected to that kernel driver as intended,
then gets() adds no new danger to your code.

Still, this is such a corner case, and there are so many better ways to approach this problem,
that the compiler/ linker is justified in complaining if you use this function in new code.

I've never found a sensible use for gets() in my own code, but I have used various other
"non-safe" functions, like tmpnam in contexts where the newer "safer" function didn't do what
I needed, and I judged that the safety issue was something I could cope with after reading
about it. I think I had to roll my own mkdtemp() some years ago for example, using "unsafe"
tmpnam and lots of careful reading of a treatise on race conditions in the filesystem.

A good example of an API that's _really_ impossible to use is one that Raymond Chen rants
about regularly, a Windows API call which purports to tell you whether a pointer is "valid" ie
whether you'd take a page fault if you tried to access it. This is simultaneously useless (it
doesn't do anything a userspace programmer should be trying to do) and dangerous (it will
itself inadvertently page in RAM if you call it on the edge of the stack for example) and
unreliable (the page may appear or vanish before control returns to your program with the
result). But having been foolishly offered to programmers in the past, Windows continues to
provide it for compatibility.

Pedantry

Posted Apr 5, 2008 10:05 UTC (Sat) by tialaramex (subscriber, #21167) [Link]

Ah, I got part of that last example wrong. It's dangerous because it will disable the stack
extension, ie it doesn't cause a new page to be mapped at the edge of the stack, but rather
overrides the runtime's automatic mapping of that page, returns to you an result that it isn't
yet mapped, but doesn't restore the stack extension behaviour. So suddenly, and without
explanation, your stack can't grow.

For a POSIX example that's less dangerous but just as unreliable/ useless, try access(2) which
attempts to guess whether you'd be allowed to do something, but can't assure you whether you
will or will not be able to do it if you actually try. It's not even useful for checks like
those in ssh which wants to see if you obeyed the rules for your own safety, since ssh needs
to check access permissions for /other users/.

What If I Don't Actually Like My Users?

Posted Apr 4, 2008 21:42 UTC (Fri) by jzbiciak (✭ supporter ✭, #5246) [Link]

Whoo hoo! I see fputs made the list. It's in the family of functions I complained about last week.

Some years ago, I wrote a bunch of code that treated the VT-100 display as a "frame buffer." You'd doodle on an array representing the screen, and then you'd call an update function to refresh the display. Nothing too earth shattering. My intention was to get the C code working, and then whittle it down into a tiny graphic hack as an IOCCC entry.

Well, the IOCCC entry never happened, but I found myself instead using the code for more and more things. I threw all sorts of stuff into the stew. The resulting code was a nightmare, particularly on the topic of coordinates. I had no fewer than 3 separate conventions:

  1. (x, y) with zero based coordinates
  2. (x, y) with one based coordinates
  3. (row, col) with one based coordinates

Oy. Thankfully I can say that was almost half a lifetime ago for me.

What If I Don't Actually Like My Users?

Posted Apr 5, 2008 1:32 UTC (Sat) by pr1268 (subscriber, #24648) [Link]

Oy. Thankfully I can say that was almost half a lifetime ago for me.

Darn... That happened just yesterday for me (it was with the blasted parameters to a function call).

In that same program, I experienced the below bundle of joy (and what is it with these stupid size_t types?!) Code:

size_t moonWalk(const vector<MyThingy>& stuff)
{
    size_t i = h_list.size();
    while(i >= 0) {
        /* do something with */ stuff[--i];
    }
    return i;
}

Oh, so subtle! It was a long afternoon of troubleshooting before I figured out why Mr. Fault (first name: Segmentation) kept interrupting my otherwise blissful coding session. (Note to self: Be sure to add brown paper bags to shopping list.)

Problem and solution

Posted Apr 5, 2008 11:01 UTC (Sat) by man_ls (subscriber, #15091) [Link]

Hmmm... before we all ignorants spend a long afternoon figuring this out, could you show us the problem and the solution? The code looks fine to me...

Problem and solution

Posted Apr 5, 2008 12:08 UTC (Sat) by emk (subscriber, #1128) [Link]

It took me a second, too, because it's using a strange (incorrect) looping idiom. Here, size_t is an unsigned type, so it can never be negative. If i == 0, and you write --i, it wraps around to a huge positive value. To fix it, try:

while (i > 0) {

Thanks to the unsigned nature of size_t, it's surprisingly hard to loop backwards over STL vectors without tripping over this bug. You could choose to use reverse iterators, which are clunky but safer:

vector<MyThingy>::const_reverse_iterator iter = stuff.rbegin();
for (; iter != stuff.rend(); ++iter) {
    /* do something with */ *iter;
}

Also, I have no idea why the original function returns i. Was there a break somewhere in the original loop? Without one, i would always equal 0 (assuming the loop terminated successfully).

Problem and solution

Posted Apr 6, 2008 12:55 UTC (Sun) by olecom (guest, #42886) [Link]

> while (i > 0) { stuff[--i];
when using predecrement, use

i=max+1; do { --i; } while(i);

Problem and solution

Posted Apr 6, 2008 13:20 UTC (Sun) by olecom (guest, #42886) [Link]

> > while (i > 0) { stuff[--i];
> when using predecrement, use
>
> i=max+1; do { --i; } while(i);

Also beware on input!

max -- is maximum linear address
i   -- geek counter, counts downto zero including;
       thus, number of loops is (max + 1)

better is to have *all* counts downto zero,
i.e. zero address access and

i = max /* number of loops is (max + 1) */
do {
  /* use */ stuff[i];
} while (--i);

Problem and solution

Posted Apr 6, 2008 15:20 UTC (Sun) by man_ls (subscriber, #15091) [Link]

What if the array is empty? You would be doing one iteration, while the above idiom (while (i>0) {--i;}) skips the loop and works fine.

Problem and solution

Posted Apr 7, 2008 5:11 UTC (Mon) by olecom (guest, #42886) [Link]

> What if the array is empty?

Same thing -- check your input. `if' for check, `while' for loop/iterations.

Mixing the two isn't a good thing, worst optimization ever.
_____

Problem and solution

Posted Apr 7, 2008 6:17 UTC (Mon) by man_ls (subscriber, #15091) [Link]

It depends. Quite often it doesn't matter much, and the compact version can be better (at least it takes less time to write). You know what they say about premature optimization, don't you?

By the way, with Java your best optimization is to write while (i != 0) instead of while (i > 0), because otherwise the JVM will perform arithmetic comparisons instead of logical. (Yes, it is pitiful.) I believe they have fixed it now, but it would be nice to know for sure.

just correct loops (Problem and solution)

Posted Apr 7, 2008 8:28 UTC (Mon) by olecom (guest, #42886) [Link]

> It depends. Quite often it doesn't matter much, and the compact version
> can be better (at least it takes less time to write).

All was about decrement-after-check access in the loop.

> You know what they say about premature optimization, don't you?

Those loops are quite fine for me, YMMV.
BTW, i'm in to system programming, so i don't know what Java is, really .:)
-- 
sed 'sed && sh + olecom = love' << ''
-o--=O`C
 #oo'L O
<___=E M

Problem and solution

Posted Apr 10, 2008 15:13 UTC (Thu) by DennisJ (subscriber, #14700) [Link]

Supposing a signed type had been used for the counter, and nothing changed to the loop
condition, the last pass through the loop would have done something to stuff[-1] which is also
unlikely to be what the author wanted.

So isn't 'while(i>0)' the correct solution following an incorrect analysis?

What If I Don't Actually Like My Users?

Posted Apr 5, 2008 15:51 UTC (Sat) by nix (subscriber, #2304) [Link]

-Wall gives you a nice helpful warning about this case.

(I use size_t (and where appropriate ssize_t) religiously for things like 
array indexes, so I hardly ever see this problem. It only turns up whe you 
have to interface with code that was written by people who don't 
understand that the index to arrays in C is *not* a fricking int or a 
long, it's a size_t... this is starting to matter now that int and long 
are different sizes again on some platforms.)

What If I Don't Actually Like My Users?

Posted Apr 5, 2008 16:36 UTC (Sat) by jzbiciak (✭ supporter ✭, #5246) [Link]

Erm... how often do you have 2 billion elements in a single array?  It's only when you get
larger than that in a *single array* that int vs. size_t matters for an *array index*.

Using more than 2 gigs of memory isn't unheard of, and is in fact quite common.  But, using
that much memory in a *single array* seems rather suspect.  Actually, no, it seems rather
outrageous, unless your screen display is a 1200 dpi bitmap on a 24" screen or something.

What If I Don't Actually Like My Users?

Posted Apr 5, 2008 17:43 UTC (Sat) by nix (subscriber, #2304) [Link]

It's not common. I'm just a correctness fiend. :)

What If I Don't Actually Like My Users?

Posted Apr 5, 2008 22:13 UTC (Sat) by jzbiciak (✭ supporter ✭, #5246) [Link]

Of course, if your size_t results in bugs like the one above (downcounters that just don't
quit), I'd argue it hurts correctness.

I don't recall anything in the C standard that suggests size_t is actually an appropriate type
for array indexes.  It *is* an appropriate type to pass to malloc, but that's what pops out of
sizeof(type).  It says nothing about the type of the index that you'll use on the resulting
array.

What If I Don't Actually Like My Users?

Posted Apr 6, 2008 0:29 UTC (Sun) by nix (subscriber, #2304) [Link]

The Standard implies it, and the implication is fairly obvious as these 
things go. Let's follow through the logic.

size_t is the upper limit on the size of any object in C, and arrays (like 
other derived types) are themselves objects (they are not functions nor 
incomplete types, the other classes of type).

The smallest addressable object type in C is 'char', which by definition 
occupies one byte; thus, the largest possible array is an array of char of 
size (size_t)-1.

Thus, the largest possible array index is by definition always the same as 
the largest possible allocated object, i.e., contained exactly within 
size_t.

Use another type and it will eventually hurt you. (If your algorithms rely 
on decrementing index counters below zero, I'd say they are themselves 
risky and should be rethought, because if you use that index, you'll be 
indexing an array before its start, which if it goes off the start of an 
allocated object invokes undefined behaviour.)

(As further evidence, the Standard contains a --- non-normative --- 
example of using sizeof to determine the length of an array, which
implies that the length of an array is a size_t, so its index probably is 
too...)

This concludes today's ludicrous pedantry. Don't make the mistake of 
thinking that any of this stuff is actually important. :)

What If I Don't Actually Like My Users?

Posted Apr 6, 2008 1:25 UTC (Sun) by jzbiciak (✭ supporter ✭, #5246) [Link]

I guess you could be even more pedantic and put 'U' suffixes on your array bounds too: int array[3U][5U]; ;-)

As for down-counting loops: The counter going negative is a red herring in terms of correct array accesses. What do the following two loops have in common?

for (i = 0; i < N; i++)
    do_something(array[i]);

for (i = N-1; i >= 0; i--)
    do_something(array[i]);

Answer? Both leave 'i' pointing one element past the end of the array. The only difference is which end.

I personally find negative array subscripting useful. The following is legitimate C code:

    /* Take a histogram of signed 8-bit values */
    int histogram[256];
    int *hist_mid = histogram + 128;
    signed char *data;

    /* ... */

    for (i = 0; i < N; i++)
        hist_mid[data[i]]++;

And as far as the standard goes, at least this example from the C0x standard uses int to define array bounds (in the context of the new "Variable Length Array" feature being added to C).

*shrug*

You're right, though, it doesn't matter a whole lot. Just don't take my signed integer indices away, and I'll let you keep your unsigned ones. :-)

What If I Don't Actually Like My Users?

Posted Apr 6, 2008 14:28 UTC (Sun) by jbh (subscriber, #494) [Link]

The largest single array I've worked with lately was about 500 million 64-bit doubles (the
values array of a CRS matrix).

So I don't think 2 billion elements seems outrageous.

But I wouldn't run that on a 32-bit cpu, for obvious reasons.

What If I Don't Actually Like My Users?

Posted Apr 7, 2008 6:50 UTC (Mon) by gdt (subscriber, #6284) [Link]

Q: Erm... how often do you have 2 billion elements in a single array?

A: When it's being exploited.

brought to you by the letters R, G and B...

Posted Apr 16, 2008 21:49 UTC (Wed) by roelofs (subscriber, #2599) [Link]

Erm... how often do you have 2 billion elements in a single array?

    unsigned char *pix = (unsigned char *)malloc(32768*32768*3*sizeof(unsigned char));

Not even a tiny bit far-fetched.

Greg

brought to you by the letters R, G and B...

Posted Apr 16, 2008 21:55 UTC (Wed) by jzbiciak (✭ supporter ✭, #5246) [Link]

Hmm... GIMP at least tiles that and uses a higher order tile structure.  Are there really
programs that actually allocate that in a single 3GB malloc()?

brought to you by the letters R, G and B...

Posted Apr 16, 2008 22:19 UTC (Wed) by roelofs (subscriber, #2599) [Link]

XV used to try. It included multiple image-format decoders that checked the height and/or width separately but either failed to check the product of the two or else did so but subsequently failed to check the product of the result with the byte-depth. Or they did so incorrectly--i.e., failing only for negative products, not realizing that h*w*d can easily wrap back into the positive range. You really want to use this pattern or its equivalent:

  // int, long, or other signed type:  w, h, npixels, bufsize
  npixels = w * h;
  bufsize = 3 * npixels;
  if (w <= 0 || h <= 0 || npixels/w != h || bufsize/3 != npixels) {
    FAIL();
  }
  buf = malloc(bufsize*sizeof(whatever_buf_is_made_of));

But realistically, you're right--you absolutely don't want to use a program that tries to allocate the whole thing simultaneously, regardless of whether it's in one piece or many. And you may want to avoid certain image formats for the same reason--tiled TIFF, for example, is well suited to very large images, but many (most?) other image formats are not. PNG, for all its simplicity (well, relative to TIFF, anyway), basically requires you to decode at least two full rows simultaneously, and rows can be up to 16 GB each (2^31 - 1 pixels wide * 8 bytes deep for 64-bit RGBA). Of course, 2G x 2G x 64-bit images are still in the realm of fantasy, AFAIK...

Greg

Correction and comment replies

Posted Apr 5, 2008 20:42 UTC (Sat) by pr1268 (subscriber, #24648) [Link]

s/h_list/stuff/ on line 3 of my code.

In reply to man_ls, as others have pointed out, a size_t is an unsigned long integer type. Integer underflow occurred here with the pre-decrement --i array index.

In reply to emk, I did indeed changing over to using a reverse iterator for this code, but I originally had some motivation not to do so, the reason which now escapes me. And to think this was just two days ago - goes to show just how fast thoughts and ideas flee through my mind when writing software!

Thanks for the replies - it's not just anywhere where I can share such embarrassing C++ code and still stimulate a nice discussion.

What If I Don't Actually Like My Users?

Posted Apr 8, 2008 14:16 UTC (Tue) by aya (guest, #19767) [Link]

Ah, but you forget you're using an STL class.

size_t moonWalk(const vector<MyThingy>& stuff)
{
    for (vector<MyThingy>::const_reverse_iterator i= stuff.rbegin();
         i != stuff.rend();
         ++i)
    {
        /* do something with *i */
    }

    /* I don't get what you're returning, though. */
}

What If I Don't Actually Like My Users?

Posted Apr 5, 2008 15:09 UTC (Sat) by Wummel (subscriber, #7591) [Link]

The Javascript parseint() interface comes to my mind. Usually, it parses strings to integers with a decimal base. But it silently switches to octal when the string has leading zeros: parseInt('08') equals zero!

So, you pretty much always have to specify the base as second argument, which is annoying: parseInt('08', 10) gives the expected 8.

parseInt feature

Posted Apr 5, 2008 16:45 UTC (Sat) by ccyoung (guest, #16340) [Link]

thanks for that reminder - nothing like getting bit in the butt

What If I Don't Actually Like My Users?

Posted Apr 7, 2008 8:28 UTC (Mon) by Los__D (subscriber, #15263) [Link]

What's wrong with the 0 prefix giving octal? It does exactly the same in C, just as both C and
parseInt() treats 0x as hexadecimal.

Anyway, ActionScript v3 has dumped this, probably meaning it's also gone from ECMAScript, so I
guess JavaScript will pick up the changes sooner or later.

They kept the '0x' prefix for Hexadecimal though, which seems a bit strange to me, either keep
them both, or dump them both. Oh well, maybe it's just because noone really uses octal in
ECMAScript and derived scripts.

What If I Don't Actually Like My Users?

Posted Apr 7, 2008 11:09 UTC (Mon) by dvdeug (subscriber, #10998) [Link]

What's wrong with the leading 0 making it octal is that if you have to deal with people who
aren't mathematicians or computer scientists, they probably won't know what octal is. If they
add a leading zero by accident, odds are they'll not be able to figure out what they did wrong
or why the results came out the way they did, and unless you're lucky enough to be standing
over their shoulder while they do it, their bug reports won't have enough information to
replicate it. The difference, to them, between typing in 0507 and 507 is nothing. Hence using
parseInt with a UI is going to cause rare, hard-to-trace problems in exchange for a feature
that no one cares about. (Really, even among computer geeks, few people use octal.)

What If I Don't Actually Like My Users?

Posted Apr 7, 2008 1:00 UTC (Mon) by gdt (subscriber, #6284) [Link]

I wrote this blog entry in reply to Rusty's first blog entry. I'l copy it here as my web server is on my home ADSL link.

The Linux kernel API, a user's view

Rusty's blog on API design is timely, as I'm struggling with the API for Linux's Netfilter. There's no shortage of HOWTOs on the topic and no shortage of production code to examine.

The problem is bit rot. The API to establish connection tracking has been deprecated, but the official HOWTO on the Netfilter website hasn't been changed. There's no documentation of the new nf_ct_expect_alloc() at all. A reasonable QA process would have rejected a code change which didn't update the official documentation.

The API to register a connection tracking helper has also silently changed. nf_conntrack_helper_register() no longer accepts a bitmask. Again the official documentation and sample code hasn't been updated. The entirety of the documentation is a set of obscure commit comments and a short NetDev list discussion. Without finding those and understanding their significance you can't understand why the production code works when it differs from the official sample code and the large collection of older code in Patch-o-matic.

The broader networking API also has a newish function: skb_header_pointer(). All of the original SKB manipulation functions have documentation in the header file. Somehow this new function appeared with no documentation in that header file.

I can read source. But let's not pretend that "Use the Source Luke" is ideal. The code for nf_conntrack_helper_register() is about adding entries to a list. The connection tracking magic doesn't happen until a packet arrives and that list is searched and acted upon, which is handled in other code in a galaxy far, far away.

Source code is also hard at explaining why. Why and when should skb_header_pointer() be used in preference to direct access to the SKB's data? The source won't tell you that unless you are already so immersed in the kernel that you half-know the answer anyway.

Source code can also mislead. For example, looking at existing Netfilter code in the kernel would give you the idea that a 64KB buffer is needed for parsing incoming packets in a Netfilter modules. That's not true at all, it's just that all of the modules which have been accepted into the kernel have needed a packet-sized buffer to parse for IETF-style protocol text or to decode ASN.1.

After struggling through all of this, I'll lay odds that posting the finished module will result in at least one put-down e-mail about some misuse of some Linux API.

A final thought. Is there a kernel API at all? Can something permanently partially obscured be said to exist? Or is the API like the Loch Ness Monster. In place of blurry photographs we have Linux device drivers, where even those who closely track the kernel API can be misled by poor design and worse documentation such as with in_atomic().

Again, text processing (as in prev. discussion)

Posted Apr 7, 2008 9:41 UTC (Mon) by olecom (guest, #42886) [Link]

> I can read source. But let's not pretend that "Use the Source Luke" is
> ideal. The code for nf_conntrack_helper_register() is about adding entries
> to a list. The connection tracking magic doesn't happen until a packet
> arrives and that list is searched and acted upon, which is handled in other
> code in a galaxy far, far away.

Seems like you have working source code, isn't that enough? I mean,
somebody did that for you (and many many other/users). Maybe after that
any kind of documentation writing wasn't in IWANTNOW list of the author?

It's open source, many who use, few who contribute. So it can be boring and
upsetting for particular authors. &#1054;thers can be outraged.

> After struggling through all of this, I'll lay odds that posting the
> finished module will result in at least one put-down e-mail about some
> misuse of some Linux API.

Maybe also a documentation patch and willingness to improve the kernel,
everyone needs? I'm sure original author will be happy.

> A final thought. Is there a kernel API at all?

I think it's just hard and boring to maintain. After some repetition
almost anything in programming can be automated. It's not harvesting or
fruit/mushroom collecting, which is by far manual-only work (making a
suitable robot is more complicated than rocket science).

Any repetition is boring for human or dumbing down.

Maybe tool set must be upgraded (diff+patch in any form is technology
of 1980th)?

I've started to work with text processing to make some automation for
maintaining big changes, i.e not just one-liners, which can be grep'ed.

With some input from coding-style policy department and developers making
tags/clues comments in hard to textually analyse cases, something can be
done, and i think quite useful.

http://kernelnewbies.org/olecom

prev. comment
http://lwn.net/Articles/275780/
_____

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds