User: Password:
Subscribe / Log in / New account

GCC and pointer overflows

GCC and pointer overflows

Posted Apr 17, 2008 17:14 UTC (Thu) by jzbiciak (subscriber, #5246)
In reply to: GCC and pointer overflows by mb
Parent article: GCC and pointer overflows

No, I meant 32 bit system. If you do the pointer arithmetic first on an array of structs whose size is 65536, then any length > 32767 will give a byte offset that's >= 231 away. A length of 65535 will give you a pointer that's one element before the base.

I guess it's an oversimplification to say 32767 specifically. Clearly, any len > 65535 will wrap back around and start to look "legal" again if the element size is 65536. Any len <= 65535 but greater than some number determined by how far the base pointer is from the end of the 32-bit memory map will wrap to give you a pointer that is less than the base. So, the threshold isn't really 32767, but something larger than the length of the array.

That was my main point: When you factor in the size of the indexed type, the threshold at which an index causes the pointer to wrap around the end of the address space and the potential number of wrapping scenarios gets you in trouble. Thus, the only safe way is to compare indices, not pointers.

For example, suppose the compiler didn't do this particular optimization, and you had written the following code on a machine for which sizeof(int)==4:

extern void foo(int*);

void b0rken(unsigned int len)
    int mybuf[BUFLEN];
    int *mybuf_end = mybuf + BUFLEN;
    int i;

    if (mybuf + len < mybuf_end && mybuf + len >= mybuf)
        for (i = 0; i < len; i++)
            mybuf[i] = 42;

    foo(mybuf);  /* prevent dead code elimination */

What happens when you pass in 0x40000001 in for len? The computed pointer for mybuf + len will be equal to &mybuf[1], because under the hood, the compiler multiplies len by sizeof(int), giving 0x00000004 after truncating to 32 bits. How many times does the loop iterate, though?

This happens regardless of whether the compiler optimizes away the second test.

Now, there is a chance a different optimization saves your butt here. If the compiler does common subexpression elimination and some amount of forward substitution, it may rewrite that if statement's first test as follows. (Oversimplified, but hopefully it gives you an idea of what the compiler can do to/for you.)

Step 0: Represent pointer arithmetic as bare arithmetic (internally)

    mybuf + 4*len >= mybuf_end

Step 1: Forward substitution.

    mybuf + 4*len >= mybuf + 4*BUFLEN

Step 2: Remove common terms from both sides. (Assumes no integer overflow with pointer arithmetic--what started this conversation to begin with.)

    4*len >= 4*BUFLEN

Step 3: Strength reduction. (Again, assumes no integer overflow with pointer arithmetic.)

    len >= BUFLEN

That gets us back to what the test should have been to begin with. There's no guarantee that the compiler will do all of this though. For example, GCC 4.2.0 doesn't seem to optimize the test in this way. I just tried the Code Composer Studio compiler for C64x (v6.0.18), and it seems to do up through step 2 if the array is on the local stack, thereby changing, but not eliminating the bug.

It turns out GCC does yet a different optimization that changes the way in which the bug might manifest. It eliminates the original loop counter entirely, and instead transforms the loop effectively into:

    int *ptr;

    for (ptr = mybuf; ptr != mybuf + len; ptr++)
        *ptr++ = 42;

This actually has even different safety characteristics.

Moral of the story? Compare indices, not computed pointers to avoid buffer overflows.

(Log in to post comments)

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds