LWN.net Logo

Glibc change exposing bugs

Glibc change exposing bugs

Posted Nov 11, 2010 23:03 UTC (Thu) by dafid_b (guest, #67424)
Parent article: Glibc change exposing bugs

I made an earlier comment at the wrong level - both technically - to the wrong part of the discussion, and also from the purely selfish perspective that the change in Glibc behaviour could break existing systems - potentially costing me my cash.

An earlier suggestion was to replace the change with a API violation detector that causes programs to crash rather than corrupt their state.

This is better than a silent corruption - but still antisocial. A program that used to work now fails. Not everyone can fix the cause of the crash. Not all software is unimportant to the user. It is a bit like suggesting that when you see a j-walker you should just mug the person next to them, as a deterrent for future jay-walkers.

I think it is fine that testing releases use the crash-bad-behaving applications change.

But the system released to users should provide the previously working Glibc.

And the Glibc developers should listen to Linus.

Cause, I really would prefer my software to work.

Selfish? Yes.


(Log in to post comments)

Glibc change exposing bugs

Posted Nov 12, 2010 0:15 UTC (Fri) by xilun (subscriber, #50638) [Link]

> Cause, I really would prefer my software to work.

Then do some system level tests.

Glibc maintainers are not responsible for your system integration and QA.

> Selfish? Yes.

Indeed.
But considering Glibc maintainers are not generous to the point the will do your system integration and QA, we have an impedance mismatch here, and they will just do as they want, which in the end seems quite logical.

Glibc change exposing bugs

Posted Nov 12, 2010 1:51 UTC (Fri) by dafid_b (guest, #67424) [Link]

Yes - More on that system level testing..

I would like to that in my spare time - not many hours - and leverage my subsequent relaxation browsing time...

What I am thinking is based on rough understanding, so please pass along any hints.

My idea is to provide a memcpy() that can safely be used in any application with minimal changes to the software behaviour, and yet provide logging of bad usage for proactive corrections.
It is ok if the system is slower.. but it should still work.

I don't really know how to do the logging as it should:
* not interfere with threads, signals etc
* be always available

The memcpy() is pretty simple..
A replacement memcpy() based on combination of memove() to test the parameters and the old memcpy() to provide the implementation for stability of my software.
When a memcpy() is made with bad args, that would normally invoke the special logic an error is logged by PID? and the old memcpy() still invoked to deliver the vanilla experience.

The replacement sounds like building a patched Glibc.

Any suggestions or hints for how to do the logging would be appreciated.

Glibc change exposing bugs

Posted Nov 12, 2010 5:00 UTC (Fri) by dafid_b (guest, #67424) [Link]

i hacked up something, around shared memory, which seems most likely to me (based on little experience) to be safe..

If this approach is reasonable.. then just need to link it into memcpy as outlined above to have a trace of last few hundred errors on the system in shared memory waiting to dumped...

Thoughts?

$ dd of=/tmp/data if=/dev/zero count=16

$ g++ test.cpp

$ ./a.out /tmp/data
ret=0x80489de, dest=0x976d008, src=0x8048b4d, len=4, pid=205b

$ cat test.cpp
#include <sys/types.h>
#include <unistd.h>
#include <sys/mman.h>
#include <syscall.h>
#include <stdio.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/mman.h>
#include <syscall.h>
#include <errno.h>
#include <string.h>

#define SHARED_MEM_SIZE 8192

struct data // trace data for a memcpy() error
{
pid_t pid ;
const void *ret ;
const void *dest ;
const void *src ;
size_t len ;
} ;

struct log
{
long int index ; // sequence or index of last entry used
struct data entry[ SHARED_MEM_SIZE/sizeof(struct data) - 1 ] ; // vector of logging instances
} ;
struct log *pLog ;
#define N_ENTRIES (sizeof(pLog->entry)/sizeof(pLog->entry[0]))

void
displayLog(int i)
{
int j = i % N_ENTRIES ; // restrict index.

fprintf(stderr, "ret=%p, dest=%p, src=%p, len=%ld, pid=%lx\n",
pLog->entry[j].ret,
pLog->entry[j].dest,
pLog->entry[j].src,
(long)pLog->entry[j].len,
(unsigned long)pLog->entry[j].pid
) ;

}

void
capture(const void *dest, const void *src, size_t n)
{
int newLoc, oldLoc ;
do {
oldLoc = pLog->index ;
newLoc = pLog->index + 1 ;
} while ( __sync_bool_compare_and_swap( pLog->index, newLoc, oldLoc ) ) ;

int j = newLoc % N_ENTRIES ; // restrict index.
pLog->entry[j].ret = __builtin_return_address(0) ; //__builtin_extract_return_address(ra) ;
pLog->entry[j].dest = dest ;
pLog->entry[j].src = src ;
pLog->entry[j].len = n ;
pLog->entry[j].pid = getpid() ;
}

void *
setup(char *av)
{
void * p = NULL ;
int fd ;

fd = open(av, O_RDWR, 0x777) ; // open the file.
if (fd < 0)
{
fprintf(stderr, "Failed to open file /%s/ errno=%d\n", av, errno) ;
return 0 ;
}
p = mmap(0, SHARED_MEM_SIZE, PROT_WRITE|PROT_READ, MAP_SHARED, fd, 0) ;
if (p == 0)
{
fprintf(stderr, "Failed to mmap file /%s/ errno=%d\n", av, errno) ;
close(fd) ;
return 0 ;
}
// have mapping in p of SHARED_MEM_SIZE bytes
fprintf(stderr, "mapped %s to %p on fd %d\n", av, p, fd) ;

return p ;
}

int main(int ac, char **av)
{
int fd ;

if (ac >1)
{
pLog = (struct log*)setup(av[1]) ;
}
else
fprintf(stderr, "map <file>\n") ;

if (pLog)
{
capture(strdup("pete"), "joe", 4 ) ;
displayLog(pLog->index) ;
}

return 0 ;
}

Glibc change exposing bugs

Posted Nov 12, 2010 6:44 UTC (Fri) by cmccabe (guest, #60281) [Link]

> fd = open(av, O_RDWR, 0x777)

This is not correct. You want

> fd = open(av, O_RDWR, 0777)

Yes, it's an octal constant. Or use the symbolic constants.

Also, this is more of a personal preference thing, but bumpyCaps and hungarian notation are frowned on by most.

In a larger sense, I think you don't want to rebuild glibc. You probably just want to use "the LD_PRELOAD trick"

If I were you, I would print my nastygrams to syslog, using the syslog(3) function. Most sysadmins don't check random areas of shared memory that often. If you do choose to use shm, try shm_open.

cheers,
C.

Glibc change exposing bugs

Posted Nov 12, 2010 9:09 UTC (Fri) by dafid_b (guest, #67424) [Link]

Thanks for the code review, much appreciated.

Is syslog() safe to call at this point?
It generates formatted output, which seems like it could itself call memcpy() or do other stuff in te library that the app did not allow for in its plan when it called memcpy.
Also is the system call that sends the message to the log safe, or can it have side effect such as signals and new error codes in errno?

I would be very happy if the answer to the above is: syslog() safe to call like this with no side-effects.

Glibc change exposing bugs

Posted Nov 13, 2010 3:02 UTC (Sat) by cmccabe (guest, #60281) [Link]

> Is syslog() safe to call at this point?
> It generates formatted output, which seems like it could itself call
> memcpy() or do other stuff in te library that the app did not allow for in
> its plan when it called memcpy

You raise a good issue. glibc's version of syslog is known to call malloc sometimes, which means that you shouldn't use it from within a signal handler. Surprisingly, memcpy isn't on the official list of "async-signal safe" functions, so you could argue that such an implementation would be POSIX conforming :)

But seriously. I think the best thing to do is probably implement your own version of syslog with no memory allocations or calls to memcpy. It's pretty easy to do in a few hundred lines. I had to do it before when writing a good signal handler.

C.

Glibc change exposing bugs

Posted Nov 12, 2010 18:21 UTC (Fri) by chad.netzer (guest, #4257) [Link]

Valgrind already does this, and MUCH more:

http://valgrind.org/docs/manual/mc-manual.html#mc-manual....

Not to deprive you of the experience of doing it yourself, which can be instructive. However, you should need to reinvent the wheel if all you want is the use of the tool. At the very least, you can see how valgrind does it. As for automatically invoking it, well that's an exercise for the reader. :)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds