User: Password:
|
|
Subscribe / Log in / New account

The ups and downs of strlcpy()

The ups and downs of strlcpy()

Posted Jul 20, 2012 8:06 UTC (Fri) by renox (subscriber, #23785)
In reply to: The ups and downs of strlcpy() by cmccabe
Parent article: The ups and downs of strlcpy()

> Please. You really think that aborting the program is the right behavior when a string is too long?

There is a "fail fast" design to abort as early as possible and let the parent process handle the error: Erlang's programs tend to act like this.

So aborting early is a reasonable design choice (Rust will be like this too), which can be used with C too, being snarky only makes you look foolish|ignorant.


(Log in to post comments)

The ups and downs of strlcpy()

Posted Jul 23, 2012 2:19 UTC (Mon) by cmccabe (guest, #60281) [Link]

Calling exit() because of a routine error is very bad form in C. For one thing, it makes your code impossible to use in a library.

From https://git.kernel.org/?p=linux/kernel/git/kay/libabc.git;...
> Never call exit(), abort(), be very careful with assert()
> - Always return error codes.
> - Libraries need to be safe for usage in critical processes that
> need to recover from errors instead of getting killed (think PID 1!).

Similar coding standards exist for Java. In fact the findbugs static code checker will flag calls to System.exit as problems.

Exiting an "erlang process" (really a green thread) doesn't terminate the whole application. I'm pretty sure you know this, so I don't know why you're bringing it up. You are comparing apples and oranges.

The ups and downs of strlcpy()

Posted Jul 23, 2012 2:24 UTC (Mon) by quotemstr (subscriber, #45331) [Link]

There's a difference between runtime errors and logic errors. The former are things that can go wrong for reasons outside the program's control. These should be reported in a way that allows recovery, and for these errors, exiting the program is inappropriate. The latter class of error always indicates a problem in the structure of the program, and the safest way to react to them is to abort the program. The idea behind strcpy_s is that an overlong string that makes it as far as strcpy_s represents a logic error in the program, and that there is no sensible way to continue past that point. If a program receives untrusted a string of unknown length as input, the program should first check the string's length, reject it with an actionable error if too long, and only then pass it to a lower layer that might use strcpy_s. strcpy_s should be used only on strings that _should_ be valid according to the programmer's mental model of the program. The function exists because it's easy to get these models subtly wrong.

The ups and downs of strlcpy()

Posted Jul 23, 2012 7:53 UTC (Mon) by renox (subscriber, #23785) [Link]

So? I didn't claim that the 'fail fast' design applies to all the cases, just that it can be a reasonable way to implement an application

> I don't know why you're bringing it up. You are comparing apples and oranges.

Nope. Many programs use the 'fail fast' design (a big percentage of the Erlang's program do), what is useful in Erlang can be useful in C..

The ups and downs of strlcpy()

Posted Jul 23, 2012 12:11 UTC (Mon) by nix (subscriber, #2304) [Link]

Quite. Not only is this useful for logic errors, it's useful for runtime error paths that are almost impossible to test and that it is nearly impossible to continue execution past.

The classic example of this is OOM. I would divide this into two subsets: if you're writing a library routine whose primary purpose is memory allocation and that can be expected to allocate a lot (e.g. a data structure's initialization function), and it runs out of memory, then by all means free what you allocated and return NULL. But if you're writing a library routine whose primary purpose is something else, and recovery from OOM is going to be tricky, then just exit() (and document this policy, of course). Your caller is unlikely to be able to do anything much on OOM anyway, exiting will free up memory at once, and if your caller is desperate to clean up or even jump out and keep going after freeing up memory, that's what atexit() is for.

But, be honest, your caller isn't going to jump away and keep going on OOM, your caller will just die: anything else is too hard to test properly. If you're lucky your caller might arrange to clean up in atexit() handlers, though I note the X server never does this and just appears to *hope* that none of its libraries exit on OOM. But perhaps this is because you can't even rely on cleaning up in atexit() handlers, because if you happen to OOM in a stack allocation the kernel is just going to kill you. So it doesn't matter if you have lots of complex cleanup-and-continue OOM code, you have to cater for an immediate exit without cleanup *anyway*. And you can't avoid this merely by not using malloc(): you have to not call functions either, at least not without 'preallocating' stack space by doing a deep recursion in advance. A few programs actually do this, but it's rare.)

There is one thing I wish we could get, but is really hard to do properly -- an automatic backtrace on OOM, so we could tell roughly which allocation was failing and why. Unfortunately on most platforms that requires a modicum of debugging information for everything, and that's huge and not loaded by default, even if it's present, so you'd be unlikely to be able to consult it at OOM time anyway.)

The ups and downs of strlcpy()

Posted Jul 23, 2012 13:21 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

You can produce backtrace without symbol names and lookup symbol names later. Windows mini-dumps certainly allow this, though it's sometimes tricky to keep all the required debugging information at hand.

The ups and downs of strlcpy()

Posted Jul 24, 2012 17:16 UTC (Tue) by nix (subscriber, #2304) [Link]

Producing a backtrace of any sort without frame pointers is very hard, and x86-64 GCC disables them by default (as does x86-32 modern GCC). To produce backtraces on such systems, you need DWARF debugging info -- though perhaps the exception frame section would serve the same purpose, though of course it too is not loaded by default. I suppose you could write the entire stack to disk on OOM (as long as it's not too big -- coredumping may be much harder, as if you're out of memory the full coredump is likely to be huge).

The ups and downs of strlcpy()

Posted Jul 24, 2012 18:16 UTC (Tue) by renox (subscriber, #23785) [Link]

> x86-64 GCC disables them by default (as does x86-32 modern GCC)

I wonder why the GCC developpers chose this default behaviour, x86-64 isn't register starved like x86-32.

The ups and downs of strlcpy()

Posted Jul 24, 2012 23:19 UTC (Tue) by nix (subscriber, #2304) [Link]

Because the ABI allows it, because it still provides a performance improvement (somewhere between 1% and 5%, not insignificant, though well below the 8--12% I've seen reported for x86-32), and because it's useless -- everything from GDB through libgcj and now I find even glibc backtrace() uses the DWARF unwinder tables instead. Why maintain a 'feature' which costs a register and adds runtime overhead to every function call when nobody needs it?

The ups and downs of strlcpy()

Posted Jul 25, 2012 15:11 UTC (Wed) by paulj (subscriber, #341) [Link]

One reason is debugging stack corruption, where normal tools may not give meaningful backtraces. With frame-pointers, you can easily figure out where earlier, uncorrupted, frames really are, and figuring out why it crashed.

The ups and downs of strlcpy()

Posted Jul 25, 2012 17:17 UTC (Wed) by nix (subscriber, #2304) [Link]

Yep. That's why frame pointers should be *enableable*. It doesn't mean they should be on by default.

The ups and downs of strlcpy()

Posted Jul 24, 2012 19:48 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

Windows does both. It uses the FS register pointing to thread information block to track frame pointers (since Windows uses SJLJ exceptions) and it also does crashdumps that contain offending threads' stacks. Minidumps are usually quite small.

The ups and downs of strlcpy()

Posted Jul 29, 2012 15:34 UTC (Sun) by pbonzini (subscriber, #60935) [Link]

Exception handler information pointed to by FS only tracks a (possibly very small) subset of frames, since most frames do not install an exception handler.

The ups and downs of strlcpy()

Posted Jul 24, 2012 20:37 UTC (Tue) by foom (subscriber, #14868) [Link]

The eh_frame section *is* loaded by default. And the backtrace function in glibc uses it, even.

So, no, it's not hard to produce a backtrace. You just call the function.

The ups and downs of strlcpy()

Posted Jul 24, 2012 23:10 UTC (Tue) by nix (subscriber, #2304) [Link]

Sorry, I misspoke, was thinking of other DWARF sections. .eh_frame is loadable, but is, of course, like all ELF sections, *mapped* in. In a severe OOM situation, it's very likely that you won't be able to map more pages in, and that not much of it is going to be mapped in at any given time.

(I must have been unlucky with backtrace(). It's never been willing to do more than coredump for me without frame pointers. Mind you I haven't tried it for years because I was so sure it was broken :) time to try it out in my next debugging blitz.)

The ups and downs of strlcpy()

Posted Jul 25, 2012 2:19 UTC (Wed) by quotemstr (subscriber, #45331) [Link]

> In a severe OOM situation, it's very likely that you won't be able to map more pages in, and that not much of it is going to be mapped in at any given time.

If you run your system in a sane configuration, the kernel should have swap somewhere it can use to evict the pages it'll need to hardfault the pages from the unwind section. It might be slow, but it'll work. If you run overcommit, there's no such guarantee, but if you run overcommit, why the hell are you complaining about OOM behavior?

The ups and downs of strlcpy()

Posted Jul 25, 2012 13:45 UTC (Wed) by nix (subscriber, #2304) [Link]

If you have swap free somewhere, you are not in an OOM situation. OOM happens when you're out of RAM *and* swap (or when you hit your RLIMIT_AS boundaries, I suppose).

If you are in overcommit state 2 with an overcommit ratio of 0, you may be right, but neither of these are the default -- and even then, I don't believe Linux reserves swap pages for every in-memory page, like Solaris does (a good thing too, it's fantastically annoying as even very-much-not-OOM systems can find fork() failing because there's not enough swap to back every page that might get dirtied in the new address space, even if it's only going to exec() and throw them all away).

The ups and downs of strlcpy()

Posted Jul 27, 2012 19:44 UTC (Fri) by quotemstr (subscriber, #45331) [Link]

> OOM happens when you're out of RAM *and* swap

OOM happens when you're out of commit. If you're doing it right, you paid for the commit for the pages you'll need when you loaded the image, so making the backtrace tables resident should still be possible.

> a good thing too, it's fantastically annoying as even very-much-not-OOM systems can find fork() failing because there's not enough swap to back every page that might get dirtied in the new address space, even if it's only going to exec() and throw them all away

I disagree: strict commit accounting makes a system more predictable in practice. If you find fork failing, you should either add more swap (which won't actually get used, as you note, except in the worst case) or change your program to use vfork or posix_spawn instead, both of which don't have the intrinsic commit-accounting problems of fork.

The ups and downs of strlcpy()

Posted Jul 28, 2012 10:41 UTC (Sat) by nix (subscriber, #2304) [Link]

If you find fork failing, you should either add more swap (which won't actually get used, as you note, except in the worst case) or change your program to use vfork or posix_spawn instead, both of which don't have the intrinsic commit-accounting problems of fork.
Right. So I'm a mere user on a system with 250 users. fork() is failing in my Emacs so I can't start a shell (Emacs is much bigger than a shell). And your proposal for fixing this awful user interface failure is either to beg the sysadmin to add swap (I did, he said no, of course turning overcommit off was out of the question as this machine was running a database, never mind that it was a test instance that nobody was using, also it was 'like Solaris does it' and he liked Solaris) or spend time hacking at Emacs and every other program that uses fork()/exec() -- i.e. nearly everything in Unix -- so it no longer does?! This despite the fact that vfork() cannot do many of the things you do between a fork() and exec(), and posix_spawn() cannot do any of them unless the developer of posix_spawn() thought of it, hence the appallingly insane complexity of the interface? And this on a machine with almost no memory left? And this when I'm supposed to be getting something else done?

Your former proposal betrays your single-user roots. Your latter proposal betrays your ignorance of what makes fork()/exec() better than the Windows model in the first place. Neither is at all times practical: the latter in particular is absolutely crackpot.

Thank goodness I can turn overcommit off on my own systems.

The ups and downs of strlcpy()

Posted Jul 28, 2012 10:43 UTC (Sat) by nix (subscriber, #2304) [Link]

The reason why my rant above sounds terribly specific is that this scenario actually happened to me. And kept happening to me, every week or so, for *years*, costing me perhaps time begging people to close other jobs down each time.

Needless to say the thought of rewriting (then X)Emacs's ferociously complex subprocess-handling infrastructure to use posix_spawn() never crossed my mind. (I tried vfork(), but that was clearly out of the question.)

The ups and downs of strlcpy()

Posted Jul 31, 2012 1:30 UTC (Tue) by khc (guest, #45209) [Link]

nevermind that posix_spawn() uses fork/exec on linux anyway

The ups and downs of strlcpy()

Posted Jul 31, 2012 23:38 UTC (Tue) by nix (subscriber, #2304) [Link]

True, so the underlying overcommit problem isn't actually fixed by it, except inasmuch as it sometimes falls back to vfork() for you. It just makes your software much much uglier, and makes it work better on major platforms such as MMU-less embedded systems, the Hurd, and Cygwin.

The ups and downs of strlcpy()

Posted Aug 1, 2012 15:53 UTC (Wed) by quotemstr (subscriber, #45331) [Link]

Emacs already uses vfork if it's available. (Read the source.) Perhaps something else was wrong with that system.

The ups and downs of strlcpy()

Posted Jul 29, 2012 2:18 UTC (Sun) by foom (subscriber, #14868) [Link]

fork() is pretty evil, especially now that we have multi-threaded programs.

It would be pretty cool if you could spawn an empty process in a stopped state, and then poke at it from the parent for a bit (open up new file descriptors/etc) before causing it to exec a real subprocess.

Doing things that way would avoid all the memory accounting issues, the performance issue of copying the page table for no good reason, and the significant complication of not actually being allowed to do anything that's not async-signal-handler-safe between fork() and exec(). (And nearly nothing actually falls into that category!)

The ups and downs of strlcpy()

Posted Jul 29, 2012 13:26 UTC (Sun) by nix (subscriber, #2304) [Link]

It would be pretty cool if you could spawn an empty process in a stopped state, and then poke at it from the parent for a bit (open up new file descriptors/etc) before causing it to exec a real subprocess.
You can do that with PTRACE_O_TRACEFORK or PTRACE_O_TRACEEXEC, but as with anything involving ptrace() there are so many tentacles that virtually any alternative is preferable.

The ups and downs of strlcpy()

Posted Jul 30, 2012 1:51 UTC (Mon) by foom (subscriber, #14868) [Link]

Apparently not *any* alternative, or a new userspace API would have been merged upstream by now. :)

The ups and downs of strlcpy()

Posted Jul 30, 2012 8:46 UTC (Mon) by nix (subscriber, #2304) [Link]

Yeah, true. But if ptrace() was something everyone had to use, a replacement would have been merged by now, because ptrace() is just so odious in so very many ways. (Though the improvements in recent kernels have been substantial, and in maybe as few as five to ten years I'll be able to rely on them enough to actually use them in real software, which these days means "meant to be portable between Linux distros, including the dinosaur-era RHELs too many people insist on running their bleeding-edge software on". sigh.)

The ups and downs of strlcpy()

Posted Aug 1, 2012 15:50 UTC (Wed) by quotemstr (subscriber, #45331) [Link]

> Right. So I'm a mere user on a system with 250 users

That's a rare edge case these days, like it or not. If you do regularly use such a system, it's the administrator's job to make sure system resources are adequate. The kernel is there to accurately account for system resources, not work around your sysadmin's snobbery.

> This despite the fact that vfork() cannot do many of the things you do between a fork() and exec()

Such as?

> hence the appallingly insane complexity of the interface

I don't think the interface is particularly complex. It's less complex than pthreads, certainly.

> the latter in particular is absolutely crackpot.

Do you really need to make it personal?

> Thank goodness I can turn overcommit off on my own systems

I think you mean "on".

The ups and downs of strlcpy()

Posted Jul 25, 2012 15:40 UTC (Wed) by mmorrow (guest, #83845) [Link]

Backtracing on x86_64 is actually quite reasonable. Here are two methods:
#if defined(USE_BACKTRACE)
/*
  -DUSE_BACKTRACE -rdynamic
  (-rdynamic for backtrace_symbols)
*/
#include <execinfo.h>
void print_trace(void)
{
  const size_t n = 10
  void *array[n];
  size_t size = backtrace(array,n);
  char **strings = backtrace_symbols(array,size);
  for(size_t i = 0; i < size; i++)
    fprintf(stderr,"%s\n",strings[i]);
  free(strings);
}
#elif defined(USE_LIBUNWIND)
/*
  -DUSE_LIBUNWIND -lunwind-x86_64
*/
#include <libunwind.h>
void print_trace(void)
{
  unw_cursor_t cur;
  unw_context_t cxt;
  unw_getcontext(&cxt);
  unw_init_local(&cur,&cxt);
  while(unw_step(&cur) > 0)
  {
    unw_word_t off, pc;
    char fname[64] = {[0] = '\0'};
    unw_get_reg(&cur,UNW_REG_IP,&pc);
    unw_get_proc_name(&cur,fname,sizeof(fname),&off);
    printf("%p: (%s+0x%x) [%p]\n",pc,fname,off,pc);
  }
}
#endif

The ups and downs of strlcpy()

Posted Jul 25, 2012 15:55 UTC (Wed) by mmorrow (guest, #83845) [Link]

9c9
<   const size_t n = 10
---
>   const size_t n = 10;

The ups and downs of strlcpy()

Posted Jul 24, 2012 6:37 UTC (Tue) by kleptog (subscriber, #1183) [Link]

There are programs that attempt to recover from OOM, PostgreSQL for example. It has a pre-allocated area which it uses to create the error message to send to the client and the rip-cord allocator will quickly release any memory allocated to the current query context.

It's not perfect of course, if the client is using SSL then you have to rely on the SSL library to not do anything silly but it's worked every time for me. On the client you get a nice message along the lines of "server ran out of memory". Your transaction is aborted, but the rest of the server is still running.

This obviously only works if malloc() returns NULL, so memory overcommit needs to be off. OOM during stack growth is uncatchable, you can only try to mitigate the risk bit keeping your stack small.

I just wanted to point out that it is possible to create code that handles OOM, and it's not helpful if libraries assume they can just die in that case.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds