(and the related
has been a perennial request
GNU C library (glibc)
commonly supported by a statement that
is superior to the existing alternatives.
Perhaps the earliest request to add these
BSD-derived functions to glibc took the form of
a patch submitted in 2000
by a fresh-faced Christoph Hellwig.
Christoph's request was rejected, and subsequent requests have similarly been rejected (or ignored).
It's instructive to consider the reasons why
has so far been rejected,
and why it may well not make its way into glibc in the future.
A little prehistory
In the days before programmers considered that someone else might want to
deliberately subvert their code, the C library provided just:
char *strcpy(char *dst, const char *src);
with the simple purpose of copying the bytes from the
string pointed to by
(up to and including the terminating null byte)
to the buffer pointed to by
Naturally, when calling
the programmer must take care that the bytes being
copied don't overrun the
space available in the buffer pointed by
The effect of such buffer overruns
is to overwrite other parts of a process's memory, such as
with the most common result being to corrupt data
or to crash the program.
If the programmer can with 100% certainty predict
at compile time the size of the
then it's possible (if unwise) to preallocate a suitably sized
buffer and omit any argument checks before calling
In all other cases, the call should be guarded with a suitable
statement to check the size of its argument.
However, strings (in the form of input text)
are one of the ways that humans interact with computers,
and thus quite commonly the size of the
string is controlled by the user of a program, not the program's creator.
At that point, of course, it becomes essential for every call to
to be guarded by a suitable
char dst [DST_SIZE];
if (strlen(src) < DST_SIZE)
(The use of
ensures that there's at least one byte extra byte available for the null terminator.)
But it was easy for programmers to omit such checks
if they were forgetful, inattentive, or cowboys.
And later, other more attentive programmers realized that by
carefully controlling what was written into the overflowed buffer,
and overrunning into more exotic places such as
call return addresses stored on the stack,
they could do
more interesting things
with buffer overruns than simply crashing the program.
(And because code tends to live a long time,
and the individual programmers creating it can be slow to
to learn about the sharp edges of the tools they use,
even today buffer overruns remain one of the
commonly reported vulnerabilities
Improving on strcpy()
Prechecking the arguments of each call to
A seemingly obvious way to relieve the programmer
of that task was to add an API
that allowed the caller to inform the library
function of the size of the target buffer:
char *strncpy(char *dst, const char *src, size_t n);
function is like
but copies at most
As long as
does not exceed the space allocated in
a buffer overrun can never occur.
Although choosing a suitable value for
will never overrun
it turns out that
has problems of its own.
if there is no null terminator in the first
does not place a null terminator after the bytes copied to
If the programmer does not check for this event,
and subsequent operations expect a null terminator to be present,
then the program is once more vulnerable to attack.
The vulnerability may be more difficult to exploit than a buffer overflow,
but the security implications can be just as severe.
One iteration of API design didn't solve the problems, but perhaps a further one can…
size_t strlcpy(char *dst, const char *src, size_t size);
is similar to
but copies at most
and always adds a null terminator following the bytes copied to
avoids buffer overruns and ensures that the output string is null terminated.
So why have the glibc maintainers obstinately refused to accept it?
The essence of the argument against
is that it fixes one problem—sometimes failing to terminate
in the case of
buffer overruns in the case of
strcpy()—while leaving another:
the loss of data that occurs when the string copied from
is truncated because it exceeds
(In addition, there is still
an unusual corner case
where the unwary programmer can find that
the analogous function for string concatenation,
without a null terminator.)
At the very least,
(silent) data loss is undesirable to the user of the program.
At the worst, truncated data can lead to security issues that
may be as problematic as buffer overruns,
albeit probably harder to exploit.
(One of the nicer features of
is that their return values do at least facilitate the detection of
truncation—if the programmer checks the return values.)
All of which brings us full circle:
to avoid unhappy users and security exploits,
in the general case even a call to
must be guarded by an
statement checking the arguments,
if the state of the arguments
can't be predicted with certainty in advance of the call.
Where are we now?
are present on many versions of UNIX
(at least Solaris, the BSDs, Mac OS X, and Irix),
but not all of them (e.g., HP-UX and AIX).
There are even implementations of these functions
in the Linux kernel
for internal use by the kernel code.
Meanwhile, these functions are not present in glibc,
and were rejected for inclusion in
the POSIX.1-2008 standard,
apparently for similar reasons to their rejection from glibc.
Reactions among core glibc contributors on the topic of including
have been varied
over the years.
Christoph Hellwig's early patch was rejected
in the then-primary maintainer's inimitable style
But reactions from other glibc developers have been more nuanced,
some willingness to accept the functions.
Perhaps most insightfully,
Paul Eggert notes
that even when these functions are provided
(as an add-on packaged with the application),
projects such as OpenSSH,
where security is of paramount concern,
still manage to either misuse the functions (silently truncating data)
or use them unnecessarily (i.e., the traditional
could equally have been used without harm);
such a state of affairs does not constitute a
strong argument for including the functions in glibc.
The appearance of an
entry on this topic in the glibc FAQ,
with a brief rationale for why these functions are currently excluded,
and a note that "gcc -D_FORTIFY_SOURCE"
can catch many of the errors that
were designed to catch,
would appear to be something of a final word on the topic.
Those that still feel that these functions should be in glibc
will have to make do with the
implementations provided in
in case it isn't obvious by now,
it should of course be noted that the root of this problem lies
in the C language itself.
C's native strings are not
of the style natively provided in more modern languages such as Java, Go, and D.
In other words, C's strings have no notion of bounds checking
(or dynamically adjusting a string's boundary) built into the type itself.
Thus, when using C's native string type,
the programmer can never entirely avoid the task of
checking string sizes when strings are manipulated,
and no replacements for
will ever remove that need.
One might even wonder if the original C library implementers were clever
enough to realize from the start that
were sufficient—if it weren't for the fact that they also gave us
to post comments)