Posted Mar 28, 2012 16:13 UTC (Wed) by Cyberax (✭ supporter ✭, #52523)
Parent article: A turning point for GNU libc
Can we finally get better interface, not limited to POSIX crap?
Can we:
1) Convert between title case/lower case in a specific locale (opened by iconv_open)
2) Convert between UTC and arbitrary timezones without changing a global variable.
3) Get the next/previous DST change instant (and the DST changes history in general) for a specific timezone.
Posted Mar 28, 2012 16:19 UTC (Wed) by cmorgan (guest, #71980)
[Link]
And along this line, can we deprecate strcat/strcpy and get http://www.gratisoft.us/todd/papers/strlcpy.html in place? Some of us still have to write c code and would welcome string functions that were less dangerous.
A turning point for GNU libc
Posted Mar 28, 2012 16:25 UTC (Wed) by JoeBuck (subscriber, #2330)
[Link]
If you want to use strlcpy/strlcat, you already can; you just have to link with a different library (libbsd). If your project standards forbid strncat, #define it to DONT_USE_strncat_YOU_IDIOT in some widely-included header.
A turning point for GNU libc
Posted Mar 28, 2012 16:47 UTC (Wed) by nix (subscriber, #2304)
[Link]
Deprecating strcat/strcpy is vanishingly unlikely. They're unpleasant functions, but they are ubiquitous. Adding __attribute__((deprecated)) to them or making the linker warn at link time (as is done for e.g. gets()) would do nothing other than cause those warnings to be universally ignored.
A turning point for GNU libc
Posted Mar 29, 2012 2:05 UTC (Thu) by apoelstra (subscriber, #75205)
[Link]
Is there an ANSI C alternative to strcat and friends? Because if not, such warnings will greatly irritate programmers who compile with -Werror (as well as the usual -W -Wall -Wextra and friends) and -ansi.
Plus, these sorts of pedantic programmers are certainly capable of using strcat safely (which is possible to do, unlike with gets()).
A turning point for GNU libc
Posted Jan 1, 2013 13:25 UTC (Tue) by shentino (subscriber, #76459)
[Link]
They are here to stay, just like the win32 api, x86 architecture, and the qwerty keyboard layout.
anyone trying to replace them will do nothing more than just rock the boat.
A turning point for GNU libc
Posted Mar 28, 2012 17:17 UTC (Wed) by arjan (subscriber, #36785)
[Link]
there's nothing wrong with strcpy and co.
seriously, everyone who wants to outright ban them does not think it through.
(and btw the glibc version of strcpy and co actually do buffer overflow checks with the help of the gcc compiler for various cases)
A turning point for GNU libc
Posted Mar 28, 2012 17:47 UTC (Wed) by jzbiciak (✭ supporter ✭, #5246)
[Link]
Posted Mar 28, 2012 19:17 UTC (Wed) by cmorgan (guest, #71980)
[Link]
I had missed that entirely. Thank you for the pointer to it, it was interesting.
A turning point for GNU libc
Posted Mar 28, 2012 21:00 UTC (Wed) by atai (subscriber, #10977)
[Link]
How can you deprecate functions in the ANSI C standard?
A turning point for GNU libc
Posted Mar 29, 2012 12:38 UTC (Thu) by nix (subscriber, #2304)
[Link]
Easily, if they suck, are widely considered to suck, and are rarely used. e.g. gets(), in the C standard but even there a compatibility relic of a pre-stdio I/O library, has produced a link warning for the entire lifespan of glibc 2 (RTH added the warning in 1996). To my knowledge nobody sane has ever complained about it, because gets() usage is rare, always considered a bug, and easily replaced with something else in the standard library. None of these things are true of strcpy().
A turning point for GNU libc
Posted Mar 29, 2012 12:57 UTC (Thu) by mina86 (subscriber, #68442)
[Link]
gets() is actually removed in C11. The difference between gets() and strcpy() though is that in the letter you can validate the length before the call, while in the former you have no way of knowing in advance how long the line you are about to read from stdin is.
A turning point for GNU libc
Posted Mar 29, 2012 20:30 UTC (Thu) by nix (subscriber, #2304)
[Link]
Agreed... but gets() was in C89, and C99, and is in C++11, and the deprecation warning was present long before C99 or C++11 existed. So we *do* deprecate stuff in ANSI C, but only if it really does suck (like gets()).
A turning point for GNU libc
Posted Mar 29, 2012 0:02 UTC (Thu) by Richard_J_Neill (subscriber, #23093)
[Link]
While we're at it, could we please get rid of the broken behaviour that leading zeros make strings into Octal? This behavior is implemented so fundamentally in eg atoi() that many other programming languages, even those that do dynamic typing, inherit the bug. If I write "064", I generally don't mean "52"!
A leading zero practically NEVER means the user intentionally wants to work in base 8, it just means they did something naive with string-splitting or data-entry. Perhaps the user entered the reading from a digital-scale, complete with leading zero. Or they took a string such as $7.09 and parsed it as "trim off the leading '$', split at the decimal point, multiply the first number by 100, and add the two together, to get a result in cents".
I'd consider that leading-zero-means-octal is a nasty case of technical debt that we still be causing bugs in 100 years if we don't fix it.
One way would be to support a new notation, similar to the "0x" notation. Numbers beginning "0o" (that's a letter 'o') would be recognised as octal. Then we spend 5 years transitioning the existing legitimate instances of octal numbers (file permissions) to the new syntax, then for 5 years, make the old format illegal, then in 10 years time, we can start treating leading zeros as they were meant to be treated.
[At the same time, can I plead for a "streq()" function, being defined as "!strcmp()" ]
A turning point for GNU libc
Posted Mar 29, 2012 4:52 UTC (Thu) by slashdot (guest, #22014)
[Link]
Are you sure about atoi supporting octal?
AFAICT POSIX 2008 forbids that, and requires atoi to support only decimal numbers.
At any rate, changing that is probably highly unwise, as it might break stuff.
A turning point for GNU libc
Posted Mar 29, 2012 10:37 UTC (Thu) by Richard_J_Neill (subscriber, #23093)
[Link]
D'oh! You're quite right about atoi(). I meant strtol(), but wasn't thinking straight. strtol() allows you to optionally specify the base=10, but then won't permit 0x... for hexadecimal.
I agree that we can't change it right away. That's why I suggest adding a new prefix, "0o" to be used in the very rare case where the programmer deliberately intends to use octal. That wouldn't break anything, and over perhaps 5 years, people could migrate.
A turning point for GNU libc
Posted Mar 29, 2012 20:53 UTC (Thu) by cmccabe (guest, #60281)
[Link]
> While we're at it, could we please get rid of the broken behaviour
> that leading zeros make strings into Octal? This behavior is
> implemented so fundamentally in eg atoi() that many other programming
> languages, even those that do dynamic typing, inherit the bug. If I
> write "064", I generally don't mean "52"!
It's a standard, like the QWERTY keyboard and the English language. Sorry, it's not going anywhere.
> At the same time, can I plead for a "streq()" function, being defined
> as "!strcmp()"
Why don't you define it yourself? It's a one-line function.
A turning point for GNU libc
Posted Mar 29, 2012 21:03 UTC (Thu) by cmccabe (guest, #60281)
[Link]
Yes, it would be nice to see strlcpy make it into the standard library (see the thread about that in the last glibc article).
However, strcpy also has its uses. Sometimes you really do know the length of the source string, and you know it fits in the destination.
strcpy() / strlcpy() / asprintf()
Posted Mar 30, 2012 11:07 UTC (Fri) by abacus (guest, #49001)
[Link]
But why to add strlcpy() to glibc while the function asprintf() is already present in glibc since considerable time ?
strcpy() / strlcpy() / asprintf()
Posted Mar 30, 2012 12:50 UTC (Fri) by nix (subscriber, #2304)
[Link]
And unlike strlcpy(), asprintf() really *is* very hard to implement in terms of other functions without either reimplementing printf format string parsing, or doing some sort of hideous hack involving mmap()ed regions repeated sprintf()ing, and catching SIGSEGV to detect overruns (yes, I've done that, yes, it made me feel ill).
strcpy() / strlcpy() / asprintf()
Posted Mar 30, 2012 15:56 UTC (Fri) by nybble41 (subscriber, #55106)
[Link]
It seems to me like asprintf() should be easy to implement in terms of two calls to vsnprintf():
int vasprintf(char **strp, const char *fmt, va_list ap)
{
va_list ap_copy;
char *str;
int bytes;
char nul;
/* Determine the amount of memory required to format the string */
/* vsnprintf() returns the space required, but only stores the NUL. */
va_copy(ap_copy, ap);
bytes = vsnprintf(&nul, 1, fmt, ap_copy);
va_end(ap_copy);
if (bytes <= 0)
return bytes;
str = (char*)malloc(bytes);
if (!str)
return -1;
/* Format the string into the destination buffer */
bytes = vsnprintf(str, bytes, fmt, ap);
if (bytes <= 0)
{
free(str);
return bytes;
}
/* No errors; store pointer to destination buffer and return its size. */
*strp = str;
return bytes;
}
int asprintf(char **strp, const char *fmt, ...)
{
int bytes;
va_list ap;
va_start(ap, fmt);
bytes = vasprintf(strp, fmt, ap);
va_end(ap);
return bytes;
}
Is there any reason this implementation wouldn't work? (Ignoring minor issues/typos; I'm making this up as I go.)
strcpy() / strlcpy() / asprintf()
Posted Mar 30, 2012 17:50 UTC (Fri) by nix (subscriber, #2304)
[Link]
Ah yes, of course if you had a vsnprintf() you can do it. I forgot about that. (I spent way too long on deprived platforms with either no snprintf() or none that worked.)
strcpy() / strlcpy() / asprintf()
Posted Mar 31, 2012 21:10 UTC (Sat) by lacos (subscriber, #70616)
[Link]
You can still implement it easily (but perhaps not too elegantly): vfprintf() the stuff to "/dev/null", and use the return value for allocation. Both the vfprintf() function and the /dev/null special file are mandated by SUSv1 (UNIX(R) 95) and possibly by earlier standards.
(SUSv1 doesn't have snprintf(). Once I needed to compile a glibc-oriented program on OSF/1 4.0E, which one might have consider a reference implementation of SUSv1, even though it wasn't formally certified. The program needed snprintf() and there was none, so I printed the string first to /dev/null (for the size), then fully to a malloc()'d area, then copied the bytes that had room, then free()'d the area. ... Sorry if this sounds trivial and/or retarded :))
strcpy() / strlcpy() / asprintf()
Posted Apr 5, 2012 8:36 UTC (Thu) by nix (subscriber, #2304)
[Link]
Never thought of using vfprintf(). That's a good idea. Ah well. It would surely be more elegant than the longjmp() thing.
A turning point for GNU libc
Posted Mar 28, 2012 20:06 UTC (Wed) by josh (subscriber, #17465)
[Link]
Why does any of the functionality you described belong in libc, rather than a specialized library?
glibc already contains far more than it should. Arguably, it should contain only the bare-minimum interfaces for compatibility with standard C and POSIX; anything else ought to live in another library.
A turning point for GNU libc
Posted Mar 28, 2012 20:11 UTC (Wed) by Cyberax (✭ supporter ✭, #52523)
[Link]
>Why does any of the functionality you described belong in libc, rather than a specialized library?
Because libc already has to do this! glibc has to read timezone database, so it can translate between UTC and current timezones. glibc has to support all kinds of encodings for iconv to work and so on.
But every single real-world program working with Unicode has to rely on third-party libraries. Ditto for timezone handling.
>glibc already contains far more than it should. Arguably, it should contain only the bare-minimum interfaces for compatibility with standard C and POSIX; anything else ought to live in another library.
Yes, that would be one way. But it should have been done from the start, now it's just not possible because of compatibility issues.
So the next best thing is to create proper and useful interfaces so all the existing complexity is at least not wasted.
A turning point for GNU libc
Posted Mar 29, 2012 12:43 UTC (Thu) by nix (subscriber, #2304)
[Link]
Timezone conversion should definitely be in glibc. It requires a large database -- though this is now universally supplied by other sources, glibc still has to do the conversion, so it makes little sense not to expose it. The only caveat here is picking an appropriate name: the getline() fiasco showed the problems created when you pick a name that is already in wide use (general rule: if your API addition breaks TeX, please pick a better name, and do so before POSIX standardizes it).
I'm not sure about proper unicode handling: doing it right is very difficult. Even doing case conversion right is difficult, and you can already case-convert with iconv() with no new APIs, only a new convention for 'locales' whose only difference from the standard ones is that they transliterate case. (But doing it that way does seem like a kludge.)
A turning point for GNU libc
Posted Mar 30, 2012 21:58 UTC (Fri) by justincormack (subscriber, #70439)
[Link]
Why should a timezone database be in libc? I would rather a tz update i not require a libc update. And the locale support in libc is legacy and should go. I do not wnt locales in my low level libraries.
A turning point for GNU libc
Posted Mar 30, 2012 23:46 UTC (Fri) by nix (subscriber, #2304)
[Link]
The tz database *itself* is no longer installed by glibc as of 2.16 (but is still in the source tree for the sake of testing).
The tz-manipulation code must stay, because widely-used functions in the exported API use it (notably tzset(), gmtime() and localtime()).
The locale support is very definitely not legacy: the C library is where all the internationalization code is located (found in libintl on many other Unix platforms). It is part of the API and ABI, used by every single internationalized program (which is most of them, these days) and can never ever be removed. (And printf() needs locale support for international printing of the decimal point and thousands grouping characters, and for the I alternative-output-digit flag character.)
A turning point for GNU libc
Posted Mar 31, 2012 10:01 UTC (Sat) by mpr22 (subscriber, #60784)
[Link]
setlocale() and strcoll() (among other locale-related things) are both defined as part of the 1990 and 1999 versions of the ISO C standard, and I am not aware of them having been removed from the 2011 version. Therefore, every compliant hosted implementation of the C programming language is obliged to provide setlocale() and strcoll(), and by saying you don't want locales in your low-level libraries you are saying you don't want a compliant hosted implementation of the C programming language.
A turning point for GNU libc
Posted Apr 5, 2012 8:29 UTC (Thu) by nix (subscriber, #2304)
[Link]
setlocale(), strcoll(), and for that matter localeconv() are, of course, still there in C11, just as they were in C99 and C89.
A turning point for GNU libc
Posted Mar 29, 2012 13:02 UTC (Thu) by mina86 (subscriber, #68442)
[Link]
Actually the DST change history may work very unreliably. In some countries, DST change is announced days in advanced on completely arbitrary dates. This means that you have no good way of knowing when the next DST change is going to happen.
As for past changes, this is doable, but I dunno if tzdata actually stores all the historical data.
A turning point for GNU libc
Posted Mar 29, 2012 19:34 UTC (Thu) by Jonno (subscriber, #49613)
[Link]
tzdata record historical time zones and all civil changes since 1970, as well as some older historical information where available.