A GNU C Library update

By Jonathan Corbet
February 24, 2015

A traditional feature of the tools track at the Linux Foundation's Collaboration Summit is an update from the developers of the GNU C Library (glibc); that tradition was upheld in fine form at the 2015 event. Glibc developer Roland McGrath noted that while the project is a critical component in vast numbers of Linux installations, it does not have a lot of developers working on it. Still, even with a relatively small developer base, some real progress has been made over the last year.

Recent history

The glibc 2.20 release came out last August featuring contributions from 69 developers. This release fixed 160 bugs, four of which had CVE numbers attached to them. It added support for Intel's MPX memory protection mechanism to the dynamic loader; that support is not useful for much now, but it will be a useful preparation once MPX-enabled applications appear in the future. Lock elision support has been added for the s390 architecture. There are new, optimized string functions for the ARMv7 and ARM64 subarchitectures, while support for the ancient AM33 subarchitecture has been removed.

Also in 2.20, the _BSD_SOURCE and _SVID_SOURCE feature test macros have been removed. The declarations formerly available under those macros are now under _DEFAULT_SOURCE, but, since it's the default, one need not set it explicitly. Roland's suggestion, though, was that most code would (continue to) want to use _GNU_SOURCE or one of the POSIX-specific macros. (See this article for an introduction to glibc feature test macros).

Support for file-private POSIX locks has been added to this release. And, finally, as of 2.20, the oldest supported kernel release is 2.6.32. Users staying with older kernels are, most likely, not going to update their C library either.

Glibc 2.21 was released on February 6. It had changes from 61 developers, fixing 101 bugs and five security issues. The MIPS architecture has gained support for the O32 FPXX, FP64, and FP64A ABIs. Support for the obsolete sigvec() functions has been removed (though ABI support remains for older binaries). The sigvec() ABI dates back to the 4.2 BSD release and, Roland said, was replaced by sigaction() 25 years ago.

The 2.21 release includes support for a handful of new locales. The i386 memcpy() implementation has been optimized. The PowerPC architecture has gained support for lock elision and optimized string functions as well. Various architecture-specific semaphore implementations have been replaced by a single generic version written in C; that should help the developers to maintain this code and avoid introducing race conditions in the future. Also in this release is support for Altera's NIOS II architecture.

The 2.22 release cycle has just gotten underway; as of Roland's talk it had five bugfixes but not much else. He expects to see a rework of the POSIX thread cancellation code to fix some race conditions there. The process of cleaning up the atomic-operation and futex code will continue as well; it is part of a long-term project to move the library toward the C11 memory model. Once that is done, he said, the developers should be able to have a higher level of confidence in the correctness of the library's synchronization code.

Glibc in the next five years

Looking forward, Roland said that, five years from now, the glibc developers will have put out ten releases and, probably, have fixed about 1,500 bugs. But it would be nice if the project could accelerate a bit, bringing in more useful features while maintaining the quality of the code. To do that, though, the developers need some more help.

One place where interested people could contribute is in bug triage. It takes a lot of work to go through the bug tracker, assign priorities to bugs, determine whether they might have security relevance, create minimal programs to reproduce bugs, and, even, write patches to fix them. Another area where help is needed is with benchmarks; what is there now is not really adequate to judge the value of performance patches. Benchmarks for the math library are especially needed. Microbenchmarks are valuable, but there is also a need for whole-system benchmarks that can be used for workload-specific tuning. Writing benchmarks, Roland said, is a great way to develop a working knowledge of the API.

Then, there is the matter of testing infrastructure. Glibc is, by its nature, a conservative project that puts a premium on not breaking things. Often the biggest blocker for new patches is the simple lack of adequate testing; a good automated testing framework would help the project to merge new code more quickly. The problem is not an easy one, though, he said.

Among other things, the project needs test suites that can run against an installed version of the library. Otherwise it can be hard to test the resolver and other parts of the library that interact with the rest of the system and the network. Some of the more complex tests probably need to be run within containers. Also needed is ABI comparison testing; the project has some "superficial" tests now that look at symbols, versions, and the size of data objects, but there's no testing for ABI breakage beyond that. The libabigail library out of the GCC project should probably be brought in to help with this task.

Finally, the developers could use help supporting the project's infrastructure — the wiki, bug tracker, build systems, etc. Interested people should consult the project's master "todo" list for lots more information.

An audience member asked about the handling of security bugs. Roland answered that the project Bugzilla includes a flag to mark whether any particular bug has been reviewed for security implications. The project could use some help with that kind of review, he said. One of the recent CVE entries for glibc referred to a bug that had been fixed in 2013, but nobody had realized at the time that it was exploitable. It would be nice to catch more of those, but, Roland said, the primary responsibility for identifying and responding to security bugs lies downstream with the distributors.

Another attendee asked about EGLIBC. Roland said that he's not really the right person to talk about that project but that, as far as he can tell, it has mostly gone away. Some of the developers behind the EGLIBC fork are now core glibc developers, and most (but not all) of the EGLIBC changes have been pulled back into glibc. In general, Roland would like to see all of the glibc forks out there merge back into the mainline. But, while distributors have patches that they apply, he is not aware of any distributors shipping forks at this point.

The next question had to do with contributor-agreement process that developers must go through to get code accepted; it was described as being complicated and an obstacle to contribution. Roland agreed with that assessment, but said that the legal aspects were outside of the project's control. Glibc is a GNU project; that fact is not going to change. It would be nice if the agreement process could be streamlined, but it is up to the GNU project to do that.

The last question had to do with support for building the library with the LLVM compiler. It was noted that there has been pushback against the merging of LLVM support into some GNU projects; would there be difficulties in getting LLVM support into glibc? Roland answered that he didn't see any reason for there to be trouble. In general, the glibc developers have been working toward the use of more standard language features and getting away from GCC extensions. That is a technical goal aimed at making the glibc code better. Technical improvement is where the glibc project's interests lie; the glibc developers, he said, are uninterested in the political issues.

[Your editor would like to thank the Linux Foundation for supporting his travel to the Collaboration Summit.]

Index entries for this article
Conference	Collaboration Summit/2015

A GNU C Library update

Posted Feb 24, 2015 19:26 UTC (Tue) by revmischa (guest, #74786) [Link] (25 responses)

These guys won't even accept a patch to fix a typo

A GNU C Library update

Posted Feb 24, 2015 20:20 UTC (Tue) by madscientist (subscriber, #16861) [Link] (7 responses)

And they're right not to accept that change. The only thing it does is increase the likelihood of non-portable code... in other words, it has net negative value. Why would anyone accept a change like that?

If it really twists someone's knickers to write O_CREAT, then they can easily add

    #define MY_CREATE O_CREAT

to their own code, and be happy (and still portable!)

A GNU C Library update

Posted Feb 25, 2015 4:45 UTC (Wed) by zblaxell (subscriber, #26385) [Link] (6 responses)

or even

#ifndef O_CREATE
#define O_CREATE O_CREAT
#endif

which is portable to any sane platform (i.e. one that hasn't already defined O_CREATE ;).

When I was 15 or so, I was out to fix anything that didn't fit into my view of the world as I head learned it thus far. All perceived defects were of equal importance. I had a library of stuff like:

#define IF (
#define THEN ) {
#define END }
#define AND &&
#define OR ||
#define CALL

because it let me write code like

IF x > 3 AND y < 5 THEN
CALL do_stuff();
END

which looked vaguely like the various languages I'd encountered prior to C, except with painfully wrong results when it was time to find out if x was equal to something.

This didn't help me read any existing code, and it meant nobody else could read mine, so I got over it--especially when it became clear just how important code readability was going to be. I think it's a phase many coders go through.

A GNU C Library update

Posted Feb 25, 2015 18:32 UTC (Wed) by bronson (subscriber, #4806) [Link] (2 responses)

Steve Bourne went through that phase as an adult: http://research.swtch.com/shmacro

A GNU C Library update

Posted Feb 25, 2015 20:05 UTC (Wed) by proski (subscriber, #104) [Link] (1 responses)

It's odd he used "done" in the shell and "OD" in its source. So inconsistent :)

A GNU C Library update

Posted Feb 25, 2015 20:09 UTC (Wed) by zblaxell (subscriber, #26385) [Link]

If the octal dump utility existed prior to the shell, 'od' would have conflicted with it.

“&&” => “and”, "||” => “or”, “!” => “not”

Posted Mar 3, 2015 9:23 UTC (Tue) by ldo (guest, #40946) [Link] (2 responses)

These are standard in C++ and C99. In C, you have to #include <iso646.h>.

“&&” => “and”, "||” => “or”, “!” => “not”

Posted Mar 3, 2015 12:17 UTC (Tue) by mpr22 (subscriber, #60784) [Link] (1 responses)

Putting the primary content of your message exclusively in its subject line is... distinctly suboptimal.

But not as bad as starting a sentence in the subject line

Posted Mar 3, 2015 16:54 UTC (Tue) by jzbiciak (guest, #5246) [Link]

and finishing it in the body.

A GNU C Library update

Posted Feb 24, 2015 21:26 UTC (Tue) by acollins (guest, #94471) [Link] (4 responses)

As far as I can tell, the only thing that patch does is create a portability nightmare?

In your link you use Go as an example of why it's okay to make the change, but that's not really a valid comparison. Go is very young, and essentially a monoculture at this point, so such changes can be quickly propagated to all interested parties. C has a much wider userbase, and forcing a wholesale libc upgrade for everyone on any platform ever, just for the sake of making folks feel better about a trivial missing letter in a name, seems rather silly.

A GNU C Library update

Posted Feb 25, 2015 13:32 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (3 responses)

Go also ships a tool to do updates between versions of the language. It also helps that the runtime is statically linked into each executable (no need for parallel install while upgrading your code).

A GNU C Library update

Posted Feb 25, 2015 13:49 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

Go is really awesome in that regard. It does not actually depend even on glibc (unless it's pulled in by some library) so you can simply install Go software by copying it.

A GNU C Library update

Posted Feb 26, 2015 4:07 UTC (Thu) by krakensden (subscriber, #72039) [Link]

Post go 1.3, net/http pulls in pthreads by default[1], so it's not *totally* independent anymore.

[1] you can turn it off, but...

A GNU C Library update

Posted Feb 25, 2015 18:05 UTC (Wed) by lsl (subscriber, #86508) [Link]

> Go also ships a tool to do updates between versions of the language.

It was called gofix but as the language spec and stdlib is stable nowadays it's not used anymore. Still, it was pretty awesome back then.

Ken's change to O_CREATE was before Go 1.0.

A GNU C Library update

Posted Feb 24, 2015 21:40 UTC (Tue) by nix (subscriber, #2304) [Link]

That "fix" is clearly wrong. By dropping that in there, you're claiming that "O_CREATE" is in all versions of POSIX and all standards that include it by reference. That's simply not true. It might belong under _GNU_SOURCE, but, er, what benefit does it bring? "Fixing a spelling error" isn't a very good reason to change an interface of multiple decades' standing, even if it is a silly one. (And, in any case, O_CREAT is consistent with creat(2), after which it was named: do you plan to rename *that*, too? Good luck if so: it's only as old as Unix itself.)

A GNU C Library update

Posted Feb 24, 2015 22:29 UTC (Tue) by shmget (guest, #58347) [Link] (3 responses)

that is silly...

<sarcasm>
why not

O_CRÉER
O_CREAR
O_SCHAFFEN
O_SKAPPA
O_USTVARITI

while we are at it:

#define O_EXCL 0x0800 /* Fail if file already exists. */
#define O_TRUNC 0x0400 /* Truncate file to zero length. */

are clearly unacceptable either... :
O_EXCLUDE
O_TRUNCATE

and don't get me started on these _t things and other weird collection of letter

time_t => epoch_based_time_type
uint32_t => unsigned_32_bits_integer_type
pthread_t => portable_operating_system_interface_thread_type

A GNU C Library update

Posted Feb 24, 2015 23:54 UTC (Tue) by cesarb (subscriber, #6266) [Link] (2 responses)

> O_EXCLUDE

I thought O_EXCL meant "exclusive".

A GNU C Library update

Posted Feb 25, 2015 0:13 UTC (Wed) by nash (guest, #50334) [Link] (1 responses)

Add both then!

A GNU C Library update

Posted Feb 26, 2015 15:55 UTC (Thu) by ortalo (guest, #4654) [Link]

That could entail a sort of paradox. Aren't you afraid HAL psychotic behaviour could reproduce as soon as C compilers are better at semantic analysis?

A GNU C Library update

Posted Feb 25, 2015 1:54 UTC (Wed) by smurf (subscriber, #17840) [Link] (2 responses)

Clearly. And don't forget the syscall (creat(2)).

Frankly, if that's your only problem with glibc then you must hold it in very high esteem indeed.

A GNU C Library update

Posted Feb 25, 2015 18:25 UTC (Wed) by bronson (subscriber, #4806) [Link] (1 responses)

If we're going that far then umount(1) is fair game.

(That semi-invisible missing letter cost me an hour back when I was learning Linux... On the bright side, I learned more about how PATH works)

A GNU C Library update

Posted Feb 25, 2015 19:16 UTC (Wed) by bronson (subscriber, #4806) [Link]

Huh, just noticed that I'd been using Unix on and off since 1992 but didn't worry about mount points or PATH (beyond adding ~/bin) until 1996.

Those were the days... Sysadmins were revered, Unixes cost $thousands per seat, and all services running on port < 1024 were legit.

A GNU C Library update

Posted Feb 25, 2015 3:55 UTC (Wed) by Jonno (subscriber, #49613) [Link]

That's not fixing a typo, that is changing a well-established programming interface. That using these kind of abbreviations in the interface has, in hindsight, proven to be a quite bad design decision, does not change that fact.

In my experience, trying to paper over minor design mistakes like this after the fact tend to cause more problems long term that the original deficiency ever could, and refusing a patch trying to do that is imho quite clearly the right thing to do.

Avoiding to carry them over to new interfaces (such as the Go API mentioned in the linked page) is obviously another matter, and should be encouraged whenever possible.

A GNU C Library update

Posted Feb 25, 2015 6:44 UTC (Wed) by rsidd (subscriber, #2582) [Link]

"Creat" is the correct spelling. In this context. Defined by decades of use.

Wow, am I the only one here…

Posted Feb 25, 2015 7:57 UTC (Wed) by HelloWorld (guest, #56129) [Link] (1 responses)

… who realises that this is obviously a joke?

Wow, am I the only one here…

Posted Feb 25, 2015 9:20 UTC (Wed) by marcH (subscriber, #57642) [Link]

... or a troll. In which case it worked really well - much better than a joke!

A GNU C Library update

Posted Feb 25, 2015 15:53 UTC (Wed) by cyrus (subscriber, #36858) [Link] (21 responses)

Is there work going on to implement a more scalable memory allocator like jemalloc?

The todo-site at https://sourceware.org/glibc/wiki/Development_Todo/Enhanc... doesn't mention scalability at all.

A GNU C Library update

Posted Feb 25, 2015 16:18 UTC (Wed) by rsidd (subscriber, #2582) [Link] (17 responses)

Couldn't they just import jemalloc? It's BSD-licensed...

A GNU C Library update

Posted Feb 25, 2015 17:13 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (16 responses)

Unfortunately, jemalloc has some subtle behavior differences from glibc.

A GNU C Library update

Posted Feb 25, 2015 19:16 UTC (Wed) by madscientist (subscriber, #16861) [Link] (15 responses)

Also note that jemalloc is incompatible with the Linux kernel's implementation of Transparent Huge Pages, which most distributions set to "always" these days. Of course something about that situation could be changed... but it's not a simple drop-in replacement.

A GNU C Library update

Posted Feb 26, 2015 0:41 UTC (Thu) by rsidd (subscriber, #2582) [Link] (14 responses)

I had the impression Firefox used jemalloc. Has that changed? Or is it only on some other platforms?

A GNU C Library update

Posted Feb 26, 2015 8:23 UTC (Thu) by rsidd (subscriber, #2582) [Link] (1 responses)

Replying to myself: as of 2011, Facebook was using and developing jemalloc on linux. Here's a detailed facebook post by Jason Evans (the JE in jemalloc). They used LD_PRELOAD to switch out the default allocator in these benchmarks, so as of Linux 2.6.27, glibc 2.5 and jemalloc 2.1, there was no problem using it on linux. I can't find newer information.

A GNU C Library update

Posted Feb 26, 2015 13:03 UTC (Thu) by jengelh (guest, #33263) [Link]

"Problem" probably did not mean a readily observable correctness problem, but perhaps a performance problem?

A GNU C Library update

Posted Feb 26, 2015 13:01 UTC (Thu) by madscientist (subscriber, #16861) [Link] (11 responses)

They do use their own slightly hacked version of jemalloc (I've seen some posts recently on the jemalloc mailing list which leads me to believe the mozilla folks are trying to align themselves with the latest standard jemalloc rather than keeping their own).

It's not every workload and every situation that has a problem, of course. But applications which do lots of alloc/free-ing of memory and expect jemalloc to be able to give back unused memory to the OS will see issues: because pages are much larger they can't be given back as easily and eventually the application will take up too much memory and be "handled" by the kernel's OOM killer. And if you're going to replace glibc's allocator then it has to work reliably for all workloads.

For more information see here and here and here and here (and other places).

A GNU C Library update

Posted Feb 26, 2015 13:34 UTC (Thu) by ibukanov (subscriber, #3942) [Link]

This does not imply that jemalloc is worse than glibc allocator regarding huge tables! It is just that an extra fragmentation prevention trick that jemalloc uses (madvise(…, MADV_DONTNEED)) does not work with huge tables. However, as with glibc in general the fragmentation level is worse compared with jemalloc even with madvise disabled, then the same issues applies to glibc.

What happened is that applications like Firefox, Splunk or TokuDB that do a lot of small allocations switched to jemalloc to fight the fragmentation and it turned out that under huge tables jemalloc advantages over glibc much smaller and the fragmentation still hurts.

A GNU C Library update

Posted Feb 27, 2015 4:26 UTC (Fri) by butlerm (subscriber, #13312) [Link] (1 responses)

If those examples are typical, I am amazed that anyone uses transparent huge pages at all. I certainly can't imagine why it should be enabled by default.

A GNU C Library update

Posted Mar 4, 2015 13:36 UTC (Wed) by nix (subscriber, #2304) [Link]

TLB miss reduction. 5% or even more speedup on some workloads, including ones that are quite significant for some people like running VMs and even running big compilations. Anything that does lots of pointer chasing in a bunch of big blocks of memory it allocates for the long haul, basically... anything that stresses the TLB, but doesn't push the system into swap, since THP is even worse when that kicks in.

A GNU C Library update

Posted Feb 27, 2015 6:21 UTC (Fri) by k8to (guest, #15413) [Link] (3 responses)

What I've seen is that the MM subsystem in linux ends up thrashing and wasting lots of time defragmenting for short-lived high-allocation proceses with jemalloc. No oomkiller, just a machine going super slow.

A GNU C Library update

Posted Feb 27, 2015 6:22 UTC (Fri) by k8to (guest, #15413) [Link]

Sorry I should say coalescing not defragmenting.

A GNU C Library update

Posted Feb 27, 2015 13:50 UTC (Fri) by madscientist (subscriber, #16861) [Link] (1 responses)

I haven't seen this, at all. jemalloc is used in all sorts of long-lived, memory-intensive, performance-sensitive applications precisely because it does not exhibit this sort of thrashing behavior.

In any event, the behavior under discussion here is not thrashing, it's a steady, inexorable increase in RSS over time (usually over multiple days of use but you can observe it in shorter amounts of time) which eventually will cause the OOM killer to get involved as the system has run out of memory. It's a real thing, as you can see from the many jemalloc-using applications which have run into it.

When looking into this issue a few years ago I switched back to standard glibc allocator for a baseline: IIRC I didn't see the same RSS increase/OOM issues as with jemalloc. However, I didn't do any further investigation in that direction, just noted it as a data point (the standard allocator has various issues with our system, that's why we switched to jemalloc in the first place). So, there could be any number of reasons for that difference, it's true. Maybe if I'd added more work, or something, to the glibc test it would have OOM'd as well.

A GNU C Library update

Posted Mar 1, 2015 8:16 UTC (Sun) by k8to (guest, #15413) [Link]

I said short-lived, and it's not really a jemalloc problem.

I have the 40 or so customer sites worth of data.

There's a discussion of transparent huge pages and jemalloc. I am mentioning a problem that has occurred in my experience with transparent huge pages and jemalloc. If you think that's off topic, then you're being silly.

There is indeed more than one problem related to transparent huge pages.

A GNU C Library update

Posted Feb 27, 2015 6:48 UTC (Fri) by rsidd (subscriber, #2582) [Link] (3 responses)

Interesting, thanks! But it looks like the problem is fixable in jemalloc, at least for kernel 2.6.38 and later -- basically, jemalloc can explicitly turn off huge page allocations. (There is no later jemalloc release, I don't know if the patch they give has been merged.)

A GNU C Library update

Posted Feb 27, 2015 17:41 UTC (Fri) by madscientist (subscriber, #16861) [Link] (2 responses)

Yeah, I saw that back when it was posted. When we did an experiment with /sys/kernel/mm/transparent_hugepage/enabled value set to "always", which is the default setting on all modern GNU/Linux distros I'm aware of, we still saw AnonHuge pages in the process space even after applying that patch to jemalloc to use MADV_NOHUGEPAGE, unless we set enabled to either madvise or never to disable THP by default.

However, I took a look at the implementation in mm/huge_memory.c and talked with my colleague, and now we wonder if that was just an artifact of some static allocations or something that were not using jemalloc for some reason (honestly I'm not sure how this would happen but...).

I'll try to find some time this weekend to do a deeper test and verify whether RSS vs. jemalloc's HeapActive value is actually diverging (this is the symptom of the problem), rather than just looking at AnonHuge in proc.

A GNU C Library update

Posted Mar 1, 2015 0:28 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

> the default setting on all modern GNU/Linux distros I'm aware of

FWIW, it is madvise on Fedora (Rawhide).

A GNU C Library update

Posted Mar 2, 2015 23:37 UTC (Mon) by Gerardo (subscriber, #37539) [Link]

Also in Debian, here.

A GNU C Library update

Posted Mar 4, 2015 13:38 UTC (Wed) by nix (subscriber, #2304) [Link] (2 responses)

There *is* work going on on the allocator, but replacing the whole thing is unlikely to happen until a good set of real-world benchmarks is produced (which is also happening). Without that, there's too much chance that any change will lead to massive disruption and a world that's no better than the one we left, only for different use cases.

A GNU C Library update

Posted Mar 9, 2015 3:20 UTC (Mon) by vbabka (subscriber, #91706) [Link] (1 responses)

AFAIK glibc already (since 2.10?) uses per-thread arenas and MADV_DONTNEED optimizations nowadays. Not sure what else is missing from what jemalloc does. It's not without problems, see e.g. https://lkml.org/lkml/2015/2/2/493

A GNU C Library update

Posted Mar 12, 2015 2:09 UTC (Thu) by nix (subscriber, #2304) [Link]

Those optimizations may in fact be pessimizations: certainly in some situations they can cause both cache and VMA/syscall thrashing. Allocator work has been greatly impaired by Emacs: its use of malloc_get/set_state() when dumping means that changing the allocator's internals will cause Emacsen dumped before the change to coredump at startup. The solution is to deprecate both those functions (they have basically no other users) and require things like Emacs and CLISP that dump and undump to either do it differently or to use their own mallocs (like Emacs already can). Sure, as an Emacs user it is annoying to lose a bit of efficiency for this reason, but it's bloody stupid for one editor to be holding back allocator work for an entire OS.

This hasn't been done yet, but has been seriously proposed and seems likely to be implemented as soon as things like a proper bunch of useful benchmarks are in place so that improvements to malloc can be analyzed properly to make sure they're not actually pessimizing it.