By Jake Edge
April 18, 2012
The core library that sits between user space and the kernel, the GNU C
library (or GLIBC), has undergone some changes recently in its governance, at least
partly to
make it a more inclusive project. On the last day of the Linux Foundation
Collaboration Summit, Carlos O'Donell gave an update on the project, the
way it will be governed moving forward, and its
plans for the future. GLIBC founder Roland McGrath was on hand to
contribute his thoughts as well.
Though he wears several hats,
O'Donell introduced himself as an "upstream GLIBC community member", rather
than as a maintainer, because the GLIBC developers have recently been trying to
change the idea of what it means to be involved in the project. He works for
Mentor Graphics—by way of its acquisition of CodeSourcery—on
open-source-based C/C++ development tools. Those tools are targeted at
Linux developers and Mentor is committed to working upstream on things like
GCC and GDB, he said. He personally got involved in the GLIBC project ten
years ago
to support the PA-RISC (hppa) architecture; he now works on GLIBC both as
part of his job and as a volunteer.
Recent changes in the GLIBC community
The changes for GLIBC had been coming for a long
time, O'Donell said. The idea is to transition the project from one that
has a reputation for being a small community that is hard to work with to
one that will work well with the kernel developers as well as other
projects in the
free software world.
As part of that effort, the moderation of the main
GLIBC mailing list (libc-alpha) was removed after four years. The goal of
that moderation had been to steer new contributors to the libc-help mailing
list so that they could learn about the open source (and GLIBC) development
process before
they were exposed to the harsher libc-alpha environment. The mentoring
process that was done on libc-help has continued; it is a place for
"random questions" about GLIBC (both for users and new contributors), while
libc-alpha is for more focused
discussion and
patches once developers have a firm understanding of the process and
culture.
There has also been a lot of "wiki gardening" to make more internal
documentation of GLIBC available, he said.
The most visible recent change was the dissolution of the steering committee in
March. The project is moving to a "self-governed community of developers" that is
consensus driven, he said. There is a wiki page that
describes what the project means by "consensus". Trivial or typo patches can just
be checked into the repository, without waiting for approval. The GLIBC
community is willing to accept all sorts of patches now, he said, which
is a "change from where we were five years ago". All of the changes in the
community have come about as gradual process over the last four or five
years; there was no "overnight change", he said.
There are around 25-30 committers for GLIBC, O'Donell said in response to a
question from the audience, and they are listed on the
wiki. Ted Ts'o then asked about getting new features into GLIBC, noting
that in the past there was an assumption that trying to do so was not worth
the effort. He pointed out that BSD union mounts got help from its libc,
but that couldn't be done for Linux in GLIBC, partly because it was not in
the POSIX standard. What is the philosophy that
is evolving on things like that, he asked.
O'Donell said that it comes down to a question of "relevance"; if there are
features that users want, the project may be willing to accept things that
are not in POSIX. GLIBC is the layer between programs and the kernel, so
if there are things missing in that interface it may make sense to add
them. If GLIBC fails to provide pieces that are needed, it will eventually
not be relevant for its users. For example, he said, there is a lot of
work going on in tracing these days, but GLIBC has not been approached to
expose the internals of its mutexes so that users are better able to debug
problems in multi-threaded programs; things like that might make good
additions. But,
"we are conservative", he said.
Ts'o then mentioned problems that had occurred in the past in trying to
expose the kernel's thread ID to user space. There has been a huge amount
of work done to get that information, which bypassed GLIBC because of the
assumption that GLIBC would not accept patches to do so. People are
working around GLIBC rather than working with it, he said.
There is no overriding philosophy about what changes would be acceptable,
McGrath said. Much like with the kernel, features will be evaluated on a
case-by-case basis. There is a need to balance adding something to every
process that runs all over the world and adding interfaces that will need
to be supported forever against the needs and wishes of users. Things that
have "bounced off" GLIBC in the past should be brought up again to "start
the conversations afresh". But don't assume that it will be easy to get
your pet feature into GLIBC, he said.
With 25-30 committers for the project, how will competing philosophies
among those people be handled, Steven Rostedt asked. That problem has not
been solved yet, O'Donell said. At this point, they are trying to
"bootstrap a community that was a little dysfunctional" and will see how it
works out. If problems crop up, they will be resolved then. McGrath said
that things will be governed by consensus and that there won't be "I do it,
you revert it, over and over" kinds of battles. In addition, O'Donell
said, that in a Git-based world reverts won't happen in that way because
new features will happen on branches.
Standards
The most static part of GLIBC is the portion that implements standards,
O'Donell said, moving on to the next part of his talk. Standards support is
important because it allows people and code to move between different
architectures and platforms. The "new-ish" standards support that the
GLIBC community is working on now is the C11 support, which he guesses will
be
available in GLIBC 2.16 or 2.17. One of the more interesting features in
C11 is the C-level atomic operations, he said. Some of the optional annexes
to C11 have not been fully implemented.
Ulrich Drepper is also working on conformance testing for POSIX 2008 and any
problems that are found with that will need to be addressed, O'Donell said.
There are no plans to add the C11 string bounds-checking interfaces from
one of the annexes as there are questions about their usefulness even within
the standards groups. That doesn't mean that those interfaces couldn't end
up in the libc_ports tree, which provides a place for optional add-ons that
aren't enabled by default. That would allow distributions or others to
build those functions into their version of GLIBC.
The math library, libm, is considered "almost
complete" for C11 support, though there are a "handful" of macros for
imaginary numbers that are missing, but Joseph Myers is working on
completing those.
All of the libm bugs that have been reported have been reviewed by Myers;
he and Andreas Jaeger are working on fixing them, O'Donell said. Some
functions are not rounding correctly, but sometimes fixing a function to
make it right
makes it too slow. Every user's requirements are different in terms of
accuracy vs. speed, so something
needs to be done, but it is not clear what that is.
Bugs filed in
bugzilla are being worked on, though, so he asked that users file or
reopen bugs that need to be addressed.
Short-term challenges
O'Donell then moved on to the short-term issues for the project, which he
called "beer soluble" problems because they can be fixed over a weekend
or by someone offering a case of beer to get them solved; "the kind of thing
you can get done quickly". First up is to grow the community by attracting
more developers, reviewers, and testers. The project would also like to
get more involvement from distributions and, to that end, has identified
a contact person for each distribution.
Part of building a larger community is to document various parts of the
development process. So there is information on the wiki about what
constitutes a trivial change, what to do when a patch breaks the build, and
so on. The idea is that the tree can be built reliably so that regression
testing can be run frequently, he said.
The release process has also changed. For a while, the project was not
releasing tarballs, but it has gone back to doing so. It is also making
release branches early on in the process, he said.
GLIBC 2.15 was released on March 21 using the new process. There will be
an 2.15.1 update at the end of April and the bugs that are targeted for
that release are tagged with "glibc_2.15". In addition, they have been
tagging bugs for the 2.16 release and they are shooting for twice-a-year
releases that are synchronized with Fedora releases.
Spinning out the
transport-independent remote procedure call (TIRPC aka Sun RPC) functions
into a separate library is an example of the kinds of coordination and
cooperation that the GLIBC project will need to do with others, he said.
Cooperation with the
distributions and the TIRPC project is needed in order to smooth that
transition.
There have been some "teething problems" with the TIRPC transition, like
some header file overlaps in the installed files. Those
problems underscore the need to coordinate better with other projects.
It's "just work", he said, but cooperating on
who is going to distribute which header and configuration files needs to
happen to make these kinds of changes go more smoothly.
Medium-term challenges
The medium-term problems for the project were called "statistically
significant" by O'Donell because the only way to solve them is to gather a
bunch of the right people together to work on them. A good example is the
merger of EGLIBC and GLIBC. The fork of GLIBC targeted at the embedded
space has followed all of the FSF copyright policies, so any of that code
can be merged into GLIBC. He is "not going to say that all of EGLIBC" will
be merged into GLIBC, but there are parts that should be. In particular,
the cross-building and cross-testing support are likely to be merged.
Another area that might be useful are the POSIX profiles that would allow
building the library with only certain subsets of its functionality, which
would reduce the size of GLIBC by removing unneeded pieces.
In answer to a question from Jon Masters, O'Donell said that new
architecture ports should target GLIBC, rather than EGLIBC. Though if
there is a need for some of the EGLIBC patches, that might be the right
starting point, he said.
The GLIBC testing methodology needs to enhanced. For one thing, it is
difficult to compare the performance of the library over a long period of
time. The project gets patches to fix the performance of various parts,
but without test cases or benchmarks that could be used down the road to
evaluate new patches. Much of the recent work that has gone into GLIBC is
to increase performance, so it is important to be able to have some
baselines to compare against.
The testing
framework also needs work. It is currently just a test skeleton C file,
though there have been suggestions to use DejaGNU or QMTest. The test
infrastructure in GLIBC is not the "most mature" part of the project. It
should be, though, because if the project is claiming that it is
conservative, it need tests to ensure that things are not breaking, he said.
More independent testing is needed, perhaps using the Linux
Test Project or the Open POSIX test suite. Right now Fedora
is used to do full user-space rebuild testing, but it would be good to do
that with other distributions as well. Build problems are easy to find
that way, but runtime problems are not.
Long-term challenges
In the next section of the talk, O'Donell looked at ideas for
things that might be coming up to five years out. No one
can really predict
what will happen in the kind of time frame, he said, which is why he dubbed
it "mad
science". One area that is likely to need attention is tracing
support. Exposing the internal state of GLIBC for user-space tracing
(e.g.
SystemTap, LTTng, or other tools) will be needed.
Another idea is
to auto-generate the libm math library from a C code description of "how
libm should work". There is disappointment in the user community because
the libm functions have a wide spectrum between "fast and inaccurate"
and "slow and accurate" functions. Auto-generating the code would allow users
to specify where on that spectrum their math library would reside.
One last
idea that he "threw in for fun" is something that some researchers have
been talking to the project about: "exception-less system calls".
The idea is to avoid the user to kernel
space transition in GLIBC by having it talk to "some kind of user-space
service API" that would provide an asynchronous kernel interface, rather
than doing a trap into the kernel directly.
To close out his talk, O'Donell stressed that the project is very welcoming
to new contributors. He suggested that if you had a GLIBC bug closed or
submitted a patch and never heard back, then you should get involved with
the project as it will be more open to working with you than it may have been
in the past. If you have GLIBC wishlist items, please put them on the
wiki; or if you have read a piece of code in GLIBC and "know what it
does", please submit a comment patch, he said.
Questions and answers
With that, he moved onto audience questions, many of which revolved around
the difference between the glibc_core and glibc_ports. The first was a
question about whether it made sense to merge ports into the core.
O'Donell said that the two pieces have remained close over the years and
essentially live in the same repository, though they are split into two Git
trees. There is no real need to merge them, he said, but if it was deemed
necessary, it could be done with a purely mechanical merge. Ports is meant
as an experimental playground of sorts, that also allows users to pick add-ons
that they need.
That "experimental" designation would come back to haunt O'Donell a bit.
An audience member noted that the Sparc version of GLIBC lives in the core,
while ARM (and others) lives in ports. McGrath said that was really an
accident of history. Ports helps ensure that the infrastructure for
add-ons doesn't bitrot, he said. "ARM is by no means a second-class
citizen" in GLIBC, O'Donell added. The ports mechanism allows vendors to
add things on top of GLIBC so keeping it working is worthwhile.
But the audience reminded O'Donell of his statement about ports being
experimental, and that it might give the wrong impression about ARM
support. "I'm completely at fault", he responded, noting
that he shouldn't have used "experimental" for ports. With a bit of a
chuckle, McGrath said: "That's the kind of statement GLIBC maintainers now
make".
At the time of the core/ports split, all of the architectures that didn't
have a maintainer were put into ports, McGrath said. Now it is something
of an "artificial distinction" for architectures, O'Donell said.
Ts'o suggested that perhaps all of the architectures should be in
ports, while the core becomes architecture-independent to combat the
perception problem. O'Donell seemed amenable to that approach, as did
McGrath, who said that it really depends on people showing up to do the
work needed to make things like that happen.
Another question was about the "friction" that led to the creation of
EGLIBC; has that all been resolved now? O'Donell said that the issues
haven't been resolved exactly, but that there are people stepping up in the
GLIBC community to address the problems that led to the split. There may
still be some friction as things move forward, but they will be resolved by
technical arguments. If a feature makes sense technically, it will get
merged into GLIBC, he said.
The last question was about whether there are plans to move to the LGPLv3
for GLIBC. McGrath said that there is a problem doing so because of the
complexity of linking with GPLv2-only code. The FSF would like to move the
library to LGPLv3, but it is committed to "not breaking the world". There
have been some discussions on ways to do so, but most GLIBC developers are
"just fine" with things staying the way they are.
The talk clearly showed a project in transition, with high hopes of a
larger community via a shift to a more-inclusive project. GLIBC is an
extremely important part of the Linux ecosystem, and one that has long
suffered from a small, exclusive community. That looks to be changing, and
it will be interesting to see how a larger GLIBC community fares—and
what new features will emerge from these changes.
[
A
video of this talk has been posted by the Linux Foundation.]
(
Log in to post comments)