jemalloc
jemalloc
Posted Mar 12, 2008 21:15 UTC (Wed) by jayavarman (guest, #19600)Parent article: A look at memory usage in Firefox 3
I wonder why something like jemalloc isn't included in glibc for the benefit of all userspace? Are there serious drawbacks?
Posted Mar 12, 2008 22:08 UTC (Wed)
by tialaramex (subscriber, #21167)
[Link] (6 responses)
Posted Mar 13, 2008 3:05 UTC (Thu)
by pr1268 (guest, #24648)
[Link] (2 responses)
Fascinating discussion - I never knew Firefox used a different allocator - in fact, I was unaware that any userspace application could use any allocator other than what was provided by GNU libc (on GNU/Linux systems in particular). When you mention "Google", what exactly are you talking about? I thought Google was a Web site/portal with online applications, not a userspace app running on someone's PC. Do you mean Google Earth? Thanks!
Posted Mar 14, 2008 14:27 UTC (Fri)
by Los__D (guest, #15263)
[Link]
Posted Mar 13, 2008 7:02 UTC (Thu)
by rsidd (subscriber, #2582)
[Link] (2 responses)
If your application is important enough to warrant writing or maintaining your own allocator, as has effectively happened with jemalloc in Firefox
No that's not what happened: jemalloc was written for FreeBSD (where it is the default in 7.0 and up), and ported to firefox with the help of the author, Jason Evans.
Posted Mar 13, 2008 14:31 UTC (Thu)
by tialaramex (subscriber, #21167)
[Link] (1 responses)
Posted Mar 13, 2008 14:52 UTC (Thu)
by tialaramex (subscriber, #21167)
[Link]
Posted Mar 13, 2008 7:00 UTC (Thu)
by rsidd (subscriber, #2582)
[Link] (2 responses)
Posted Mar 13, 2008 13:09 UTC (Thu)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted Mar 13, 2008 15:31 UTC (Thu)
by jasone (subscriber, #2423)
[Link]
jemalloc
Well Stuart is careful to say that jemalloc did well for Firefox specifically rather than just
saying that it's better for everyone. He doesn't actually give any GNU libc numbers for
comparison (unlike for Windows). Google and many others have provided alternative allocators,
it wouldn't be fair to characterise any of their differences as "drawbacks" necessarily, but
this is definitely a case where there are tradeoffs to be made.
Some allocators concentrate on the tiny allocations needed for all those variable length HTML
elements, filenames, email addresses, etc. they may be very efficient but in doing so use a
few more CPU cycles when you free() an item, or slowly fragment your address space over time
so that they're not acceptable in a program which runs for days or weeks at a time.
In some applications the allocator must absolutely be fast, above almost anything. Google, for
example, offer a really fast allocator but it will never return heap memory to the OS.
Allocate some memory once, and you're stuck with it. You can re-use it, split it up, maybe
merge it with another allocation, but the Google allocator won't ever give it back for use by
other applications. If you've just visited myhugeimages.example.com and Firefox won't give
back the 600MB of memory it used rendering the site, this would be unacceptable.
In some applications, as in this Firefox example, fragmentation is determined to be a problem,
and the application is long-lived, it's OK to be a little slower, so long as you don't waste
memory or fragment the address space too much and so long as large allocations are returned
when no longer needed (avoiding fragmentation actually makes that easier).
Thread safety is another variable. Of course your allocator will run faster if it doesn't
worry about this at all, but some applications must have a thread safe allocator. Some will
even want to allocate memory in one thread and then later free it from a different thread.
The system / libc provided malloc is intended only to be a good default. If your application
is important enough to warrant writing or maintaining your own allocator, as has effectively
happened with jemalloc in Firefox, then you can stop using the C library's allocator and it's
smiles all round. GNU lib intentionally emits weak symbols for the allocator, making it easy
to choose your own instead.
OK, on a decent modern OS, not until the program exits. If your OS doesn't track heap
allocations made by userspace, and so leaks RAM when apps crash, well - here's a nickel, get
yourself a real operating system kid.
jemalloc
jemalloc
Google has a ton of libraries, frameworks, general tools and other stuff , either developed as
an improvement to some in-house project and then released afterwards, or just created as a
service to other developers.
jemalloc
jemalloc
To quote the article, It was a huge effort resulting in Jason doubling the number of lines of
code in jemalloc over a 2 month period
If Stuart is misdescribing the situation, argue with him, not me.
jemalloc
... and for the avoidance of doubt, I didn't intend to hide Jason's credit or imply that
Mozilla.org people are doing all the hard work, but only that this sort of thing doesn't come
free. Firefox developers decided that it was worth it, but for most Free Software projects,
it's going to make sense to use the system allocator unless it's absolutely dire.
jemalloc is the default on FreeBSD since 7.0 (replacing the older phkmalloc, which was also widely used outside FreeBSD, but had scaling issues). Why isn't it used in glibc -- you should ask the maintainers. Could be it's too new, could be NIH.
jemalloc
jemalloc
Also, `just rewrite it dammit' is not always the right answer. The malloc() in glibc has a lot
of arcane knowledge and nifty speedups for obscure situations in it: replacing it wholesale
would throw all that knowledge away. A malloc() which works slightly better in most
circumstances but drastically worse in some is perhaps not an improvement (not that this is
necessarily true of jemalloc: I haven't checked, but *consistent* behaviour has always been a
major focus of glibc malloc()).
jemalloc
Indeed, arbitrarily replacing the system malloc implementation is dangerous, because it will
cause memory layout changes which are almost certain to expose latent application bugs. An
allocator has to be substantially better to warrant such a change.
glibc uses ptmalloc, which is a multi-threaded derivative of dlmalloc. It works quite well
for most typical uses, which is rather impressive, given the difficulty of tuning extent-based
allocators. The only weak spots I've observed are its fragmentation behavior for some
allocation patterns (such as lots of variation in allocation sizes, as for many C++
applications), and threading scalability above ~6 CPUs. In practice, I'm not convinced that a
switch to a different allocator is warranted yet for Linux, because the existing allocator
doesn't fail horribly for any common uses. When we all have laptops with 16+ CPUs, glibc's
malloc may need some tender loving care, but we're not quite there yet.
With regard to jemalloc in Firefox 3, the primary reason it was worthwhile was the horrible
fragmentation behavior of the malloc implementations on Windows. The Linux version of Firefox
also benefitted, but not to a degree that would have by itself warranted the serious effort we
had to expend.
Your comment about *consistent* behavior is spot-on. Early versions of jemalloc used
extent-based allocation, and I found it essentially impossible to make the algorithms work
consistently across a wide range of applications. This is why jemalloc now uses what people
typically call zone allocation. The behavior is much easier to predict and reason about, thus
avoiding extreme fragmentation due to complex interactions with allocation patterns.
