Indeed, arbitrarily replacing the system malloc implementation is dangerous, because it will
cause memory layout changes which are almost certain to expose latent application bugs. An
allocator has to be substantially better to warrant such a change.
glibc uses ptmalloc, which is a multi-threaded derivative of dlmalloc. It works quite well
for most typical uses, which is rather impressive, given the difficulty of tuning extent-based
allocators. The only weak spots I've observed are its fragmentation behavior for some
allocation patterns (such as lots of variation in allocation sizes, as for many C++
applications), and threading scalability above ~6 CPUs. In practice, I'm not convinced that a
switch to a different allocator is warranted yet for Linux, because the existing allocator
doesn't fail horribly for any common uses. When we all have laptops with 16+ CPUs, glibc's
malloc may need some tender loving care, but we're not quite there yet.
With regard to jemalloc in Firefox 3, the primary reason it was worthwhile was the horrible
fragmentation behavior of the malloc implementations on Windows. The Linux version of Firefox
also benefitted, but not to a degree that would have by itself warranted the serious effort we
had to expend.
Your comment about *consistent* behavior is spot-on. Early versions of jemalloc used
extent-based allocation, and I found it essentially impossible to make the algorithms work
consistently across a wide range of applications. This is why jemalloc now uses what people
typically call zone allocation. The behavior is much easier to predict and reason about, thus
avoiding extreme fragmentation due to complex interactions with allocation patterns.