LWN.net Logo

The merging of anon_vma and 4G/4G

Immediately prior to releasing 2.6.7-rc1, Linus merged the full remaining set of virtual memory patches from Andrea Arcangeli and Hugh Dickins, including the anon_vma code. This action has raised eyebrows in some quarters; some developers had been under the impression that 2.6 was a stable kernel series. Nobody seems to doubt that the object-based reverse mapping code is a good idea in the long run, but merging it now strikes some developers as unlikely to increase the stability of the 2.6 kernel in the near future.

Linus defends the change in this way:

It's not "fundamental", in that the reverse mapping is still done. It's just done in a slightly different way. Going to rmap was a _fundamental_ change to how we did VM. In contrast, this was just an "implementation detail".

Most "implementation details" fit into rather less than 40 individual patches, do not involve difficult special cases (such as making all uses of mremap() work correctly), and avoid making significant changes to core parts of the virtual memory subsystem. That said, one should note that the core decision-making VM code has not been changed; the algorithm for choosing pages to move into and out of memory is the same as before. It is also notable that there have been almost no VM-related problem reports since 2.6.7-rc1 was released. This particular change may just work out in the short term after all.

A related topic is the 4G/4G patch, which separates kernel and user space entirely so that each can make full use of the 4G virtual address space on 32-bit systems. This patch has been considered for merging for some time, but has never quite found its way in. Most developers see it as an ugly hack (though, perhaps, a necessary one), and there is fear of the (possibly overstated) performance overhead that the 4G/4G mode imposes. Even so, some people wonder when this patch might be merged.

The answer seems to be "never, if at all possible." The motivations behind this patch are (1) to make more kernel-space low memory available on large-memory systems, and (2) to provide a larger virtual address space for applications. The first reason may well have just become moot; the anon_vma patch was merged because, among other things, it significantly reduces the amount of low memory used by the VM subsystem. The initial reports suggest that the current VM code handles 32GB of memory nicely on 32-bit systems. Since 32-bit systems rarely come more heavily loaded than that (so far), it is thought that the VM has gotten as good as it needs to be on those systems.

The real hope, however, is that a serious transition to 64-bit systems will happen before too long. The x86 architecture has been stretched much further than anybody would have expected it to go, and x86_64 makes the transition so easy that there is very little reason not to do it. The 4G/4G patch is likely to hang around (and be included by some distributors) for some time; if nothing else, all of the currently-deployed monster x86 systems are likely to go on running for a while yet. But the mainline kernel may just get away with saying "switch to 64-bit" and leaving that particular patch out.


(Log in to post comments)

The merging of anon_vma and 4G/4G

Posted May 27, 2004 15:08 UTC (Thu) by seanegan (subscriber, #15672) [Link]

and there is fear of the (possibly overstated) performance overhead that the 4G/4G mode imposes.

Where I work we have tested the 4G/4G patch in a 2.4 kernel. Our app is a network server running in an poll() based event loop; it needs a large datastructure in memory and is receiving and sending small message requests about that large data set. Bottom line, our app is VERY syscall dominated. To get more user space memory we looked into the 4G/4G patch.

The performance impact was enormous, 30-50% loss of requests per second. Swapping memory spaces for every syscall killed our app so much that it didn't even need the greater memory space.

The merging of anon_vma and 4G/4G

Posted May 27, 2004 17:54 UTC (Thu) by ncm (subscriber, #165) [Link]

As a counterpoint, my employer runs farms that are overwhelmingly compute-bound. (Actually, much of their time is spent waiting on memory bus latency, because there is no working set.) 4G/4G would have no effect on performance, but would allow us to continue using the same hosts longer, as the data set continues to grow. Replacing those hundreds of hosts with Opterons will be a big effort and a big expense. It will happen eventually, but certainly not all at once. Anything that stretches the lifetime of the old hosts is a great boon.

The merging of anon_vma and 4G/4G

Posted Jun 3, 2004 22:37 UTC (Thu) by bronson (subscriber, #4806) [Link]

This is a really rare situation. In the interest of reduced complexity, I think it makes sense for Linus to not include it in the main kernel. Those people that need it, such as yourself, probably would not mind applying it manually...?

The merging of anon_vma and 4G/4G

Posted May 30, 2004 19:36 UTC (Sun) by garloff (subscriber, #319) [Link]

Two points:
(a) It is no miracle that nobody complains about the memory management
in 2.6.7-rc1. A large distributor has based their 2.6 kernel
on Andrea's patches and a lot of testing has happended on the
kernel with various desktop and enterprise workloads.
These tests have resulted in a number of bug reports, but they
have been analyzed and addressed quickly.
The good functioning is the result of the good design, the deep
understanding of MM by Andrea and the heavy testing.
(b) If 4:4 can be avoided, it should.
MySQL has shown some numbers at the user conference that showed
75% higher performance for the small working set when comparing
the RHAS smp(3:1) to the hugemem(4:4) kernels. For the large
working set, the 3:1 kernel was still 19% faster.
With the objrmap, anon_vma and prio_tree work, we can now avoid 4:4.

The merging of anon_vma and 4G/4G

Posted Jun 3, 2004 15:59 UTC (Thu) by salex (guest, #4814) [Link]

> The real hope, however, is that a serious transition to 64-bit systems will happen before too long.

I'd suggest looking at the history of the PDP-11 for some idea of what "too long" might mean. Despite the various efforts that DEC made (notably compatability mode on the VAX), the 11s continued for many years and continued to have to deal with the limitations of a 16 bit address space. I certainly hope Linux doesn't ever have to deal with separate I+D or overlays, but if Linux is going to continue to keep an eye on making old and cheap hardware run fast, I would expect that the developers will be looking at how to shoehorn more into a 32 bit address space for years at least.

Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds