LWN.net Logo

Other memory management work

Lest one think that tweaking rmap is all that is happening in the memory management world: a great deal of code is currently circulating which makes big changes, and it has been finding its way into Linus's kernel.

For example, 2.5.34 includes Patricia Gaughen's discontiguous memory patch, which is aimed at the needs of large, NUMA systems. On such systems, you no longer just have a simple array of memory; instead, the system's RAM is broken up into zones, each of which is attached to a particular NUMA node. Memory accesses within a node are faster than cross-node references, so the kernel needs to know where any given page resides. Memory on these systems can also have address holes between each node's zone.

The discontiguous memory patch does away with the classic mem_map array, which contained one struct page structure for each page on the system. The memory map is now split into separate, per-node maps, and all references to mem_map in the kernel have been changed. Rather than dealing with simple indexes into mem_map, the kernel now works with page frame numbers; an old reference to mem_map+i is now pfn_to_page(i). For the most part, code which did not access mem_map directly will likely require no changes in response to the discontiguous memory patches. But there will be exceptions...

Andrew Morton's "-mm" patches have become the staging area for memory management changes. The current patch as of this writing (2.5.34-mm1) contains a long list of other changes, including:

  • Directory indexes for the ext3 filesystem (by Daniel Phillips). Calling this one "memory management" is a bit of a stretch, of course, but it is a definite performance improver when large directories are used.

  • A patch by William Lee Irwin which lets the i386 architecture maintain page tables in high memory.

  • A change to the readv and writev system calls (by Janet Morgan) which submits all segments for I/O in parallel; this patch greatly speeds up direct disk I/O operations.

  • Rohit Seth's large page patch for the i386 architecture (covered here last month).

  • A patch which allows copy_from_user and copy_to_user to be called in atomic (non-blocking) situations. If the copy operation encounters a page fault, it simply fails.

  • ..and many other changes.

One interesting side result from work like the atomic copy_*_user functions and the preemptible kernel is a formalization of just when the kernel is performing an atomic operation. Code in the 2.4 (and prior) kernel could check for certain situations where atomic operation was required, such as when servicing an interrupt. In 2.5, other atomic situations (i.e. holding a spinlock) are tracked, and it is easy for code with a need to say "don't interrupt me or sleep now." The result should be more explicit code and fewer bugs.


(Log in to post comments)

Other memory management work

Posted Sep 12, 2002 4:13 UTC (Thu) by cyanide (guest, #2236) [Link]

It's interesting to see these "big iron" patches still comming in. I seem to recall that LWN speculated a while ago that companies like IBM might end up forking, given that large scale system design is often in conflict with Linus' push to get the kernel doing interesting things on small devices.

I wonder if anyone can tell me how much of an issue this conflict is now, and whether the threat of a big iron fork is real or not?

Cheers,

Oliver White

Other memory management work

Posted Sep 12, 2002 7:35 UTC (Thu) by Arithon (subscriber, #3647) [Link]

I saw an article which might help on LWN a few weeks ago:
http://lwn.net/Articles/4526/

It talks about how the kernel could die a horrible death if all was sacrificed at the altar of scalability, and the locking was granularised to the Nth degree. And there's a reference to "cache-coherent clusters" which are an interesting possibility:
http://lwn.net/Articles/4536/

Copyright © 2002, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds