Both transparent huge page support and compaction (which transparent huge pages cannot reliably be done without) are good things. We may be able to merge some of our functionality with those over time. We are certainly large consumers of 2MB pages, so we see compaction as very important, and with it we may be able to provide transparent, reliable dynamic allocation of 2MB pages in our subsystem. Without compaction, pages must be explicitly funded and taken away from the kernel upfront, since fragmentation may make it impossible to get hold of them later.
Due to various needs though, our current implementation a separate physical memory management subsystem. The main, overriding need for this is the avoidance of unnecessary TLB invalidates when physical pages are unmapped and mapped [a bit later] within the same process - this is the most frequent pattern across the pauseless collector and the application mutators, and addressing it pretty much requires process-local free lists, which we use.
WRT VMs (e.g.. running this kernel as a guest on top of KVM or VMWare), we see those as common targets. Because the rate of virtual memory mapping changes in our system is quite substantial (well over 1000x compared to the typical OS and application loads), we care a lot about the cost of those manipulations in virtualized environments. Luckily EPT (Intel) and NPT/RVI (AMD) features in all modern x86-64 machines practically eliminate this cost, taking the hypervisor out of the business of intercepting, tracking, and applying guest-virtiual to guest-physical mappings. Without those HW assisted features, applying changes at the rate we do would certainly "hurt" on virtualized systems.
We'll also be able to play with memory ballooning, and we've got some very good uses for it, but the current implementation we've posted doesn't. When we do, we'll be looking to deflate balloons at sustained rates of several GBs per second, which will probably put some serious stress on current host implementations. This is one of the items at the hypervisor level that we're going to be playing with as part of the Managed Runtime Initiative [large page and high sustained rate ballooning]. At first glance, KVM's handling of ballooning seems like it can easily be extended to accommodate what we'll need.