LWN.net Logo

KHB: Transparent support for large pages

KHB: Transparent support for large pages

Posted Jun 22, 2006 2:58 UTC (Thu) by Thalience (subscriber, #4217)
Parent article: KHB: Transparent support for large pages

I did enjoy reading it, and look forward to the next installment!


(Log in to post comments)

KHB: Transparent support for large pages

Posted Jun 23, 2006 21:37 UTC (Fri) by smooth1x (subscriber, #25322) [Link]

I work as a DBA for a MAJOR company.

We are allocating multi Gb shared memory segments for our databases
(we have broken the 4GB for 1 shared memory segment size recently!). Large pages for large (>1GB?) shared memory allocations is all we need.

And we want these to be what Solaris calls ISM (intimiate shared
memory) i.e. shared page tables.

Oh, and pinned into physical memory (non-pageable and NOT looked
at by the paging/swapping code).

Dave.

KHB: Transparent support for large pages

Posted Jun 24, 2006 1:16 UTC (Sat) by dododge (subscriber, #2870) [Link]

(we have broken the 4GB for 1 shared memory segment size recently!).

Just FYI there shouldn't be much trouble with mappings that size. On a system with 96GB of RAM, I regularly do single 80GB shared mappings and I've managed to push it as high as 90GB keeping it all in-core. This system is actually a small configuration for the hardware and it wouldn't surprise me if people with bigger machines are doing mappings in the hundreds of gigabytes.

One limit you can run into is that the POSIX shm_open (and SVR4 shmget?) is typically implemented by using a file in /dev/shm, and the tmpfs mounted there is usually sized to half your RAM. If you want to go larger, you can do things like mount a larger tmpfs, or mmap some other file or block device (for example a striped LVM volume), or use hugetlbfs instead of tmpfs.

Another thing about /dev/shm is that it won't stop you creating and mapping a sparse file bigger than it can actually hold. I don't know if shm_open checks for this. I found out about it the hard way -- I mapped a new 50GB file in a 48GB tmpfs and had the application bus error when /dev/shm ran out of pages a few hours later.

The biggest issue we have is simply getting the data in and out of RAM, especially if the shared memory is directly backed by disk. Imagine hitting control-C in an application and having to wait 20-30 minutes for the shell prompt to return, as the OS flushes a zillion pages back to the drive(s).

Large pages for large (>1GB?) shared memory allocations is all we need.

Oh, and pinned into physical memory (non-pageable and NOT looked at by the paging/swapping code).

I think hugetlbfs will do this for you today, if you want it immediately.

You can also use mlock to keep things resident, but be aware that the last time I tried using it (admittedly it was a 2.4 kernel), it instantly dirtied all of the pages in the mapping. So when the mapping (backed by a disk file) was then unlocked, it insisted on flushing the entire thing even if it hadn't been modified, and the flushing was done single-threaded in the kernel. For a large mapping, this can take a long time.

KHB: Transparent support for large pages

Posted Jun 24, 2006 11:49 UTC (Sat) by nix (subscriber, #2304) [Link]

shmget() and SysVIPC are implemented differently and are not constrained by the size of /dev/shm. (This need for a unique shared-with-nothing implementation is *another* reason to hate sysvipc!)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds