|
|
Subscribe / Log in / New account

Prerequisites for large anonymous folios

Prerequisites for large anonymous folios

Posted Sep 9, 2023 9:38 UTC (Sat) by ibukanov (subscriber, #3942)
In reply to: Prerequisites for large anonymous folios by pbonzini
Parent article: Prerequisites for large anonymous folios

Hm, is it because applications mmap a lot of small size? Or is the issue the increased Linux file cache usage which has page size granularity? If the latter the Linux file hash has to be fixed not to depend on that.


to post comments

Prerequisites for large anonymous folios

Posted Sep 9, 2023 11:52 UTC (Sat) by kleptog (subscriber, #1183) [Link] (3 responses)

I imagine a chunk of this is due to the fact that ELF binaries simply mmap() all the libraries they use and let the kernel fault in the pages they actually need. You also have data files like libicudata and locale files. If you're only using small parts of a library this is fairly efficient using a 4k page size, but you can imagine that with a 64k page size the overhead balloons. The alignment requirement doesn't help either.

This also suggests you could become more memory efficient by actually adjusting the position of functions in libraries. You'd need to collect a large sample of usages at a function level and use that to control that layout of functions in the resulting binaries/executables. For example, figure out which parts of libssl are actually required for TLS1.3 and arrange them all together. This doesn't buy much at 4k page size, but it helps a lot with bigger page sizes. And you'd have to do it at a distro-level to be effective.

Prerequisites for large anonymous folios

Posted Sep 9, 2023 14:27 UTC (Sat) by walters (subscriber, #7396) [Link] (2 responses)

Prerequisites for large anonymous folios

Posted Sep 10, 2023 21:08 UTC (Sun) by kleptog (subscriber, #1183) [Link] (1 responses)

Yes, that would do it. As I suspected the actual technical part of optimising the binaries is done, the problem would be to get decent samples of all the popular applications. For example, it appear it requires specially compiled binaries to work (--emit-relocs), so you can't just ask a random group of people to run a sampling profiler in the background for a day. It also seems aimed at optimising individual binaries, whereas I think shared libraries are where a lot of the gains could be made.

But hey, all it takes is one person with enough will & skills and it might happen.

Prerequisites for large anonymous folios

Posted Sep 11, 2023 14:45 UTC (Mon) by aaupov (guest, #166901) [Link]

> Yes, that would do it. As I suspected the actual technical part of optimising the binaries is done, the problem would be to get decent samples of all the popular applications. For example, it appear it requires specially compiled binaries to work (--emit-relocs), so you can't just ask a random group of people to run a sampling profiler in the background for a day.
`--emit-relocs` is required for function reordering and only at optimization time. It's possible to collect samples from a regular distro binary (stripped, no relocs) and then use that profile to optimize a separately-built binary (not stripped, with relocs preserved).

> It also seems aimed at optimising individual binaries, whereas I think shared libraries are where a lot of the gains could be made.
BOLT can optimize shared libraries.

Prerequisites for large anonymous folios

Posted Sep 9, 2023 14:52 UTC (Sat) by Paf (subscriber, #91811) [Link]

I don’t understand what you’re thinking the page cache does or should do? It’s mostly just physical pages today. The only way using folios can increase memory usage is via increased read in of sparse files - you just want a few bytes and instead of, say, 4K we read in 16K or 64K. That’s it. So we could get better at recognizing sparseness and switching page size down, I guess.

Prerequisites for large anonymous folios

Posted Sep 9, 2023 18:33 UTC (Sat) by willy (subscriber, #9762) [Link]

Several of the responses in this thread have misunderstood the question.

If you build a kernel with CONFIG_PAGE_SIZE_64K, you literally cannot mmap at a smaller granularity than 64k. The hardware is configured such that each page table entry controls access to a 64kB chunk of memory. This talk of "the page cache needs to ..." is foolish. The page cache must allocate in 64k size chunks or it cannot support mmap [*].

This is why folios are superior. You can keep your 4kB mmap granularity. The kernel decides when to use 64kB (or smaller, or larger) chunks of memory to cache files. You get opportunistic use of features like CONTPTE if the conditions allow.

[*] Since the vast majority of files are never mmaped, we *could* cache files in smaller sizes, then transition to 64kB allocations if somebody calls mmap on this file. This would be a huge increase in complexity and I am far from convinced it would be worthwhile.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds