Prerequisites for large anonymous folios
Prerequisites for large anonymous folios
Posted Sep 9, 2023 9:38 UTC (Sat) by ibukanov (subscriber, #3942)In reply to: Prerequisites for large anonymous folios by pbonzini
Parent article: Prerequisites for large anonymous folios
Posted Sep 9, 2023 11:52 UTC (Sat)
by kleptog (subscriber, #1183)
[Link] (3 responses)
This also suggests you could become more memory efficient by actually adjusting the position of functions in libraries. You'd need to collect a large sample of usages at a function level and use that to control that layout of functions in the resulting binaries/executables. For example, figure out which parts of libssl are actually required for TLS1.3 and arrange them all together. This doesn't buy much at 4k page size, but it helps a lot with bigger page sizes. And you'd have to do it at a distro-level to be effective.
Posted Sep 9, 2023 14:27 UTC (Sat)
by walters (subscriber, #7396)
[Link] (2 responses)
Posted Sep 10, 2023 21:08 UTC (Sun)
by kleptog (subscriber, #1183)
[Link] (1 responses)
But hey, all it takes is one person with enough will & skills and it might happen.
Posted Sep 11, 2023 14:45 UTC (Mon)
by aaupov (guest, #166901)
[Link]
> It also seems aimed at optimising individual binaries, whereas I think shared libraries are where a lot of the gains could be made.
Posted Sep 9, 2023 14:52 UTC (Sat)
by Paf (subscriber, #91811)
[Link]
Posted Sep 9, 2023 18:33 UTC (Sat)
by willy (subscriber, #9762)
[Link]
If you build a kernel with CONFIG_PAGE_SIZE_64K, you literally cannot mmap at a smaller granularity than 64k. The hardware is configured such that each page table entry controls access to a 64kB chunk of memory. This talk of "the page cache needs to ..." is foolish. The page cache must allocate in 64k size chunks or it cannot support mmap [*].
This is why folios are superior. You can keep your 4kB mmap granularity. The kernel decides when to use 64kB (or smaller, or larger) chunks of memory to cache files. You get opportunistic use of features like CONTPTE if the conditions allow.
[*] Since the vast majority of files are never mmaped, we *could* cache files in smaller sizes, then transition to 64kB allocations if somebody calls mmap on this file. This would be a huge increase in complexity and I am far from convinced it would be worthwhile.
Prerequisites for large anonymous folios
Prerequisites for large anonymous folios
Prerequisites for large anonymous folios
Prerequisites for large anonymous folios
`--emit-relocs` is required for function reordering and only at optimization time. It's possible to collect samples from a regular distro binary (stripped, no relocs) and then use that profile to optimize a separately-built binary (not stripped, with relocs preserved).
BOLT can optimize shared libraries.
Prerequisites for large anonymous folios
Prerequisites for large anonymous folios