|
|
Log in / Subscribe / Register

Pulling slabs out of struct page

Pulling slabs out of struct page

Posted Oct 9, 2021 16:08 UTC (Sat) by luto (subscriber, #39314)
In reply to: Pulling slabs out of struct page by willy
Parent article: Pulling slabs out of struct page

> - GUP gets back a page and then calls set_page_dirty(). That needs to figure out whether this is a file/anon/ksm/netpool/DEVICE/... page and call the filesystem if required.

Is this done directly in GUP? If so, surely it could work like the fault code and look up the VMA.

> - compaction walks the memmap and needs to figure out what this memory is and whether it can be relocated.

Hmm, this one is legit.

> - memory failure gets a physical address and needs to understand how to handle it

In my dream world, the low-level memory failure / machine check code gets a virtual address and can look up a VMA or vmap area. Making this work with kmap might be interesting.

> There are more, but these should illustrate some of the problems we have to solve.

I wonder if it's possible to reduce the dependency on struct page or equivalent to the point that everything works without it except for some nice-to-have features like compaction. (I'm not saying that the colossal amount of effort involved is worthwhile.)


to post comments

Pulling slabs out of struct page

Posted Oct 9, 2021 16:59 UTC (Sat) by willy (subscriber, #9762) [Link] (1 responses)

I'm really just trying to avoid the bugs we have where people look at page->mapping and the compiler can't say "this is a tail page, that doesn't do what you think it does". Everybody keeps trying to get me to solve their problems as well.

Please, just let me solve a problem, not rewrite the entire kernel.

Pulling slabs out of struct page

Posted Oct 9, 2021 17:06 UTC (Sat) by luto (subscriber, #39314) [Link]

I don't want you to rewrite the whole kernel! I'm just contemplating how it _could_ be rewritten if someone were inclined to do so.

(Also, I do care about the KVM mess, and I don't think KVM could have dug itself into quite the hole its in if there hadn't been a struct page to begin with for most user mappings, but fixing that needs a rewrite and a time machine.)

Pulling slabs out of struct page

Posted Oct 10, 2021 14:39 UTC (Sun) by willy (subscriber, #9762) [Link] (3 responses)

> In my dream world, the low-level memory failure / machine check code gets a virtual address and can look up a VMA or vmap area. Making this work with kmap might be interesting.

I don't think your dream world is possible. It's the same problem the page cache has with errors on writeback -- the producer might not be around any more. We might have unmapped the vmap/kmap; the user process that dirtied the cache line might have exited, or just been switched away from.

But more importantly, unless the cache is writethrough, the CPU no longer knows which virtual address(es) were used to dirty the cache line.

Pulling slabs out of struct page

Posted Oct 10, 2021 14:53 UTC (Sun) by luto (subscriber, #39314) [Link] (2 responses)

As I understand it, on Intel chips that support memory failure recovery, failed writes may not be notified at all. (I’ve at least been told this is true for the TDX style machine checks.)

And Linux’s entry code makes quite weak guarantees about recoverability of machine checks: we make a best (and pretty good) effort to recover from a fault in user code, and we try to recover from kernel code with exception table entries. If normal kernel code without an exception table entry hits a memory failure entry, forget about struct page: we may be 100% dead regardless because we have no idea how to resume execution.

If we hit a machine check with an exception handler, then we know the program counter, and we have a full register file. Figuring out the failed virtual address isn’t much of a problem even if the hardware doesn’t help.

Pulling slabs out of struct page

Posted Oct 10, 2021 14:57 UTC (Sun) by willy (subscriber, #9762) [Link] (1 responses)

Having the full register file doesn't matter if the store that dirtied the cache line was 10ms ago. I can't imagine how any CPU vendor would keep the register state around until the cache line moves from L3 to DRAM

Pulling slabs out of struct page

Posted Oct 10, 2021 15:32 UTC (Sun) by luto (subscriber, #39314) [Link]

You’re assuming that the CPU will notify the OS at all when a store from L3 to DRAM fails and that the OS actually needs to do anything about it. I don’t know all the nasty details, but it may be possible (and even mandatory?) to mark the memory bad when writeback fails and deliver a fault on a subsequent read.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds