|
|
Log in / Subscribe / Register

LFCS: Preparing Linux for nonvolatile memory devices

LFCS: Preparing Linux for nonvolatile memory devices

Posted Oct 11, 2016 8:08 UTC (Tue) by ecloud (guest, #56624)
In reply to: LFCS: Preparing Linux for nonvolatile memory devices by etienne
Parent article: LFCS: Preparing Linux for nonvolatile memory devices

I think the ultimate point about getting userspace onboard is that we need next-generation languages that make memory leaks impossible, that maintain data structures compactly in memory (avoid linked lists and the like), and trade in the "filesystem" APIs for the appropriate object-storage APIs. (But yes, some databases are already appropriate places to start with this.) Instead of having APIs that make filesystem access completely different from memory manipulation, we need a way of marking data structures persistent. The language should then translate that into marking pages of memory persistent, and the OS should ensure that persistent pages are stored on the appropriate device. Applications should take care not to write to persistent structures more often than necessary; but otherwise either the language implementation or the OS should provide a way to cache frequently-updated persistent structures in volatile memory, and do checkpointing of changes. (Maybe marking the structure both volatile and persistent would mean that.) I guess the next issue is that sync-written structures could be temporarily out of sync with those which are cached; then either it means all writes need to be to cached first and then flushed to NVM at the next checkpoint, or else the system needs to be power-failure-proof (not a problem for battery-powered devices; line-powered machines can have at least a capacitor-based UPS sufficient that all writes can be completed before power fails).

So, rebooting, or even restarting applications, should become exceedingly rare. It places great demands on all software to be as reliable as the kernel itself: keep running for years with no leaks, no overflows, no bugs of the kind that require restarting the software as a workaround. You couldn't truly restart the application without losing all its stored data too. Using filesystems has made it harder to write software (so much persistence-related code that has to be written), but also allowed us to be too lazy for too long about reliability of the in-memory operations. If we invest as much effort into keeping memory beautifully organized as we have invested into file-based persistence, maybe we can get there?

I doubt that Linux will be the leader here, but there must be some current university research project by now? Anybody know of one? A long time ago there was KeyKOS which had checkpointing-based persistence; then there was Eros, but its focus shifted more strongly to capability-based security than on checkpointing. (And Linux still doesn't have such advanced capability-based security, either. This is why Sandstorm exists: the OS doesn't do it, so you have to rely on containers and management of them to isolate processes from each other.)

So now we have NVMe devices, like the M.2 flash drives. Can they be configured as memory-mapped, without using mmap()? Because using mmap() implies that all reads and writes will be cached in volatile RAM, right? If the hardware allows us to have RAM for one range of addresses and flash for another range, this work could begin.


to post comments

LFCS: Preparing Linux for nonvolatile memory devices

Posted Oct 13, 2016 12:46 UTC (Thu) by nix (subscriber, #2304) [Link] (1 responses)

Fundamentally it seems to me that we'll still want something like a filesystem: a collection of named, possibly hierarchically or otherwise structured blocks of data that can be used by multiple programs without regard for which program created them. Just arranging for programs to keep their data structures around forever doesn't do that: each program would have to implement some sort of organizational principle and if they're not all using the same one this is tantamount to every program having its own implementation of half a filesystem, non-interopable with any of the others or with external tooling. This seems drastically worse than what we have now.

Persistent memory is nice because it might mean that e.g. you could shut down a machine with a long-running computation on it and have it just restart again. Of course, with the CPU caches not persistent, it might have to go back a few seconds to a checkpoint. You can often do that *now* by just checkpointing to disk every few seconds, but with persistent storage you can presumably do that even if there are gigabytes of state (assuming that the persistent memory of discourse doesn't wear out on writes the way flash does).

But persistent memory will not allow us to do away with filesystems: neither the API nor the allocation layer. The fundamental role of the filesystem API -- naming things and letting users, and disparate programs, access them -- will still be needed, and cannot be replaced by object storage any more than you can replace a filesystem with an inode table and tell users to just access everything by inode number. Equally, the role of filesystems themselves -- the object allocation layer -- is still there: It's just a filesystem for persistent storage, with differently strange tradeoffs than every other filesystem's differently strange tradeoffs. Even having files with no name is not new: unlinked files have given us that for decades, and more recently open(..., O_TMPFILE) has too.

LFCS: Preparing Linux for nonvolatile memory devices

Posted Oct 13, 2016 14:32 UTC (Thu) by raven667 (guest, #5198) [Link]

> every program having its own implementation of half a filesystem, non-interopable with any of the others

Probably not every program, but every major language family that doesn't share low level compatibility of its data structures, like how today having a C API is a lowest common denominator for a languages compatibility with other languages. Or how JSON has become a medium of exchange for network software.

> naming things and letting users, and disparate programs, access them

With the popularity of application sandboxing, with Flatpak on the desktop and Docker on the server, there are far more defined and regimented ways for applications to share data, so I don't expect arbitrary disparate programs accessing data to be supported in this model.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds