|
|
Log in / Subscribe / Register

Making use of persistent memory

By Jonathan Corbet
February 10, 2016

linux.conf.au 2016
Persistent memory holds a lot of promise: what's not to like about vast amounts of directly-attached memory that remembers its contents over a power cycle? For some years we have been told that large persistent-memory arrays are coming; now it seems that they are about to arrive. According to Matthew Wilcox, who spoke on the topic at linux.conf.au 2016, Intel wants persistent memory to become a regular platform feature. It will start by shipping server platforms supporting 6TB arrays in 2017. Matthew's question was: how will we make use of that persistent memory?

There are a lot of ideas out there, he said, many of which have been promoted by academics — most of whom are not greatly concerned about practicality. Start, for example, with the idea of total system persistence, where the entire system can be turned off at any time and, when powered back on, will simply pick up where it left off. The problem here is that the CPU caches are not persistent, and there is no easy [Matthew Wilcox] way to know when all writes to main memory have completed. Whole-system persistence is a "delightful" idea, but it is not something that can be done with today's hardware.

Stepping back a bit, one can try for application persistence — using persistent memory to make cheap application snapshots. Unfortunately, the cache problem exists here as well.

Perhaps what needs to be done is to completely redesign the operating system, creating a new system designed around persistent memory from the beginning. The "new" ideas that proponents of this idea bring up tend to have a familiar ring to them: microkernels, nanokernels, unikernels, etc. Matthew suggested that some people see new technology as an opportunity to push the same ideas they have been promoting for years. He wishes them the best, but is looking for something that will work today.

Some developers at Intel created a new filesystem for persistent memory called pmfs. They are, he said, smart people, but they are not Linux kernel developers. As a result, the work they did is not suitable for production settings.

What can be done today is to package up a persistent-memory array and export it as a fast block device. We have support for that in the kernel now, but, he said, it doesn't feel like a general-purpose solution. To get to that more general solution, he set out to make some small modifications to existing filesystems so that they could make use of persistent memory; he allowed that it was perhaps good that Dave Chinner was unable to attend the conference, since Dave might just quibble with Matthew's notion of "small". In any case, Matthew started with the existing execute-in-place support, rewrote it, and ended up with the subsystem now known as DAX.

Beyond filesystems

There is more to proper persistent-memory support than getting the filesystems to work, though. Down at the CPU level, Intel's designers have created a new set of instructions for use with persistent memory. The intent is to allow developers to initiate the flushing of specific cache lines; it is not possible to know when a write completes, but the operation can at least be started when needed. The CLFLUSH instruction has existed for years; its job is to remove a line of data from the cache. It is not optimal for persistent-memory applications, though, because it serializes the instruction stream, hurting performance.

That shortcoming will be addressed with the CLFLUSHOPT instruction, which does not perform that serialization. In the future, there will also be CLWB, which starts writing out the cache line but does not remove the data from the cache. Finally, the PCOMMIT instruction will ensure that all data written prior to the last store fence is persistent. It may not be written to a specific persistent-memory array, but it will be in persistent storage somewhere.

There is still the question of how application programmers should make use of persistent memory. One option would be to create a special-purpose programming language that natively understands persistence, but that, he said, is not particularly interesting. A bit more practical might be to modify the runtime virtual machines for language like Python to get them to issue the new instructions when needed. That would avoid the need for code changes in general, but is still not entirely interesting to "dinosaurs" like him, who want to program in languages like C.

For such people, there is a whole set of new libraries available. At the lowest level, libpmem simply provides easy access to the new instructions, but adds little otherwise. The libvmmalloc library supplies a replacement for malloc() that can allocate from persistent memory. It can also be used for non-persistent applications; indeed, applications can be linked against this library and be unaware that they are using persistent memory at all. For developers who are willing to code specific persistent-memory awareness into their applications, there is libvmem. It provides better performance on persistent memory, but is still mainly expected to be used with memory used as if it were volatile.

Developers wanting utilities to help with the storage of persistent memory can use libpmemblk, which provides access to atomic blocks of memory. Log-oriented applications, which append data to an existing structure, can use libpmlog to manage logs easily in persistent memory. But the interesting one, he said, is libpmemobj, which provides a transactional object store on top of persistent memory. It provides locking that persists over system boots, type safety, and data structures like doubly linked lists and a key-value store. It can handle replication of data across files. And, for those who are so inclined, C++ support has been added in recent months.

Quite a bit of functionality has been built on these libraries, he said. There is, for example, a MySQL storage engine that uses it. Interested developers can go to pmem.io, which hosts a blog, pointers to the source, and other information. There is also intel.com/nvm, which has mostly marketing material about the upcoming persistent-memory hardware. It appears that, after years of hype, the hardware will soon be available, and the software support will be there for it as well.

The video of this talk is available for those wanting more information.

[Your editor thanks LCA for assisting with his travel expenses.]

Index entries for this article
KernelMemory management/Nonvolatile memory
Conferencelinux.conf.au/2016


to post comments

Making use of persistent memory

Posted Feb 11, 2016 7:36 UTC (Thu) by jezuch (subscriber, #52988) [Link] (2 responses)

If I remember correctly, magnetic core memory was non-volatile, right? How did people make use of this fact back then?

[Too many old ideas are forgotten and then painfully re-invented again, you see.]

Making use of persistent memory

Posted Feb 11, 2016 12:52 UTC (Thu) by zmower (subscriber, #3005) [Link]

Like this? These were the days before operating systems. Persistent storage was punched cards or papertape if you were lucky and keying in programs with buttons if not! Then the much cheaper and capable SRAMs came along which lost their contents when powered off. Microcomputers happened and here we are today.

Making use of persistent memory

Posted Feb 11, 2016 15:04 UTC (Thu) by corbet (editor, #1) [Link]

I worked for a while with a Data General Nova, built into a radar van, that had core memory. We "made use" of that fact by just turning off the power at night, and having it pick up again in the morning when we turned it on. Probably worked about 80% of the time, otherwise we had to boot from the beginning.

Back then, though, it was just memory; persistence was just a side effect of how it worked. Even then, when systems didn't do the sort of memory caching we see today, there was enough that could go wrong between power cycles that nobody depended on it to safely store data. Now, hopefully, we can.

Making use of persistent memory

Posted Feb 11, 2016 7:45 UTC (Thu) by jem (subscriber, #24231) [Link] (1 responses)

Start, for example, with the idea of total system persistence, where the entire system can be turned off at any time and, when powered back on, will simply pick up where it left off. The problem here is that the CPU caches are not persistent, and there is no easy [Matthew Wilcox] way to know when all writes to main memory have completed.

The same problem exists with hard disks, and yet they are considered non-volatile. Maybe the computers of today are not able to detect and react quickly enough to a sudden power loss, but surely systems could easily be designed to overcome this limitation?

Making use of persistent memory

Posted Feb 11, 2016 23:11 UTC (Thu) by Jonno (guest, #49613) [Link]

On hard drives the problem is solved by explicit flushes and write barriers, which current CPUs can't do for persistent-memory. And even so there is lots of ways for things to go wrong with regards to recently written data, for an undefined value of "recently".

As long as CPU cache, CPU registers, and/or peripheral devices remain volatile computers will likely never be able to just pick up where it was and go on after a sudden power loss, but with persistent main memory it should be possible to get a system suspend mode combining the simplicity and performance of sleep and the power savings of hibernate. An with a big enough battery or capacitor it should be possible for the kernel to try to enter that mode if main power disappears (though misbehaving peripheral devices or their drivers would still be able to sabotage it).

Making use of persistent memory

Posted Feb 11, 2016 14:33 UTC (Thu) by anton (subscriber, #25547) [Link] (9 responses)

The promise of hardware-persistent memory always makes people think that we now can just use main memory and get rid of dealing with files, file systems, etc. But we have actually had (software-)persistent memory for a long time, e.g., in the EROS single-level store, but it never became mainstream. There is a reason for that.

We need files and file formats for purposes besides just persistence: In particular, for exchanging data and for moving it from one version of a program to the next. These purposes are still needed even with hardware-persistent memory, so we still need to deal with files.

As an example of what happens when you ignore that, consider the pre-XML Microsoft Word format, which was reported to be mostly a memory dump of a part of the Word process. Problems I have heard of with that format were that 1) you could corrupt your Word, and that corruption would be preserved across saving the document, and using it in a new Word process; 2) the file contained data that the users had intended to erase from the document.

BTW, I am an academic:-).

Making use of persistent memory

Posted Feb 11, 2016 23:23 UTC (Thu) by dlang (guest, #313) [Link] (7 responses)

I find the CPU cache issue a red herring. That would only matter if you are planning on removing power at any arbitrary time, including in the middle of a write to memory and still expecting zero data loss (and if you pull power in the middle of a write to memory, which version do you expect to find there when you resume?)

We are not going to be having devices that just loose power at random times. We are going to have devices where the CPU is much more willing to power off parts or all of the system. And the CPU can make sure that it's flushed it's caches when it does so, so it just won't be an issue.

The real things that will drive how persistent memory is used is it's price, density, speed, and power consumption.

Is it really going to be faster, cheaper, etc than DDRx ram? I really would be surprised if that was the case. Frankly, I do not expect that the Persistent memory is going to be as fast as simpler RAM (the more that has to be done to read/write the data, the harder it will be to ramp up clock speeds)

If it's not better in all ways, then it's unlikely that systems will use persistent memory exclusively (except in some admittedly important corner cases like mobile/IoT devices)

If you have a system that has regular RAM, persistent RAM, and flash/disk. I fully expect to see the persistent RAM used as a staging area before going to disk, a fast swap, and for queues of various types that databases and other software currently have to jump through a lot of hoops to get working halfway well on disks today.

In the mobile/embedded space where they frequently only install one chip of RAM, making that chip be persistent will have a wonderful effect on battery life, it will mean that when the CPU decides that it can sleep, it doesn't have to do a lot of effort to preserve state, onto permanent storage, it just flushes it's CPU caches, enables the wake events, and powers itself off

Besides, if persistent memory really is that great, how long before the CPUs (at least at the low end where performance isn't the primary factor) start using it in the core?

Making use of persistent memory

Posted Feb 12, 2016 13:11 UTC (Fri) by james (guest, #1325) [Link] (1 responses)

That would only matter if you are planning on removing power at any arbitrary time, including in the middle of a write to memory and still expecting zero data loss (and if you pull power in the middle of a write to memory, which version do you expect to find there when you resume?)
Well, that's what we expect from databases: that they can survive going down at any point, and after a recovery process, we can re-open them and keep going as normal.
We are not going to be having devices that just loose power at random times
Good: I have this mental image of a device like that, letting power loose in the form of arcs hitting anything conductive nearby at random times...

But we all know what you meant, and that puts a major limitation on where you can use persistent memory. It rules it out of desktop systems, it probably rules it out of consumer devices with removable batteries, and it even makes server applications much dodgier: UPSes do fail, "high availability" virtual machines will occasionally end up being migrated behind the same UPS, and recovery time is important, if only because you know that the business is going to be on IT's back demanding that they can get on with doing the stuff they couldn't do during the unscheduled downtime.

Backups are good. Needing to use backups is bad.

Making use of persistent memory

Posted Feb 13, 2016 1:47 UTC (Sat) by dlang (guest, #313) [Link]

> Well, that's what we expect from databases: that they can survive going down at any point, and after a recovery process, we can re-open them and keep going as normal.

they do that by doing a very convoluted process so that they can tell that something only got partially written and write it again from a buffer.

You really don't want to have to do that for every write to memory.

> it probably rules it out of consumer devices with removable batteries

no, it just means that such devices have a small capacitor in them so that when they detect the loss of power, they can still do their 'suspend' work.

Making use of persistent memory

Posted Feb 12, 2016 14:35 UTC (Fri) by anton (subscriber, #25547) [Link] (4 responses)

I find the CPU cache issue a red herring. That would only matter if you are planning on removing power at any arbitrary time, including in the middle of a write to memory and still expecting zero data loss (and if you pull power in the middle of a write to memory, which version do you expect to find there when you resume?)
Power loss is not instantaneous. Voltage drops over time, and you have a little time until your hardware becomes incapable of writing reliably (and by that time it should have done all it's writing, otherwise the memory contents may be corrupted; I have experienced hard disks that did not get that right). Anyway, when the system detects that power is failing, it should write all the caches back to permanent storage, and then store the CPU registers (as in a context switch), and the state of I/O devices. When power comes back on, it should wait for the capacitors to be sufficiently full to perform a full shutdown again, then load the CPU and the state of I/O devices, and then continue processing. There would still be some lossage, e.g., due to broken connections, but applications typically know how to deal with that. So I think that the power loss situation is one where persistent memory could be helpful, if implemented well.

Concerning the CPU caches, I expect that they can be dumped to memory already now for suspend-to-memory; or is the CPU kept powered-on in that state?

Making use of persistent memory

Posted Feb 13, 2016 1:50 UTC (Sat) by dlang (guest, #313) [Link] (3 responses)

The point I', trying to make is that power loss is still a 'suspend' event, just a much faster one since there's far less data to worry about.

the persistent memory cheerleaders try and make it sounds as if the system/os doesn't need to do _anything_ and the memory technology will avoid all problems.

But the reality is that you need to treat it like a suspend problem. stop all ongoing activity (so you don't have a partial write somewhere) and write the volitile stuff (including CPU caches) to non-volitile storage.

Making use of persistent memory

Posted Feb 13, 2016 3:45 UTC (Sat) by neilbrown (subscriber, #359) [Link] (2 responses)

I agree.

If you use persistent memory like we've always used memory, then it isn't importantly different from the battery backed memory that laptops have had for years - maybe a bit bigger.
If you use persistent memory like we've always used disk drives, then it behaves much like disk drives, or at least flash drives - maybe faster.
You can use persistent memory like both memory and disk drives, but virtual memory has allowed that for years too - maybe with a bit more flexibility in sizing.

While there are lots of interesting challenges on the implementation side, I think that for the user - persistent memory just means: a bit bigger, a bit faster, a bit more flexible.

You will still need a suspend/resume discipline to save power - but you save more power. You still need a RAID discipline to avoid data loss due to hardware error, though reliability is probably higher. You still need to lay out long-term data like a filesystem, though the details of the layout can be different.

If persistent memory ushers in a whole new usage-pattern, it wont be because memory is suddenly persistent. It will be because someone finds a clever way to use bigger/faster/more-flexible/more-reliable.

Making use of persistent memory

Posted Feb 13, 2016 7:37 UTC (Sat) by dlang (guest, #313) [Link]

again, not a completely new idea, but rather a refinement on existing efforts.

We know that Amazon has talked about wanting kindles to be able to go to sleep between keypresses in the past, but sleep/wake time hasn't been fast enough, would this shrink the amount of stuff needed to allow that?

with persistent memory, you could also eliminate the flash storage, everything would just be stored in memory.

how close are we to being able to build a e-ink reader that can power itself from an embeded solar panel? Without a need to power flash, scan a keyboard, or keep a touchscreen live, a simple e-reader should be able to wake-on-buttonpress, change the display, and go back to sleep. require USB to load stuff onto it so you don't have to power a radio (or use bluetooth low power, only enabled with a keypress)

and what new price threshold would such a device pass that would enable nifty new uses? The Kindle sales really took off when they dropped below $100, are we now taking about devices that could drop below $10??

Integrity and backups of persistent memory?

Posted Feb 13, 2016 18:35 UTC (Sat) by songmaster (subscriber, #1748) [Link]

Is persistent memory being built with ECC or at least parity bits? Permanent storage on rotating media usually has a per-sector CRC to catch the effect of bits getting corrupted, but it's not obvious how to handle this on directly-accessible storage — maybe there could be some kind of CRC check done on a persistent page whenever it gets mapped into memory, and the checksum calculated and saved whenever it gets unmapped again (although that doesn't protect the page's integrity while it's mapped). I hope someone is thinking about these topics, I haven't heard talk about making backups of the contents of PM, or of anything like RAID for higher levels of integrity.

Making use of persistent memory

Posted Feb 18, 2016 10:38 UTC (Thu) by oldtomas (guest, #72579) [Link]

This is a very important observation. If you use persistent memory as "the whole status gets persisted" you persist unwanted corruption with it. No more "switch off, then on and all is well" (sometimes the system is so botched that reboot is not possible).

As a little anecdote to make you smile: long years ago, a friend of mine had an electromechanical calculator (four arithmetics and square root: his father was an engineer). The "mechanical" part in electromechanical provided some sort of persistence.

Obviously this marvel performed division by succesive subtraction: on divide-by-zero it went on an endless and quite noisy frenzy -- prompting the user to switch it off.

That didn't help, though: switching it on again just resumed that frenzy...

(Yes, there was some mysterious way to get things to normal again).

Making use of persistent memory

Posted Feb 11, 2016 23:35 UTC (Thu) by jhhaller (guest, #56103) [Link]

The term I've seen used the most is non-volatile byte-addressable memory (NVBM). A couple of proposed use cases that I've run across in my looking at potential uses: The key for NVBM is providing significantly cheaper and denser storage than RAM without being significantly slower. One advantage of NVBM is being able to avoid using both backing store and cache, as those are combined for workloads which fit into RAM, at a lower cost than RAM. Challenges include data integrity, and endurance, for applications where those are important. Other issues relate to data protection and concurrency, and how they may differ from RAM cache and backing store. Data privacy becomes a concern when one can boot a different operating system and have your way with the existing persistent storage. While some of these issues may be similar to those of block storage, the tools to deal with them are likely to be different. While using NVBM as a fast block device may be a good first step, it misses the advantage of having the storage be directly available.

The HP Machine is mostly based on NVBM, with the extension of having the memory widely addressable across multiple CPU sockets. That extension has some different challenges than those of a single server accessing NVBM, but deals with some of the challenges as well.

I would be shocked if there aren't already startups looking to provide new databases based on using NVBM directly, with such companies coming out of stealth mode once the hardware is really available.

Making use of persistent memory

Posted Feb 20, 2016 2:10 UTC (Sat) by welinder (guest, #4699) [Link]

> That shortcoming will be addressed with the CLFLUSHOPT instruction, ...

I don't care what the docs say. That will forever be known as the
flu shot instruction.


Copyright © 2016, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds