Is there a better way to use flash memory?
Posted Sep 22, 2009 15:55 UTC (Tue) by jzbiciak
(✭ supporter ✭
In reply to: Is there a better way to use flash memory?
Parent article: Log-structured file systems: There's one in every SSD
My understanding and experience is that flash is rather similar to EPROM. You erase the entire erase block, sending it to all 1s. This is an indivisible operation—the whole block gets clobbered, and there's no way to clobber only a section of it. Then, over whatever period of time is convenient to you, you fill in sections of that erase block with live data. The size of the section you have to fill in at a time is governed by the width of the memory, since a programming pulse has to be applied for all of the bits across the width of the memory, but you only have to program one row. So, erasure erases a group of rows, and then you can fill the rows in at your leisure.
If your ECC lives within the the row as your data, then your ECC encoding doesn't really matter. Since row writes are atomic, the fact that ECC bits toggle back and forth as you monotonically clear 1s to 0s in your data bits doesn't matter. You have to present your data and ECC in parallel when you write the row. Typical ECCs such as Reed-Solomon are built around this block principle.
(Now here's where I don't know how similar EPROMs and flash are: You could keep reprogramming the same row as long as you only flip 1s to 0s, which is where your initial idea becomes relevant. At least one flash-based embedded device I've used tells me to never program a row more than twice without an intervening erase, which suggests there may be an issue with storing too much charge on the floating gate, which in turn could physically damage the gate. That charge is what makes a 1 turn into a 0. Old school EPROMs were a bit more durable in this regards. But, then, you also blast them with bright UV for 15-30 minutes to erase them.)
If the rows are fine enough granularity, you could in theory encode the data, a version number and an ECC in that row, and do some sort of delta-update. If only a few bytes in a block changed, there's no reason to store an entire new copy of the whole block. Only store the changed rows. This would provide great compression for certain types of updates, such as appending to a file or doing filesystem metadata updates (ie. ext2 block-bitmap updates, where only some of the bits in the bitmap flip).
If you also included an internal map that hashed all the data rows into a reverse map database, you could use that to quickly collapse all of the identical rows across the entire drive into a single row. That is, whenever you decide to go store a particular row of data, find out of that row already exists on the physical media and instead point to that. For typical storage patterns (ie. lots of similar text across many files due to duplicated files, lots of end-of-block empty fill, etc.), this could result in a huge on-disk savings. That savings would then directly translate to a larger erase block pool for the same apparent loading vs. advertised capacity.
to post comments)