LWN: Comments on "Facebook releases Flashcache"

Facebook releases Flashcache

Simetrical — Thu, 06 May 2010 17:20:51 +0000

Enterprise disks already have recoverable synchronous write guarantees, by means of battery-backed disk controllers. The advantage of using an SSD as a write-through buffers instead of battery-backed RAM is just that SSDs are much cheaper than RAM. Likewise, the advantage of using SSDs instead of RAM for read caching is that you get a much bigger read cache for the same amount of money. (Plus it's pre-populated on boot, but that's not a big deal for servers.)

So, no, it might not improve performance any more than adding the same amount of RAM, but the same amount of RAM costs a lot more. :) You don't need SSDs for caching if your dataset already fits in memory, but not all datasets fit in memory.

Facebook releases Flashcache

butlerm — Thu, 06 May 2010 04:52:34 +0000

"It depends on the consequences of losing a couple of transactions"

A well designed persistent writeback cache should be able to store several minutes of writes, assuming that it is required to be present when the system reboots.

If you are just using it as a glorified write through cache, there are no special requirements. But for most journalled filesystems and fsync heavy database applications a write through cache won't improve performance any more than adding a large amount of RAM (aside from the possibility of reducing cache warmup time on system restart).

To speed up any application that issues synchronous writes, including the filesystem journal itself, reliable writeback capability must be present. Typical journalled filesystems do lose several seconds of user level transactions on recovery. But when a filesystem wants to complete a meta data transaction it must issue a full barrier operation (generally meaning writing all dirty meta data buffers to disk) before it can continue or it cannot provide any recovery guarantees _at all_.

Similar thing with databases and other applications that use fsync. If fsync returns before the file data is committed to persistent storage, there is a substantial risk that the database will be completely corrupted. All the database redo logs and so forth are potentially worthless if the database cannot be confident that certain writes have actually been made persistent.

As in the filesystem case, even if you don't care about losing the last few seconds of user transactions, if you want to recover your database the database itself must be able to either commit writes and commit them now or have a block device that can provide full write barrier guarantees.

Ultimately this is a performance issue - if the flash cache provides the recoverable synchronous write guarantees, the latency for a database commit (or similar fsync requiring operation) can drop by a couple orders of magnitude.

Facebook releases Flashcache

koverstreet — Sat, 01 May 2010 01:48:47 +0000

If you do want safe write behind caching, checksumming and journalling are on the list for bcache:
http://lkml.org/lkml/2010/4/30/496
http://lkml.org/lkml/2010/4/30/497
http://lkml.org/lkml/2010/4/30/498
http://lkml.org/lkml/2010/4/30/499

Not as far along as flashcache, but it's moving quickly.

Facebook releases Flashcache

ESRI — Fri, 30 Apr 2010 23:49:49 +0000

Really would like to see this used to speed up O_SYNC type access (for NFS writes in sync mode).

Facebook releases Flashcache

ThinkRob — Fri, 30 Apr 2010 15:07:37 +0000

Either that, or a clever L2-based monetization ploy...

Facebook releases Flashcache

cowsandmilk — Fri, 30 Apr 2010 13:03:22 +0000

Facebook doesn't make guarantees to users about performing transactions, and that's part of the point. The more 9's you require in reliability, the more expensive it is computationally. Facebook makes a system that gives enough reliability to make users happy. Occasional double posts or lost posts will be viewed by most users as their own error. If the NYSE doubles my order, I'm going to be pissed off. So, this system might not have the reliability for stock exchanges, but that's not what it was written for...

Facebook releases Flashcache

rstreeks — Fri, 30 Apr 2010 12:00:57 +0000

We all instinctively think that the lost NYSE transactions, are more important because of the $ value, then face-book updates. But to the end user there is no difference. Both groups get upset about lost transactions. With one you can a easily but a $ value on it but the other one you can't.
In both situations you have to do a lot of damage control.

Facebook releases Flashcache

fperrin — Fri, 30 Apr 2010 08:28:23 +0000

It depends on the consequences of losing a couple of transactions. If you're Facebook and you loose a dozen of status updates or wall messages, does it really matter a lot? Of course, if you're NYSE and you loose some transactions, your users won't be very happy.

Facebook releases Flashcache

ledow — Fri, 30 Apr 2010 07:40:39 +0000

Looks like you just activate the same cache device again as you would on any normal boot and it would carry on from where it left off. You might lose a block or two of data that was unwritten but that's where things like journalling filesystems / transactional databases make MUCH more difference than what the writeback cache does.

Facebook releases Flashcache

butlerm — Fri, 30 Apr 2010 06:48:57 +0000

What I want to know is what recovery mechanisms there are for flushing all the delayed writes from the flash cache to the actual media after a system crash. No one in his or her right mind would use a writeback cache without them, right?

Facebook releases Flashcache

paragw — Fri, 30 Apr 2010 01:20:08 +0000

For a moment I thought now that Adobe is done exploiting GPU acceleration, next step was to explore kernel modules to hide the CPU usage and slow down the fans.

Facebook releases Flashcache

jstultz — Thu, 29 Apr 2010 22:15:13 +0000

Looks like its based on the dm-cache work. Very cool to see that moving again. http://users.cis.fiu.edu/~zhaom/dmcache/index.html