"It depends on the consequences of losing a couple of transactions"
A well designed persistent writeback cache should be able to store several minutes of writes, assuming that it is required to be present when the system reboots.
If you are just using it as a glorified write through cache, there are no special requirements. But for most journalled filesystems and fsync heavy database applications a write through cache won't improve performance any more than adding a large amount of RAM (aside from the possibility of reducing cache warmup time on system restart).
To speed up any application that issues synchronous writes, including the filesystem journal itself, reliable writeback capability must be present. Typical journalled filesystems do lose several seconds of user level transactions on recovery. But when a filesystem wants to complete a meta data transaction it must issue a full barrier operation (generally meaning writing all dirty meta data buffers to disk) before it can continue or it cannot provide any recovery guarantees _at all_.
Similar thing with databases and other applications that use fsync. If fsync returns before the file data is committed to persistent storage, there is a substantial risk that the database will be completely corrupted. All the database redo logs and so forth are potentially worthless if the database cannot be confident that certain writes have actually been made persistent.
As in the filesystem case, even if you don't care about losing the last few seconds of user transactions, if you want to recover your database the database itself must be able to either commit writes and commit them now or have a block device that can provide full write barrier guarantees.
Ultimately this is a performance issue - if the flash cache provides the recoverable synchronous write guarantees, the latency for a database commit (or similar fsync requiring operation) can drop by a couple orders of magnitude.
Posted May 6, 2010 17:20 UTC (Thu) by Simetrical (subscriber, #53439)
[Link]
Enterprise disks already have recoverable synchronous write guarantees, by means of battery-backed disk controllers. The advantage of using an SSD as a write-through buffers instead of battery-backed RAM is just that SSDs are much cheaper than RAM. Likewise, the advantage of using SSDs instead of RAM for read caching is that you get a much bigger read cache for the same amount of money. (Plus it's pre-populated on boot, but that's not a big deal for servers.)
So, no, it might not improve performance any more than adding the same amount of RAM, but the same amount of RAM costs a lot more. :) You don't need SSDs for caching if your dataset already fits in memory, but not all datasets fit in memory.