Two separate block-level caching solutions, dm-cache and bcache, were the
topic of an LSFMM Summit 2013 discussion led by Mike Snitzer, Kent
Overstreet, Alasdair Kergon, and Darrick Wong. Snitzer started things off
with an overview of dm-cache, which was
included in the 3.9 kernel. It uses the
kernel device mapper framework to implement a writeback or
writethrough cache on a fast device for a slower "origin" device.
Essentially, dm-cache uses the device mapper core and adds a policy layer
on top of
it. The policy layer is "almost like" a plugin interface, where
different kinds of policies can be implemented, Snitzer said. Those
policies (along with the cache contents) determine
whether there is a hit or a miss on the cache and whether a migration (moving
data between the origin and the cache device in either direction) is
required. Various policies have been implemented, including least-recently
used (LRU), most-frequently used (MFU), and so on, but only the default
"mq" policy was merged, to reduce the number of policies being initially
tested.
There are hints that can be supplied by the filesystem to the policy, such
as blocks that are
dirty or have been discarded. That kind of information can help the policy
make a more informed decision about where to store blocks.
Overstreet then gave a status update for bcache, which is queued for 3.10.
There are "lots of users", he said, and the code has been relatively stable
for a while. He has been concentrating mostly on bug fixes recently.
Unlike dm-cache, which is "tiered storage", Overstreet said, bcache is more
of a conventional cache. It can store arbitrary extents, down to a single
sector, whereas with dm-cache, a block is either entirely cached or it isn't.
Just before the summit, Wong sent an email
comparing the performance of bcache, dm-cache, and EnhanceIO to several
mailing lists (dm-devel, linux-bcache, linux-kernel). He made a kernel
that had each enabled and ran some tests. He found that EnhanceIO was the
slowest, bcache had four to six times better performance, and dm-cache had
better performance by a factor of 15, except when it didn't. All were
compared to the same test being run on a regular hard disk, and sometimes, for
reasons unknown, dm-cache performed more or less the same as the disk. He
did note that some tests would cause the inode tables created by
mkfs to be cached, which is not a particularly efficient use of
the cache.
Snitzer is trying to reproduce Wong's results, he said, but currently is
getting poor results for both bcache and dm-cache. He said that he wants
to get with Overstreet to try to figure it out. For his part, Overstreet
cautioned against reading too much into synthetic benchmarks. They can be
useful, but can also be misleading. Ric Wheeler asked if Snitzer was
"seeing real improvements with real workloads"; Snitzer mentioned that
switching between Git tags was one example where there was a clear win for
dm-cache, but he needs some help in determining more real-world workloads.
An attendee asked about whether the solutions always assumed the presence
of a cache device. Snitzer said that dm-cache does make that assumption,
but it is something that needs to change. Overstreet said that bcache does
not require a cache device at all times. Since bcache is being used in
production, it has had the time to hit the corner cases and handle
situations where the cache device is unavailable.
Snitzer said that there are still things that need to be done for
dm-cache. Originally, it was doing I/O in parallel between the cache and
the origin, but ultimately had to fall back to sequential I/O. Also, with
NVM devices coming down the pipe, storage
hierarchies are likely. Since dm is "all about stacking", dm-cache will
fit well into that world, though Overstreet pointed out that bcache can
stack as well.
No real conclusions were reached, other than the need to get better
"real-world" numbers for performance of both solutions. Figuring out why
various testers are getting wildly different results is part of that as
well.
(
Log in to post comments)