I'd trust those benchmarks to be indicative of problems in ext4 as far as
I could throw their designer.
e.g. one really obvious case they haven't checked is whether other
filesystems and raw block devices are also affected. If they are, this is
due to changes in the block layer (of which there have been many more in
the target time period than changes to ext4), and Ted is blameless.
TBH their PostgreSQL results are indicative of a misconfiguration of some
kind more than anything else. Do you think no PostgreSQL users anywhere
would have noticed a *factor of four slowdown* between .31 and .32 and
mentioned it on the kernel list?
Posted Jan 24, 2010 14:02 UTC (Sun) by kronos (subscriber, #55879)
[Link]
> TBH their PostgreSQL results are indicative of a misconfiguration of some
> kind more than anything else.
AFAIK the difference is caused by barriers being enabled by default in .32, the change is deliberate.
Performances of ext4
Posted Jan 24, 2010 14:32 UTC (Sun) by nix (subscriber, #2304)
[Link]
I can see no sign of that in the ext4 git changelog. AFAIK barriers were
always enabled for ext4... but commit
5f3481e9a80c240f169b36ea886e2325b9aeb745 causes an fdatasync() in the
middle of an already-allocated file to always flush its blocks out (with a
barrier). PostgreSQL would be 'bitten' by this hard (really bitten by the
bug it fixes): almost all its writes are in the middle of
already-allocated files, and before this change the fdatasync() wouldn't
actually have synced anything but the inode, AFAICS.
Performances of ext4
Posted Jan 24, 2010 16:07 UTC (Sun) by patrick_g (subscriber, #44470)
[Link]
Perhaps...but from 1069 transactions per second (2.6.31) in pgbench to only 280 (2.6.32) the cost is gigantic!
See this page.
Who knows
Posted Jan 24, 2010 23:17 UTC (Sun) by man_ls (subscriber, #15091)
[Link]
And who can tell if the change is really worth it after the previous ext4 fiasco?
Who knows
Posted Jan 25, 2010 8:12 UTC (Mon) by nix (subscriber, #2304)
[Link]
IMNSHO, anything that fscks as fast as ext4 is worth it, no matter what
else has changed :)
Performances of ext4
Posted Jan 26, 2010 8:02 UTC (Tue) by kleptog (subscriber, #1183)
[Link]
280 transactions per second sounds about right for a system with spinning disks attached. A transaction is committed when the data hits the log and in general you can do this once per revolution of the disk platter. If there are simultaneous transactions they can commit together.
Anyone who gets thousands of transactions per second either has a battery backed cache on the hard disk controller, or does not have the D in ACID. Or is running on SSD disks.
The fact that disks and operating systems have silently been ignoring fsync requests has gotten people used to completely unrealistic numbers.
Performances of ext4
Posted Jan 26, 2010 15:00 UTC (Tue) by ricwheeler (subscriber, #4980)
[Link]
I agree in general with the comment, but have to point out that the transaction rate depends a lots of things.
You can get a rough idea of how many transactions your storage can do by timing the fsync()'s per second of a dirty file. On a S-ATA drive, that number is around 30-40 per second, on an enterprise class array it can jump up to 700/sec over fibre channel and with something like a PCI-e SSD device it can go beyond that.
Note that you can also try and batch multiple transactions into one commit - ext4 supports batching for multi-threaded writers for fsync for example.
Performances of ext4
Posted Jan 26, 2010 15:30 UTC (Tue) by dlang (✭ supporter ✭, #313)
[Link]
SATA drives can do better than that.
for rotating media, figure the drive can do one fsync per rotation when writing to sequential file, for 7200 rpm drives this is ~120/sec.
if you are getting thousands of transactions/sec from a database test, you have some buffering going on, and unless that buffering is battery backed, you will loose it in power outages.
the one exception is that if you have multiple transactions going in parallel, you may be able to have different transactions complete their syncs in the same disk rotation, so you may get # threads * (rpm/60) syncs/sec.
enterprise storage arrays have large battery backed ram buffers, which do wonders for your transaction rate, up until the point where those buffers are filled (although even then they give you a benefit as multiple transactions can be batched and written at once, reducing the number of writes to the drives)