Posted Oct 6, 2012 10:37 UTC (Sat) by skitching (subscriber, #36856)
Parent article: Samsung's F2FS filesystem
The article mentions that this FS is intended to handle "the snowball effect". Can someone explain what this means?
Is it a reference to the problem with log-structured systems where writing to a leaf block means not only copying that block, but also copying the direct node-block that points to it, and potentially copying the indirect node-block that refers to that direct node-block, and then the inode that refers to the indirect node-block etc?
And if so, how does it "avoid" this effect? Is it somehow a hybrid of log-structured and in-place-modified FS? And if so, does this then lose the ability of log-structured systems to support snapshots etc?
Posted Oct 7, 2012 2:59 UTC (Sun) by cmccabe (guest, #60281)
[Link]
I think the snowball effect is a reference to write amplification.
Snowball effect
Posted Oct 7, 2012 21:38 UTC (Sun) by neilbrown (subscriber, #359)
[Link]
Every time my RSS reader (RSSyl in clawsmail) reloads the feed, it adds another copy of the above comment. That's a novel form of write amplification!
Now the important question: is that irony, or just coincidence?
Snowball effect
Posted Oct 11, 2012 0:55 UTC (Thu) by neilbrown (subscriber, #359)
[Link]
No, this isn't a reference to write amplification.
The correct answers to the GP's questions are:
1/ yes
2/ yes
3/ ...
4/ yes
5/ yes
The answer to 3 is a little long to fit in this comment.
Snowball effect
Posted Oct 9, 2012 13:21 UTC (Tue) by the.wrong.christian (guest, #73127)
[Link]
> And if so, how does it "avoid" this effect? Is it somehow a hybrid of log-structured and in-place-modified FS? And if so, does this then lose the ability of log-structured systems to support snapshots etc?
My guess (only a guess as I've not looked to deeply into it) is that the FS writes only leaf nodes to the log, and maintains any non-leaf interior node changes in memory until such time that lots of writes have accumulated in memory meta-data, and the non-leaf nodes are written out in one go at checkpoint time, thus amortizing changes to non-leaf nodes across many leaf writes. It can do this because the leaf nodes can be used to reconstruct the non-leaf data based on the last known snapshot and rolling forward.
Makes recovery slightly slower, but hey, that's not the predominant use case.