The trouble with discard

Posted Aug 19, 2009 10:33 UTC (Wed) by ewen (subscriber, #4772)
Parent article: The trouble with discard

On the other hand, regularly discarding all of the free space in a filesystem makes it likely that some time will be spent telling the device to discard sectors which it already knows to be free.

At the cost of another block bitmap (which stores the kernel's idea of the device's "free bitmap"), this could be reduced to one discard for sectors that the device knows to be free per mount of the file system; the bitmap would start assuming that the device considered all blocks in use, and then discard the ones that the file system knows aren't in use, and then update the "device free bitmap" to correspond to the file system one, so that further discards could be limited to blocks which are newly freed up.

Whether this makes sense as a time/memory trade off probably depends on the size of the block bitmap, and the portion which is likely to be free; for very large (more than 1TB?) file systems which are mostly full (more than 90%?) it may be better just to multiply discard and save the RAM.

And of course there's no particular reason why all the discards to "free up space" need to be done at once. Like any garbage collection it probably best implemented by doing it incrementally with some sort of mark'n'sweep algorithm in a background thread when runs only when IO on the underlying device is mostly idle. Covering the file system space every few hours would probably be fine for most use cases.

Ewen

PS: I'd guess TRIM was made a flushing operation in order that it could be used as a security boundary -- on completion one could be sure that the block had been discarded. In which case ATA would benefit from another advisory operation (RELEASE?) which could be tagged and/or return immediately that just queued the block as potentially discardable if it helped the device. This would also make repeatedly discarding both desirable, and much cheaper.