Desmond: Out-Tridging Tridge
Posted Aug 29, 2013 22:54 UTC (Thu) by mathstuf (subscriber, #69389)
where C is a smaller block to "fill gaps between the full ones".
Posted Aug 30, 2013 1:04 UTC (Fri) by khc (guest, #45209)
we will simplify things by saying the max chunk size is 5 bytes instead. With your example, it's easy to see how AAAAA would be a match. Next we get to C, which is not a match. Then we get to CB, CBB, CBBB, CBBBB, none of these would match, and we hit the max chunk size. This normally would cause us to define a new chunk, with a chunk size of 5 bytes.
I guess it may work if the lookup window is twice the max chunk size though.
Posted Aug 30, 2013 9:03 UTC (Fri) by ikm (subscriber, #493)
In a nutshell: after each deduplication matched, the window starts growing from a single byte to its full width. After that, the window starts rolling forward, looking for a match. If a match is found, the data left behind the window is output as a new short chunk (or even in-place if it's too little). If no match is found at the point where the data left behind is enough for a new full-sized chunk, the data left behind is output as a new full-sized chunk. The window then continues to roll.
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds