User: Password:
|
|
Subscribe / Log in / New account

Desmond: Out-Tridging Tridge

Desmond: Out-Tridging Tridge

Posted Aug 29, 2013 8:05 UTC (Thu) by khc (guest, #45209)
In reply to: Desmond: Out-Tridging Tridge by ikm
Parent article: Desmond: Out-Tridging Tridge

What do you mean by "fill gaps between the full ones"?


(Log in to post comments)

Desmond: Out-Tridging Tridge

Posted Aug 29, 2013 22:54 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

I imagine that it's likely something like this:

Before:
> AAAAABBBBB

After:
> AAAAACBBBBB

where C is a smaller block to "fill gaps between the full ones".

Desmond: Out-Tridging Tridge

Posted Aug 30, 2013 1:04 UTC (Fri) by khc (guest, #45209) [Link]

Right. It's a little strange though. Consider how the rolling hash would work (disclaimer: I actually haven't looked at zbackup's code):

we will simplify things by saying the max chunk size is 5 bytes instead. With your example, it's easy to see how AAAAA would be a match. Next we get to C, which is not a match. Then we get to CB, CBB, CBBB, CBBBB, none of these would match, and we hit the max chunk size. This normally would cause us to define a new chunk, with a chunk size of 5 bytes.

I guess it may work if the lookup window is twice the max chunk size though.

Desmond: Out-Tridging Tridge

Posted Aug 30, 2013 9:03 UTC (Fri) by ikm (subscriber, #493) [Link]

Once you get to CBBBB, the hash starts to roll. The first roll would get it to BBBBB, leaving C between the deduplicated chunks AAAAA and BBBBB (this whole example assumed that both AAAAA and BBBBB were encountered before so they got deduplicated here).

In a nutshell: after each deduplication matched, the window starts growing from a single byte to its full width. After that, the window starts rolling forward, looking for a match. If a match is found, the data left behind the window is output as a new short chunk (or even in-place if it's too little). If no match is found at the point where the data left behind is enough for a new full-sized chunk, the data left behind is output as a new full-sized chunk. The window then continues to roll.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds