I see what khim means:
That thread describes an example where a file has been mostly rewritten, and the new file is almost identical to an unrelated file elsewhere in the tree, because they are both so short that the license header constitutes the majority of the file.
Since git tracks the state of trees, there's no problem when you're using git natively. To send the changes to svn though requires that git generates a diff. The change to the file is so great, and the similarity to an existing file so high, that the diff it generates by default describes a copy of the other file followed by a change of the non-license part of the file, which is clearly a failure in git's move/copy detection heuristics and will produce nonsensical change history when imported to svn.
That said, it's arguably not too major a failure, because if you're looking at the diffs you're generating then you can see the problem immediately and can ask git to DTRT by setting the similarity parameter for the heuristic to something more appropriate. I don't know if git-svn makes that easy to notice in the standard workflow though, as I don't use it.
There are some fairly dodgy heuristics in git which can cause this kind of thing, and while we're looking at dodgy heuristics I might suggest that the similarity threshold could be a) dynamically adjusted according to the length of the file, and b) possibly altered to weight lines more heavily the further they are down the file, based on the reasoning that it's typically the beginning of most files (for code anyway) that tend to have a load of common guff that confuses the similarity detection.
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds