> At the core of every single VCS sit "diff" and "patch", totally useless for binaries.
You can 'diff' and 'patch' binaries. What you can't usually do is 'merge' them. Nevertheless, the need for versioning them exists, even if they aren't mergeable. By the way, to that end SVN supports locking, so that only one person works on a binary at a time. That would be quite weird for a DVCS, but centralized SVN can afford this.
Posted Apr 4, 2010 14:43 UTC (Sun) by tialaramex (subscriber, #21167)
[Link]
Agreed that binary files (sometimes) want revision control, and that digging a hole is not the solution.
Perhaps though it's wrong to accept the "can't merge" outcome, particularly in the video game context. It's my understanding that video games almost always have someone in the "toolsmith" role. A toolsmith's job could include providing merge for binary formats where that seems like a meaningful operation.
A really simple example is image merge. Given three layered "source images" one the "original", one of which has an improved stone effect layer made by artist A, and one the newly agreed "bad guy logo" in the emboss layer from lead designer B, it ought to be possible to take the changes made by A and by B and merge them to produce a new image which has the nicer stone AND the new logo. This is a mechanical operation (load both images, keep the layers that changed in both, save) but it requires insight into the format of the data.
But even non-mechanical merges are useful. Maybe the two unrelated changes to the intro level can't be merged by a machine, but the level design tool could be tricked out with a feature that can load another level and show it as a copyable ghost on top of the first. That takes merge from a painful and lossy operation worth avoiding at any cost (svn locking) to a relatively mundane occurrence, possible whenever necessary but not to be encouraged.
Support large repositories!
Posted Apr 4, 2010 17:15 UTC (Sun) by RCL (guest, #63264)
[Link]
I haven't seen even visual diffing, let alone merge between formats
(binary or textual, doesn't matter) - I'm not talking about images, but
(usually stored in proprietary and/or ad hoc formats because of efficiency
requirements) animations, geometry etc
What you are proposing is a nice idea, but it would took an enormous
amount of work to be generally applicable (merge of two skinned characters
with different number of bones, anyone?) and still will be error-prone.
Moreover, merging between two different data sets is not solely a
technical problem, it requires artistic eye because even correctly merged
result of two great-looking changes may still look like shit.
It's so much easier to just lock the files, really!
Support large repositories!
Posted Apr 5, 2010 3:52 UTC (Mon) by martinfick (subscriber, #4455)
[Link]
"Moreover, merging between two different data sets is not solely a
technical problem, it requires artistic eye because even correctly merged
result of two great-looking changes may still look like shit."
This can very well be true for code also...doesn't meant that a merge tool
to help the process isn't/wouldn't be useful. But, naturally the right way
to support this would be to have your VCS support multiple merge tools via a
plugin mechanism.
Support large repositories!
Posted Apr 5, 2010 9:28 UTC (Mon) by dlang (✭ supporter ✭, #313)
[Link]
which git does support
Support large repositories!
Posted Apr 6, 2010 20:43 UTC (Tue) by vonbrand (subscriber, #4458)
[Link]
True, as far as it goes. But note that the diff + patch mechanism used to merge aren't infallible either: Consider a repo containing a function named foo and several uses of it. Now on one branch rename foo to bar, and on another introduce further uses of foo. When you merge, even if the merge is successful (i.e., no changed chunks intersect), the result is still inconsistent.
What is really happening is that we use text as a representation for source code, which has a rather richer structure than just "lines" (but not so rich that it makes the above completely useless). We saw that with a student who worked (essentially) on VCS for XML files representing projects. The simple line based diff/merge turned out not to be enough, a somewhat smarter set of operations was needed.
That takes us again to the source of the (otherwise unreasonable) success of Unix: Use text files, not random binary formats unless strictly required. Binary formats are much harder to handle, and each of them will require its own set of operations. To add to the fun, many binary formats include their own (rudimentary) VCS...
Support large repositories!
Posted Apr 4, 2010 20:35 UTC (Sun) by marcH (subscriber, #57642)
[Link]
> You can 'diff' and 'patch' binaries
I was not thinking of "diff-the-concept" but of "diff-the-tool".
You can design a tool that will pretend to handle both text and binaries the same way, but it will only pretend to. Inside the box you will actually find two different tools.
Support large repositories!
Posted Apr 4, 2010 21:31 UTC (Sun) by nix (subscriber, #2304)
[Link]
No you won't. hg's delta algorithm is binary, as is svn's; git can
transform its series of commit snapshots in all sorts of ways without
changing visible behaviour, and can detect commonality between entirely
unrelated files merged from completely different source repositories. The
diff you see when you do 'git diff' is completely unrelated to the
delta-compression algorithm. (IMHO, this is one of git's biggest
architectural strengths: it can change its on-disk representation almost
beyond recognition without changing anything the user sees or breaking
compatibility in any way.)
Support large repositories!
Posted Apr 5, 2010 11:35 UTC (Mon) by marcH (subscriber, #57642)
[Link]
> hg's delta algorithm is binary, as is svn's; git can transform its series of commit snapshots in all sorts of ways without changing visible behaviour,
Many binary formats are compressed by default. This usually prevents computing deltas. Are these tools clever enough to transparently uncompress revisions before comparing?
Support large repositories!
Posted Apr 5, 2010 16:24 UTC (Mon) by nix (subscriber, #2304)
[Link]
Of course the source formats being compressed doesn't prevent computing
deltas, but it does mean that the deltas might be larger than they would
otherwise be. (If they end up too large, you'll just end up storing a
sequence of snapshots.)
Support large repositories!
Posted Apr 5, 2010 21:47 UTC (Mon) by marcH (subscriber, #57642)
[Link]
> Of course the source formats being compressed doesn't prevent computing deltas, but it does mean that the deltas might be larger than they would otherwise be.
Every time I tried this, the delta was almost as big as the file itself. Would you have counter-examples?
Support large repositories!
Posted Apr 5, 2010 23:14 UTC (Mon) by nix (subscriber, #2304)
[Link]
No. You'll end up with a lot of snapshots rather than deltas, currently.
Support large repositories!
Posted Apr 5, 2010 17:25 UTC (Mon) by dlang (✭ supporter ✭, #313)
[Link]
the ability is there in git to have the delta algorithm uncompress revisions before comparing them.
This has been discussed several times (especially in the context of handling things like .odf files that are compressed XML). What needs to be done to handle formats like this is well understood. Git even has the mechanism to flag files as being of a specific type and call arbatrary tools (external scripts/programs) to handle different file types.
unfortunately, nobody has good, simple examples of this that I am aware of. It's possible, but will take some git-fu to get setup.
Support large repositories!
Posted Apr 6, 2010 17:00 UTC (Tue) by Spudd86 (guest, #51683)
[Link]
What would be nice is if someone wrote the code to handle the common files of this type and just included it as part of git (or at least posted it somewhere)