User: Password:
|
|
Subscribe / Log in / New account

Smarter Diff

Smarter Diff

Posted May 29, 2008 14:52 UTC (Thu) by fmyhr (subscriber, #14803)
In reply to: Getting the right kind of contributions by roc
Parent article: Getting the right kind of contributions

It seems to me that what is needed is a smarter diff tool - one that parses the code the same
way the precompiler and compiler do, and doesn't bother us poor humans with the noise
generated by simple whitespace changes.


(Log in to post comments)

Smarter Diff

Posted May 29, 2008 14:59 UTC (Thu) by nix (subscriber, #2304) [Link]

Yeah, but how do you apply patches in the presence of intervening 
whitespace changes? A `smarter patch' in that sense would need to have a 
language-specific understanding of indentation, at the *very* least...

Smarter Diff

Posted May 29, 2008 15:57 UTC (Thu) by tjc (guest, #137) [Link]

I don't think this would be a problem in C, since indentation doesn't have semantic meaning.

Smarter Diff

Posted May 29, 2008 16:49 UTC (Thu) by nix (subscriber, #2304) [Link]

Try geting in after someone has de-K&Red your project and applying a patch 
generated earlier, and oopsy. Any line-based patch system would be 
confused if { and } had moved from lines with code on them to lines 
without, or vice versa.


Smarter Diff

Posted May 29, 2008 16:56 UTC (Thu) by fmyhr (subscriber, #14803) [Link]

Maybe keep stupid patch (line-by-line) but have smart diff that understands the language and
parses by tokens not lines? Patchfiles would still be noisy, but humans would ignore that and
look at output of smart diff to see what had really changed.

Smarter Diff

Posted May 29, 2008 16:59 UTC (Thu) by fmyhr (subscriber, #14803) [Link]

But that doesn't fix your pre- and post- de-K&R problem does it? Patch would have to be smart
(token, not line-based) too.

Smarter Diff

Posted May 29, 2008 19:41 UTC (Thu) by tjc (guest, #137) [Link]

Yeah, a line-based patch system wouldn't work in that case.  But I think lexically scanning
the input would be sufficient -- you wouldn't have to parse it.  If two files generate the
same sequence of tokens, then I think they could be considered equivalent in a "free form"
language.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds