LWN.net Logo

Busy busy busybox

Busy busy busybox

Posted Oct 2, 2006 18:44 UTC (Mon) by tjc (subscriber, #137)
In reply to: Busy busy busybox by sepreece
Parent article: Busy busy busybox

Beyond the licensing issue, I'm not clear on how one could accurately identify and seperate code that 's been contributed and "assimilated" into the codebase. Most code eventually gets refactored, and after this has been done, what then? There are two possible senarios:

1) After code has been refactored, then it's no longer part of the original contribution, and the contributors copyright claims are no longer valid.

2) The contributors copyright remains valid after the code has been refactored, and the codebase is forever tainted, since the original contribution can no longer be accurately identified and extracted.

Most license infractions that are settled in court involve code that's been blatantly copied over from one program to another and remains largely unchanged. It's unsettling to think of what might happen if a court takes up the job of deciding what to do about code that's been refactored.


(Log in to post comments)

Busy busy busybox

Posted Oct 2, 2006 19:51 UTC (Mon) by landley (guest, #6789) [Link]

3) You use a legal-grade tool like comparator that strips out whitespace
and brackets when checking for similarity, and is not confused by code
reordering.

4) You manually inspect every line of the code to discern refactoring
from actual rewriting, with an eye towards "scenes a faire" and the
smallest copyrightable expression not to fall under fair use within your
jurisdiction (yeah, copyright is federal, but which circuit court?).
Note that book titles are explicitly not copyrightable, and the smallest
legally recognized copyrightable expression in poetry seems to be haiku,
something which Apple has taken advantage of in its protocols:
http://sree.kotay.com/2006/02/legal-security-ocp-and-appl...

Comparator also defaults to "3 lines" as its smallest actionable hit, and
there's a reason for this.

Anyway, by-hand comparison is much easier when you have a single known
source of contamination that you can compare against, and if it isn't in
that source it isn't what you're looking for. So you don't have to
actually track multiple upstream sources, just eliminate _one_. Plus, a
lot of familiarity with the codebase helps (this file was rewritten from
scratch in this svn checkin, so no earlier claims can apply).

Hence four-stage forensic analysis, which may have been overkill but
should definitely have been _sufficient_. Unfortunately, Bruce was
trying to make moral arguments, not legal ones.

Rob

Busy busy busybox

Posted Oct 2, 2006 20:20 UTC (Mon) by tjc (subscriber, #137) [Link]

I read your rational page on forensic analysis, and the comparator man page, and I think I understand your methodology. But I'm still not sure about the larger legal implications of refactoring code.

My working premise is this: if code is refactored (into "chunks" of whatever size is required by law), then what started out as "expression of concept" has to been reduced to the concept itself, and as such is no longer bound by copyright law, but is instead the subject of patent law.

So assuming that I have this right, then in this particular case -- if Bruce's code was correctly identified and sufficiently refactored -- then his copyright claims would no longer be valid. This still leaves open the possibility of patent claims for the underlying concepts, but that does not seem to be an issue in this case.

Yes? No? Maybe...?

Busy busy busybox

Posted Oct 4, 2006 19:45 UTC (Wed) by tjc (subscriber, #137) [Link]

4) You manually inspect every line of the code to discern refactoring from actual rewriting, [snip]
What is the difference between refactoring and rewriting?

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds