|
|
Subscribe / Log in / New account

Preferred form for making modifications

Preferred form for making modifications

Posted Nov 16, 2018 23:40 UTC (Fri) by JoeBuck (subscriber, #2330)
In reply to: Preferred form for making modifications by matthias
Parent article: Bringing the Android kernel back to the mainline

The reason for the language in the GPL is to prevent people from claiming that shipping obfuscated source code satisfies the license. Lack of access to the git (or subversion/CVS/Perforce/whatever) repository doesn't handicap a developer who is trying to produce a modified version of the program, or understand what it does, but obfuscated source is a huge barrier.


to post comments

Preferred form for making modifications

Posted Nov 17, 2018 8:51 UTC (Sat) by mjthayer (guest, #39183) [Link] (14 responses)

Then it is probably just my lack of skill or experience (or time) in trying to disentangle what is original kernel (that part at least is relatively easy), back-ported changes from newer kernels and vendor additions. Not that it is even hard to identify individual back-ported changes, just the volume makes it (or did for me) unpractical. Hence the idea that a git repository would be needed for serious work.

Preferred form for making modifications

Posted Nov 17, 2018 11:40 UTC (Sat) by matthias (subscriber, #94967) [Link] (13 responses)

No, it is not. The problem is just that the companies are not required to publish the history alongside with the source code. Even if we would argue that the preferred way of publishing source code is git, the companies could just create an empty git repository, unpack the tar ball, commit, and publish. Nothing would be gained.

The GPL predates much of today's version management systems. The use case "port 2-3 years of kernel development to an ancient kernel with many out of tree changes" was certainly not on the radar, back then. The use cases where like fixing a bug or programming a new feature (not merging a feature from a different branch).

Having the history available would be very nice. But this can only be accomplished by convincing the companies that working together with the community provides some benefits. And without the companies willing to work together, I think, even a git repository with history is not really helpful. How should we integrate code into the mainline, that nobody in the community knows and that is not supported. If we just want to backport some security fixes, the tar ball is not much worse than a full repository. If we want more, we need someone knowing the code who helps with integration.

Preferred form for making modifications

Posted Nov 17, 2018 17:15 UTC (Sat) by pallas (guest, #128204) [Link] (3 responses)

FWIW, I once had a vendor tell me they used git internally; I asked for git access instead of tgz dumps of point releases, and they did exactly that: access to a repo with a single commit of the unpacked tgz.

Preferred form for making modifications

Posted Nov 17, 2018 23:49 UTC (Sat) by JoeBuck (subscriber, #2330) [Link] (2 responses)

The GPL (2 or 3) does not impose an obligation to give people all of the stupid ideas and mistakes that occurred during internal development, or the references to company confidential information that might have appeared in comments but were scrubbed before release, all of which would be exposed by providing the complete git repository. The obligation is to provide source that corresponds to the binary that it shipped, and not more than that.

Preferred form for making modifications

Posted Nov 19, 2018 4:29 UTC (Mon) by jeffm (subscriber, #29341) [Link]

That's a lot of words to obfuscate the value of a release branch.

Preferred form for making modifications

Posted Nov 19, 2018 20:47 UTC (Mon) by tao (subscriber, #17563) [Link]

You are also obliged to make note of any modifications you have made (no requirements for changelogs though) and when you made them.

"You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change."

The best way to do this *is* typically to provide a changelog entry, but if you just distribute patches that should be enough; after all patches by their very nature show what files are modified, and assuming that the patch was created when the modification was made, it'll also have the time of change in the patch header.

Preferred form for making modifications

Posted Nov 18, 2018 3:13 UTC (Sun) by marcH (subscriber, #57642) [Link] (1 responses)

> Having the history available would be very nice.

There isn't really one thing such as "the" history, it's a continuum and any line drawn would be in an arbitrary place. Should it be possible to access every version that was ever tested by some internal Trybot / 0day even before getting approved and merged internally? There's a lot of value in test results after all. Going even further, should it possible to get every version that is in the developer's reflog? If it was committed at some point then it must have some value. Getting inside the thought process of a good developer is surely a useful learning experience, observing mistakes made helps not repeating them.

BTW *open-source* developers rewrite history, that's part of the public review process. Sometimes some of these git histories get lost!
https://public-inbox.org/git/70ccb8fc-30f2-5b23-a832-9e47...
https://github.com/git-series/git-series
So shouldn't these open-source developers be forbidden to "unpublish" and unshare these? It's GPL code after all </devil's advocate>

> But this can only be accomplished by convincing the companies that working together with the community provides some benefits. And without the companies willing to work together, I think, even a git repository with history is not really helpful.

This is the best summary.

Preferred form for making modifications

Posted Nov 18, 2018 11:15 UTC (Sun) by emj (guest, #14307) [Link]

> BTW *open-source* developers rewrite history [...] So shouldn't these open-source developers be forbidden to "unpublish" and unshare these?

Thanks for the links! At work I've started keeping tracking every branch and all the rebases and squashes. It helped me immensly once and I suspect if there were tooling for keeping the history for both WIP repos and "official" gitseries repos in the same repo I could solve a lot more problems.

Preferred form for making modifications

Posted Nov 19, 2018 22:16 UTC (Mon) by cyphar (subscriber, #110703) [Link] (5 responses)

> The problem is just that the companies are not required to publish the history alongside with the source code.

This is somewhat incorrect (depending on your interpretation of "history" in this context). ยง2a of GPLv2 states:

> You may modify your copy or copies of the Program [...] provided that you also [...] must cause the modified files to carry prominent notices stating that you changed the files and the date of any change.

Unfortunately it appears most people have forgotten this part of the GPL. There was a big argument several years ago when RedHat decided to start providing big patch-blobs rather than individual patches, but it seems the community has settled that this is "okay". But just providing a tarball with a modified kernel isn't full compliance with the GPLv2.

Preferred form for making modifications

Posted Nov 20, 2018 2:16 UTC (Tue) by pizza (subscriber, #46) [Link] (4 responses)

You know, a patch file clearly shows what modifications have been made to which files, so whether or not the patches are broken out is irrelevant for purposes of GPL compliance -- it is sufficient to just show what has changed, and a patch file shows that admirably well.

That said, a simple tarball of modified sources is arguably another matter -- while perhaps a technical violation, IMO using that alone as the basis for accusing someone of GPL violations is ludicrous -- but there exists a modern [1] tool called 'diff' which makes it a fairly trivial exercise to determine what has changed versus the original, unmodified sources.

[1] First released all the way back in 1974

Preferred form for making modifications

Posted Nov 20, 2018 6:55 UTC (Tue) by mjthayer (guest, #39183) [Link] (3 responses)

Actually, playing a reluctant devil's advocate, "prominent notices stating that you changed the files and the date of any change" makes no statement about the content of the changes. So the lack of dates of modification (VCS would help here of course) is a GPL violation, but preventing it would not help anyone much.

Preferred form for making modifications

Posted Nov 20, 2018 7:04 UTC (Tue) by mjthayer (guest, #39183) [Link] (1 responses)

All that said, does anyone have experience with forward porting a modified blob to a later kernel? When I have time I was planning to try, by 1) converting it into a patch (or working copy change) against the repository it seems to be based on and 2) trying to manually apply everything which was later committed to the repository, mechanically change by change, fixing when there are conflicts and hoping the patch will shrink until it is cleaned of back-ports. (It won't help me that I know Subversion better than git.) If anyone knows better ways, please tell.

Preferred form for making modifications

Posted Nov 20, 2018 11:54 UTC (Tue) by nix (subscriber, #2304) [Link]

diff -urN original-kernel-repo blob-tree > blob.diff
cd original-kernel-repo; git apply --index blob; git commit -m "Blob patch"; git rebase new-kernel-version

should suffice, more or less -- or, alternatively, instead of the rebase you can git checkout the new kernel version and do a git cherry-pick onto it. (In non-ancient versions of git these end up using exactly the same machinery for application, even if you cherry-pick a range.)

Preferred form for making modifications

Posted Nov 21, 2018 17:39 UTC (Wed) by tao (subscriber, #17563) [Link]

Having one single blob for all changes doesn't document the date of change though, it just documents the date of the blob creation. It's highly unlikely that a 100MB bundle of changes were all made in one go... But hey, who knows. :)

Preferred form for making modifications

Posted Nov 27, 2018 7:54 UTC (Tue) by nhippi (subscriber, #34640) [Link]

Even when companies have git repos internally, they might not be the pinnacle of clarity. Commit messages like "foo" or "See ticket-12345" or "As per discussed with Jeff and Jess at watercooler". People say "we need to clean this up vefore we open source it" for a reason, and just as a delay tactic.

Chromium OS has excellent Git repos, with relatively well enforced commit message, bug tracker and CI testing results available in open, and still you will be overwhelmed easily in forest of trees with numerous branches... Any internal repo would need lots of institutional knowledge to understand what is going on.

Preferred form for making modifications

Posted Nov 17, 2018 19:37 UTC (Sat) by Otus (subscriber, #67685) [Link] (4 responses)

> Lack of access to the git (or subversion/CVS/Perforce/whatever) repository doesn't handicap a developer who is trying to produce a modified version of the program, or understand what it does, but obfuscated source is a huge barrier.

I definitely often rely on git blame and git log to understand what and why a piece of code is trying to do. IMO git history and commit messages are comparable to comments. Though I have no idea whether stripping those before distribution is ok.

Preferred form for making modifications

Posted Nov 17, 2018 21:21 UTC (Sat) by matthias (subscriber, #94967) [Link] (3 responses)

E-Mail correspondence between developers could also help in understanding the source code. Are you also unsure whether it is OK to strip those before distribution? Git commit messages are clearly metadata. If it would be required to distribute metadata, this would have to be explicitly mentioned in the GPL.

Preferred form for making modifications

Posted Nov 17, 2018 21:39 UTC (Sat) by Otus (subscriber, #67685) [Link] (2 responses)

I meant I'm not sure if stripping comments from source is ok.

Removing git history is clearly at least tolerated.

Preferred form for making modifications

Posted Nov 17, 2018 21:56 UTC (Sat) by matthias (subscriber, #94967) [Link] (1 responses)

Ah ok. I actually also wondered whether it is ok to strip comments. I am also not sure. For git I am sure that stripping it is perfectly ok. They are definitely not part of the code itself. For comments this can be argued in both directions. From the point of the compiler they are also not part of the code. That is why they are called comments. From the point of the programmer this might look different. Of course, if the added code is without comments there is always the problem to prove that they were removed, i.e., that the code was commented before publication.

Preferred form for making modifications

Posted Nov 19, 2018 18:12 UTC (Mon) by jezuch (subscriber, #52988) [Link]

...assuming of course that the comments are worth anything. I've seen too many repositoe6 where the commit comments were of the variety "today's work" or "implement this" or something like that. It just makes you want to hurt the authors ;)


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds