LWN.net Logo

Reasoning behind the "preferred form" language in the GPL

Reasoning behind the "preferred form" language in the GPL

Posted Mar 12, 2011 4:52 UTC (Sat) by tomcatsdb (guest, #73351)
In reply to: Reasoning behind the "preferred form" language in the GPL by PaXTeam
Parent article: Commitment to Open (Red Hat News)

"this is where you are wrong. you consider *only one* kind of modification (one where you add your own original work to create a derived work, although in reality even in this case one prefers to see the history leading up to the work you're modifying, but let's not digress), "

The act of adding a new file, removing a file, changing a line of code, regardless of the purpose, is a modification. Viewing extraneous information, historical or otherwise, is not modifying the source, hence it is out of scope of the GPL. The history is data about the work you're modifying, not the work itself. Note that when people distribute a compiled GPL application + source in tarball format on the same media, this has historically satisfied the conditions of the GPL. Your assertions would make every one of these distributions against the GPL. In addition, distributing the changesets or diffs from a previous version are not sufficient, however the complete source to the binaries as shipped is. From the GPL V2 FAQ (See: http://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.html)

---

Q: I want to distribute binaries, but distributing complete source is inconvenient. Is it ok if I give users the diffs from the “standard” version along with the binaries?

A: This is a well-meaning request, but this method of providing the source doesn't really do the job.

A user that wants the source a year from now may be unable to get the proper version from another site at that time. The standard distribution site may have a newer version, but the same diffs probably won't work with that version.

So you need to provide complete sources, not just diffs, with the binaries.

----

Other relevant quotes:

----

... No, you must supply the source code that corresponds to the binary. Corresponding source means the source from which users can rebuild the same binary.

... No matter how you distribute the source, the sources you provide must correspond exactly to the binaries. In particular, you must make sure they are for the same version of the program—not an older version and not a newer version.

---

Note that in all cases, historical data is not required, just the source to the binary as shipped. If you still wish to debate this point, pick it up with the FSF. They are the authoritative voice on what the GPL requires and their FAQ spells it out in exactly the terms I originally used.

Now then, this does lead back to the question of your test case of obfuscated code. Others have discussed this in various places (google "gpl obfuscated code"), and the answer most agreed on is what I stated: The purpose of the GPL is to allow recipients to distribute the application (plus source) in original and/or modified form. Going through a GPL work and obfuscating the source to make it useless to a recipient goes against the purpose of the license. The reason why I stated that it does matter how you code i simple: badly written code / poorly documented code is not against the GPL. So if you did work in that fashion, ie you aren't intentionally attempting to make the code unreadable to the recipient, you'd be in the clear. Automating a obfuscation step prior to the release is translating it from the preferred form for modifications to a bunch of files useless for the recipient aside from running make on. Thus your example is not GPL compliant.

The rest of your paragraph details the reasons why someone would want to modify a file, but this is irrelevant as far as GPL compliance goes. Again, you are attempting to combine the source code with the history of that code. These are two different things as supported by the wording of the GPL as well as the FSF's GPL FAQ. If you can find an authoritative source to support your position, feel free to post it here.

"great, so you agree that whatever the author of the work uses is prime candidate for 'preferred form for modification'. then you must agree that what RH distributes in the RHEL6 sprm is *not* the preferred form since they're not sending such monolithic tarballs around when their own developers communicate with each other during development"

... Really? Here you equate the internal method for a developer obtaining the source to the source itself. You keep doing these kinds of things and its still as wrong as it was the first time. Your example hypothetical situation deals with modifications to the source files themselves. It makes no mention of how those files are passed around internally. That is the difference between the real world situation of what RH is doing and your example. Internal communications have absolutely _NOTHING_ to do with GPL compliance.

"and now you're contradicting yourself. you have to make up your mind and decide *whose* preferred form is the one meant by the GPL."

No, I haven't. My position has been clear. It's supported by the FSF's FAQ. The only thing that I could not find with a few seconds of googling was the FSF providing an official position on obfuscated code. There is a difference between poor programming and willful, or in your case, automated obfuscation which I think would be relevant to a GPL violation lawsuit... Regardless, since RH is _NOT_ obfuscating the "source code" (as defined by the FSF), interesting as this side discussion is, it's irrelevant to the situation at hand. The rest of your paragraph builds on the false assertion that I've contradicted myself, fails to take into account the qualifications I've clearly made, and attempts to equate source code obfuscation to not shipping historical data.

"hey're obfuscating what i and many other people consider 'preferred form for modification'. you're still trying to cling to some 'common sense' definition of 'source code' whereas for good reasons the GPL gives a definition of it for its use in the license, you cannot arbitrarily reinterpret it."

I'm reinterpreting the GPL? Your idea of what "source code" constitutes goes far beyond anything the GPL states. But don't take my assertion for it, go back and read the FSF GPLv2 FAQ. Diffs are not sufficient for GPL compliance, but complete source for the binary is. "source code" is defined _explicitly_ as "not an older version and not a newer version" but the exact version used to make the executable. Historical development data is _NOT_ relevant to GPL compliance. Period. If it was, then CDs with just the binary and tarball of the source would be against the GPL. Those forms of distribution have been cleared by the FSF numerous times.

Yes people may be unhappy, as they no longer have a convenience provided by RH. Historically RH went well above and beyond what was required by the GPL. They're cutting that back. In either case, what matters here are the sources / scripts etc used to build the binary, which is what is defined by the GPL as the corresponding source code. Regardless of how the sources were packaged, the recipient got a source tree that they can make, and no, the files are not obfuscated. In the end, both RHEL5 and RHEL6 srpms provided a kernel source tree for the binary kernel RH shipped. If the RHEL 5 srpms had extra info, so be it. You keep trying to stretch the GPL to cover areas beyond the actual source code. That is the failure in your position. The RHEL5 srpms went beyond what the GPL requires, the RHEL6 srpms trimmed it back. Both are in compliance with the GPL.

"i never once mentioned version control."

But you keep saying RH needs to provide data that can only come out of a version control system.

"easy. please extract all the fixes for all the CVEs they have in RHEL6."

From the GPL v2 FAQ, previous or newer versions are not relevant to GPL compliance. Aside from that, you're asking for a specific modification that relates to historical data, on a binary that when built by you, was never distributed by RH. Hence they have absolutely zero obligation to support the request you're asking for. As an additional thing, you didn't address my point: Prove I can't open a C file from the kernel tree, change it, and build the resulting modified work. Prove I can't add a new module. Is it machine readable code that can be compiled? Why yes it is. Is it the complete source to the version as shipped by RH, well yes, it is. Do these conditions satisfy the GPL, according the the FSF FAQ, yes they do.

"you're mixing up the rights of the authors of the original work (Linus has the copyright on the collective work and individuals have copyrights on their respective contributions) vs. those of the derived work (RH's modifications to some state of the Linus tree). authors of the original work can pretty much do whatever they want, authors of the derived work are bound by the GPL."

Now this is an easy one. There is no magical divide here. Guess what: RH is in the same boat as any other contributor or distributor. Anyone who distributes a binary build of the kernel is bound by the GPL. This is the key component. RH is not under any obligation to provide sources for a binary they did not distribute.

RH plays the role of both contributor (the employees that RH pays to do the kernel work are making the contributions on behalf of RH) and distributor. The obligations for a distributor are to ship the source for the binary they distributed. RH does this. Now then, as a developer, RH internally builds, tests, and works on numerous bugs, but unless they distribute those binaries to an external entity, they are not bound by the GPL for those versions. They can expose that information or not, it's their choice, but not a GPL requirement.

"you make no sense here. what is the 'version control itself'? i never said anything related in fact,"

The version control system would be git, or bitkeeper, or whatever. The historical development data for a code base comes from a version control system. You keep saying stretching the definition of source code to include historical development data based on a completely twisted reading of the "preferred form" phrase in the GPL which contradicts the GPLv2 FAQ, as well as contradicting the simple fact that the GPL is triggered for a specific binary released. The development history behind the code is irrelevant to the terms of the GPL as explicitly stated by the FSF. Sure it's nice to have, but not required.

"that specific point means a set of files/directories obviously. but *that* set of files/directories is *not* distributed in the RHEL6 sprm, only its mangled version, contrary to your own expectation."

I never said RH had to distribute vanilla kernel sources, now did I. The only thing I stated regarding historical data, was that I think RH does provide a bulk "here's all the changes" patch, but I could be wrong on that point. Its a moot point though.

"so if i can understand the cfront output then what? am i in the clear? since when does skill level enter the GPL? right, it does not."

No, because that wasn't where you, being the programmer, made your changes. But you're illustrating my point. Obviously delivering the C code version of the C++ wouldn't satisfy the GPL. It's the same reasoning I'm using to justify why your (completely irrelevant to the situation RH is in) hypothetical code obfuscation example fails GPL compliance.

"yeah, clear as mud. 'source code' is not defined that way in the GPL. it's defined as 'preferred form for modification'. "

Omission of the next sentence in the GPL is your downfall: "For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable." By the way, the kernel is an executable, hence this clarification applies. Oddly enough, it's exactly what I've been saying. You can't selectively quote a single sentence out of the GPL and twist it all to heck to fit your personal agenda or worldview. You accuse me of reinterpreting the GPL when your own ideas about what a single sentence means goes against the FSF GPL FAQ and the very next sentence in the GPL itself.

"please stick to what the license says, not what you wish it said."

Please read more than one sentence of the license.

Your next paragraph is more of the same flawed logic. Read the license again as it applies to executables. Oh, thats right, it fits exactly what I said. You keep trying to wedge in a division between what the recipient of the code and the author expects based on your non-applicable hypothetical situation. No such division exists. People are crying foul about the RH policy change and trying to argue every possible negative against what RH is doing. I'm fine with that so long as those arguments are based on something real, not an imaginary GPL violation.

"you're wrong, the paragraph about the 'source code' and its 'preferred form for modification' has nothing whatsoever to do with the medium for distribution. what does talk about this are 3.a and 3.b in that they mandate a machine-readable form, but that's not under discussion here as RH is in the clear, they do provide a machine-readable form. "

So you're arguing a PNG screen cap is fine then? It's machine readable, but not the preferred form for modifications. People have tried to pull stunts like that in the past, hence the language in the GPL to prevent such nonsense. Aside from that, source code as defined for an executable is stated directly after the preferred form language, which, again supports what I've been saying.

"i don't know where this strawman came from, i wasn't talking about version control at all."

Because the data you are asking RH to provide comes out of their revision control system. The source code and its history are two different things. Your entire argument against RH boils down to a lack of that history.

The next few paragraphs equate internal distribution methods to making modifications on the source. But one thing you stated:

"or derived works and distribution it's irrelevant what an individual does in his privacy. "

I agree with this 100%. Guess what: RH developers are part of the same corporate entity. They can pass whatever they want around however they want around and it matters absolutely zero to RH's obligations under the GPL. See the GPL and the GPL FAQ for support of this.

"RH distributes a derived work, kernel.org does not (well, speaking of the Linus tree). not to mention that kernel.org distributes not only a tarball but a git tree with full history of the kernel. what did you try to say here again?"

This is wrong. Linus, kernel.org, or anyone really, is technically distributing a derived work from someone else at this stage, since the copyrights aren't assigned to a central entity. There is no central owner of the Linux source tree's copyrights.

"they're obfuscating their changes in the derived work."

No, they aren't. The C is as readable as the next bit of code in the kernel. You equate no internal revision history to source obfuscation which is yet another failure in your position.

I had read somewhere that they were doing monolithic patches, rather than individual patch sets. If it's the full source to the binary w/o patches in there, then according to the authoritative voice (FSF) RH is still fine. As I said, it was the only thing that may matter, and based on the clarifications provided by the FSF in their FAQ, seems to be a non-issue.

With this post, I'm bowing out of this topic. I think we're both fairly set in our positions on this one. While I don't agree with your logic, the debate did raise an important point that AFAIK isn't directly addressed in FLOSS licensing: obfuscated code in a Open Source project.


(Log in to post comments)

Reasoning behind the "preferred form" language in the GPL

Posted Mar 13, 2011 11:40 UTC (Sun) by PaXTeam (subscriber, #24616) [Link]

i think the crux of the matter is that when the GPL grants one the right to modify the distributed work, it doesn't say anything about what those modifications may be (as in, what kind of modifications the distributed 'source code' must allow). and clearly, there's a disconnect between what the RHEL6 srpm allows vs. what many people would like to be able to do as 'modification' (and what earlier RHEL kernel srpms allowed). so as you can read it in the followup post from Bradley Kuhn, the language of the GPLv2 is not what it should be, but that's too late now of course. that also means that my one-liner files would satisfy his 'tar x; make -C' criterion and therefore be compliant (and of course many other forms would also be compliant). sad or not, that's what we have to live with.

one last comment, as you seem to keep misreading the GPL: 'source code' != 'complete source code'. the sentence using the second term does *not* define 'source code', it defines 'complete source code' using the definition of 'source code' given in the previous sentence and i didn't mention it because it's irrevelant for the discussion: before we can talk about 'complete source code' we have to know what 'source code' is. apparently anything that compiles and a human can read in a normal text editor is good enough for 'source code'.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds