Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
PostgreSQL 9.3 beta: Federated databases and more
LWN.net Weekly Edition for May 9, 2013
(Nearly) full tickless operation in 3.10
How they track their changes and how they store the result of those changes does not matter at all.
You modify the source, not the patches.
Red Hat and the GPL
Posted Mar 9, 2011 13:24 UTC (Wed) by ewan (subscriber, #5533)
If people actually preferred a monolithic tarball no-one would be objecting to this, and Red Hat wouldn't be talking about how this change is actually intended to make life harder for downstream users of the code.
Given that this is explicitly about obstructing Oracle et al it's clear that Red Hat are under no illusions that this form is actually preferred by anyone - if it were they wouldn't be doing it.
Posted Mar 9, 2011 14:01 UTC (Wed) by Los__D (guest, #15263)
Again, the preferred form for making modifications is the raw source, not patches. The preferred form for tracking modifications might be patches, but this (again) is irrelevant.
Posted Mar 9, 2011 14:16 UTC (Wed) by ewan (subscriber, #5533)
We're talking about the source code for a binary RPM. The preferred form of the source for the RPM is all that matters.
Again, the preferred form for making modifications is the raw source
That's a non-sequitur. Clearly what is required is 'the source'. The entire intent of the 'preferred form for modification' wording is to address the question of what counts as 'the source'. I think everyone agrees that a tarball of machine generated pre-processor output is not 'the source', even though it can be compiled to a binary.
If you have a build system that starts with a single tarball and works from there then that single tarball may indeed be 'the source'. If you have a build system that starts from a pristine upstream tarball and a pile of patches, then it's reasonable to say that the pristine upstream tarball and the pile of patches together form 'the source'.
It appears Red Hat's build process does still involve a pristine upstream and a pile of patches since they're making the patches available to customers. If that's the case, then the collapsed single tarball is just as much a machine-generated intermediate stage as pre-processor output would be.
Posted Mar 9, 2011 14:50 UTC (Wed) by Los__D (guest, #15263)
How they store their changes it does not matter.
This seems to be another case of "I don't like, I better try to twist reality to my preferences".
Posted Mar 9, 2011 15:18 UTC (Wed) by ewan (subscriber, #5533)
Apparently not in the case of the RHEL kernel they don't. If what you're suggesting was actually true there would be no list of separate patches to supply to customers. And there is.
Posted Mar 9, 2011 15:52 UTC (Wed) by mjg59 (subscriber, #23239)
Posted Mar 9, 2011 16:00 UTC (Wed) by mjg59 (subscriber, #23239)
diff --git a/redhat/Makefile.common b/redhat/Makefile.common
index 53c2115..f11b488 100644
diff --git a/redhat/kernel.spec.template b/redhat/kernel.spec.template
index c4017cf..b44bcf6 100644
as their sole contents should be a give away that the patch generation is automated!
Posted Mar 9, 2011 22:56 UTC (Wed) by jone (guest, #62596)
Posted Mar 9, 2011 16:12 UTC (Wed) by avik (guest, #704)
Posted Mar 9, 2011 17:09 UTC (Wed) by ewan (subscriber, #5533)
Posted Mar 9, 2011 17:11 UTC (Wed) by ewan (subscriber, #5533)
Posted Mar 9, 2011 17:20 UTC (Wed) by mjg59 (subscriber, #23239)
Posted Mar 9, 2011 17:26 UTC (Wed) by ewan (subscriber, #5533)
Posted Mar 10, 2011 10:31 UTC (Thu) by pbonzini (subscriber, #60935)
Posted Mar 9, 2011 17:30 UTC (Wed) by paulj (subscriber, #341)
So then we're back to dmw2's point, at what point does a change in the build processes that happens to result in the released sources throwing away a lot of useful information go from being benign to one that is deliberately trying to obfuscate the source input to that build process so as to evade the GPL? Particularly when the system apparently still has the capability to generate the uncollapsed src.rpm, but it's not done with the express purpose to frustrate rebuilders?
Posted Mar 9, 2011 17:34 UTC (Wed) by mjg59 (subscriber, #23239)
Posted Mar 9, 2011 17:44 UTC (Wed) by paulj (subscriber, #341)
Posted Mar 9, 2011 21:14 UTC (Wed) by branden (subscriber, #7029)
You cannot violate the copyright license in your own work.
Posted Mar 9, 2011 22:47 UTC (Wed) by Los__D (guest, #15263)
Posted Mar 9, 2011 21:46 UTC (Wed) by branden (subscriber, #7029)
Posted Mar 9, 2011 23:21 UTC (Wed) by mjg59 (subscriber, #23239)
Posted Mar 10, 2011 1:22 UTC (Thu) by branden (subscriber, #7029)
At any rate, you are baffled because you are arguing against a straw man.
"The full set of data contained within a revision control system" is not what is being asked for.
What is being asked for is the delta between the upstream source code as Red Hat retrieved it, and each patch made to it at the level of atomicity the engineers working with it find most appropriate.
Because that was Red Hat's standard operating procedure in the kernel SRPMS domain for, literally, years.
(I'd say that a one-liner description of each patch would be de rigueur as well, but I get the impression that most kernel hackers would be content with one of git's inscrutable hexadecimal identifiers. If that really is good enough for the community, then it's good enough for me.)
To tie this in to the other discussion we're having (and for the edification of those eating popcorn while we argue), the .spec approach of specifying a base tarball (possibly more than one, it's been a while since I've built an RPM) and an arbitrary number of patches was one of the RPM package format's few advantages over DEB (particularly in the domain of source management).
Debian developers recognized that as a deficiency many years ago, and took steps to ameliorate. Doogie's Build System (dbs) was one such effort, such that the .diff.gz didn't really patch any of the original directories anymore, but just created a debian/ directory as it always does, and then included a debian/patches directory which contained the itemized patches which had to be applied to the source as part of the package build process.
There were other initiatives along these lines. Ultimately, the problem was solved at the right level, by extending the Debian source package format to allow for multiple .diff.gz files. These may still be kept few in number, but as long as the engineering equivalent of a debian/patches directory exists (many source packages use quilt) identifying the discrete patches applied, Debian packages are meeting the same standard as the Red Hat Linux kernel SRPMS of the recent past.
Sadly, my knowledge of the history of innovations in Debian source package management trails off at about five years ago. I'd much appreciate an active member of the project chiming in to bring the discussion up to date and correct any misstatements I've made.
The fact that our knowledge of developer-friendly ways to package and distribute source code has increased, and those improvements have spread and become common practice, tells you something important: our community evolves. Our expectations evolve. We learn how to do things better. The GNU GPL is vague on this for precisely the right reason: preferences among developers change with time. What was the preferred form for modification 20 years ago might not be good enough today.
When a major figure in the FLOSS community like Red Hat Software takes a deliberate step backward in engineering quality like this, and thumbs its nose at its fellows (even if they only "mean" to inconvenience less sympathetic firms), people notice, and recognize it for what it is--a conscious refusal to abide by current best practices for delivering source code in the preferred form for modification.
That Red Hat Software played a significant role in advancing those very best practices to the high level they are today makes it poignantly sad and ironic that they are betraying their legacy.
Posted Mar 10, 2011 10:38 UTC (Thu) by pbonzini (subscriber, #60935)
Of course some of them are, apparently enough to be willing to leak internal (and occasionally false) information about the change. Others are not, others consider it a sad but justified move. You'll find the whole panorama, as expected in a relatively large population.
Posted Mar 10, 2011 10:00 UTC (Thu) by paulj (subscriber, #341)
The developers of one project can quite legitimately prefer to work on tarballs without history, while those of another may prefer to have the history. The *same* developers may follow two different processes, even working on nominally the same codebase, according to whether they're developing features for upstream or whether they're working on maintaining their employers supported package. That's certainly been my experience at another vendor, and you may have had the same experience too at your employer.
To be clear, there is a difference between "the sources for the Linux kernel" and "the sources for a vendor kernel package". It's an undeniable fact that, say, a RedHat kernel RPM was built using files that are not and (almost certainly) never will have equivalents in the stock Linux sources.
Further, I'm not sure there is much direct historical precedent. In the past, distributors built their packages from pristine+packages because SCMs weren't good enough. So, for package sources, for want of an SCM that could keep changes distinguished, the preferred form became pristine sources + patches. That this way of collating sources for packages became established at multiple distributors - including non-Linux ones - strongly suggests it was industry wide best-practice. I'd be amazed at anyone who tried to argue that wasn't the case. For me, such wide practise is somewhat equivalent to a preference, though you'd presumably disagree. However, in recent times SCMs have become much better. Git and mercurial (git especially) have changed how we can work and made it viable to move information that was previously kept in explicitly separate files in the sources of the package off into the history data of the SCM.
I have a lot of sympathy for RedHat. They've done and continue to do great things for free software. I do think it's legitimate to ask though, when a vendor deliberately tries to withhold information that was previously part of the sources they released for a package, at what point they cross a line. Maybe RedHat have not, but the discussion is still worthwhile.
Posted Mar 10, 2011 11:24 UTC (Thu) by dwmw2 (subscriber, #2063)
"Back when people sent patches directly to Linus and he just released tarballs, was he violating the GPL?"
Tarballs are a perfectly sane way for upstream releases to happen.
But I think that every competent open source developer, when they're not trying to twist things to make a point, will agree: When releasing a modified code base which is based on some upstream project, it is definitely preferable to release that in the form of original + patches, rather than as a monolithic tarball of the whole damn mess.
Posted Mar 10, 2011 12:47 UTC (Thu) by mjg59 (subscriber, #23239)
Posted Mar 10, 2011 14:43 UTC (Thu) by paulj (subscriber, #341)
Posted Mar 10, 2011 18:48 UTC (Thu) by martinfick (subscriber, #4455)
Posted Mar 9, 2011 23:03 UTC (Wed) by rahvin (subscriber, #16953)
This clause exists in the GPL to prevent some company from sending you a printed book of source code. I find it extremely dishonest to start arguing about an individual persons interpretation of this clause without consideration of that history or the reason it exists. You simply can't be willy nilly redefining intent however you like and examining this in a vacuum.
Any court of law that finds the GPL language ambiguous is going to go back and read the history of the clause and what it was intended to mean. That simple review can be conducted with a Google search that points out RMS defined this clause as trying to prevent the paper copy exploit. Any other description of this clause is disingenuous at best and this debate should have never made it past that cursory review.
Posted Mar 10, 2011 1:23 UTC (Thu) by branden (subscriber, #7029)
Posted Mar 10, 2011 3:03 UTC (Thu) by ewan (subscriber, #5533)
No, it doesn't. The GPL deals with that using the "on a medium customarily used for software interchange" language. The specification that the source be the 'preferred form for modification" is a separate requirement, and is about what does and does not count as 'source', not about the media via which it is delivered.
Posted Mar 18, 2011 18:48 UTC (Fri) by mishad (guest, #69757)
To know for sure we'd have to ask the GPL's original authors, but my reading of it was that the term was added to ensure that the right to produce modified works could be meaningfully exercised. In particular, it was to preclude distribution of "source" that was obfuscated (e.g. variable names changed, no whitespace, no comments, replace control structures with equivalent gotos) or which was already compiled (e.g. as "binary blobs" which form part of the resulting program/work).
I don't think anyone was thinking about VCSen back then.
Posted Mar 9, 2011 16:04 UTC (Wed) by Los__D (guest, #15263)
And again, it doesn't really matter how Red Hat package those changes. The modifications originally (in all probability) was done to the checked out source.
- In simple cases, someone might have opted to do the change directly to an existing patch, but that is hardly the norm.
Posted Mar 9, 2011 17:22 UTC (Wed) by paulj (subscriber, #341)
OTOH, if you're maintaining a code-base, adding patches temporarily (or not) while tracking an upstream, then it's more convenient to work with a system that allows you to easily distinguish between each change, relative to the upstream. E.g. a pristine tarball + patches, or a git tree.
I've worked on both developing an upstream code-base and maintaining a supported version of that same code-base, at a big distributor/vendor, and we used both methods, as appropriate. For the supported, released binaries - they were built from pristine+patches, and that's what got released as the source (just as RedHat used to).
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds