|
|
Subscribe / Log in / New account

Forking instead of fighting

By Jake Edge
August 27, 2014

LinuxCon North America

Bradley Kuhn is widely known for his GPL-enforcement efforts. He has spoken about them at many different conferences along the way, but his talk at LinuxCon North America in Chicago was on a different tack entirely. Instead of trying to enforce the GPL, he and others routed around a violation of the license by writing code—forking the project rather than fighting the violation.

[Bradley Kuhn]

"Sometimes it makes sense not to enforce the GPL", Kuhn said, which many may find a surprising thing for him to say. He normally talks about violations in the embedded Linux space, since they are the most "prevalent and insidious", but violations come in all shapes and sizes. This one was different than many others, which led to a different kind of resolution.

Kuhn then took a short detour to ensure that the audience was up to speed on the GPL. For the purposes of the talk, there were just a few things that the audience should be willing to agree to—at least for 45 minutes or so. The GPL requires that the source code for a covered work (the "whole work") to be released with the "complete corresponding source" (CCS) for that work. There are differences of opinion about what, exactly, goes into the whole work and he could give an entire talk on just that topic. There is limited guidance from either the laws or courts on what constitutes the whole work, but he believes that will change in his lifetime—we will see a court case that makes a judgment about the reach of "whole work".

There is a fundamental assumption in the software industry, he said, that proprietary software makes more money than free software. The veracity of the assumption is immaterial, as that is the perception which causes companies to try to keep as much of their code proprietary as they can. Developers, on the other hand, would share their code, "all things being equal".

Developers will share code when it is convenient to do so, but even sometimes when it is not so convenient. A Japanese developer once told him about early code-sharing in that country by way of floppies placed on a sushi counter when meeting for lunch. Developers are not all "freedom zealots" like he is, but they tend to err on the side of code-sharing. That is the backdrop for this tale, he said.

The birth of hg-app

In January 2010, a developer noticed that there were no free-software code-sharing sites supporting the Mercurial revision control system. There were a variety of proprietary solutions (e.g. GitHub), but most of those only supported Git. So, the developer scratched an itch and created something called "hg-app" that was released in June 2010 under the MIT license. That immediately led to a flame war about the license, since Mercurial is released under the GPL and hg-app is based on Mercurial (in a GPL sense). Some pointed out that the MIT license is GPL-compatible, thus it might not make any practical difference. But in the end, hg-app's developer switched to the GPLv2+.

Sometime later, the project was renamed to RhodeCode; Mercurial developers also started to contribute to it under the GPL. Some developers were also paid to work on improvements to RhodeCode. This is all pretty standard fare for a free software project, he said, but things were about to change.

The company

The primary author of RhodeCode formed a company, RhodeCode GmbH (which Kuhn said he would refer to as "the company" to avoid confusion with the software of the same name). The company announced a license change and added a 20-user maximum into the Python code for RhodeCode. That led to complaints, threats, and ultimately a patch to remove the 20-user restriction. The company then threatened the author of that patch.

Mercurial is a member of the Software Freedom Conservancy (SFC), which Kuhn is the president of, so SFC got involved in the dispute at that point as a mediator, more or less. Some Mercurial developers and other community members sought aggressive action against the company. But, Kuhn said, SFC's goal was to have a calm conversation about the issue with the company.

That conversation broke down quickly. The company claimed to have 100% of the copyright in RhodeCode, even though patches had been accepted from others under the GPL. There is also the question whether the whole work includes Mercurial itself, which would then also require a GPL release for RhodeCode.

The company cannot revoke the GPL on earlier releases of the code under that license (and, he stressed, it has not disputed that). It is a question of the future copyrights: can those be licensed under non-GPL terms. Kuhn believes the company does have an obligation to release under the GPL going forward, both because of the GPL patches accepted and due to the whole work question.

But the company did not agree. When friendly negotiations break down, he said, you have to look at the options. In most cases, the only option is litigation. For example, in the embedded space, there is typically no code at all, or it is so far from the CCS that it is not useful. So, a lawsuit has to be filed to force the release of the CCS.

But this case is a bit different. The company is not a completely bad actor; in fact, it spent a "long time as a good actor", he said. The reason that this situation exists at all is because the company did a good thing in the past (released its code). It won't be doing that going forward, which is lamentable, but we can take the code that was released and move on, he said.

The company will still be violating the GPL, he and others believe, but a lawsuit to pursue that will take far too long. The famous USL v. BSDi lawsuit essentially shut down development on free BSDs for 18 months, he said. That is actually a fairly short time frame; it could have been much longer.

Fork

Rather than do that, the developers decided to fork the code base and move forward. The original code is under the GPL, but some of the newer additions are as well. Done carefully, some of that newer code could be pulled into the fork.

The company's license is complex and self-contradictory, Kuhn said. The code is split into two parts, with Python and HTML code being licensed under GPLv3 and everything else, including CSS, images, and design, released under a separate proprietary license.

Under GPLv3, that second part could be considered "non-permissive additional terms" that could be removed based on section 7 of GPLv3. But the company would likely fight that interpretation, so to avoid conflict, SFC used a conservative reading of what the license said and followed it, Kuhn said.

SFC decided not to take any action on behalf of the Mercurial project. It had also gotten the ball rolling on the idea of a fork, but any fork would not automatically be a member of SFC. There was a big debate in the membership committee, Kuhn said, about whether to take on the fork as a member. As part of the decision to do so, the committee came up with a set of conditions that would clarify the provenance of the fork's code, so that the risk to SFC and its other member projects was low.

That led to a four-step process that would be followed before releasing the fork. First, find the last version of the code base without the new license. Second, extract useful patches of Python and HTML code from the post-license-change versions. Third, rebrand the project to a new name. And, finally, ensure "beyond reproach" compliance with the license.

The work was largely done by Kuhn, as part of his SFC work, and by Mads Kiilerich, who is a volunteer, with the assistance of a few others. The first step was fairly easy. Using the Mercurial repository for the project, identifying the changeset where the license change was made was straightforward.

Even with a "hyper-conservative reading" of the license, Python and HTML files are still clearly released under GPLv3. Pulling those kinds of changes out of the post-license-change versions was a bit tricky. In many cases, the changesets also touched other kinds of files. They came up with Mercurial commands to pull out what they wanted and Kuhn vetted all of the changes. Any edge cases were "discussed carefully with legal counsel", he said.

Rebranding was rather painful, overall. RhodeCode is the company's name and trademark, so the fork could not use that name except in the usual ways that anyone can (i.e. nominative use). They came up with the name "Kallithea", which is a location on the Greek island of Rhodes. But there was more to it than just renaming the project, as the string rhodecode_ was used throughout the code. While it was probably unnecessary to do so, he said, they wrote 300 lines of sed and Perl to replace all of the uses.

JavaScript and the GPL

In order to be "beyond reproach" in its license compliance, Kallithea needed to ensure that it was providing the CCS for its code. Even if the company was violating GPLv3, that doesn't give the project (which is using a large chunk of code that the company holds copyright to) the right to do so. The biggest problem for providing the CCS turned out to be JavaScript under the GPL.

It is a problem that other projects have, he said, but they may not know it. Typically, you publish a .js file at some URL and it gets downloaded as part of an HTTP request. Under the GPLv3, you have distributed the code at that point, so you must provide the CCS. For RhodeCode/Kallithea, though, there is bunch of JavaScript code from all over, some of it was written for RhodeCode, but lots of it was from elsewhere.

The first problem was tracking down what version of the external code is being used (and what license it is under) so that the license text accompanying Kallithea could be kept up to date. That part was fairly straightforward (if tedious), but the real problem came from "minified" JavaScript. Under the GPL, that is considered to be "object code", so the source JavaScript had to be tracked down to be added to the CCS of Kallithea.

For example, YUI 2.9 is a deprecated Yahoo user interface library written in JavaScript that can be found in many places in minified form. That's fine, since the library is BSD licensed, but it is not fine for a GPL-licensed package to release it that way. It would hypocritical for Kuhn to release code without the CCS, he said, given that he has spent many years fighting for the CCS to various other programs in court. They were eventually able to figure out how to get the source and to build it into the minified version, so the instructions to do so are now part of the license file for Kallithea.

There are some minutia to the GPL, he said, but they are normally easily met. The first release of Kallithea was done on the same day (August 22) as Kuhn's talk. A late-breaking problem that they ran into before the release was the license notification that appeared in the HTML of each page of the interface. RhodeCode has an incorrect one, he said, but getting that right is not necessarily easy. He would rather maintain a single page (like "About") rather than something on each page. It is, he said, the first time he has felt burdened by the GPL—it is something he may try to get Richard Stallman to change down the road.

Kuhn cited three lessons for developers that resulted from this episode. Don't just grab JavaScript from anywhere and incorporate it into a web application, he said. Part of the problem is that Python programmers (in this case) don't really take JavaScript very seriously, which can lead to problems as it did here.

When contributing to a new project, immediately check to see who holds the domain name and trademark—if it is only one person, start talking to an organization like SFC. Kuhn "pre-announced" a new "Conservancy-lite" program that SFC is offering to projects who are just looking for a place to park their domain name and trademark.

Lastly, he suggested that developers keep their own copyrights to their code. The code can be contributed to a project under the same license the project releases its code under. Developers should make it clear they expect the license that the contribution was made under to be upheld.

Meanwhile, Kallithea is an early success. It has released version 0.1, which supports both Mercurial and Git, and anyone can run their own instance. And all of it is developed in the open under GPLv3.

[I would like to thank the Linux Foundation for travel assistance to Chicago for LinuxCon North America.]

Index entries for this article
ConferenceLinuxCon North America/2014


to post comments

BitBucket

Posted Aug 28, 2014 11:06 UTC (Thu) by jnareb (subscriber, #46500) [Link]

In January 2010, a developer noticed that there were no free-software code-sharing sites supporting the Mercurial revision control system. There were a variety of proprietary solutions (e.g. GitHub), but most of those only supported Git.

If we are talking about proprietary solutions (like GitHub), there is always Bitbucket. It is as long or almost as long as GitHub.

If we are talking about code-sharing (or code-hosting) software which is OSS licensed, then GitHub is not a good example; a better one would be Gitorious (or recently GitLab). hg-app / RhodeCode / Kallithea would be Mercurial equivalent of open-source Git-based Gitorious, not of proprietary GitHub.

Forking instead of fighting

Posted Aug 28, 2014 14:46 UTC (Thu) by jackb (guest, #41909) [Link]

My takeaway from this article is yet another confirmation that IP needs to die.

Just look at amount wasted effort that goes into jumping through imaginary hoops to comply with arcane and contrived bullshit that could instead be directed toward accomplishing something useful.

MIT vs GPL

Posted Aug 28, 2014 18:15 UTC (Thu) by man_ls (guest, #15091) [Link] (5 responses)

What a convoluted process. It's things like these that are pushing the kids these days towards the MIT license and equivalents. It's a pity because then they are essentially giving the code for free to any greedy organization that may take advantage of it; but it's easier than tracking down obscure non-minified libraries.

In fact Kuhn might have just released YUI as is, since it is BSD licensed, but he felt the need to go through all the trouble just to avoid being hypocritical. I think that the net result is the opposite: pushing people towards permissive licenses.

One may well ask if the GPL is worth all the trouble. Right now it seems that there are not so many "evil" corporations trying to incorporate everyone's code inside their proprietary, nefarious products; rather a bunch of clueless oriental manufacturers who embed any code they can find into their cheap ephemeral products. While it's good to educate them, it is also a lot of effort for minimal gains.

MIT vs GPL

Posted Aug 29, 2014 10:57 UTC (Fri) by HIGHGuY (subscriber, #62277) [Link] (3 responses)

I don't agree. These 'evil' corporations, would just love to go about and copy-paste code into their "proprietary, nefarious products".
The GPL is the thing that is actively stopping many of them from doing so.

As Kuhn said, developers generally like to share (both on the sending and receiving end), but for many of them their salaries come with copyright assignments. When that companies deal with closed-source code this means that they need to be on the safe side when receiving external code. Take that extra care away, and you'll quickly find yourself in the wild-west of copy-paste-ship_as_commercial.

MIT vs GPL

Posted Aug 31, 2014 21:39 UTC (Sun) by man_ls (guest, #15091) [Link] (2 responses)

I tend to agree with you, but nowadays many developers don't care that corporations are integrating their code into their products -- in fact, they love it when it happens!

It all depends if you consider that proprietary software is a morally bankrupt thing, or just a mildy annoying fact of life, I guess.

MIT vs GPL

Posted Sep 1, 2014 14:23 UTC (Mon) by raven667 (subscriber, #5198) [Link] (1 responses)

That's true that many developers don't care but there are downsides other than the plain objection to using the law to prevent working on the software you use if you have the tools and inclination, many people are chafed when they don't have the same opportunity to profit from their own work, where they are being taken advantage of as free labor for someone else to profit from when there is a one-way flow of work. If the company is just using the code as-is this isn't much of a problem (even in the GPL sense) but I think a lot of people would feel it unfair to be driven out of the market by a proprietary fork of their own work, that they aren't legally allowed to work on, rather than being on more equal footing.

MIT vs GPL

Posted Sep 2, 2014 13:25 UTC (Tue) by man_ls (guest, #15091) [Link]

Quite true, but that doesn't seem to happen much nowadays, when free software projects get all the momentum and proprietary forms hardly ever shine at all... Now that many developers are educated about the evils of proprietary software, perhaps.

So, OpenWRT is a "minimal gain"?

Posted Sep 9, 2014 17:43 UTC (Tue) by bkuhn (subscriber, #58642) [Link]

I guess you'd consider OpenWRT and SamyGo "minimal gains"? Both projects exist at all primarily because of GPL enforcement work that I led.

This talk wasn't about the issues of current perceived preferences for non-copyleft licenses. For those who are curious about my thoughts on those subjects, which man_ls is raising here, I recommend my FOSDEM 2014 talk.

Forking instead of fighting

Posted Aug 28, 2014 19:57 UTC (Thu) by Baylink (guest, #755) [Link] (3 responses)

This is pretty simple. Once you have accepted patches with the GPL placed on them by their authors, *and released that code*, then that release has those authors as part of the copyright holder group.

To re-license it, you need releases from those authors.

This is why Sun and FSF require such licenses, though in FSF's case it's probably not to do a commercial release later. :-)

Forking instead of fighting

Posted Aug 28, 2014 23:40 UTC (Thu) by giraffedata (guest, #1954) [Link] (2 responses)

Once you have accepted patches with the GPL placed on them by their authors, *and released that code*, then that release has those authors as part of the copyright holder group.

It's really simpler than that. Once you have accepted patches with any license or no license at all, and regardless of whether you have released the patched code, the patch authors are copyright holders of the patched code.

Because they're copyright holders, you need their permission to distribute the patched code in any way, irrespective of what licenses you give your recipients.

Now if a patch author gave you a GPL license along with the patch, then you already have the patch author's permission to distribute the code if you meet certain conditions, including giving your recipients a GPL (or similar) license. But if you don't meet those conditions, you'll need separate permission from the patch author.

To re-license it, you need releases from those authors.
What you need from the patch author, to distribute the code without giving recipients a GPL license, is a suitable copyright license (Not GPL). Note that license=permission.
... Sun and FSF require such licenses, ...

I believe you're talking about a copyright assignment, which is an alternative way to have the right to distribute a patch. The patch author is no longer a copyright owner if he assigns his copyright to Sun/FSF, so Sun/FSF doesn't require his permission for anything.

Forking instead of fighting

Posted Aug 31, 2014 11:14 UTC (Sun) by niner (subscriber, #26151) [Link] (1 responses)

Not exactly. Not every source code and thus not every patch is even copyrightable. It has to be complex enough to allow the author to express herself. I'd guess a straight forward typo fix for example would not pass this barrier.

Take away is: don't ever make blanket legal statements. Reality is always a bit more complicated than one would assume.

Forking instead of fighting

Posted Aug 31, 2014 15:37 UTC (Sun) by giraffedata (guest, #1954) [Link]

You make a good point that law is complex, so simple, blanket statements are almost guaranteed to be incorrect. But if we adopt a rule against making such statements, we can't discuss the law at all, because either every statement gets so full of qualifiers that one can't read it, or people are afraid to say anything because there might be some exception they missed.

So I would turn it around: understand that any blanket legal statement you read probably has numerous exceptions.

I also think it's a wonderful form of discourse for one person to make a basic blanket statement in a forum like this and for others then to refine it with the exceptions.

Other materials related to this talk are available.

Posted Sep 9, 2014 17:56 UTC (Tue) by bkuhn (subscriber, #58642) [Link]

The slides for this talk are available on my website if anyone wants to view them. Also of interest might be my blog post about the subject.

Forking instead of fighting

Posted Mar 26, 2022 16:38 UTC (Sat) by ecm (subscriber, #129897) [Link]

> He would rather maintain a single page (like "About") rather than something on each page. It is, he said, the first time he has felt burdened by the GPL—it is something he may try to get Richard Stallman to change down the road.

I assume this change would require a GPL v4 (or perhaps v3.1)? But then Kallithea would not be able to use that since it appears to be licensed under GPLv3-only.


Copyright © 2014, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds