Copyright notices (or the lack thereof) in kernel code
In early September, a patch series implementing fscrypt integration for the Btrfs filesystem included this patch adding, among other things, a one-line Facebook copyright notice. Btrfs maintainer David Sterba replied with a request to limit copyright information to SPDX tags; he cited a page in the Btrfs wiki, asserting that these tags are a complete replacement for copyright notices. Christoph Hellwig disagreed, pointing out that SPDX describes licensing but not ownership:
It is not a replacement for the copyright notice in any way, and having been involved with Copyright enforcement I can tell you that at least in some jurisdictions Copyright notices absolutely do matter.
Hellwig, of course, was the initiator of a GPL-infringement lawsuit against VMware that was dismissed due to an inability to prove ownership of the code in question. It is thus unsurprising that he is sensitive to the placement of copyright notices in the code itself. When Hellwig submitted a patch of his own, also in September, that added a copyright notice to a newly created file, Sterba let it be known that he would refuse that change as well. Toward the end of October, in the discussion of yet another patch set, Hellwig eventually withdrew the work, saying:
FYI, I object to merging any of my code into btrfs without a proper copyright notice, and I also need to find some time to remove my previous significant changes given that the btrfs maintainer refuses to take the proper and legally required copyright notice.
Given that the kernel code has no shortage of copyright notices (nearly 79,000 lines contain the word "copyright"), it is natural to wonder why this policy is being applied in the Btrfs subsystem. The Btrfs wiki page describes the reasoning:
The copyright notices are not required and are discouraged for reasons that are practical rather than legal. The files do not track all individual contributors nor companies (this can be found in git), so the inaccurate and incomplete information gives a very skewed if not completely wrong idea about the copyright holders of changes in a given file. The code is usually heavily changed over time in smaller portions, slowly morphing into something that does not resemble the original code anymore though it shares a lot of the core ideas and implemented logic. A copyright notice by a company that does not exist anymore from 10 years ago is a clear example of uselessness for the developers.
The page also states that the Signed-off-by tags found in the kernel's Git history are sufficient to document the copyright status of the code. There are a few difficulties with this position, including the fact that those tags indicate that the submitter has the right to contribute the code to the kernel, but do not necessarily show who the copyright owner is. Another problem was pointed out by Bradley Kuhn: if the Git history serves as the copyright notices for the code, then it will be necessary to ship the entire Git repository to be in compliance with the GPL's source-code requirements. That makes complaints about copyright notices in the code being unwieldy lose some of their weight.
In the most recent discussion, Chris Mason said
the "Christoph's request is well within the norms for the kernel
".
Sterba replied that he
would consider changing the policy, but only as part of a wider policy
decision by the kernel project:
I've asked for recommendations or best practice similar to the SPDX process. Something that TAB can acknowledge and that is perhaps also consulted with lawyers. And understood within the linux project, not just that some dudes have an argument because it's all clear as mud and people are used to do things differently.
It's not clear who Sterba has asked for recommendations at this point.
Chances are that he will find, over time, that the Btrfs subsystem's
position on copyright notices is not widely held across the project as a
whole. Steve Rostedt arguably described
the consensus view: "The policy is simple. If someone requires a
copyright notice for their code, you simply add it, or do not take their
code
". In the absence of a decree from Linus Torvalds, though, the
issue of copyright notices may continue to be a source of disagreement.
Claiming copyright on a portion of a shared body of code can always be a
touchy matter, but it's one that developers can care a lot about.
Index entries for this article | |
---|---|
Kernel | Copyright issues |
Posted Oct 27, 2022 15:53 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (13 responses)
Authored by: J Random Developer: (c) Fred Bloggs Ltd 2022
That way it also shows up if somebody changes employ - the "author" line will change. Does git say "this line came from that patch"? So if they want the author, the copyright would show up at the same time.
Cheers,
Posted Oct 27, 2022 16:37 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link] (3 responses)
OTOH, copyright notices are not necessarily going to make that any easier. If the code says it is copyright Fred Bloggs Ltd, that is not necessarily true. It may have started out as a work by that company, and then over time, other individuals may have contributed to such an extent that the original would be unrecognizable. I'm dubious that a court would recognize a claim of copyright in those circumstances, but you'd have to ask a lawyer.
Posted Oct 27, 2022 16:58 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
Yup. if the lines have been re-written, mangled, diluted etc then there might be an argument over whether copyright has survived, but at least git shows clearly who contributed, what they contributed, and who owned the contribution.
It at least removes some uncertainty - if my employer owns my contributions, then I leave but continue contributing to the project, I can't claim the contributions on my employers dime as my own (or vice versa - they can't claim mine from before I joined).
The thing is, it makes a clear claim (a) of ownership, and (b) of what is owned. There's still going to be argument over whether it was worthy of copyright, and of whether enough of it survived to keep a valid copyright claim.
But that would always be the case. This just reduces the amount of crap the lawyers can argue over.
Cheers,
Posted Oct 28, 2022 11:50 UTC (Fri)
by khim (subscriber, #9252)
[Link] (1 responses)
They absolutely do make life much easier in cases like Hellwig's one. Copyright notice clearly shows that code originates from something owned by the company or individual mentioned in these notices and then it becomes problem for the other party to prove that copyright notice is a lie. It's not impossible, it have been done, but it's very hard. If there are no copyright notice, on the other hand, then there are no such presumption, then you have to prove that you actually wrote enough of the code to be entitled for the copyright protection. Yes, it's basically just the question of “who pays the lawyer”, not question of what court would, ultimately, decide… but it's still very important distinction in practice.
Posted Oct 29, 2022 3:48 UTC (Sat)
by buck (subscriber, #55985)
[Link]
In a world where the cost of getting hauled into court is the long and short of some people's business model, you have a fiendishly sly sense of humor. [grin]
I don't know about this question of copyright and who's going to be in a position to defend it, but if there's any possibility it makes the lawyer tax fall more heavily on somebody trying to misappropriate the code or the "embodied" IP, that seems like a pretty persuasive argument against Copyright-comment minimalism. The rest of everything everybody does in this country I live in, anyway, is in large part guided by lawyer-tax-avoidance considerations. Not having that make a noticeable mark on every commitdiff is almost quaint. [wink]
Sorry; just being cynical. Please ignore if you're perturbed by me being so glib about such things, or if you're a lawyer.
Posted Oct 27, 2022 16:41 UTC (Thu)
by dullfire (guest, #111432)
[Link] (8 responses)
Using git as the sole source of copyright attribution would render is inadvisable (or maybe even illegal) to distribute source tarballs (that did not include the full git history).
Posted Oct 27, 2022 17:08 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (7 responses)
I hate to say it, but it's perfectly normal practice, when the reams of copyright headers get excessive, for them to be stripped from the live source and a note replaces them saying "look at the previous version for historic copyrights".
Is it a criminal offence to strip notices, or just civil? If it's a civil offence, then you'd have to prove damages, and if it's not done with the intention of breaking copyrights, but only with the intention of making working with the code easier, that would be very hard to do.
Cheers,
Posted Oct 27, 2022 17:29 UTC (Thu)
by dullfire (guest, #111432)
[Link]
Assuming all source tarballs come from git sources, shipping of source-only tarballs would be illegal (used loosely) at some point (though maybe not for simply redistributing the tarball you got).
So let me clearly say "I am not a lawyer". With that out of the way, DMCA § 1204 provides for criminal charges (in some case, I recommend reading it for your self[1]). DMCA § 1203 provides for civil liabilities.
[1] https://www.law.cornell.edu/uscode/text/17/1204 see also the links to § 1202 b.2, and § 1202 c for definitions of terms
Posted Oct 27, 2022 17:40 UTC (Thu)
by matthias (subscriber, #94967)
[Link]
Where is the difference if you strip a COPYRIGHT file that contains the copyright information or the .git folder that contains the copyright information?
If I would add parts of the source code as git attributes (strange idea but possible) and then use a build script to extract and compile them. Would you say that it is perfectly valid to only distribute the parts of the source code that are in the files? Or does the mere fact that I put some of the code into git attributes enforce me to also include these when I distribute code?
Is there a difference between code that is hidden in git attributes and copyright notices when it comes to what is allowed to be omitted and what is not allowed to be omitted when creating a source tarball?
Cheers,
Posted Oct 27, 2022 17:45 UTC (Thu)
by mpr22 (subscriber, #60784)
[Link] (1 responses)
I understand that in some jurisdictions, copyright infringement is uniformly a matter of criminal law, while in others, whether copyright infringement constitutes a crime or a tort depends on things like "scale" and "commerciality". (And I dare say there's a jurisdiction somewhere out there where copyright infringement is purely a civil matter.)
Posted Oct 27, 2022 23:30 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link]
The US comes surprisingly close. Commercial copyright infringement is technically a criminal matter, but in practice the federal government has better things to do, so you would usually only get prosecuted if you make a nuisance of yourself and the (rather substantial) civil remedies are inadequate. See for example Kim Dotcom. But the vast majority of copyright infringement is either handled civilly or informally (i.e. without directly involving the court system, usually in the form of DMCA notices, as well as stuff like ContentID).
Posted Oct 27, 2022 17:48 UTC (Thu)
by stephen.pollei (subscriber, #125364)
[Link] (1 responses)
I'm not a lawyer, but I think it's criminal and not civil... however key words are "fraudulent intent". Perhaps, if the intent is to declutter the source code and you have a good-faith reason to think git history is sufficient then there might be no issue. Maybe, it is best to not put yourself in situation where you have to explain intent in a court.
Posted Oct 27, 2022 17:52 UTC (Thu)
by stephen.pollei (subscriber, #125364)
[Link]
Posted Jan 1, 2023 16:37 UTC (Sun)
by agowa338 (guest, #162947)
[Link]
Posted Oct 27, 2022 18:01 UTC (Thu)
by IanKelling (subscriber, #89418)
[Link]
Posted Oct 27, 2022 23:12 UTC (Thu)
by mtaht (subscriber, #11087)
[Link] (2 responses)
I'm increasingly frustrated that any level of gpl enforcement against serial violators, particularly in the embedded market, has faded. Cambium and ubnt both stopped doing GPL drops a few years ago. So many "security" cams, so many other devices, so obviously based on linux, lacking GPL drops, also.
Lacking GPL enforcement, it would be best for the world, if somehow those that are copying and going be strongly encouraged again to work within open source best practices.
Posted Oct 30, 2022 21:05 UTC (Sun)
by andy_shev (subscriber, #75870)
[Link] (1 responses)
Posted Oct 31, 2022 6:05 UTC (Mon)
by ssmith32 (subscriber, #72404)
[Link]
You may help someone going after GPL infringement, though. If I understand the arguments being made here.
Posted Oct 28, 2022 4:56 UTC (Fri)
by scientes (guest, #83068)
[Link] (5 responses)
Posted Oct 28, 2022 4:59 UTC (Fri)
by scientes (guest, #83068)
[Link]
Posted Oct 28, 2022 7:31 UTC (Fri)
by LtWorf (subscriber, #124958)
[Link] (2 responses)
In a world without copyright you'd be absolutely right.
Posted Oct 28, 2022 8:41 UTC (Fri)
by leromarinvit (subscriber, #56850)
[Link] (1 responses)
Regarding this, I'm hoping something comes from SFC's enforcement suit against Vizio (https://lwn.net/Articles/895405/). They're suing as a buyer of an affected device, not as a copyright owner. If they win this, owners of violating devices would have credible power against manufacturers even in the face of copyright owners who don't care - which, in the case of Linux, seems to be at least a significant minority (or maybe even the majority).
Currently, all you can do as a user is say "pretty please" if the copyright owner doesn't care, and lots of companies get away with ignoring that. This would be a massive improvement in my book.
Posted Oct 30, 2022 19:15 UTC (Sun)
by LtWorf (subscriber, #124958)
[Link]
Posted Oct 28, 2022 7:43 UTC (Fri)
by rsidd (subscriber, #2582)
[Link]
Posted Oct 28, 2022 8:17 UTC (Fri)
by hverkuil (subscriber, #41056)
[Link]
We had that situation at least once in the media subsystem.
Posted Oct 28, 2022 12:17 UTC (Fri)
by karim (subscriber, #114)
[Link] (5 responses)
FWIW, git offers the "--author" flag for "commit". Maybe that'd be useful here?
Posted Oct 28, 2022 12:59 UTC (Fri)
by geert (subscriber, #98403)
[Link] (4 responses)
That just overrides user.name/user.email in git's configuration.
Posted Oct 28, 2022 13:14 UTC (Fri)
by karim (subscriber, #114)
[Link] (3 responses)
From https://git-scm.com/docs/git-commit :
Override the commit author. Specify an explicit author using the standard A U Thor <author@example.com> format. Otherwise <author> is assumed to be a pattern and is used to search for an existing commit by that author (i.e. rev-list --all -i --author=<author>); the commit author is then copied from the first such commit found."
Am I misreading what this does? Note: I'm not a regular user of this functionality, so I might be missing the mark.
Posted Oct 28, 2022 21:02 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Posted Oct 29, 2022 20:20 UTC (Sat)
by Wol (subscriber, #4433)
[Link] (1 responses)
A LOT of contributors do not own the copyright in their contributions. I've got a feeling I might soon need to sort out that mess in ScarletDME ...
Cheers,
Posted Oct 30, 2022 5:18 UTC (Sun)
by pabs (subscriber, #43278)
[Link]
Posted Oct 28, 2022 15:08 UTC (Fri)
by flussence (guest, #85566)
[Link] (4 responses)
If this guy's causing a chilling effect on people trying to contribute then I'd say the real problem isn't copyright, but that the kernel's CoC is entirely toothless.
Posted Oct 28, 2022 15:17 UTC (Fri)
by corbet (editor, #1)
[Link]
Posted Oct 28, 2022 16:16 UTC (Fri)
by atnot (subscriber, #124910)
[Link] (1 responses)
The only place hard drives really live these days is in network storage devices, which do their own redundancy locally, often as some sort of cluster. If you're using local disks they're going to be high performance SSDs, in RAID 10 because the overhead of calculating parity at those speeds would be too high anyway.
So at this point the only one you've got left is enthusiasts building a DIY NAS at home, who have enough time on their hands to just deal with the inconveniences of dealing with ZFS anyway.
So unless some brand new use case for RAID 5/6 appears from somewhere, I don't think this will change even under the friendliest maintainership.
Posted Oct 28, 2022 21:45 UTC (Fri)
by Conan_Kudo (subscriber, #103240)
[Link]
Posted Oct 30, 2022 20:45 UTC (Sun)
by jhoblitt (subscriber, #77733)
[Link]
The reality is that neither the user base or commercial financial interest is present to mature reed-solomon codes for a single node storage solution. Evidence of this is that RedHat has dropped support in RHEL for btrfs completely. Mid to large scale organizations have either outsourced the problem to "the cloud" or they use a distributed storage system that has erasure-codes and/or replicas spread across multiple nodes. Single host storage solutions are simply too unreliable to be trusted with important data.
Posted Oct 31, 2022 5:51 UTC (Mon)
by mirabilos (subscriber, #84359)
[Link]
Except, I’d say, for those gazillion ones printk’d during boot. These are just ridiculous. Something-or-the-other was written for SuSE, and all that.
Copyright notices (or the lack thereof) in kernel code
Wol
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Wol
> OTOH, copyright notices are not necessarily going to make that any easier.
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Wol
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Matthias
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices removal -- us law
Copyright notices removal -- us law
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
"--author=<author>
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Wol
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
I think the RAID5/6 problems in Btrfs persist because nobody has put in the time to fix them. That is certainly a problem but a different one than the subject of this article; I wouldn't mix the two.
Btrfs RAID
Copyright notices (or the lack thereof) in kernel code
There is work going on to fix RAID 5/6 modes right now.
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code
Copyright notices (or the lack thereof) in kernel code