Leading items

Old projects and the free-software community

By Nathan Willis
May 18, 2016

The Community Leadership Summit (CLS) is an annual event for community managers, developer evangelists, people who work on public-facing forums, and those with a general interest in engagement or community development for free-software projects. The 2016 edition was held in Austin, Texas the weekend before OSCON. Several sessions at CLS 2016 dealt with the differences exhibited between old and new free-software projects where community management is concerned. One of those tackled the problem of how to foster community around an older software project, which poses a distinct set of challenges.

Community revitalization

While there were a few plenary talks each morning at CLS, the majority of the day was reserved for "unconference" sessions: discussions on a topic proposed, on the spot, by a conference attendee. The "old projects" topic was raised in such an unconference session, moderated by Kate Stewart. Specifically, it dealt with how to re-energize a community around a software project that is no longer in the "new shiny thing" category.

Since the terms "old" and "new" are relative, establishing some context at the outset was important. It is often easy to attract code contributors, active volunteers, and enthusiastic members of the user community when a project is undergoing rapid development; doing the same thing for a stable code base or a core infrastructure project in "maintenance mode" is far more difficult. A few example projects were brought up at the beginning: FOSSology, which is well-established and has plenty of users but few developers; bzip2, which is a dependency of scores of other projects but is essentially in maintenance mode; and libexiv2, which is an under-the-hood library many end users may be taking for granted.

Each project's situation is a bit different, but they share common traits, such as a declining influx of new contributors and fewer volunteers actively staffing the communication channels. An Ubuntu developer noted, for example, that the excitement level of the Ubuntu community was quite a bit higher when the project was in "build a new distribution" mode; now that it is an established distribution, it is still popular, but the momentum has shifted to "building something on top of Ubuntu."

A few key facets of the community-engagement gap for older projects were identified in the discussion. First, established projects have a problem with the outside perception that there is little or nothing to do in the project—which is rarely actually the case. In the bzip2 case, for example, the project needs a number of infrastructure updates, including improvements to the build system.

Someone suggested that "creating a crisis" is often a viable approach to closing this awareness gap. Real crises, like the Heartbleed vulnerability for OpenSSL, have had that effect for some projects. But one can remind the free-software community as a whole how important an older project is by asking it to consider what would break if the project disappeared.

Another identified gap is that older projects may not have anyone working in the community-manager role, perhaps because that is a more recent idea. In the early days of the GNU project, one attendee pointed out, the other GNU developers were the user community. Because the broader free-and-open-source software movement has expanded so much since then, developers from those original projects may not feel a connection to the community even though they, as individuals, are not what has changed.

Another common factor is that end users often feel disconnected from stable or under-the-hood projects. In the FOSSology case, the project's users rarely mention its value in public, perhaps since it is generally an internal tool or used as an auxiliary to the user's main activities. It might reap dividends for the project to simply encourage its users to be more vocal about how FOSSology helps them, someone suggested. For library projects like libexiv2, the downstream projects depending on them can help connect library developers to the end users by making it clear how important the upstream project is, and perhaps by sharing community spaces and communications tools, such as forums.

Practical challenges

Communication tools and other community infrastructure can ossify if a project is not careful, which can have the unintended side effect of making it harder for new community members to get involved or, simply, to interact with the project's development team. In some cases, it may be that the project-hosting world has moved on: older projects may be relying on a SourceForge-hosted mailing list, for example, while many new users are familiar only with GitHub and the like. Updating the tools may be a painful, but beneficial, option.

A more difficult question is how to keep the project's communication channels up-to-date in light of the constantly shifting "tool of the month" culture. Several people mentioned the difficulty of maintaining Slack channels for open-source projects; Slack is proprietary, but the service is what new users—at least, at the moment—expect. There are mechanisms for linking a Slack channel to other (free-software) services, like IRC, but they seem to be difficult to use or in a poor state of repair. On the plus side, one attendee mentioned that it is quite easy to export Slack discussions as JSON and archive them online for wider access, a fact that seemed to come as news to many in the session.

Another issue is that the "maintainer" developers are usually not "creator" developers, who tend to drop away after a project becomes stable. Maintenance is frequently a side effort for someone who does other development; the divided focus and lack of creator's enthusiasm make it harder for a project to attract new developers. But there are publicity opportunities that are better suited to older projects than to newer projects. "Come to this project and have your work be used by 20 million people" was suggested; a core library could truthfully make such a claim while few newer projects could. But there are also recruitment obstacles more prevalent in older projects, such as technical debt, complexity, and reliance on older languages.

Yet, perhaps ironically, some of those technical issues can also make older projects a good target for education outreach. Several educators were in the session; they noted that when they talk to computer-science professors about getting students involved in open source, those professors are typically looking for older, stable code bases. Thus, they can have students work on small patches, and the experience can focus on the student's work, rather than wrestling with a shifting, chaotic project.

Those same attendees noted that professors also prefer establishing a long-term relationship with the project in question, which is beneficial to the project, too. The project gets a steady stream of contributions over multiple semesters, rather than a "drive-by contribution" from an individual summer intern who never returns. Few projects, they reported, have a point-of-contact person who can work with interested professors. Although getting college professors in touch with older projects would be mutually beneficial, the room seemed to agree, it ties in to the larger problem that open-source software has making inroads with the academic world.

That said, comparatively few long-established projects participate in efforts like Google Summer of Code or Outreachy. Making an effort to participate could be beneficial to those projects. Several attendees mentioned other outreach efforts that target bringing in new contributors, such as OpenHatch, CodeTriage, and Your First PR.

One attendee noted that perhaps some projects really do deserve to be moved to a "archival" type of project hosting. Older filesystems, for example, may no longer be used on newly set-up machines, but the code certainly needs to persist in the long term, since there will be users needing to access old disks and disk images long after the filesystem has been supplanted by a newer alternative. No one seems to address this project-hosting niche today; someone suggested that the Internet Archive, which performs a wide variety of archiving functions, might be worth approaching.

The session ended, as unconference sessions often do, only when time was up and the attendees ran off to the next round of discussions. But the mood of the room at the end was fairly positive. "Old" projects, it seems, may have to employ different strategies to attract vibrant communities, but it is certainly a challenge that the free-software community can meet. Some notes from the session are available online; those in attendance may continue to add to what is there, as may anyone else interested in the topic.

Comments (28 posted)

The open-source generation gap

By Nathan Willis
May 18, 2016

CLS

At the 2016 Community Leadership Summit (CLS) in Austin, Texas a pair of sessions discussed the culture gap that exists between a substantial contingent of old and new open-source developers. The first session sought to understand the background of the divide, in particular to understand why a large proportion of new developers seem to have no interest in licensing, simply wanting to work "in the open" and skip over the other details. The second session sought to figure out how the divide could be bridged.

Both of the sessions were of the unconference variety, meaning that they were proposed, organized, and moderated by one of the attendees. In fact, the first of the sessions originally proposed to tackle solutions as well; it was only when time ran out and the discussion was nowhere near a resolution that the second session was proposed to follow up. Unlike the session dealing with older software projects, this discussion centered around what is apparently an age-based divide between different "generations" (for lack of a better term) of open-source developers.

The impetus for interest in the topic was a widely circulated Medium post by Nadia Eghbal, in which she decries the term "open source" as irrelevant. Among other things, Eghbal's post contends that most developers don't know that there is a formal definition of "open source" and that most developers don't care about software licenses (much less whether they happen to align with someone else's definition). What really matters to modern developers is "1) building and 2) collaborating in public." She proposes the term "public software" as a replacement that more accurately captures the important issues: that development is a visible process and that the product (be it code, data, or anything else) is accessible to anyone.

Mind the gap

As one might anticipate, that position frustrated many people who work in the open-source software world, particularly those involved in licensing work. In the first unconference session, moderated by Spencer Krum, an effort was made to suss out where the root of the disconnect lies, since it is clearly a trend among newer developers. Part of the problem, it seems, is that tools built by the open-source (and free-software) community have now become widely available to developers who are not motivated by the ideals of those movements. GitHub is a prime example; although it is built on Git and makes it easy to use workflows that grew out of the free-software movement, anyone can use it for almost any purpose.

Meanwhile, online collaboration tools have become more "frictionless" as time has gone by. Consequently, older projects that place a heavy emphasis on GPG-signed releases and an email-driven patch review process have stuck with one generation of tools, while newer projects have picked up a newer generation of tools like GitHub's web-based issue tracker. So there is a practical disconnect, which is then exacerbated when people say (as one attendee put it) "we do things this way because that's what Linus does."

Others suggested some blame belongs to the prominent organizations. The Linux Foundation, one person said, tries so hard to say "you don't have to worry about licenses and patents if you work with us"—in an effort to attract developers—that the importance of those details gets lost. Another noted that the Open Source Initiative (OSI) coined the term "open source" in the first place and thus holds the responsibility for letting it devolve into a generic term (as Eghbal's post contended).

OSI General Manager Patrick Masson was one of the session's attendees, and he pushed back on that last point. There is too much "open-washing" these days, he said, but it does not come from the OSI. There is still only one Open Source Definition; the dilution of the term comes from others who use "open" to describe organizations, workflows, processes, and other things unrelated to software licensing. "We have open hardware and open data, but also 'open cola' and 'open beer.' That blurs over an important distinction. Not everything fits."

Several others agreed with Masson, noting that misunderstandings about the importance of licensing belong not to the institutions, but to the individuals not giving enough thought to licensing issues. As Deb Nicholson put it, much of the legal framework of free software and open source has developed to fit intellectual property law. "Saying 'we don't care about licensing' is fine—until Oracle v. Google hits you."

A number of people felt that the new-contributor experience of projects was responsible for much of the gap between generations of developers. If a project says "follow these seventeen steps and then we'll accept your contribution," for example, quite a few potential developers will simply walk away. Another attendee added that the concern is not so much that projects have put up barriers to entry, but that there is a big difference between the ease of joining an established open-source project (such as the process for becoming a Debian Developer) and the ease of starting a new project of your own (such as pressing the "new repository" button on GitHub).

That led several to suggest that the OSI should provide guidance on contribution models, contributor agreements, and similar facets of the development process. Masson politely disagreed, as did several others. One noted that just as there is clearly no "one true license," there is no "one true contribution model." Perhaps, though, the community needs to do some work to determine which parts of the contribution process are essential to open-source development and which are not.

Quite a few other discussion threads came and went during the session. For example, it was noted that the majority of the people using GitHub are individuals, not projects. Many seem to use GitHub solely to store personal project files they have no interest in promoting to the public. Many others have a habit of forking existing repositories rather than collaborating. Consequently, the oft-cited statistic that most GitHub repositories have no license is misleading. Another person suggested that part of the divide comes from the fact that so many new projects are web-development efforts and software-as-a-service (SaaS) projects. There is an inherent gap between how those projects approach licensing and how non-web projects approach licensing.

Close the gap

The follow-up session about the open-source generation gap came the next day. Many of the attendees from the previous session returned; Nicholson acted as moderator. To start things off, she noted that most of the people at CLS (and certainly most at the session) fell into the "older" generation, but that it was important to recognize that there was work to be done on both sides. That said, there was a show of hands, and several people in the session self-identified as being part of the "younger" crowd.

A number of people felt that the disconnect over the value of software licensing indicated that the older generation had not been doing a good job of connecting incoming contributors to the necessary background material. The older generation needs to better document and explain the history of the free-software movement for new community members.

Along those same lines, several people felt that individual projects do a poor job of documenting the decisions that led to their current architecture and development practices. Consequently, new developers can get turned off by what they perceive as poor design, only to join newer projects that eventually make the same kinds of decisions as they mature. Explaining the "whys" of a project makes it easier for newcomers to get involved.

As an example, one attendee, who works at a large online retailer, noted that the company had routinely encountered such an attitude from its new hires, who were appalled to learn that the company's system architecture is a convoluted mess and want to try and redesign everything from scratch. So the company added a session to its new-employee orientation process, during which a senior engineer lets the new hires work through their own architectural decisions to design a system that handles all of the company's services. They learn to appreciate that services can start simple, but grow in complexity, and come away with a better understanding of the overall architecture.

Another suggestion was that established projects need to meet new contributors "where they're at," even when that entails leaving the project's comfort zone. Jenny Wong noted that the WordPress community has started holding "new contributor day" events, at which the experienced contributors meet with the newcomers and, together, work through the new-contributor documentation that the experienced folks themselves have written. That lets the two communities work together, and it lets the experienced coders see firsthand what struggles the new contributors encounter—including, notably, where the new-contributor documentation is falling short. Other attendees concurred that having "onboarding" documentation was important, but that it was equally important to encourage bug reports and patches to that documentation from new contributors as they work through it.

Among the other points raised during the session, attendees noted that it was important that the community distinguish between minting new project contributors and minting new free-software activists, and that it was important for projects to put a check on flamewar-style debates—particularly those that focus on dismissing certain technologies. It is easy for experienced developers to become attached to a language or framework, but there will always be new languages and projects popping up that are the entry points for new coders. Project members deriding language Y because it is not language X may only serve to tell newcomers that they are not welcome.

Time elapsed in the second session well before the attendees had finished sharing all that they had to say. While there were plenty of thought-provoking points raised, this is sure to be a topic that the community continues to debate for some time to come. It will be particularly interesting to see how different subsets of the open-source community view the issues, since other samples might have different takes than those found among the CLS 2016 attendees.

Comments (9 posted)

A discussion on combining CDDL and GPL code

May 18, 2016

This article was contributed by Paul Brown

LLW 2016

Within the context of an event dedicated to discussing free and open-source software (FOSS) legalities, such as the Free Software Legal & Licensing Workshop (LLW), the topic of conflicting licenses was bound to come up. The decision by Canonical to start shipping the ZFS filesystem with its Ubuntu server distribution back in February led to a discussion at LLW about distributing the kernel combined with ZFS. Discussions at LLW are held under the Chatham House Rule, which means that names and affiliations of participants are only available for those who have agreed to be identified. This year's LLW was held in Barcelona, April 13-15.

Both the CDDL, which is the license under which ZFS is distributed, and the GPLv2, the license used for the kernel, are considered open-source licenses, but they are, to some extent, incompatible with each other. Executable code that is licensed under the CDDL can be re-licensed, as described in section 3.5 of its text. The source code, however, must be distributed under the CDDL (section 3.1).

This conflicts with GPL section 2(b) that states: "You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License." This means that both the executable and the source code, and any modifications made to either, must be distributed under the GPL. Although it is true that Canonical also distributes ZFS as a Filesystem in Userspace (FUSE) module, in which case the kernel and the module are considered separate entities and no license conflict exists, Canonical also distributes ZFS as a native kernel module, which gives rise to a clash of licenses with regard to the source code.

The spirit of the licenses

Mishi Choudhary, from the Software Freedom Law Center, explained as much in her talk "ZFS Licensing". She noted that Linus Torvalds himself had been asked way back in 2003 whether the kernel developers considered a module to be a separate entity and, therefore, not subject to the GPL's copyleft clause that governed the rest of the kernel. Torvalds's answer was a resounding "no".

However, Choudhary considers the conflict between the two licenses to be resolvable and, along with her colleague, Eben Moglen, has published a paper explaining why. In her talk she explained that, even though a conflict existed, none of the parties, or even any third party, was being damaged if one or the other license was being infringed upon. The source code was still being freely distributed, regardless of whether it was being done under one license or the other. Therefore, even if the letter of one of the licenses was not being respected, at least "its spirit" was.

Choudhary and Moglen argued that enforcing a strict interpretation of the letter of both licenses would have, in this case, more negative consequences than positive. There are precedents in Western law for interpreting a contract or license in its spirit, rather than literally. In the case of the GPL, the spirit of section 2(b) is that the source code must be distributed under a copyleft license. As that is what happens under the CDDL, the spirit of the GPL's clause is respected.

This led to a rather heated Q&A session, with Moglen himself flying in from New York specifically to field questions on the matter. Before Choudhary's presentation, the attendees had listened to two talks that advocated taking a hard stance against all infringements and how taking infringers to court had served to establish the GPL and related licenses as valid legal documents (see last week's article on enforcement and compliance of FOSS licenses).

At this point, the Q&A had moved on to a fishbowl format, in which attendees could come up and sit on stage to share their thoughts. Many did.

Multiple attendees felt that interpreting clauses in the GPL "in their spirit" would undo the enforcement work and weaken the license's legal standing, leading to a slippery slope that infringers could take advantage of. "Why," an infringer could argue, "is clause 2(b) not taken literally in the Canonical case, but it is for me?" Attendees also felt that, by not applying licenses literally, developers that had already licensed their work under the likes of the GPLv2 would wonder why they bothered; and those that were considering using the license would now be reluctant, because the move to interpreting the license in its spirit would make it seem vague and ambiguous.

Moglen stated that this was already happening, but for a different reason: companies he worked with were "fleeing the GPL" because of its inflexibility. He argued that this case was an example of how applying a literal interpretation of the clause would cause more damage than good to the collaborative model of development, which is the cornerstone of the free-software movement.

Many attendees saw the argument as a way to appease a prominent player that had violated a license, and opened a door to a slew of "convenience" violations that would weaken strict licenses further. Moglen countered that he could imagine a scenario in which two FOSS-defending organizations entered in a dispute with each other over conflicting FOSS licenses and, if that happened, the legal war would tear the community apart.

That characterization may have been a mistake. Attendees took offense at the notion that, by defending the literal validity of the GPLv2, a license Moglen himself had been instrumental in creating, they would somehow be made responsible for a hypothetical demise of the free-software movement. As the morning sessions drew to a close, there was the prevalent feeling that the CDDL-GPL issue had opened a rift in the FOSS legal community.

Conclusion

The proliferation and complexity of FOSS licenses were bound to lead to a conflict sooner or later. As Choudhary and Moglen's talk and Q&A session showed, the question of whether it is better for the free-software community to always apply a strict interpretation of a license, or take a more lenient approach when it conflicts with a another FOSS license, remains unanswered. We will hopefully learn more as the conflict between ZFS and the kernel plays out.

[ The author would like to thank Red Hat and Intel for assisting with his travel expenses and the Free Software Foundation Europe (FSFE) for help during the event. ]

Comments (34 posted)

Page editor: Jonathan Corbet
Next page: Security>>