LWN.net Weekly Edition for March 12, 2015
File sharing, streamlining, and support plans with ownCloud 8
Version 8.0 of the ownCloud web-service platform was released in February. As was the case with previous releases, a basic installation offers a variety of cloud-like services for managing information: shared file storage, contact and calendar synchronization, online document editing, and so forth. The project also supports an API on top of which a variety of third-party web apps can run. The new release brings with it a renewed effort to make installing and managing these add-on apps easier and more reliable, plus several tools to make running one's own, private ownCloud server simpler. Finally, the company that underwrites ownCloud's development has announced that users who run such private server installations will be able to purchase support plans—something that was previously reserved only for enterprise customers.
The 8.0 release comes about eight months after the last major update, 7.0. The project makes builds available in a variety of formats, from source archives to installer bundles intended for use on shared web hosting plans. Packages for a variety of Linux distributions are also available for download. There are desktop applications available for managing shared folders, and an Android app for device synchronization (the app, interestingly enough, is a for-pay offering in Google's Play Store, but is available for free through F-Droid).
Users interested in testing out ownCloud 8 on a publicly reachable server (as opposed to installing it locally on their own machine) also have an opportunity to do that. The project has a three-hour "test drive" program available through a web hosting provider. The trial offers 1GB of storage space and is fairly painless to set up (although one must still walk through the hosting company's full setup process, including frustrating steps like trying to guess at an available subdomain name).
There are a few changes in the project's release practices worth pointing out, though. First, in the past, there were two separate editions of ownCloud: the Community Edition and the Enterprise Edition—the latter being aimed at businesses and coupled with paid support plans from ownCloud, Inc. As of 8.0, the Community Edition has been renamed "ownCloud server" (although not all of the references on the web site have been updated to reflect this).
There are still functional differences between the offerings: the Enterprise version features integration with services likely to be necessary in corporate IT environments (like Microsoft SharePoint and Oracle databases), and it adds support for using some different file-storage back-ends (including Amazon S3 and Ceph) as primary storage. But, as of the 8.0 release, the extra functionality in the Enterprise edition comes via a separate set of Enterprise apps and different default configuration, not from a different server codebase. And non-Enterprise users can still use Amazon S3 and Ceph for storage—they simply do not come configured as the primary back-end storage layers.
The second change is that, starting with version 8, the project is moving to a time-based release schedule with an accompanying version-numbering scheme. Version 8.1 is scheduled to arrive in three months, followed by two more quarterly point releases (8.2 and 8.3), with 9.0 set to arrive one year from now.
Last, but certainly not least, ownCloud Inc. has announced that it will offer commercial support plans for users running the "server" (i.e., non-Enterprise) version of ownCloud 8. The support plans are on the low end compared to the Enterprise offerings—users get email support only, and only during 8-to-5 business hours (those hours being measured from offices in Europe or on the East or West coasts of the US). But that is still, hopefully, a more reliable tech-support avenue than asking questions on a community mailing list or IRC channel, and it may produce another revenue stream to support development.
So far, the company has managed to not build different features into the community edition and enterprise edition of the server, which is reassuring to see. Prior to version 8, there was an additional API in the enterprise edition; as will be discussed later, this has now been merged into the community version, too. There are also community-built substitutes available for several of the enterprise apps (such as logging or Shibboleth authentication).
To the cloud
All in all, the changes found in the 8.0 release fall into a few general categories. A lot of work has gone into making user-interface (UI) improvements, both on the user-visible side and in the administrative interface. There are also a handful of new and updated features. Finally, the new release integrates some changes to the way third-party apps are designed and deployed—changes that may primarily interest app developers at present, but should make for a better user experience in the long run.
On the UI front, there is a new interface for working with shared files. In the web interface, one can open a pop-up dialog for each stored file and folder to change the sharing settings. There is a download link to provide to everyone who needs access to the file, plus straightforward password-protection and time-expiration checkboxes to limit that access when necessary. Any active sharing enabled for a file is also visible in the file browser thanks to an indicator that appears next to the file name.
There is also a "favorites" feature that, at the moment, is fairly limited in scope: the user can star files in the main file browser, then access these "favorite" files in a separate sidebar. But the project indicates that there is more to come here: "favorites" are just the first metadata field tracked by the application. The plan is to roll out additional metadata filters (like "recently used" and "recently changed") in future updates.
The 8.0 release notes also tout an improved search interface, although my tests found this feature to be a mixed bag. It is, indeed, remarkably fast at showing search results (and the search box is available on every screen, which is key). But it only appears to search the contents of the current folder—not including subfolders—which leaves quite a bit to be desired. That is particularly frustrating because the release notes include a screenshot indicating that ownCloud-wide search ought to be supported.
Interface improvements are available on the administrative side as well, which (in a practical sense) is likely to be just as important as UI improvements on the user side—considering how many early ownCloud users run their own server. In particular, the various administrative tasks have been streamlined into a single page with handy links in the sidebar to the important sections. There are also improved tools for managing large numbers of user accounts and use groups, letting administrators search and sort on multiple fields, apply changes to multiple selected users, edit existing group names, and so on—features that were unsupported in the past.
Finally, app installation has been significantly simplified. The available third-party apps are listed in an app-browser reminiscent of Firefox's current add-on browser. Each available app has a single "install" button, version and update information is clearly listed for each app, and there is a one-click tool for restricting access to each app by user group.
Behind the clouds
Under the hood, the revamped app-management system also marks a functional change. In previous ownCloud releases, the download bundle included an entire suite of add-on apps that were not enabled in the default settings. That made activating them rapid, of course, but it also made for a much larger download. Starting in version 8.0, only the basic file-storage and sync apps come built in; all of the others (including standard apps developed by the project, like Calendar and Contacts), are downloaded when they are installed from the web interface.
Another set of less-visible changes affect file sharing. Starting with version 8.0, file sharing supports federation—that is, a folder can be shared directly between two ownCloud instances running on different hosts, not just between one ownCloud instance and a desktop machine. Users set up a federated share by entering otherusername@remoteOwnCloudServer.example.com in the "Share with a user or group" field. At the moment, that relies on the user already knowing the correct username and address of the other ownCloud server, but it is a step in the right direction, and is more secure than emailing a public link to the folder in question.
The other new file-sharing feature is support for downloading a file directly from its underlying storage (e.g., Dropbox, Amazon's S3, a Gluster server). By bypassing the need to funnel the download through the ownCloud server, this should significantly speed up file access when large groups of people work on the same set of files, or for ownCloud servers that simply have a lot of user accounts.
For third-party app developers, ownCloud 8.0 also includes some changes to app packaging and development. Dependency management is now built into ownCloud server; an app needs to include a list of any dependencies in an XML file, but the ownCloud server will automatically resolve those dependencies (where possible) when a user installs an app. That includes dependencies on underlying system tools (such as a database version or library) and specific PHP extensions, as well as simpler dependency issues like ensuring that the correct version of ownCloud itself is running on the server.
There have also been a number of cleanups to the app API, with an emphasis on providing a more stable and predictable platform for app developers. Evidently, in previous releases, it was far from uncommon for a third-party app to rely directly on ownCloud's internal PHP classes and methods, leading to obvious stability problems across upgrades. The project has updated its developer documentation and tutorials to reflect this; users may only notice the change when they encounter less breakage in third-party apps.
There is also one entirely new API available in ownCloud 8.0: the user provisioning API, which enables external tools to query and change various user account settings like storage quotas, and to create or modify users and groups. It is most useful from an administrative standpoint, but it is interesting to note that the API was originally an Enterprise-Edition-only feature that has now been added to the non-Enterprise edition.
Evaluating the changes in ownCloud 8.0 can be a subjective affair. What one gets out of ownCloud depends on how one intends to use it. As a replacement for proprietary cloud services like Google Drive and Google Calendar, the latest version is easy to use and just as powerful. How one feels about all the additional apps might vary somewhat—I found the Documents collaborative-editor app to be a bit more awkward and less integrated, for instance.
But the project is doing well to focus on the core—whatever other apps anyone uses, everyone needs access to files of some sort. It will also be interesting to see how the support plans for non-Enterprise customers fare as a fundraising endeavor. Other free-software web-application projects would, no doubt, like to find a reliable revenue stream that does not hinge on "open core" shenanigans or charging for commodities like file storage. Perhaps lightweight end-user support, if done right, could be just such an opportunity.
A GPL-enforcement suit against VMware
When Karen Sandler, the executive director of the Software Freedom Conservancy, spoke recently at the Linux Foundation's Collaboration Summit, she spent some time on the Linux Compliance Project, an effort to improve compliance with the Linux kernel's licensing rules. This project, launched with some fanfare in 2012, has been relatively quiet ever since. Karen neglected to mention that this situation was about to change; that had to wait for the announcement on March 5 of the filing of a lawsuit against VMware alleging copyright infringement for its use of kernel code. This suit, regardless of its outcome, should help to bring some clarity to the question of what constitutes a derived work of the kernel.
In her talk, Karen said that the Conservancy gets "passionate requests"
for enforcement of the GNU General Public License (GPL) from two distinct
groups: "ideological developers" and corporate general counsels. The
interest from the developers is clear: they released their code under the
GPL for a reason, and they want its terms to be respected. On the other
hand, a typical general counsel releases little code under any license. Their
interest, instead, is in a demonstration that the GPL has teeth so that they
can be taken seriously when they tell management that the company must
comply with the license terms of the code it ships.
The VMware suit should bring some comfort to both groups, in that it targets the primary product of a prominent company that has long been seen in some circles as pushing the boundaries of the GPL. But, beyond that, the suit will be of interest to the larger group of people that would like more clarity on just where the "derived work" line is drawn.
The complaint
The complaint has been filed in Hamburg, Germany, in the name of kernel developer Christoph Hellwig; the Conservancy is helping to fund the case and the lawyer involved is Till Jaeger, who also represented Harald Welte in his series of successful compliance cases. It focuses on the "vmkernel" component of VMware's vSphere ESXi 5.5.0 hypervisor product — one of VMware's primary sources of revenue.
VMware openly uses Linux as part of the ESXi product, and it ships the source for (presumably) all of the open-source components it uses; that code can be downloaded from VMware's web site. But ESXi is not a purely open-source product; it also contains a proprietary component called "vmkernel." The bootstrap process starts with Linux, which loads a module called "vmklinux." That module, in turn, loads the vmkernel code that does the actual work of implementing the hypervisor functionality. [Update: in truth, newer versions of ESXi no longer need the initial Linux bootstrap; in current versions, vmkernel boots directly.]
To many, the mere fact that vmkernel was once loaded into the kernel by a module is enough to conclude that it is a derived product of the kernel and, thus, only distributable under the terms of the GPL. That would make an interesting case in its own right, but there is more to it than that. It would seem that vmkernel loads and uses quite a bit of Linux kernel code, sometimes in heavily modified form. The primary purpose for this use appears to gain access to device drivers written by Linux, but supporting those drivers requires bringing in a fair amount of core code as well.
If one downloads the source-release ISO image from the page linked above and untars vmkdrivers-gpl/vmkdrivers-gpl.tgz, one will find these components under vmkdrivers/src_92/vmklinux_92. There is some interesting stuff there. In vmware/linux_rcu.c, for example, is an "adapted" version of an early read-copy-update implementation from Linux. vmware/linux_signal.c contains signal-handling code, vmware/linux_task.c contains process-management code (including an implementation of schedule()), and so on. Of particular interest to this case are linux/lib/radix-tree.c (a copy of the kernel's radix tree implementation) and several files in the vmware directory containing a modified copy of the kernel's SCSI subsystem. Both of these subsystems carry Christoph's copyrights and, thus, give him the standing to pursue an infringement case against VMware.
The picture that emerges suggests that vmkernel is not just another binary-only kernel module making use of the exported interface. Instead, VMware's developers appear to have taken a substantial amount of kernel code, adapted it heavily, and built it directly into vmkernel itself. It seems plausible that, in a situation like this, the case that vmkernel is a derived product of the Linux kernel would be relatively easy to make.
Unfortunately, we cannot see the complaint itself, because "
In her talk, Karen stated that litigation is the Conservancy's last resort
after every other approach fails to obtain compliance. Certainly there can
be no accusations of a rush to litigation here; the first indications
of trouble emerged in 2007. The Conservancy raised the issue with
VMware a number of times with no luck.
Christoph approached VMware in August 2014
with his own request for compliance, starting a series of communications
that did
not lead to an agreement. There was a meeting in December where, it is
said, VMware wanted to propose a settlement but only under strict
non-disclosure terms — terms which Christoph refused. So, it seems, going
to court is about the only remaining option.
One might wonder about the choice to file in Germany. The FAQ
says:
It is worth adding that Germany's courts seem to be relatively friendly
toward this sort of claim, with the result that previous GPL-enforcement
cases filed there have tended to go well for the plaintiffs. The ability
to pick the battlefield is a powerful advantage in a dispute of this
nature.
Filing an enforcement lawsuit is an intimidating prospect for a number of
reasons. Karen's talk noted that there is a lot of tension around the topic of
GPL enforcement. Some people would rather that it were not done at all,
seeing it as an incentive for companies to avoid GPL-licensed code. There
are not many developers who want to make a stand in an enforcement effort;
the Linux Compliance Project, she said, contains a number of kernel
developers, but almost none of them want to stick their necks out in an
actual enforcement effort.
But, she said, there is value in such efforts. Companies worldwide spend
vast amounts of money to ensure that they are in compliance with
free-software licenses. In the absence of enforcement, some will certainly
question the value and necessity of that expense — and some will decide not
to bother. There are also highly successful projects that have resulted
from enforcement efforts; router distributions like OpenWrt are usually
featured at the top of that list. GPL enforcement, by making it clear that
everybody needs to play by the rules, is, she said, performing a service to
the community as a whole.
How that service plays out in this case is going to be interesting to
watch, which is good, since we are likely to be watching for some time.
Given that ESXi is at the core of VMware's business, VMware seems unlikely to
either release the code or withdraw the product willingly. So the case may
have to go all the way through trial, and perhaps through appeals as well.
But, at the end, perhaps we'll have a clearer idea of what constitutes a
derived product of the kernel; that could be seen to be a useful service
even if the enforcement effort itself fails.
Since opening its doors in 2008, GitHub has grown to become the largest
active project-hosting service for open-source software. But it has
also attracted a fair share of criticism for some of its
implementation choices—with one of the leading complaints being
that it takes a lax approach to software licensing. That, in turn,
leads to a glut of repositories bearing little or no licensing
details. The company recently announced a new tool to help combat the
license-confusion issue: a site-wide API for querying and reporting
license information. Whether that API is up to the task, however,
remains to be seen.
By way of background information, GitHub does not require users to
choose a license when setting up a new project. An existing project
can also be forked into a new repository with one click, but nothing
subsequently prevents the new repository's owner from changing or
removing the upstream license information (if it exists).
From a legal standpoint, of course, the fork inherits its
license from upstream automatically (unless the upstream project is
public domain or under some other less-common license). But from a
practical standpoint, this provenance is difficult to
trace. Throw in other GitHub users submitting pull requests for
patches that have no license information, and one has a recipe for
confusion.
The bigger problem, however, is that the majority of GitHub repositories
carry no license information at all, because the users who own them
have not chosen to add such information. In 2013, GitHub introduced
its first tool designed to combat that issue, launching ChooseALicense.com, a web site
that explains the features and differences of popular FOSS licenses.
ChooseALicense.com allows GitHub users to select a license, and the GitHub
new-project-configuration page has a license selector, but using it is
not obligatory. In fact, the ChooseALicense.com home page includes
the following as its last option:
That "no license" link, incidentally, attempts to explain the downside of selecting no license—most notably, it strongly discourages other
developers (both FOSS and proprietary) from using or redistributing
the code in any fashion, for fear of getting entangled in a copyright
problem. But the page also points out that the GitHub
terms
of service dictate that other users have the right to view and
fork any GitHub repository.
One could probably quibble endlessly over the details of
ChooseALicense.com and its wording. The upshot, though, is that it
did not have a serious impact on the license-confusion problem. A
March 9 post
on the GitHub blog presented some startling statistics: that less than 20%
of GitHub repositories have a license, and that the percentage is declining.
The introduction of the license-selection tool in 2013 produced a
spike in licensed repositories, followed by a downward trend that
continues to the present. The post also included some statistics on license
popularity; the three licenses featured most prominently on the
license-chooser site (MIT, Apache, and GPLv2) are, unsurprisingly, the
most often selected.
This data set, however, is far from complete; as the post
explains, the team only logged licenses that were found in a file
named LICENSE, and only matched that file's contents against
a short set of known licenses. Nevertheless, GitHub did evidently
determine that the problem was real enough to warrant a new attempt at
a solution.
The team's answer is a new site-wide API called, fittingly, the Licenses API.
It is currently in preview, which means that interested developers
must supply a special HTTP header with any requests in order to access it.
But the API is, at least currently, a frustratingly limited one.
It offers just three functions:
Arguably the biggest limitation is that, as was the case with the statistics
gathered for the blog post, the license of a repository is determined
only by examining the contents of a LICENSE file. On the
plus side, the license information returned by the API conforms to the
Software Package Data Exchange (SPDX) specification, which should make it easy to integrate with
existing software.
To be sure, determining and counting licenses is not a simple
matter—as many in the community know. In 2013, for example, a
pair of presentations at the Free Software Legal and Licensing
Workshop explored several strategies for
tabulating statistics on FOSS license usage. Both presentations ended
with caveats about the difficulty of the problem—whatever
methodology is used to approach it.
Nevertheless, the GitHub Licenses API does appear to be strangely
naive in its approach. For example, it is well-established that a
significant number of projects place their license in a file named
COPYING, rather than LICENSE, because that has long
been the convention used by the GNU project. Even scanning for that
filename (or other obvious candidates, like GPL.txt) would
enhance the quality of the data available significantly. Far better
would be allowing the repository owner to designate what file contains
the license.
Furthermore, the Licenses API could be used to accumulate more
meaningful statistics, such as which forks include different license
information than their corresponding upstream repository, but there is
no indication yet that GitHub intends to pursue such a survey. It may
fall on volunteers in the community to undertake that sort of
work. There are, after all, multiple source-code auditing tools that are
compatible with SPDX and can be used to audit license information and
compliance. Regrettably, the GitHub Licenses API does not look like it will
lighten that workload significantly, since the information it returns
is so restricted in scope.
GitHub is right to be concerned about the paucity of license
information in the repositories hosted at its site. But both the
2013 license chooser and the new Licenses API seem to
stem from an assumption on GitHub's part that the reason so many
repositories lack licenses is that license selection is either
confusing or difficult to find information on. Neither effort strikes
at the heart of the problem: that GitHub makes license selection
optional and, thus, makes licensing an afterthought.
SourceForge has long required new projects to select a license while
performing the initial project setup. Later, when Google Code
supplanted SourceForge as the hosting service of choice, it, too,
required the user to select a license during the first step. So too
do Launchpad.net, GNU Savannah, and BerliOS. FedoraHosted and Debian's
Alioth both involve manually requesting access to create a new
project, a process that, presumably, involves discussing whether or
not the project will be released under a license compatible with that distribution.
It is hard to escape the fact that only GitHub and its direct
competitors (like Gitorious and GitLab) fail to raise the licensing
question during project setup, and equally hard to avoid the
conclusion that this is why they are littered with so many
non-licensed and mis-licensed repositories. An API for querying
licenses may be a positive step, but it is not
likely to resolve the problem, since it side-steps the underlying
issue.
Hopefully, the current form of the Licenses API is merely the
beginning, and GitHub will proceed to develop it into a truly useful
tool. There is certainly a need for one, and being the most active
project-hosting provider means that GitHub is best positioned to do
something about it.
court
proceedings are not public by default in Germany (unlike in the
USA)
", according to the FAQ maintained by the Conservancy.
A service to the community
GitHub unveils its Licenses API
None of the above
A new interface
Power to choose
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: Security module stacking; New vulnerabilities in autofs, chromium, openssh, xen, ...
- Kernel: A bunch of LSFMM 2015 coverage
- Distributions: Ten years of Kubuntu; Fedora, Debian, ...
- Development: HTTP/2 web services with gRPC; Samba 4.2.0; Mailpile beta pulled from service; High-DPI monitor support in KDE; ...
- Announcements: VMware update to GPL-enforcement suit, News from the FSF, GNU Tools Cauldron CfP, ...
