LWN.net Logo

Development

FOSSology gains SPDX support

By Nathan Willis
July 3, 2013

A new release of the FOSSology source-code analysis tool is out. Although there have been minor updates, this is the first update in 2013 to bring additional functionality. The 2.0 release in 2012 marked a major shift for the project, debuting a new, more modular design and paving the way for faster releases. The newest update, version 2.2.0, includes a new permissions scheme and some usability improvements, but in the long run, the most notable feature in this release may be the improved compatibility with the Software Package Data Exchange (SPDX) standard for tracking software components, licenses, and copyrights.

FOSSology is designed to be a flexible platform for analyzing source code, but it is best known for its ability to scan large collections of files and pick out licenses and copyright statements. The resulting license and copyright information is then used to help an organization stay in compliance with the licensing requirements it inherits from upstream open source projects. However, there are other use cases—for instance, at LinuxCon Japan, Armijn Hemel mentioned using FOSSology to help automate the process for finding license violations in the source code of software shipped in embedded Linux devices. It is not hard to imagine the tool being adapted for other source-scanning tasks, such as to assemble a list of contributors needed to sign off on license change.

Users can upload source packages to FOSSology, then queue scanning jobs that analyze the packages for various types of information handled by scanning "agents." As new code is added, components are updated, and trees are rearranged, these scans can be run periodically, to help check for problematic license combinations or missing information. The basic agents available include a license recognizer, a copyright recognizer, a MIME-type analyzer, and a package header parser (which looks for the packaging information defined for RPM or .deb files). However, users can write their own agents to scan for arbitrary information.

All of the agents work by matching text patterns, which is a tricky business, considering all of the ways a licensing statement could be phrased, and the wide assortment of licenses that may be encountered. FOSSology defines 600 or so at the moment. Although they are sometimes less critical from a legal-compliance standpoint, recognizing copyright statements is also a pattern-matching game; FOSSology looks for text blocks that resemble copyright statements, as well as for email addresses and URLs.

Historically, FOSSology has been deployed on a web server backed by a PostgreSQL database, with multiple users uploading source code bundles and performing scans through the web UI. In October 2012, version 2.1.0 added a pair of command-line utilities, fo_nomos_license_list and fo_copyright_list, with which users could query the FOSSology database for license or copyright information. The command-line utilities free up users from the web UI, plus they make the FOSSology repository more accessible to scripting, and they are reported to run faster. Execution speed can be a major issue with large repositories, where a scan run in the web UI could time out if it took too long. But in the 2.1.0 release the tools were pretty limited in scope, since both required scanning an entire upload (that is, one package or source archive). The 2.2.0 release updates the utilities to accept a sub-tree as the starting node from which to perform a scan.

Version 2.2.0 also introduces a new permissions scheme that allows administrators to limit access to specific files on a per-file and per-user basis. The system implements its own set of internal user groups (i.e., separate from the Unix groups that may be associated with accounts); each user in a group can be granted read permission, write permission, and user/group-administration permission. The ability to upload source packages to the application is governed by a separate permission table, perm_upload, which grants upload permission for each folder to specific groups; each user gets his or her own group by default, which enables per-user upload restrictions. It is a fairly straightforward system, but it replaces the permission system used in previous releases (which bound permissions to each individual application plugin), so administrators may have to do some work migrating existing installations.

Licenses galore

There are, naturally, the usual collection of bugfixes and stability improvements in this release, plus the noteworthy addition of the ability to pull up the full text of a software license from within FOSSology itself (useful for those rare users who do not have the differences between GFDL v1.1 and GFDL v1.2 memorized, no doubt).

But the bigger news item on the license-presentation front is the fact that FOSSology has migrated its list of license names to be compatible with the canonical list supported by SPDX. The SPDX project is a relatively new effort (dating back to 2010); it defines a metadata format for describing the "bill of materials" of a software package, including everything from its creator and definitive name to its URL of origin and file checksums. In the list of mandatory items, as one might guess, is the "concluded license" that governs the package as a whole. SPDX is meant to be both human-readable and machine-parsable (RDF is the preferred file format), so the specification includes a list of open source licenses.

SPDX is also in use by a few other source analysis tools, such as the Ninka scanner and the commercial tools used by Black Duck Software. The specification is written by a Linux Foundation workgroup, which is currently drafting a new revision.

What SPDX support brings with it is the ability to use FOSSology data in conjunction with other tools based on sharing a common file format. The license-compliance problem is no longer one that organizations can ignore. Last week, Harald Welte won a GPL infringement case in Germany in which the court held that the violator had to ascertain on its own that it was in compliance with the licensing requirements it inherited from upstream suppliers. In other words, even if a device maker contracts out the software to a third party, it is still required to verify that the source code it offers in compliance with the GPL actually corresponds to the software on the device. For a device maker that does not do development itself, that could be a tricky undertaking. But with independent tools able to report licensing information in a compatible format, the problem becomes easier (although still not trivial) to solve.

For its part, FOSSology has adopted SPDX's names for the licenses already on its list of recognized licenses, and the 2.2.0 release notes comment that the application also added support for a few SPDX licenses not previously recognized by its license agent. FOSSology is most certainly a specialist's tool at this stage, but the refactoring that went into the 2.x series may make it useful for a wider variety of applications, if developers write scanning modules of their own to look for interesting nuggets buried in the source code. There was a one-year wait between version 1.4 and 2.0, but in the year since, the project has picked up the pace and delivered two stable releases with functional additions. Hopefully, that signals a platform that more developers will wish to contribute to. After all, the free software community is (justifiably) nitpicky where licenses and copyrights are concerned, but there are far more potentially useful bits of information to glean from a corpus of source code, given the proper tool to find them.

Comments (1 posted)

Brief items

Quote of the week

A friend asked yesterday if I knew of a tool to print a web page as a single-page PDF, i.e., making the PDF page as tall as necessary to keep everything on one page.

As a result, I know that Obnam's bug list is 4915 mm tall.

Lars Wirzenius

We want to thank all our loyal fans.
Google, after shutting down Google Reader.

Comments (none posted)

Qt 5.1 released

Version 5.1 of the Qt toolkit has been announced. "We have added many new modules that largely extend the functionality offered in 5.0. The new Qt Quick Controls and Qt Quick Layouts modules finally offer ‘widgets’ for Qt Quick. They contain a set of fully functional controls and layout items that greatly simplify the creation of Qt Quick based user interfaces."

Comments (1 posted)

Upstart 1.9 released

Version 1.9 of the Upstart init-replacement has been released. This version adds support for AppArmor through two new stanzas, adds a stateful re-exec, and allows inherited environment variables to be un-set for Session inits. In addition, a new D-Bus signal bridge has been added, as has a client library (libupstart) through which applications can communicate with Upstart.

Full Story (comments: none)

systemd 205 available

A new version of the systemd init-replacement has been released. Version 205 includes "a number of major new concepts, such as transient units, scopes and slices, which turn systemd into something that is far more dynamic than it ever was," a new systemd-run tool, and is the first release in which systemd assumes management of control groups (cgroups).

Full Story (comments: none)

GNUstep Objective-C Runtime 1.7 available

Version 1.7 of the GNUstep Objective-C runtime has been released. Changes include the move to a CMake-based build systems, a CTest-based test suite, and significant improvements in property introspection. The test suite itself has also been improved, as has integration with libdispatch and with foreign exceptions (e.g., exceptions from C++). Finally, MIPS64 is now supported in the assembly routines.

Full Story (comments: none)

Rust 0.7 released

Version 0.7 of the Rust language is out. "This release had a markedly different focus from previous releases, with fewer language changes and many improvements to the standard library. The highlights this time include a rewrite of the borrow checker that makes working with borrowed pointers significantly easier and a comprehensive new iterator module (std::iterator) that will eventually replace the previous closure-based iterators." See the release notes for details.

Comments (none posted)

Newsletters and articles

Development newsletters from the past week

Comments (none posted)

Akademy 2013 Keynote: Jolla's Vesa-Matti Hartikainen (KDE.News)

KDE.News has an interview with Jolla engineer Vesa-Matti Hartikainen who will be giving a keynote at KDE's Akademy conference in mid-July. The interview covers various topics, from the history of Jolla (and how it came out of MeeGo and the N9 Nokia phone efforts) to the use of Qt in Jolla's Sailfish OS. "For Jolla, Qt is a first class citizen. For developing apps using QML, we put a lot of effort into making them as good as possible. We have an awesome team working on the Sailfish Silica component set. It includes many of the original core developers of the QML language and runtime. And we have really experienced app developers from N9 and other Nokia projects. On the middleware level, a lot of the lower level APIs now have quite good Qt bindings for C++ developers."

Comments (none posted)

Swift: The Easy Scripting Language for Parallel Computing (Linux.com)

Linux.com introduces the Swift parallel scripting language. It may offer some assistance in solving the parallel programming problems noted by Andreas Olofsson in his keynote at this year's Linux Foundation Collaboration Summit. "Swift plays a simple but 'pervasively parallel' coordination role to create the upper level logic of more complex applications, [Argonne National Laboratory and the University of Chicago's Michael] Wilde said. 'It makes it very easy to parallelize what we often call the "outer loops".' Highly parallel applications can thus be composed by gluing together serial algorithms because Swift creates the parallelism automatically at runtime, without explicit direction from the programmer. It does this by first encapsulating the applications that are called within a script as 'functions' with uniform interfaces, and then applying automatic data flow, he said."

Comments (4 posted)

Nemeth: How Google pulled the plug on the public Jabber Network

At his blog, Adam Nemeth has harsh words to share about Google's recent decision to move away from the XMPP instant messaging protocol. Specifically, he criticizes XMPP itself: "Jabber failed to provide good enough spam protection, failed to provide a scalable protocol, failed to provide easy transfer of accounts between providers (if I change e-mail address, I don't have to re-add all my friends, it's enough to set a simple forward or inbox pulling - that's not true for Jabber IDs!)." The result, he argues, was that client application developers never found the protocol all that compelling.

Comments (94 posted)

Heilmann: The Fox is out of the bag #FirefoxOS

On his blog, FirefoxOS developer Christian Heilmann reflects on why he is excited about the phone operating system in light of the announcement of the first FirefoxOS smartphones. One of five things he highlights: "FirefoxOS does not assume a fast, stable and always available connection. When traveling I start hating my Android phone which I love to bits otherwise. Having dozens of megabyte updates over roaming is out of the question and neither is using flaky and slow wireless connections. Firefox OS has no native apps – all of them, including the system apps are written in HTML, CSS and JavaScript. Thus they are much smaller and can have atomic updates instead of having to be replaced as a unit every single time."

Comments (46 posted)

Page editor: Nathan Willis
Next page: Announcements>>

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds