Leading items

Estimating the costs of open-source development

By Nathan Willis
October 7, 2015

At the 2015 Embedded Linux Conference Europe in Dublin, Paul Sherwood from Codethink presented an intriguing take on the common problem of estimating the length of time that a development project will take. In particular, he called out the widespread Constructive Cost Model (COCOMO) as being demonstrably unscientific and, thus, useless at predicting the future, then presented a far simpler alternative metric based on Git activity. The proposed new metric, he said, can be shown as useful by examining well-known open source software projects, including the Linux kernel.

At the beginning, Sherwood acknowledged that he was "biting the hand that feeds him" by challenging some statistics frequently cited by, among others, the Linux Foundation (LF). He has seen it claimed [note: signup required at link], he said, that it costs an average of $250,000 per year to maintain an out-of-tree kernel patch, and that the total amount of developer time [PDF] invested in creating core Linux projects is the equivalent of 1,356 developers working for 30 years. But the $250,000 number sounds dubious and the 1,356 sounds far too specific to be an "estimate"—and neither value comes with any supporting documentation.

Similarly, estimates of the value of Linux or even the number of lines of code (LOC) in the kernel are often quoted without hard evidence to back them up, a practice that clearly ruffles Sherwood's feathers. In fact, the LF said in 2013 that the kernel contained about 17 million lines of code, Sherwood said. But when he ran David Wheeler's SLOCCount against it, he got a significantly different number: 12 million lines of code (a 30% discrepancy). The numbers reported by OpenHub are even larger still, and no one can explain the difference when he asks.

In the absence of real statistics, he said, the software industry relies on vague and suspicious estimates when assessing the size and complexity of projects. And the problem of precision is not limited to pull quotes in promotional material. Companies regularly rely on estimation tools to predict the time and budget that a new project will require, and those estimates, by and large, seem to be founded on unsupported numbers. Sometimes such corporate estimates are based on past internal projects, but it is increasingly common for companies to compare their projects to an open-source project of similar complexity and scope. For example, a company embarking on an effort to write a new Linux graphics driver may look at several recent graphics-driver projects to approximate the time and engineering resources it will take. Finding a valid metric to serve in such situations is a hard problem, Sherwood said, but he has at least worked out a plausible alternative by analyzing several open-source software projects.

The prediction problem

To predict project costs, people have attempted to count all manner of indicators: LOC, number of commits, time sheet hours, even "WTFs per week." LOC is a naive measurement, although it is an easy one to understand. Commits and time sheet hours are also simple, but they are difficult to compare across projects and companies, and can be gamed (such as by breaking changes up into an excessive number of commits). The most common estimation tool in the industry, though, is the complicated COCOMO formula, he said—a method that routinely generates "enormous numbers" that no one can support with data. COCOMO dates back to the 1980s, but it has continued to be discussed, written about, and used in business environments up through the present day. It is the source of OpenHub's estimates about the value of open-source software, for example.

COCOMO has obvious flaws, he said, such as counting only positive increases in LOC as its measurement of progress. Blank lines and comments thus get taken into account, while removing dead code or rewriting existing code does not count at all. But the trouble is worse than that: COCOMO is inherently problematic because it is a function of multiple unreliable input variables. Many of these variables take predictive powers to fill (such as knowing the eventual hardware attributes that will be available on the deployed system) or are subjective (such as developer efficiency).

In reality, he said, project managers can tweak all of these inputs enough to fully control the output of the COCOMO function, so it is meaningless. Engineers and managers have learned to over-estimate their COCOMO numbers so that they appear more productive when they beat the estimate. Just as importantly, he said, looking at COCOMO estimates from past projects invariably shows them to be unreliable. But the method still gets used, in large part because there is no proven alternative. One can complain about COCOMO, and its users will shrug and say "what else do we have?"

Moreover, even those organizations that do not subscribe to COCOMO estimates often fall victim to them because the method is so widespread. A widely cited statistic, for example, is that the "Agile" development methodology reduces many of the costs of fixing bugs the "old-fashioned way"—in which, supposedly, fixing a bug in the planning phase is ten times cheaper than fixing it in the development phase. But that 10x factor originates in COCOMO estimates, he said, so comparing a new method to it has no value. "I think there was no scientific basis for COCOMO," Sherwood said in conclusion, "and there is no proof that is has worked historically. Until someone can prove otherwise, that's what I'm calling it."

Sherwood pointed the audience to several external resources on the unreliability of business prediction models, such as Nassim Taleb's The Black Swan: The Impact of the Highly Improbable and Dan Gardner's Future Babble. But, he continued, decrying bad numbers from COCOMO is not enough: the industry clearly needs to find a way to measure development costs. Without an alternative, "guesswork" like COCOMO will persist.

By GAD

In response to the shortage of hard evidence, Sherwood has been researching alternative methods of counting the effort required to complete software projects. His research looked at internal projects at Codethink as well as at well-known open-source software like the kernel, QEMU, systemd, GStreamer, OpenSSH, and various GENIVI projects (to which Codethink is a contributor).

Ultimately, he said, the one metric that he has found to most accurately reflect the time it takes to complete a software project is what he calls 2GAD. "GAD" stands for "Git active days" and is simply the number of days on which a developer makes a Git commit—of any kind. Summing up all of the GADs for every project participant and multiplying by two produces a relatively accurate count of how long it takes to move a project to completion.

He admitted that there are factors that could make 2GAD numbers too high—for instance, when some developers commit to multiple repositories or branches in one day, they may be counted twice. And it would be possible for developers to "game" the system to a degree by making meaningless commits. But the historical numbers he surveyed were surely not being so gamed, and they hold up well across the projects examined. Furthermore, there are also factors that could make 2GAD numbers too low, such as a project culture that favors large patches over small commits, or tools that squash many commits together. In the end, he said, those factors seem to balance each other out.

Sherwood's calculations showed that 2GAD was within 10% of the time sheet hours for Codethink's internal projects. When compared to COCOMO numbers for open-source projects, 2GAD produces somewhat smaller numbers for the largest software projects (like the kernel) and somewhat larger numbers for the smallest projects. But at least it does not rely on fudge-able input factors and it is based on hard data. Furthermore, he said, 2GAD counts activity independent of the language and type of content involved (e.g., commits to project documentation are counted). And it can easily be calculated on any subset of a project: a certain time frame, a particular branch, or a subset of contributors. It can also be used to measure the effort required to maintain an existing code base.

None of those qualities are true of COCOMO, Sherwood said. 2GAD may not be perfect, but he invited everyone present to give it a try against their own projects and see how it compares to other estimation tools. All one needs to calculate Git active days is access to simple tools like GitStats or git summary.

Where to from here

The session ended with a spirited round of questions from the audience. Most asked about factors that do not seem to be captured by the 2GAD metric—such as development styles. For example, one commenter noted that when working on a kernel project, he tends to make lots of small commits along the way in his own private branch, then rebase and merge his commits into a few larger patches. Just looking at the contributions he makes to the mainline kernel would necessarily overlook his "Git Active Days" spent on the private branch.

Sherwood replied that he was aware of the effect and that it perhaps explains why 2GAD numbers are lower than expected for the kernel. But he emphasized again that, flawed though it may be, it is a far more defensible measurement of project effort than COCOMO. There may be plenty of room for improvement, but at least the method can provide a solid baseline. In an industry where budgets and timelines are often established by comparing a proposal against some similar-sounding open-source project, Sherwood argued, any improvement over the unreliable "state of the art" is a welcome change.

[The author would like the thank the Linux Foundation for travel assistance to attend Embedded Linux Conference Europe.]

Comments (16 posted)

Debugging tools for input devices

By Jake Edge
October 7, 2015

X.Org Developers Conference

Most users don't know how to help diagnose a broken input device (e.g. mouse, touchpad, keyboard), Benjamin Tissoires said to start his X.Org Developers Conference talk (slides, YouTube video). Generally, users can't say much more than "my touchpad is broken", but he hopes to change that. There are tools that can be used to do better.

There are a number of stages in the input process. An input device communicates with the kernel over some transport. That information goes to libevdev, then to libinput, and on to the X.Org input driver. From there it goes to the X server or compositor and then to the toolkit (e.g. Qt or GTK+). Each of those layers might be broken, so there is a need to figure out which is the culprit when an input problem is discovered.

There are three main problem areas in input handling. The first is the user, but "we cannot fix it". The other two are the kernel and libinput (which was the subject of an earlier talk). In order to narrow the problem area down, the first step is to use evemu to record events and other information. It records the output of the kernel as seen by libinput and it allows others to replay the sequence of events to reproduce the problem. There is information on the evemu web page that will help users submit useful recordings, Tissoires said.

The evemu-record program has a straightforward interface that lists the input devices on the system and allows the user to choose which is of interest. It will then dump information about the kernel, the device and its capabilities, and the events generated by the device, in a form that can be used to replay them elsewhere. Based on the recording, if it looks like the kernel is doing the right thing, "talk to Peter [Hutterer]", otherwise the problem is probably something that Tissoires needs to look at.

There are other tools for diagnosing input problems, including mtdiag-qt, which he demonstrated. It intercepts events at the same level as evemu, but has a GUI to display the events graphically. He uses mtdiag-qt for kernel debugging, while Hutterer uses mtview for libinput. There are separate tools because developers of different pieces want to be able to examine different things.

If evemu, mtdiag-qt, mtview, or something else shows that the kernel is doing the right thing, libinput-debug-events can be used to show the events as they have been processed by libinput. It has an interface similar to evemu, with a list of devices to choose from to determine which to capture events for. It now ships with libinput. Once you have the output from libinput-debug-events, then it is time to talk to Hutterer (or to the wayland-devel mailing list, presumably).

On the other hand, if evemu and friends show that the problem lies in the kernel, the process gets more complex, because that means it could be any of the layers from the kernel down. There are multiple reasons that the kernel might "send garbage". It could be the device is broken in some fashion that needs to be worked around in the kernel. There is also room for interpretation in the protocol—Tissoires and Hutterer don't necessarily agree on some of the gray areas.

But driver bugs are usually the source of garbage from the kernel. For example, if the device maker changes the device and its protocol in some way, the driver will not reflect that change. Another recent example was a bug when placing three fingers on the touchpad one at a time, then lifting one (and leaving two). That caused a spurious event from the driver that was interpreted by libinput as if all the fingers had been removed, which led to several other bugs caused by that one underlying bug in the kernel driver.

In order to find where the problem exists, you "need to know what is going on down below", which includes both the physical layers (i.e. transports) and the protocols. For physical layers, there are USB, which is used widely, PS/2, which is mostly for keyboards, and I2C, which is quite new (for input devices) and mostly used on phones. Beyond that, there are SMBus, Bluetooth, serial, and so on. "Whatever you can think of, basically somebody made it", he said.

There are three classes of protocols for input devices. There are the public protocols, like the human interface device (HID) and psmouse protocols. There are the semi-public protocols, which are documented but only used by one manufacturer, such as RMI4 (Synaptics) or HID++ 2.0 (Logitech). Proprietary protocols, which have been reverse engineered or their documentation is only available under NDA, round out the types of protocols.

The physical layers and protocols can be combined in various creative ways by the device makers. For example, Synaptics has devices that use RMI4 over HID over I2C. Some versions of the Thinkpad trackstick use PS/2 over RMI4 over SMBus. Logitech also gets "quite creative" with devices that use HID++ in a multi-step combination that converts to and from HID twice. His final example was one that he would not explain: "WTF over GTFO over SNAFU".

In order to figure out which physical layer and protocol are being used by a device, kernel messages should be consulted:

    $ dmesg | grep input

This will show the devices, how they are connected, and the protocol(s) they are using.

If the device uses the HID protocol, the hid-replay tool can be used to show the raw events from the device before processing by the kernel. It works with any transport and can replay the events for debugging purposes. The hid-recorder tool is used to record the events. It has an interface much like the other tools, with a list of devices to choose from. If the kernel is doing the wrong thing with the raw events, it is time to blame him, Tissoires said.

For non-HID devices, the usbmon kernel facility can be used to capture the raw events from the USB transport. There is no provision for replaying events, however. For PS/2 devices, ps2emu (the subject of another XDC talk) can be used to record and play back raw events. Even though PS/2 is a rather old transport, 99% of laptops still use it internally, he said. Other transport layers will require specialized tools or hacks to the kernel to get raw event data.

One question that is often asked is whether there are regression tests for the input stack. The answer is "yes and no". Libinput does have regression tests. Testing X without libinput, though, relies on the X.Org integration test suite (XIT), which starts a new X server for each test, so it is not very efficient. That limits the amount of testing that is done. The kernel HID drivers have a basic wrapper for running regression tests, hid-test, but it is not run much any more, Tissoires said.

When reporting an input bug, there are a few guidelines that should be followed. First off, provide the full dmesg output. Second, provide a recording using evemu that shows the bug triggering (and, if possible, not much more). Lastly, "do not be afraid" if the input developers ask for additional information and recordings, as it is all part of the process for tracking down these kinds of bugs.

[I would like to thank the X.Org Foundation for travel assistance to Toronto for XDC.]

Comments (none posted)

Status updates for three graphics drivers

By Jake Edge
October 7, 2015

X.Org Developers Conference

Drivers for graphics hardware are an important part of the graphics stack, so it was not unexpected that the 2015 X.Org Developers Conference had several status updates for free graphics drivers. Three projects had talks: the Nouveau driver for NVIDIA devices, the amdgpu driver for AMD hardware, and the Etnaviv driver for Vivante GPUs. Each presented an update on its progress and plans. Something of a summary for each presentation follows; those interested in more detail can consult the program page for links to the slides and videos from each of the talks.

Nouveau

Alexandre Courbot of NVIDIA and Martin Peres of Intel started their talk by clarifying the role their companies play in the project. For Peres, the Nouveau work is strictly done in his spare time and has no connection to Intel at all. Courbot is paid by NVIDIA to work on Nouveau, but that work is mostly focused on supporting Tegra devices, so NVIDIA has "not taken over Nouveau development", he said.

They noted that the last status update was at FOSDEM in 2014 (more than a year and a half earlier) and that there have been many improvements since then. A big refactoring of the kernel driver core architecture that had been started back in the Linux 3.7 days was completed. The effort was led by Ben Skeggs and will be finished as of the upcoming Linux 4.3 kernel.

In addition, support for the NVIDIA virtualization interface has been added. The goal is to allow GPU virtualization with low performance impact. Samuel Pitoiset has added support for performance counters to the driver (which was the subject of another talk). Reclocking support, which allows for different performance levels of the hardware, has been added for more GPUs. That has mostly been for Kepler GPUs, but Maxwell reclocking had been added that morning, Peres said.

There are some proposals from NVIDIA that have been merged or are in progress. Explicit handling of coherent objects between the CPU and GPU has been added to the driver. Objects can be marked so that the driver will keep them cache-coherent between the two processors even on buses that are not guaranteed to be coherent. There is a new submit ioctl() that allows user space to handle synchronization, which is not yet merged but would bring performance improvements, Courbot said.

Peres noted that NVIDIA releases a lot of graphics cards, which makes it hard to keep up on the user-space (Mesa) side. Maxwell support was added to Mesa back in mid-2014. Beyond that, support for OpenGL 3.3 for NVIDIA hardware came in Mesa 10.1, and OpenGL 4.1 support came in Mesa 11. Upcoming work includes more graphics-related performance counter support, including an API to expose the counters to other programs.

On the device-dependent X (DDX) side of things, xf86-video-nouveau has dropped support for the Glamor 2D driver. Those who want that support should use the xf86-video-modesetting driver instead.

Courbot said that support for the Tegra K1 (GK20A) was released in January 2014. That code came from NVIDIA and surprised many people at the time. By October 2014, there was "out of the box" Mesa support for the hardware and the patches for the kernel driver are now upstream. Basic kernel support for the Tegra X1 (GM20B) was merged for 4.3 and more features are planned.

Applications that use kernel mode setting (KMS), such as Weston and X, assume that the display components (which send graphical data to the screen) and render components (which produce off-screen data from the graphics commands) are the same device. That is generally true for discrete GPUs (dGPUs), but is not true for mobile devices like Tegra. That means that there is still work to do before applications will display properly on those devices. Courbot suggested that render nodes, which were added a few years back, should be used, though that doesn't completely solve the problems.

First-generation Maxwell GPUs (GM107) were supported initially back in March 2014 using NVIDIA's firmware. By April 2015, open-source firmware had been released and was supported by the driver. But support for the second generation (GM204+) has been stalled for all of 2015, waiting for the release of signed firmware by NVIDIA. Those GPUs will not load firmware unless it is signed with an NVIDIA key.

The problem is wider than just Maxwell, as newer Tegra GPUs also have this requirement. Courbot said that NVIDIA will be releasing signed firmware but that it hasn't happened yet. The code to load signed firmware is mostly working at this point, but there is an internal workflow issue at NVIDIA that needs to be resolved in order to release the firmware. It is normally linked into the binary driver, but there needs to be a separate release of the firmware for Nouveau.

There has been a lot more cooperation between NVIDIA and Nouveau over the last few years. Beyond official support for Tegra GPUs, there is ongoing work to provide generated header files with proper register names and descriptions from the NVIDIA documentation. In addition, there is some open documentation [FTP] available, though it is "still pretty scarce", Courbot said. There is a (non-public) mailing list where Nouveau developers can ask questions and get answers, which sometimes results in additional documentation being written and released. There is still room for improvement, but the relationship between NVIDIA and Nouveau has gotten better and better over the years.

amdgpu

Alex Deucher and Jammy Zhou from AMD gave an overview of the status of the amdgpu project, which is meant to unify AMD's Linux driver offerings. The driver is taking advantage of the existing open-source infrastructure, such as the TTM memory manager, direct rendering manager (DRM) subsystem, Glamor, and so on.

The amdgpu driver is based on the current upstream Radeon driver. There will effectively be two versions of the driver, one that is all open source and a "pro" version that contains some closed-source components. The closed components are an OpenGL user-mode driver (UMD) and two pieces that will eventually become open source: OpenCL and Vulkan support. There is already Gallium3D-based OpenCL support for the open-source driver.

The amdgpu driver has ioctl() interfaces based on those in the Radeon driver for command submission and memory management. It uses the common mode-setting ioctl() interface. There is a libdrm_amdgpu library that provides a common interface for both the open and closed versions of the driver. The FirePro add-on, which adds "workstation class" features, will be open source (though it may not be accepted upstream) and will only be used by the driver if "absolutely necessary". There is a Mesa user-space driver for OpenGL support.

The initial driver was merged into Linux 4.2, supporting Volcanic Islands GPUs and with experimental support for Sea Islands hardware. Support for Fiji GPUs was added in Linux 4.3. Initial OpenGL support for Volcanic Islands hardware was merged for Mesa 11.0. In addition, initial support for libdrm_amdgpu was merged into libdrm 2.4.63.

Plans for the future include enabling the software GPU scheduler by default, adding a new display component, and adding a new power component called PowerPlay. Support for more graphics hardware is also planned. Both the currently closed OpenCL and Vulkan components will be turned into open-source projects that will be run by AMD. They will share their code bases with the closed-source versions for other operating systems, so they will likely remain as standalone projects.

Etnaviv

Lucas Stach, a kernel and graphics developer at Pengutronix, presented on the Etnaviv project, which supports the Vivante IP core used by multiple different systems-on-chip (SoCs). There are several hardware vendors that are using the Vivante core, but probably the most interesting are Marvell and Freescale, both of which have multiple SoC lines using those GPUs.

The project started as a reverse-engineering effort by Wladimir J. van der Laan with contributions from Christian Gmeiner and others. A lot of the commands and the instruction set are known at this point, so Stach did not have to do any reverse engineering of his own.

Vivante hardware has separate cores for 2D, 3D, and vector graphics, which can be combined in various ways on a particular SoC. The 3D core is straightforward; it is modeled after the DirectX 9 pipeline with the addition of unified shaders. Newer and bigger models of the hardware add the ability to run programs using the OpenCL embedded profile.

The different ways that the cores can be arranged makes a difference in how the kernel driver is implemented. One configuration would have a single fetch engine (FE), which is just a "fancy DMA engine", that feeds all three cores. That single FE is exposed to user space as a single channel for rendering.

Another configuration has three FEs, one per core, so 3D acceleration could be handled in parallel with 2D or vector graphics rendering. Each FE is exposed as a separate channel, but that makes synchronization trickier. There may also be multiple "pixel pipes"—the component that runs shader programs and writes out the data. That allows for parallelism and better performance in rendering; it is somewhat akin to the scalable link interface multi-GPU support from NVIDIA. Stach has never seen hardware that has that capability, but the obfuscated GPL driver that Vivante has released supports it.

There are a number of reasons to want a FOSS driver for Vivante GPUs beyond the obvious "FOSS drivers are awesome" reason, Stach said. Integrating vendor drivers is a serious pain point for his customers (and others). The obfuscated kernel driver is huge and only works with Linux 3.14; it also requires closed-source user-space libraries. No security audit is possible for the code and fixes do not necessarily come in a timely fashion.

For these reasons, his customers demand open drivers where they can fix bugs on their own. The Freescale i.MX6, for example, is used in a lot of automotive and industrial applications. It has a fifteen-year guaranteed availability, so the last newly-built devices using the chip may ship in 2027, as it was introduced in 2012. Running the vendor driver may well be impossible by then.

The kernel driver work was started by Gmeiner in 2014 as a clone of the freedreno driver adapted to the Vivante hardware. Stach cleaned it up and sent it out as an RFC in April 2015. That received some comments that have been addressed and it is now out for additional comments.

Since the first version, the user-space API has been significantly reworked to avoid a problem where the command stream could be changed after the driver had validated it. The cache handling for non-cache-coherent architectures has been fixed. GPU suspend and resume are now working and there have been lots of stability improvements. Etnaviv can now replace the "fat and obfuscated" Vivante kernel driver with one that has readable code and is much smaller—instead of 60,000 lines of code, Etnaviv is around 6,500.

There is still work to be done on the kernel side, including using the dynamic voltage and frequency scaling (DVFS) available in the cores. The command-stream validation needs to be improved and support for per-client MMU contexts needs to be added; both have security implications. If support for the MMUv2 interface on some hardware can be added, it would remove the need for the command-stream validation on those platforms. Exposing the performance counters to user space is needed as well.

Russell King has gotten the xf86-video-armada driver working using libetnaviv on top of the Vivante kernel driver. That uses the 2D GPU and provides acceleration for some common operations. Gmeiner started a libdrm for Etnaviv (etna-drm) as another freedreno clone. It has been updated for the new user-space API and some cleanups have been done, so it is ready for review. There is also a Mesa driver that is able to run simple applications—including Quake 3.

As can be seen, there has been quite a bit of progress in the world of free drivers. It is not all that long ago that open-source graphics drivers were essentially non-existent, but that has changed substantially—and that process appears to be accelerating. Some surprising vendors are participating and even the world of mobile graphics is seeing major progress these days. It is all rather heartening to see.

[I would like to thank the X.Org Foundation for travel assistance to Toronto for XDC.]

Comments (41 posted)

Page editor: Jonathan Corbet
Next page: Security>>