Leading items
Welcome to the LWN.net Weekly Edition for October 5, 2017
This edition contains the following feature content:
- Business accounting with Odoo: the quest for a new accounting system continues with a look at what is said to be the most popular free accounting software.
- Strategies for offline PGP key storage: where should one keep one's PGP key to ensure both its security and its availability?
- More from the testing and fuzzing microconference: fuzzers, ktest, unit testing, KMSAN, and more.
- Improvements in the block layer: a lot of work has been done to improve the block layer; maintainer Jens Axboe gives a summary at Kernel Recipes.
- The NumWorks graphing calculator: an open-hardware calculator with open (but not truly free) software.
- Catching up with RawTherapee 5.x: recent developments in this raw-photo editing tool.
This week's edition also includes these inner pages:
- Brief items: Brief news items from throughout the community.
- Announcements: Newsletters, conferences, security updates, patches, and more.
Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.
Business accounting with Odoo
Odoo is, according to Wikipedia, "the most popular open source ERP system". Thus, any survey of open-source accounting systems must certainly take a look in that direction. This episode in the ongoing search for a suitable accounting system for LWN examines the accounting features of Odoo; unfortunately, it comes up a bit short.
Odoo is the current incarnation of the system formerly known as OpenERP; it claims to have over two million users. It is primarily implemented in Python, and carries the LGPLv3 license. Or, at least, the free part of Odoo is so licensed; Odoo is an open-core product with many features reserved for its online or "Enterprise" offerings. The enterprise version comes with source code, but it carries a proprietary license and an end-user license agreement forbidding users from disabling the "phone home" mechanism that, among other things, enforces limits on the number of users. Online offerings are not of interest for this series, and neither is proprietary software (the whole point is to get away from proprietary systems), so this review is focused on the community edition.
Installing and running Odoo
Getting started with the community edition can be a little rough. Odoo
offers Debian and RPM packages and instructions
on how to install them. The version available through the indicated
repository was old, though — version 9, even though version 10
was released in late 2016. The instructions say that PostgreSQL must be
installed while failing to mention that one must create an odoo user
— and that said user, needed for production use, must be a PostgreSQL superuser.
There is a systemd unit file to start the system, but one has to dig to
realize that the way to actually access the system is by pointing a web
browser at local port 8069. While a determined user can get past these little
obstacles, they may leave said user a bit concerned about
the accuracy and completeness of the documentation.
The main screen offers an extensive set of functions, each of which must be "installed" before becoming available; your editor found "accounting" a fair way down the list. The system offers to install a set of demonstration data, an offer which was accepted to test the system until LWN's data could be imported — a step that never took place, as it turns out, for reasons that will be discussed shortly.
The accounting "dashboard" screen shows a menu of possible functions and a few
line plots. It looks slick enough, but its limitations quickly become
clear. There is, for example, no straightforward way to just get a list of
transactions on an account. One can pull up a "statement" and see the
transactions that were listed there, but that will work poorly on accounts
that lack statements. There must be a way to, for example, obtain a list
of transactions on a given expense account, but it's not obvious.
There is no "register" view that makes it easy to enter transactions. One can enter a transaction through what is essentially an HTML form; there is no support for useful features like autocompletion. It seems clear that the Odoo developers expect that the bulk of transactions will be imported from some other source. But that turns out to be a bit of a problem.
Getting data into the new system is a key part of switching away from
QuickBooks. That, alas, is a place where the open-core nature of Odoo makes itself
felt. A look at the "settings" screen to the right makes the problem
clear. The ability to import data directly from banks, the Plaid interface, and the ability to import
data from QIF and OFX files are all reserved for the enterprise edition.
Version 10 added even CSV support to the "enterprise-only" list.
Users of the community edition are not without choices, naturally; they can
type their data in by hand or figure out the database schema and write
their own import code.
Odoo is open source, after all, so the job can always be done.
But the deliberate refusal to include such basic functionality in the free version makes it clear that this version is not intended for anybody wanting to put it to serious use. The list of reserved features also includes check printing — necessary for any real-world user, at least in the US — and, according to this list, "full accounting". This is discouraging for anybody looking for a free-software solution to the accounting problem.
The Odoo chart of accounts works more-or-less as expected, if one doesn't mind the lack of a list of transactions. The chart is flat, though, with no support for account hierarchies. There is a basic set of accounting reports (profit and loss, balance sheet) available, but they are primitive relative to QuickBooks or GnuCash. The customization options are minimal. There is no ability to click on, for example, an expense category in the profit-and-loss chart and see how one managed to spend so much money on beer. Anybody hoping for advanced features like pie charts will be sorely disappointed.
The use of PostgreSQL should allow Odoo to scale to relatively large accounting data sets, though that was not tested here. The system can also handle multiple users working simultaneously, something that neither QuickBooks (in the small-business edition) nor GnuCash can do. One thing any user should be aware of is that Odoo does not attempt to maintain database compatibility across major releases of the system. The company will migrate databases for a fee, naturally enough; there are also projects like OpenUpgrade trying to fill the gap. Companies tend to stay with an accounting system for a long time, so the upgrade issue is one that should be kept in mind.
Development community
The source for the Odoo community edition is maintained on GitHub. A quick look suggests that it is an active project with a non-trivial amount of community involvement. Your editor looked in vain for pull requests adding an "enterprise" feature like QIF import to see what the response would be, but there do not appear to be a lot of external developers trying to add larger features. Odoo requires contributors to sign a contributor license agreement allowing the code to be distributed under a proprietary license.
There seems to be a fair selection of books out there for those wanting to learn more about working with (or developing for) Odoo. One can also find active support forums and such. There does indeed appear to be a significant user community for this system. Odoo seems unlikely to disappear overnight or fade away in the near future.
Closing notes
As can be seen from the above text, your editor has found Odoo wanting when it comes to the task of basic company accounting. One thing should be pointed out here, though: this review was focused on accounting functionality, but Odoo aims to be a full enterprise system with features far beyond accounting. It offers customer-relationship management, project management, inventory control, sales support, a point-of-sale interface, time tracking, issue tracking, expense tracking, and more. It has add-on modules to "build your own enterprise web site", create and run surveys, conduct mass-mailing campaigns, manage calendars, manage fleets of vehicles, and coordinate lunch orders. Businesses in need of all that functionality may well be willing to overlook the shortcomings in the accounting module.
For a company that is focused on gaining control over its own accounting data, and that wants to keep said data locally and managed with free software, Odoo does not seem like an ideal choice. The open-core model will put Odoo-the-company in a conflict-of-interest position with regard to its free users; it seems certain that the community version will always lack some important features. But even the enterprise version come up short relative to some of the alternatives. Odoo might yet develop into a full-featured free accounting system, but it is not there now. The search for a suitable QuickBooks replacement will continue.
Strategies for offline PGP key storage
While the adoption of OpenPGP by the general population is marginal at best, it is a critical component for the security community and particularly for Linux distributions. For example, every package uploaded into Debian is verified by the central repository using the maintainer's OpenPGP keys and the repository itself is, in turn, signed using a separate key. If upstream packages also use such signatures, this creates a complete trust path from the original upstream developer to users. Beyond that, pull requests for the Linux kernel are verified using signatures as well. Therefore, the stakes are high: a compromise of the release key, or even of a single maintainer's key, could enable devastating attacks against many machines.
That has led the Debian community to develop a good grasp of best practices for cryptographic signatures (which are typically handled using GNU Privacy Guard, also known as GnuPG or GPG). For example, weak (less than 2048 bits) and vulnerable PGPv3 keys were removed from the keyring in 2015, and there is a strong culture of cross-signing keys between Debian members at in-person meetings. Yet even Debian developers (DDs) do not seem to have established practices on how to actually store critical private key material, as we can see in this discussion on the debian-project mailing list. That email boiled down to a simple request: can I have a "key dongles for dummies" tutorial? Key dongles, or keycards as we'll call them here, are small devices that allow users to store keys on an offline device and provide one possible solution for protecting private key material. In this article, I hope to use my experience in this domain to clarify the issue of how to store those precious private keys that, if compromised, could enable arbitrary code execution on millions of machines all over the world.
Why store keys offline?
Before we go into details about storing keys offline, it may be
useful to do a small reminder of how the OpenPGP standard works.
OpenPGP keys are made of a main public/private key pair, the
certification key, used to sign user identifiers and subkeys. My
public key, shown below, has the usual main certification/signature key (marked
SC) but also an encryption subkey (marked E), a separate signature
key (S), and two authentication keys (marked A)
which I use as RSA
keys to log into servers using SSH, thanks to the Monkeysphere
project.
pub rsa4096/792152527B75921E 2009-05-29 [SC] [expires: 2018-04-19]
8DC901CE64146C048AD50FBB792152527B75921E
uid [ultimate] Antoine Beaupré <anarcat@anarc.at>
uid [ultimate] Antoine Beaupré <anarcat@koumbit.org>
uid [ultimate] Antoine Beaupré <anarcat@orangeseeds.org>
uid [ultimate] Antoine Beaupré <anarcat@debian.org>
sub rsa2048/B7F648FED2DF2587 2012-07-18 [A]
sub rsa2048/604E4B3EEE02855A 2012-07-20 [A]
sub rsa4096/A51D5B109C5A5581 2009-05-29 [E]
sub rsa2048/3EA1DDDDB261D97B 2017-08-23 [S]
All the subkeys (sub) and identities (uid) are
bound by the main
certification key using cryptographic self-signatures. So while an
attacker stealing a private subkey can spoof signatures in my name or
authenticate to other servers, that key can always be revoked by the
main certification key. But if the certification key gets stolen, all
bets are off: the attacker can create or revoke identities or subkeys
as they wish. In a catastrophic scenario, an attacker could even steal
the key and remove your copies, taking complete control of the key,
without any possibility of recovery. Incidentally, this is why it is
so important to generate a revocation certificate and store it
offline.
So by moving the certification key offline, we reduce the attack surface on the OpenPGP trust chain: day-to-day keys (e.g. email encryption or signature) can stay online but if they get stolen, the certification key can revoke those keys without having to revoke the main certification key as well. Note that a stolen encryption key is a different problem: even if we revoke the encryption subkey, this will only affect future encrypted messages. Previous messages will be readable by the attacker with the stolen subkey even if that subkey gets revoked, so the benefits of revoking encryption certificates are more limited.
Common strategies for offline key storage
Considering the security tradeoffs, some propose storing those critical keys offline to reduce those threats. But where exactly? In an attempt to answer that question, Jonathan McDowell, a member of the Debian keyring maintenance team, said that there are three options: use an external LUKS-encrypted volume, an air-gapped system, or a keycard.
Full-disk encryption like LUKS adds an extra layer of security by hiding
the content of the key from an attacker. Even though private keyrings are
usually protected by a passphrase, they are easily
identifiable as a keyring. But when a volume is fully encrypted, it's not
immediately
obvious to an attacker there is private key material on the device. According
to Sean Whitton, another advantage of LUKS over plain GnuPG keyring
encryption is that you can pass the --iter-time argument when
creating a LUKS partition to increase key-derivation delay, which makes
brute-forcing much harder.
Indeed, GnuPG 2.x doesn't have a run-time option to
configure the
key-derivation algorithm, although a patch was introduced recently to
make
the delay configurable at compile
time in gpg-agent, which is now responsible for all secret key
operations.
The downside of external volumes is complexity: GnuPG makes it difficult to extract secrets out of its keyring, which makes the first setup tricky and error-prone. This is easier in the 2.x series thanks to the new storage system and the associated keygrip files, but it still requires arcane knowledge of GPG internals. It is also inconvenient to use secret keys stored outside your main keyring when you actually do need to use them, as GPG doesn't know where to find those keys anymore.
Another option is to set up a separate air-gapped system to perform certification operations. An example is the PGP clean room project, which is a live system based on Debian and designed by DD Daniel Pocock to operate an OpenPGP and X.509 certificate authority using commodity hardware. The basic principle is to store the secrets on a different machine that is never connected to the network and, therefore, not exposed to attacks, at least in theory. I have personally discarded that approach because I feel air-gapped systems provide a false sense of security: data eventually does need to come in and out of the system, somehow, even if only to propagate signatures out of the system, which exposes the system to attacks.
System updates are similarly problematic: to keep the system secure, timely security updates need to be deployed to the air-gapped system. A common use pattern is to share data through USB keys, which introduce a vulnerability where attacks like BadUSB can infect the air-gapped system. From there, there is a multitude of exotic ways of exfiltrating the data using LEDs, infrared cameras, or the good old TEMPEST attack. I therefore concluded the complexity tradeoffs of an air-gapped system are not worth it. Furthermore, the workflow for air-gapped systems is complex: even though PGP clean room went a long way, it's still lacking even simple scripts that allow signing or transferring keys, which is a problem shared by the external LUKS storage approach.
Keycard advantages
The approach I have chosen is to use a cryptographic keycard: an external device, usually connected through the USB port, that stores the private key material and performs critical cryptographic operations on the behalf of the host. For example, the FST-01 keycard can perform RSA and ECC public-key decryption without ever exposing the private key material to the host. In effect, a keycard is a miniature computer that performs restricted computations for another host. Keycards usually support multiple "slots" to store subkeys. The OpenPGP standard specifies there are three subkeys available by default: for signature, authentication, and encryption. Finally, keycards can have an actual physical keypad to enter passwords so a potential keylogger cannot capture them, although the keycards I have access to do not feature such a keypad.
We could easily draw a parallel between keycards and an air-gapped system; in effect, a keycard is a miniaturized air-gapped computer and suffers from similar problems. An attacker can intercept data on the host system and attack the device in the same way, if not more easily, because a keycard is actually "online" (i.e. clearly not air-gapped) when connected. The advantage over a fully-fledged air-gapped computer, however, is that the keycard implements only a restricted set of operations. So it is easier to create an open hardware and software design that is audited and verified, which is much harder to accomplish for a general-purpose computer.
Like air-gapped systems, keycards address the scenario where an
attacker wants to get the private key material. While an
attacker could fool the keycard into signing or decrypting some
data, this is possible only while the key is physically connected,
and the keycard software will prompt the user for a password before
doing the operation, though the keycard can cache the password for some time. In effect, it thwarts offline attacks: to
brute-force the key's password, the attacker needs to be on the target
system and try to guess the keycard's password, which will lock itself after a
limited number of tries. It also provides for a clean and standard
interface to store keys offline: a single GnuPG command moves private
key material to a keycard (the keytocard command in the
--edit-key
interface), whereas moving private key material to a LUKS-encrypted
device or air-gapped computer is more complex.
Keycards are also useful if you operate on multiple computers. A
common problem when using GnuPG on multiple machines is how to safely
copy and synchronize private key material among different devices, which
introduces new security problems. Indeed, a "good rule of
thumb in a forensics lab
", according
to Robert J. Hansen on the GnuPG mailing list, is to "store the minimum
personal data possible on your
systems
". Keycards provide the best of both worlds here: you can use
your private key on multiple computers without actually storing it in
multiple places. In fact, Mike Gerwitz went as far
as saying:
For users that need their GPG key on multiple boxes, I consider a smartcard to be essential. Otherwise, the user is just furthering her risk of compromise.
Keycard tradeoffs
As Gerwitz hinted, there are multiple downsides to using a keycard, however. Another DD, Wouter Verhelst clearly expressed the tradeoffs:
Smartcards are useful. They ensure that the private half of your key is never on any hard disk or other general storage device, and therefore that it cannot possibly be stolen (because there's only one possible copy of it).
Smartcards are a pain in the ass. They ensure that the private half of your key is never on any hard disk or other general storage device but instead sits in your wallet, so whenever you need to access it, you need to grab your wallet to be able to do so, which takes more effort than just firing up GnuPG. If your laptop doesn't have a builtin cardreader, you also need to fish the reader from your backpack or wherever, etc.
"Smartcards" here refer to older OpenPGP cards that relied on the IEC 7816 smartcard connectors and therefore needed a specially-built smartcard reader. Newer keycards simply use a standard USB connector. In any case, it's true that having an external device introduces new issues: attackers can steal your keycard, you can simply lose it, or wash it with your dirty laundry. A laptop or a computer can also be lost, of course, but it is much easier to lose a small USB keycard than a full laptop — and I have yet to hear of someone shoving a full laptop into a washing machine. When you lose your keycard, unless a separate revocation certificate is available somewhere, you lose complete control of the key, which is catastrophic. But, even if you revoke the lost key, you need to create a new one, which involves rebuilding the web of trust for the key — a rather expensive operation as it usually requires meeting other OpenPGP users in person to exchange fingerprints.
You should therefore think about how to back up the certification key, which is a problem that already exists for online keys; of course, everyone has a revocation certificates and backups of their OpenPGP keys... right? In the keycard scenario, backups may be multiple keycards distributed geographically.
Note that, contrary to an air-gapped system, a key generated on a keycard cannot be backed up, by design. For subkeys, this is not a problem as they do not need to be backed up (except encryption keys). But, for a certification key, this means users need to generate the key on the host and transfer it to the keycard, which means the host is expected to have enough entropy to generate cryptographic-strength random numbers, for example. Also consider the possibility of combining different approaches: you could, for example, use a keycard for day-to-day operation, but keep a backup of the certification key on a LUKS-encrypted offline volume.
Keycards introduce a new element into the trust chain: you need to trust the keycard manufacturer to not have any hostile code in the key's firmware or hardware. In addition, you need to trust that the implementation is correct. Keycards are harder to update: the firmware may be deliberately inaccessible to the host for security reasons or may require special software to manipulate. Keycards may be slower than the CPU in performing certain operations because they are small embedded microcontrollers with limited computing power.
Finally, keycards may encourage users to trust multiple machines with their secrets, which works against the "minimum personal data" principle. A completely different approach called the trusted physical console (TPC) does the opposite: instead of trying to get private key material onto all of those machines, just have them on a single machine that is used for everything. Unlike a keycard, the TPC is an actual computer, say a laptop, which has the advantage of needing no special procedure to manage keys. The downside is, of course, that you actually need to carry that laptop everywhere you go, which may be problematic, especially in some corporate environments that restrict bringing your own devices.
Quick keycard "howto"
Getting keys onto a keycard is easy enough:
Start with a temporary key to test the procedure:
export GNUPGHOME=$(mktemp -d) gpg --generate-key-
Edit the key using its user ID (UID):
gpg --edit-key UID -
Use the key command to select the first subkey, then copy it to the keycard (you can also use the
addcardkeycommand to just generate a new subkey directly on the keycard):gpg> key 1 gpg> keytocard If you want to move the subkey, use the
savecommand, which will remove the local copy of the private key, so the keycard will be the only copy of the secret key. Otherwise use thequitcommand to save the key on the keycard, but keep the secret key in your normal keyring; answer "n" to "save changes?" and "y" to "quit without saving?" . This way the keycard is a backup of your secret key.Once you are satisfied with the results, repeat steps 1 through 4 with your normal keyring (unset
$GNUPGHOME)
When a key is moved to a keycard, --list-secret-keys will show it as
sec> (or ssb> for subkeys) instead of the usual sec keyword. If
the key is completely missing (for example, if you moved it to a LUKS
container), the # sign is used instead. If you need to use a key
from a keycard backup, you simply do gpg --card-edit with
the key plugged in,
then type the fetch command at the prompt to fetch the public key
that corresponds to the private key on the
keycard (which stays on the keycard). This is the same procedure as the one
to use the secret key on another computer.
Conclusion
There are already informal OpenPGP best-practices guides out there and some recommend storing keys offline, but they rarely explain what exactly that means. Storing your primary secret key offline is important in dealing with possible compromises and we examined the main ways of doing so: either with an air-gapped system, LUKS-encrypted keyring, or by using keycards. Each approach has its own tradeoffs, but I recommend getting familiar with keycards if you use multiple computers and want a standardized interface with minimal configuration trouble.
And of course, those approaches can be combined. This tutorial, for example, uses a keycard on an air-gapped computer, which neatly resolves the question of how to transmit signatures between the air-gapped system and the world. It is definitely not for the faint of heart, however.
Once one has decided to use a keycard, the next order of business is to choose a specific device. That choice will be addressed in a followup article, where I will look at performance, physical design, and other considerations.
More from the testing and fuzzing microconference
A lot was discussed and presented in the three hours allotted to the Testing and Fuzzing microconference at this year's Linux Plumbers Conference (LPC), but some spilled out of that slot. We have already looked at some discussions on kernel testing that occurred both before and during the microconference. Much of the rest of the discussion will be summarized below. As it turns out, a discussion on the efforts by Intel to do continuous-integration (CI) testing of graphics hardware and drivers continued several hundred miles north the following week at the X.Org Developers Conference (XDC); that will be covered in a separate article.
Fuzzers
Two fuzzer developers, Dave Jones and Alexander Potapenko, discussed the fuzzers they work on and plans for the future. It was, in some sense, a continuation of the fuzzer discussion at last year's LPC.
Potapenko represented the syzkaller project, which is a coverage-guided fuzzer for the kernel. The project does more than simply fuzzing though, as it includes code to generate programs that reproduce crashes it finds as well as scripts to set up machines and send email for failures. It "plays well" with the Kernel Address Sanitizer (KASAN), runs on 32 and 64-bit x86 and ARM systems, and has support for Android devices, he said.
Jones noted that his Trinity project is a system-call fuzzer for Linux that is "dumber than syzkaller". It does not use coverage to guide its operation; in some ways it is "amazing that it still finds new bugs." Over the last year, logging over UDP has been added, as has support for the MIPS architecture ("someone got excited"). There have been lots of contributions from others in the community over the last year, he said.
Sasha Levin, one of the microconference leads, asked both what the next big feature for their fuzzer would be. Potapenko said that tracking the origin of values that are used in comparisons in functions is being worked on. The idea is to allow syzkaller to find ways to further the coverage by reversing the sense of the comparisons to take new paths.
For Trinity, Jones plans to explore BPF programs more. He wants to feed in "mangled BPF programs" to see what happens. There is limited support in Trinity for BPF fuzzing currently; it has only found two bugs, he said. Steven Rostedt suggested stressing the BPF verifier, which will require something more than simply random programs. Jones said that Trinity uses Markov chains to create the programs, but that it is "still a little too dumb".
Rostedt asked about the reproducer programs for problems that the fuzzers find; he wondered if those should be sent to the maintainers to be added to the tests they run. Or perhaps they should get added to the kernel self-tests, he said. Greg Kroah-Hartman agreed that the programs could be useful, but some of them cause a "splat" in the logs, which might make them difficult to integrate into the failure checking of the self-tests or the Linux test project.
Something that is missing, Jones said, is for the fuzzers to be run regularly. If that is not done, various problems will sneak through and end up in kernel releases. "We still see really dumb stuff", like not checking for null pointers, ending up in the mainline. Those kinds of bugs "should not hit the tree", he said, but should be caught far earlier. Running Trinity and syzkaller on the linux-next tree could be done, but it is difficult to run the kernel using that tree. That tree is "not really testable", Daniel Vetter said, because it tends to be broken fairly often. The Intel CI system for graphics can use the linux-next tree, but only because there are a bunch of "fixup patches" that get applied.
Levin asked about getting distributions involved in fuzzing. He wondered if there were ways to make it easier for distribution kernel maintainers to run the fuzzers on their kernels. He suggested a disk image that could be run in a virtual machine (VM); that would help getting more people running the fuzzers, he said. Potapenko said that there is infrastructure available to set up a few VMs to run syzkaller, but that at least two physical machines are needed. The fuzzer causes crashes, so it is best to have a separate master machine that supplies parameters to the workers.
With a grin Jones said that he "got out of the distro building game" and was not planning to get back in. Trinity is currently run as part of the Fedora test suite, but it is somewhat destructive so it is the last thing that gets run. There are going to be Fedora kernel test days for each Linux release, he said; ISO files are generated for those tests. He had not thought about adding fuzzers into that image, but it would be good to do so.
At the end, Levin asked how to get more people and companies working on fuzzers. Potapenko said that it is simple to contribute to syzkaller and he would like to see more subsystem maintainers help with the code to exercise their system calls. Jones said there is plenty of low-hanging fruit for things to be added to Trinity; as an experiment, he did not add support for certain system calls to see if someone else would, but so far that has not happened.
ktest
The ktest tool that has been in the kernel source tree for some time now was the subject of the next, fairly brief talk. Steven Rostedt, who wrote the Perl script, wanted to get information about it into more hands; he is often surprised how many people have not heard of it. It is meant to automate testing of kernels, but it does a fair amount more than that. Rostedt said that these days he rarely uses the make command for kernel builds and installs; he uses ktest to handle all of that for him.
One of the main ways he uses it is to check a patch series from a developer in the subsystems he maintains. He wants to ensure that the series is bisectable, among other things. Before developing ktest, he would apply a patch series, one patch at a time; he would then build, boot, and test the kernels built. That was a time-consuming process.
His test setup consists of systems with a remote power switch capability, as well as a means to read the output of the boot. That can all be controlled by ktest to build, install, boot, and run tests remotely. His test suite takes 13 hours to run on a single system as it uses multiple kernel configurations. Once he could do that, he started adding more features to ktest, including bisection, reverse bisection, configuration bisection, and more.
Dhaval Giani, the other microconference lead, noted that he has found ktest to be a good test harness. He uses it to run fuzzers on various test systems and configurations. Rostedt concluded his talk by saying that he mostly just wanted more to be aware of the tool: "ktest exists", he said with a chuckle.
Kernel unit testing
Knut Omang wanted to look at ways to add more unit testing to the kernel. He has created the Kernel Test Framework (KTF) to that end. It is a kernel module that adds some basic functionality for unit testing; he would like to see the kernel have the same unit-testing capabilities that user space has. He has integrated KTF with Google Test (gtest) as the user-space side of the framework. It communicates with the kernel using netlink to query for available tests and then to selectively run one or more tests and collect their output.
Omang wants developers to "get hooked on testing". So he tried to come up with a test suite that developers will want to use. Testing costs less the closer it is done to the developers writing the code. He is an advocate of test-driven development (TDD), but acknowledged that it is not universally popular. The basic idea behind TDD is to write tests before writing the code, but there are a number of arguments that opponents make about it. Among the complaints are that writing good tests takes a lot of time, developers do not think of themselves as testers, and that writing test code is boring.
Behan Webster said that good testers have a different mindset than developers; testers are trying to break things, while developers are trying to make something work. Rostedt added that it is better if a developer doesn't write the tests because they have too much knowledge of how the code is supposed to work, so they will overlook things. Another attendee pointed out that "if you don't have tests, you don't have working code"; it may seem to be working, but it will break at some point.
There was also some discussion about how to do unit testing for components like drivers. A lot of code infrastructure is needed before anything in a driver works at all, which limits the testing that can be done earlier. Omang and others believe that the problem can be decomposed into smaller pieces that can be individually and separately tested, though some in the audience seemed skeptical of that approach.
Kernel memory sanitizer
Finding places where uninitialized memory is used is a potent way to find bugs, finding those places is what the KernelMemorySanitizer (KMSAN) aims to do. Alexander Potapenko described the tool, which has found a lot of bugs in both upstream kernels and in the downstream Google kernels. It stems from the user-space MemorySanitizer tool that came about in 2012.
The idea is to detect the use of uninitialized values in memory, not simply uninitialized variables. Those values could be used in a branch, as an index, be dereferenced, copied to user space, or written to a hardware device. KMSAN has found 13 bugs so far, though he thought another may have been found earlier in the day of the microconference (September 15). KMSAN requires building the kernel with Clang.
To track the state of kernel memory, KMSAN uses shadow memory that is the same size as the memory used by the kernel. That allows KMSAN to track initialization of memory at the bit level (i.e. it can detect that a single bit has been used but not initialized). KASAN uses a similar technique, but tracks memory at the byte level, so its shadow memory uses one byte for each eight bytes of kernel memory.
KMSAN obviously requires a lot of memory, so Levin wondered if there could be options that used less. Potapenko said that doing so creates a lot of false positives, so it is not worth it. He also noted that kmemcheck has found five bugs in the last five years, but that KMSAN runs 5-10 times faster so it finds more bugs.
In the future, KMSAN could be used for taint analysis by using the shadow mapping to track data coming from untrusted sources. It could also help fuzzers determine which function arguments are more useful to change and to track the origin of those values back to places where they enter the kernel. He pointed to CVE-2017-1000380, which was found by KMSAN, and wondered if there is a way to kill of all of the uninitialized memory bugs of that sort. Simply replacing calls to kmalloc() with kzalloc() may be tempting, but could be problematic.
KMSAN requires patches to Clang (and the ability to build the kernel with Clang). He hopes to see KMSAN added to the upstream kernel by the end of the year.
Conclusion
The kernel testing story is clearly getting better. There is still plenty to do, of course, but more varied and larger quantities of testing are being done—much of it automatically. That is finding more bugs; with luck, it may evenutally outrun the kernel's development pace so that it is finding more bugs than are being added every day. Kernel development proceeds apace, it is important that testing gets out ahead of it and stays there.
[I would like to thank LWN's travel sponsor, The Linux Foundation, for assistance in traveling to Los Angeles for LPC.]
Improvements in the block layer
Jens Axboe is the maintainer of the block layer of the kernel. In this capacity, he spoke at Kernel Recipes 2017 on what's new in the storage world for Linux, with a particular focus on the new block-multiqueue subsystem: the degree to which it's been adopted, a number of optimizations that have recently been made, and a bit of speculation about how it will further improve in the future.
Back in 2011, Intel published a Linux driver for NVM Express (or NVMe, where NVM is the Non-Volatile Memory Host Controller Interface), which was its new bus for accessing solid-state storage devices (SSDs). This driver was incorporated into the mainline kernel in 2012, first appearing in 3.3. It allowed new, fast SSD devices to be run at speed, but that gave no improvement if the block subsystem continued to treat them as pedestrian hard drives. So a new, scalable block layer known as blk-mq (for block-multiqueue) was developed to take better advantage of these fast devices; it was merged for 3.13 in 2014. It was introduced with the understanding that all of the old drivers would be ported to blk-mq over time; this continues, even though most of the mainstream block storage devices have by now been successfully ported. Axboe's first focus was a status update on this process.
Some old, outstanding flash drivers have recently been converted, as has NBD, the network block device driver. The scsi-mq mechanism, which allows SCSI drivers to use blk-mq, has been in the kernel as an option for some time, but recently Axboe made it the default. This change had to be backed out because of some performance and scalability issues that arose for rotating storage. Axboe feels those issues have now been pretty much dealt with, and hopes that scsi-mq will be back to being the default soon.
The old cciss drivers have now moved to SCSI, which will please anyone who's had to work with HP Smart Array controllers; it also pleases Axboe, since that means that cciss will be implicitly converted to blk-mq when scsi-mq becomes default. That leaves about 15 drivers that have to be converted; work will continue until the job is finished, a state Axboe speculated (to some amusement) will be reached with the conversion of floppy.c — for which he is willing to award a small prize.
The main new features added recently relate to I/O scheduling, because its absence was one of the main sources of resistance to converting drivers to the new multiqueue framework. Various design decisions that were made earlier for blk-mq have made this less easy than it might have been but, despite this, blk-mq-sched was added in 4.11, with (initially) two scheduling disciplines: none and mq-deadline. The former makes no change to the default behavior, which Axboe describes as "roughly FIFO", and the latter is a re-implementation of the old deadline scheduler. In 4.12 two more were added: Budget Fair Queuing (BFQ), which is a scheduler based on CFQ that's been around for years but never been integrated into the kernel, and Kyber, a fully multiqueue aware scheduler that supports such things as reader-versus-writer fairness.
Another new feature is writeback throttling. This is an attempt to better deal with the kernel's periodic desire to write out dirty pages to disk (or whatever flavor of storage backs them), which currently causes heavy load on the I/O backend that Axboe compared to Homer Simpson consuming a continuous but non-uniform stream of donuts. With writeback throttling, blk-mq attempts to get maximum performance without excessive I/O latency using a strategy borrowed from the CoDel network scheduler. CoDel tracks the observed minimum latency of network packets and, if that exceeds a threshold value, it starts dropping packets. Dropping writes is frowned upon in the I/O subsystem, but a similar strategy is followed in that the kernel monitors the minimum latency of both reads and writes and, if that exceeds a threshold value, it starts to turn down the amount of background writeback that's being done. This behavior was added in 4.10; Axboe said that pretty good results have been seen.
The work came out of complaints from Facebook developers who saw high latencies when images were being updated. To quantify this, Axboe created his writeback challenge: two RPMs that can be installed on a system, one of which creates many small files, the other of which creates a small number of large files, the idea being that you monitor the performance of your services while installing these RPMs. In the context of a small but predictable service (a test application called io.go) Axboe saw (on both SSD and rotating storage) maximum request times that decreased by around a factor of ten with writeback throttling; worst-case on rotating storage with throttling was an I/O request that took 478ms to complete, while without throttling the worst case took 6.5 seconds.
Yet another improvement is I/O polling. Traditionally, when an application wishes to perform I/O, execution passes into kernel space and down to the device driver, then while it's waiting for that request to complete it either gets on with something or goes to sleep. Completion is made known to it by an interrupt; the receipt of this completion notification wakes the application to get on with its job. During this sleep phase, on a non-overloaded system, the CPU on which the sleeping application was running is now likely to go to sleep itself, and the overhead of going to sleep then waking up becomes a significant part of the completion time. So, as a strategy to avoid this overhead, the kernel code may engage in continuous polling: it passes the I/O request to the hardware then immediately starts repeatedly asking it whether it's done yet — a behavior that will be familiar to any parent of small children on car journeys. This minimizes the sleeping and waking-up overhead, but wastes CPU in the process of polling; there are also power implications to such behavior. A middle ground is desirable.
Axboe and others working on this subsystem came up with a solution called hybrid polling, which relies on the fact that fast devices tend also to be deterministic. With hybrid polling, the kernel tracks the completion times of I/O requests as a function of their size. Then when the kernel sends any given I/O request down to the hardware it sets a timer for half the mean completion time of comparably-sized requests. That will wake the application (running within the kernel) while it's likely that the request has not yet completed and switch to continuous polling behavior so that the completion of the request is detected as promptly as possible. Thus, hopefully about half of the cost of continuous polling is avoided, while hopefully most of the cost of waking up is paid before the actual I/O request has been completed, so that latency is not increased.
In Axboe's tests, this strategy produced latencies that were indistinguishable from continuous polling. There are new system calls, pwritev2() and preadv2(), for those who wish to enable this behavior now (certain flags must be set). There are also associated sysfs controls: io_poll, which enables or disables the behavior, and io_poll_delay, which defaults to -1, meaning no polling. If the latter is set to zero, hybrid polling is used as described. If it's set to a positive value, a specific delay latency (in microseconds) is set. Enthusiastic knob-twiddlers should be aware that Axboe's tests show that it's hard to beat the hybrid strategy, and easy to do badly: not just worse than hybrid, but worse even than traditional interrupt-based performance.
Improved direct I/O (i.e. O_DIRECT) handling, which treats large and small requests differently, and corresponding improvements in fs/iomap.c, have shaved a further 6% from I/O times. This improvement is in 4.10. The I/O accounting subsystem was observed to be using 1-2% of CPU, which, for a subsystem that just tracks performance, is a high overhead. Changes merged in 4.14, which were easy as a result of the design of blk-mq, have noticeably reduced that cost.
Support has also been added for write lifetime hints, which is a feature introduced in hardware in NVMe 1.3. This allows the flash device controller to be given knowledge of the expected lifetime of data that's been queued for writing. Flash devices group writes into structures known as erase blocks, which can be multi-gigabyte sized in modern devices. If a write is later invalidated by an overwrite sent down from the application, the erase block it was in has to be copied to a new, modified one internally, and this is expensive. If a device controller knows the expected lifetime of data, then it can improve its own performance by constructing erase blocks of writes that are all expected to have comparable lifetimes and thus might all be invalidated together. Current kernel support allows lifetime hints of short, medium, long, and extreme to be provided, though these quantities don't have absolute values as it is acknowledged that they will differ from application to application. Nevertheless, with these changes, reductions of 25-30% in physical writes to the storage device have been achieved in the context of database workloads, which Axboe rightly describes as "huge", and is accompanied by corresponding improvements in latency.
Axboe concluded his talk by showing his list of desirable improvements from 2015, and noting (to applause) that every single one had been achieved. His list for 2017 is therefore much shorter: I/O determinism and efficiency improvements. The former is a way to guarantee I/O latency for a given application and thus avoid the "noisy neighbor" problem, where two applications use the same back-end storage and one's I/O unduly reduces the other's performance. The other is a safe bet because it's a wide umbrella; history suggests he'll find something to put beneath it next year.
Unless you're running a computer that never remembers anything it does, you, personally, have an interest in the I/O subsystem. This sort of news, then, is good news for all of us.
[We would like to thank LWN's travel sponsor, The Linux Foundation, for assistance with travel funding for Kernel Recipes.]
The NumWorks graphing calculator
As the Internet of Things (IoT) becomes ever more populous, there is no shortage of people warning us that the continual infusion into our lives of hard-to-patch proprietary devices running hard-to-maintain proprietary code is a bit of a problem. It is an act of faith for some, myself included, that open devices running free software (whether IoT devices or not) are easier to maintain than proprietary, closed ones. So it's always of interest when freedom (or something close to it) makes its way into a class of devices that were not previously so blessed.
In this case, the device is the humble scientific calculator. Many people now use their smartphones when they need to do sums, but others still find a calculator a useful thing to have at hand. Recently, NumWorks, a new scientific graphing calculator with an open-design ethos was released. Although it is far from fully free at this point, it is a major step forward from the user-hostile position most calculator manufacturers have taken, and it is interesting to see to what extent it fulfills its promise.
NumWorks was founded about two years ago. Romain Goyet, the CEO, comes from a software engineering background. He was looking to start his next company when he wandered into a supermarket and saw his old college calculator on the shelf — unimproved in over a decade and more expensive than it used to be. Recalling how he'd had to spend hours poring over a manual to persuade his old calculator to do anything useful, and perceiving calculators in general to be user-hostile, unintuitive, and stuck in a rut, he decided to make a calculator that was easier to use and, crucially, easier to improve. The NumWorks device (currently the N100) and Epsilon, the OS that the N100 runs, are the result.
My current calculator, the device against which I'm judging the NumWorks, is a TI-84 Plus. As an industrial product, the NumWorks compares well. It's about the same width, an inch shorter, about half the thickness and slightly over half the mass of the TI and, at €80 or $100, competitively priced. The NumWorks's screen is much better than the TI's, offering color, a lot more pixels (320x240 versus 96x64), and backlighting. The general design feel is much more video game than brick, though the gold-on-white print used for the alphabet functions on the keys is almost illegible in poor light (Goyet says NumWorks may darken the ink used, but that it wanted to avoid labels relating to advanced features crowding out the basic function set). π, the exponent function (x10x), and Ans (which recalls the result of the previous calculation) are all on unshifted keys, which does make the workflow simpler than the TI's.
It's a pretty intuitive device to use. Basic arithmetical operations are all on the main keys. Less basic operations (trigonometric functions, exponentials, and complex numbers) are on the smaller keys above. Sophisticated operations (integrals, sums and products, permutations and combinations, and the like) are accessed through the "toolbox" key, which I took to be a print key for some hours. Other modes (such as graphing and linear regression) are accessed via the high-level navigation controls at the top of the keypad. I was able to get up and running with graphs within two hours, something I still can't do reliably on my TI-84 Plus after two years. That said, it's really handy that the NumWorks is so intuitive, because the online manual is so short as to be virtually useless.
On the openness front, the device does fairly well. The STL files that describe the calculator's plastic parts are all available on GitHub, as is the Epsilon operating system. Epsilon is flashed onto the device via the USB Device Firmware Upgrade protocol, which is a well-established and well-supported method of flashing new OSes.
That said, all of the GitHub content is available only under Creative Commons (CC) Attribution-NonCommercial-NoDerivatives (BY-NC-ND), a license even more restrictive than CC Attribution-NonCommercial (BY-NC, which itself is not regarded as free). Asked about this, Goyet ascribes the choice of license to NumWorks's fear of clone manufacturers undercutting it, but recognizes that the license is a disincentive for both contributors and enthusiasts. It requires a CLA from contributors which requires a broad patent grant and permits relicensing; even if commercial considerations require NC for some time, the ND restriction is likely to be removed in the near future. Going to a CC BY-NC model would at least permit the community to continue to improve the device if the company were to fold or take a new direction. While I wish the device were fully free, it is worth contrasting how TI has tried to impede the user community's efforts to run custom software on their TI-84 Plus calculators; NumWorks does seem to be trying to be as free as it thinks it can be.
When it comes to mathematics, the NumWorks didn't come off so well. It supports complex numbers, but when I asked it to calculate Euler's identity it told me the answer was on the order of 10-8i, instead of zero. Asked for the calculator standard of 69!, it told me the answer was infinite; experimentation revealed that the largest factorial it could manage was 34!, and that results over about 3.4x1038 were out-of-range.
However, this was with the installed version of Epsilon, which for mine was 1.0.3; a look at the GitHub repository revealed that the current version was 1.1.2. The instructions for building the software are pretty simple; git clone downloads a copy, and repeated cycles of make clean && make (to find out what I was missing) and dnf whatprovides (or your distribution's equivalent to find out what package you need to provide a missing command, library, or cross-compiler) allowed me to compile my first new calculator OS in about half an hour. make app_flash turns out to require privilege under Fedora 26, but once I'd installed dfu-util and used sudo to become root I was prompted to connect the calculator via USB and press its (recessed) reset button. Doing so started the flashing, which completed in about 70 seconds, after which the calculator rebooted automatically and presented me with the new OS.
Version 1.1.2 did quite a lot better than its predecessor. Euler's identity evaluated to zero; 69! evaluated correctly, as (to my surprise) did 70! — the new OS copes with numbers considerably larger than 10100. Integration still had problems; asked to integrate the normal distribution function from -5 to +5, it incorrectly told me the result was undefined. After I logged this as a bug, a community member had diagnosed the underlying problem within a day, and provided a patch within another day. The patch has since been pulled, and I've downloaded, built, and flashed the new OS. Not only does the integral now evaluate correctly, but it does so in a time too short for me to measure; my TI-84 Plus takes over three seconds. For those who like some programmability in their calculators, a pre-release, restricted-functionality version of Python is now provided, though code entry on the NumWorks's keyboard is fairly painful, and only one program can be stored.
There are other issues: when answers are carried forward to new calculations, no more numerical accuracy is retained than is displayed on the screen, so taking the square root of 2 then squaring the result yields 2.000001. There are a number of other small grumbles I could mention, but the response to my first bug has been sufficiently positive that I'm prepared to assume that they'll get fixed in the fullness of time. There are also complaints about the Linux support; while Windows and macOS users can automatically flash the latest OS through their browsers, Linux users must compile and flash their own. Goyet is sympathetic to the idea that providing links to the binary images would let Linux users flash their NumWorks devices using dfu-util, without having to download all the tools needed to build their own images. It also would not require NumWorks to try to make the in-browser support work on all the browsers that people use on their many Linux distributions; so Linux support may get better soon. For readers who want to get up and running now, the toolchain isn't all that painful to assemble.
NumWorks itself seems open to feedback. To get to where it is, it has had to make some choices, such as the illegible color scheme and the odd license, but Goyet asserts that it's open to reviewing those choices in the face of community pressure. It has email, a reddit forum, and it seems to monitor the GitHub issue queues closely.
In summary, NumWorks is an elegant and promising device. I would certainly buy one for any sixth-form (i.e., high school) student or undergraduate of my acquaintance, though I'd make sure it was running the latest Epsilon before I gave it to them. I'd buy one for myself (in fact, I have). Yes, it has imperfections, but it also has mechanisms for dealing with them. Maybe it's a little more work than just buying a calculator from the supermarket, but let's face it, if we ask for our devices to be open so that we can understand and fix them ourselves, we perhaps should not be surprised when our relationship to them isn't that of consumer to product, but one of engineer to tool.
Catching up with RawTherapee 5.x
Free-software raw photo editor RawTherapee released a major new revision earlier this year, followed by a string of incremental updates. The 5.x series, released at a rapid pace, marks a significant improvement in the RawTherapee's development tempo — the project's preceding update had landed in 2014. Regardless of the speed of the releases themselves, however, the improved RawTherapee offers users a lot of added functionality and may shake up the raw-photo-processing workflow for many photographers.
It has been quite some time since we last examined the program during the run-up to the 3.0 series in 2010. In the intervening years, the scope of the project has grown considerably: macOS is now supported in addition to Windows and various flavors of Linux, and the application has seen substantial additions to the tool set it provides.
The competitive landscape that RawTherapee inhabits has also changed; 2010-era competitors Rawstudio and UFRaw are not seeing much active development these days (not to mention the death of proprietary competitors like Apple's Aperture), while darktable has amassed a significant following — particularly among photographers interested in a rich set of effects and retouching tools. At the other end of the spectrum, raw-file support improved in the "consumer" desktop photo-management tools (such as Shotwell) in the same time period, thus offering casual users some options with a less intimidating learning curve than darktable's. Where RawTherapee sits amid all of the current offerings can be a bit hard to define.
The 5.0 release landed on January 22, 5.1 then arrived on May 15, and 5.2 was unleashed (in the words of the announcement) on July 23. The project also migrated its source-code repository and issue tracking to GitHub, launched a new discussion forum, and has assembled a wiki-style documentation site called RawPedia.
Core functionality
There are several new features that serious photographers might consider "infrastructural" (which is to say that they were sorely missed in the 4.x series). Chief among these is color management; this ensures that pixel data is correctly transformed for display and output devices. A properly calibrated setup is required, but RawTherapee now supports monitor profiles, rendering intent, and even "soft proofing" to simulate print output on screen.
In the same vein are some user-interface and workflow improvements. For instance, image-sample points in the navigator window can show pixel values in absolute terms, as percentages, or in a [0,1] range. Which option is best will vary from situation to situation; having the ability to choose is the real benefit. Another nicety is that users must now hold down the shift key in order to change any settings with mouse wheel. This prevents accidentally trashing one's fine-tuned slider settings when trying to simply scroll through the list of available image operations. Accidents of that nature are all too common with some of RawTherapee's competitors and among free-software graphics applications in general. Dragging points in the curves editor now has modifier-key support: holding down Shift lets the user snap curve points to useful positions like the 45-degree diagonal, while holding down Control enters fine-granularity mode, letting the user slowly adjust curve values.
Also of note are improvements to the noise-reduction tool set. Chroma noise (that is, color noise) can now be reduced automatically, while there are new controls available to manually reduce luminance (that is, brightness) noise. The accepted wisdom is that the human eye is more sensitive to small differences in luminance, so it is arguably better to leave that feature in the hands of the user. Chroma noise can be erased more safely, with less chance that the result will offend the eye.
New tools and features
There are two brand-new image-editing tools worth discussing as well: the Retinex and Wavelet tools.
Retinex attempts to restore color accuracy in situations where digital
sensors perform poorly, using an algorithm that mimics how the human
brain interprets color in similar situations. For example, in dim
lighting, the brain uses what it has already observed about the colors
of objects in a scene to "fill in the gaps" — hence, your eye still
interprets a banana as yellow, even when seen it is in the strong blue tones
of moonlight. This neurological behavior is called color constancy, and it also
allows the brain to "see through" haze and fog. The downside is that
it can also confuse the brain, as observable in the optical illusions
on Wikipedia's color-constancy page (and as was made famous by the
dress in 2015).
RawTherapee's Retinex tool is, at this stage, a complex beast. There are four basic presets: "Low" to improve the appearance of darker image areas, "High" to improve the appearance of brighter areas, "Uniform" to attempt to balance the image for the midtones, and "Highlight," which provides an alternate take on the "High" preset that is tailored for improving ultra-bright image areas like reflections. But the tool also has eleven other sliders, four other optional parameters, and up to four transformation maps to worry about. In my experiments, I found it easy to brighten up some overly dim images but next to impossible to remove any haze from foggy images. It did not help that the only documentation of the tool in English is a rough translation from a French wiki page. Regardless, one hopes that this tool will see some more refinement in future releases to make it into something genuinely usable.
The Wavelets tool is simpler to understand (anyone who is still unconvinced that the Retinex tool is overly complicated in its current form should let that sink in for a moment). Wavelet transformations split an image up into multiple frequency levels; each level, in essence, isolates the image features of a given approximate size (as measured in pixels). This can make it easier to clean up artifacts that appear at one detail level without corrupting other detail levels. Pat David has written a detailed tutorial on using wavelets to touch up skin imperfections without smearing a portrait photograph — as methods like the traditional "clone" tool can do.
RawTherapee's Wavelet tool lets you activate a
wavelet-decomposition of the current image and adjust the contrast,
sharpness, denoising, and several other features of every wavelet
detail level separately. Functionally, it works just as if the wavelet
levels are built-in layers in the image. In contrast, an editor like
the GIMP requires the user to split the image up into separate wavelet
detail layers, then recombine them once editing is complete. This is a
far more intuitive solution.
Although it is not an editing tool, a special mention also goes to RawTherapee's new dual-illuminant DCP (digital negative camera profile) support. This, too, is an entirely new feature that has only recently landed in any photo editor. A dual-illuminant camera profile is essentially just a pair of color-management profiles where each corresponds to one lighting source in a scene with two sources. For example, a room with a fluorescent light in one corner and sunlight beaming in from the other side has two considerably different colors of light. With one profile available for each light source, the editing application can interpolate the color values between the endpoints; it chooses how to interpolate by measuring the apparent color temperature of a white point in the image being edited. The technique can save a lot of time when working on large image sets because the same profile can be used for every shot. When the subject moves around the room, each shot can get automatically tuned depending on whether more light is coming from the window or the lamp.
Making use of this feature in RawTherapee requires the photographer to shoot a pair of reference images (one for each light source) and combine them into a dual-illuminant profile using external software, so it is not trivial. But it is still nice to see the feature available.
Other changes
As usual, a host of other updates and improvements are implemented in the new release. There is a new lockable color picker, with which you can select sample positions in the image and measure how various settings and adjustments affect their color as you edit. There are two new modes for the curve tool: luminance and perceptual. Luminance mode weights the red, green, and blue curves by their relative inherent luminances, so adjusting the image's curve keeps the RGB values more or less synchronized. Perceptual mode attempts to preserve hue and saturation, while letting the user freely skew the contrast and brightness.
Several existing tools gained nice new options. For example, the contrast by detail levels tool can now be set to before or after the image is converted to black and white. Obviously, that knob only makes a difference on an image where both the contrast-by-detail-levels tools and a convert-to-black-and-white operation are enabled, but changing the order of those operations can produce radically different output, and in prior releases, the user had no control. Along the same lines, the resize tool now has its own, built-in "sharpen" option, because in the standard raw-editing pipeline, sharpening the image after other processing steps might make unwelcome changes to their results.
It is certainly fair to say that this hardcoding of the order of operations causes raw-photo editing to be an uncomfortably rigid process at times. But the tradition stems from some inescapable realities, like the fact that demosaicing and white balancing have to occur before everything else. Essentially all raw editors are bound by this same model; RawTherapee is actually doing better than most by offering a bit of flexibility in situations where the effect is significant.
Also on the menu in 5.0 and 5.1 are support for new camera models, support for 32-bit pixel depth TIFF images, and support for grayscale JPEG and TIFF files. 5.2 added a GIMP plugin to open raw files, as well as a remote option (triggered by calling rawtherapee with the -R switch) that opens additional files in the same running instance of RawTherapee. Finally, developers and packagers may be happy to hear that the project is now actively working on its GTK+3 branch.
Raw photo editors tend to be designed to fully take over a user's workflow, handling every part of the image-processing pipeline from the raw demosaicing to the finishing touches. As such, it can be hard to persuade a regular user to jump ship and try an alternative. But it is possible to switch back and forth. At a personal level, I have always found darktable's user interface to be more awkward than it is worth, and that project's emphasis tends to be on special-effects processing, rather than straightforward correction and retouching.
For anyone of a similar bent, RawTherapee 5 is a significant improvement over what had been a capable but bare-bones editor in years past. For anyone who is committed to darktable or has never used RawTherapee before, the benefits may not be a sufficient sales pitch. But the project is worth tracking nonetheless for some of its newer features. For everyone who simply does not care, suffice it to say that it is an encouraging sign that the RawTherapee team is now pushing the bar forward on new technical features; all free-software image-editing projects benefit from that work, in the long term.
Page editor: Jonathan Corbet
Next page:
Brief items>>
