LWN.net Weekly Edition for January 14, 2010
When does a bug turn into a feature?
Sometimes bugs are in the eye of the beholder as a recent PHP bug report illustrates. That report also illustrates how quickly discussions in bug reports can spiral out of control, turning to anger and insults. There are some comical aspects to the thread, but the underlying issue, maintaining compatibility with existing bugs, is one that many projects struggle with.
A PHP user ("endosquid") reported that the number_format() function had changed behavior in PHP 5.3; that is, when number_format("",0) is called, it no longer returns "0", instead it returns an empty string. Given that the first argument to the function is supposed to be a number, in particular a floating point number that is to be formatted based on the rest of the arguments, an empty string might seem like the right thing to return. On the other hand, all earlier versions of the function returned a string containing "0".
It turns out that part of the work that went into version 5.3 was to clean
up the parameter parsing code in PHP, and to use one routine,
zend_parse_parameters(), internally. As PHP creator
Rasmus Lerdorf related in the thread: "Most
of PHP was using this already, but there were still some stragglers
like number_format().
" Lerdorf also suggested casting the
first argument to a float (i.e. number_format((float)"",0)) as a
solution to the problem.
As one would guess, endosquid's application wasn't calling number_format() directly with an empty string, but was instead passing a variable that may or may not have been initialized. In general that is a bad programming practice, but it is quite common in PHP code where the language has often tried to "do the right thing" with uninitialized variables. But if the "right thing" changes, lots of code that relied on it can break.
The argument that endosquid makes about what number_format() should return is not entirely without merit. The function is supposed to return a formatted number, and the empty string is hardly that, so endosquid believes that it should return "0". But, as Lerdorf points out, what would one expect number_format("a",0) to return? The unfortunate answer is that pre-5.3 versions did return "0" in that case. So, in tightening up the PHP parameter parsing code, a substantial difference in the behavior of number_format() was introduced.
The documentation for number_format() is not terribly helpful as it doesn't address error conditions at all. It does specify that the first parameter is a float, but PHP will happily take strings like "9" or "3.14159" for that parameter, converting as needed. Given all that, programmers have to rely on what the language actually does, and since at least PHP 3, number_format() has always returned "0" when handed random strings.
It doesn't take long for the bug report thread to descend into flames.
Evidently endosquid works in a tightly controlled environment that requires
a raft of paperwork to accompany code changes, but that still doesn't
justify a claim of "MONTHS [of] fixing code for no real
benefit
". It seems clear that endosquid didn't quite understand who
it was responding to the bug report when asking Lerdorf to "escalate
this to someone who can answer the question as to why
this was changed
". Lerdorf responds:
"Escalate? Oh how I wish I had someone to escalate to.
"
Lerdorf also explained that the change was first made public as part of the first 5.3 release candidate in March 2009. He said that interested folks had until July to make a case that any particular change shouldn't go into the release. While endosquid complained that 5.3 had only recently become available on the platform he was using, Lerdorf pointed out that users have some responsibility to keep up with their tools:
There is certainly some truth to Lerdorf's admonishment, but it didn't sit well with endosquid, who plans to change the C code back to the old behavior. Patching the language source—rather than making a fairly simple textual substitution to the number_format() call sites—seems a bit extreme, but is evidently easier in that environment. Unlike some proprietary alternatives, though, free software allows just that kind of change.
But free software developers should not have to deal with insulting
comments from bug reporters. There are multiple alternatives for
endosquid, including staying with the 5.1.x version of PHP, patching the
5.3.x source, or fixing the actual calls, so getting angry and lashing out
in the bug report is not likely to help anyone. It is, as Lerdorf
points out, "a classic case of how not to treat unpaid volunteers who
provide
critical pieces of your money-making infrastructure
".
There is always the question, though, of when a "bug" has lived long enough
that it becomes something that needs to be carried forward. Once
applications start depending on buggy behavior, there will always be
annoyed users when the bug gets fixed. The Linux kernel has run into this
problem numerous times, generally opting to maintain the
"insanity
" (in the words
of Al Viro) for compatibility's sake.
It is a difficult balance to strike. PHP developers cannot possibly know all of the different corner-cases and quirks that PHP applications depend on. When fixing what they see as a bug, they have to rely on users testing betas and release candidates to find places where the "bug" label may not be appropriate—or at least requires some discussion. But users are often busy with other things, so we are likely to see this kind of situation play out for various projects in the future.
The Grumpy Editor's Tomato review
Your editor has just completed an important transition: moving his Internet connectivity from one evil branch of the local telecom duopoly to the other, equally evil branch. This change required the acquisition of a new router; that, in turn, provided the opportunity to play with Linux-based router software, and Tomato in particular. Read on for your editor's impressions of this impressive bit of (mostly) free software.
Tomato has its roots in the original Linksys WRT54G firmware. This
firmware was first distributed as if it were proprietary software, but
Linksys, under heavy GPL-enforcement pressure, eventually made the source
available under the GPL. The existence of this source, along with the ease
by which the Linksys routers could have new firmware installed, led to the
creation of a number of firmware distributions, all of which added new
features and otherwise improved on the original Linksys offering. Over
time, Linksys (Cisco) has incorporated some of these improvements; the
company also continues to offer a special version of its basic household
router (the WRT54GL) which is explicitly designed to allow firmware
replacement.
If a company is going to make a competitively-priced, Linux-based, user-hackable router, your editor feels an obligation to buy it. That choice is easy, but the choice of which replacement firmware to use is harder. There's a wide variety of offerings, including OpenWrt, DD-WRT, FreeWRT, and Tomato. There appears to no easy way to pick one in particular; your editor started with Tomato because the screen shots looked nice and the installation instructions were straightforward. On the other hand, OpenWRT's installation instructions are simply missing (though some information is available on the OpenWRT wiki), and those for DD-WRT are lengthy and intimidating, making the process look similar to installing Gentoo.
The funny thing, of course, is that installing replacement firmware on a WRT54GL router is a trivial task: download firmware, go to the router's "upgrade firmware" screen, and upload the new blob. Two minutes later the job is done.
Your editor's first impression of Tomato is that it is great stuff - though reflection yields some concerns which will be discussed below. Tomato brings a whole range of new functionality to a cheap consumer device, yielding a degree of visibility into and control over the network which your editor has never had before. The web-based interface is slick - if JavaScript heavy - and mostly easy to use. It would have been nice to bring this device into the house some time ago, even if Evil Telecom #1's network did not require its presence.
One nice feature is simple bandwidth monitoring and display; there are a
number of plots which can be brought up and watched in real time. The
router is also able to store network statistics for a long period of time
and produce plots on daily, weekly, or monthly scales. The only problem
there is that the hardware lacks the storage for this amount of data;
Tomato can work around that little limitation by using a built-in CIFS
client to use storage found elsewhere on the net.
The Linux kernel has the facilities to exercise a great deal of control over the processing of network traffic. There is simple firewalling, of course, with the ability to decide which traffic is worthy of passage and which should be denied. But there is also an extensive traffic control subsystem allowing the user to prioritize the use of the available bandwidth. That feature is arguably underused because it takes a while to figure out how to configure it with the available command-line clients. Tomato provides a relatively straightforward mechanism for the creation of both access control and quality-of-service rules.
On the access control side, Tomato has a screen which allows the creation of rules for specific addresses and port numbers. Rules can be global, or they can apply only to traffic from specific machines on the local network. Rules can have a schedule attached so that, say, distracting web sites can be blocked during the day - encouraging accomplishment - while serious sites can be blocked at night - encouraging relaxation. Specific systems can be blocked from the net entirely on a schedule, a potentially useful feature for parents who have long since given up on trying to keep wireless-enabled devices out of the kids' rooms late at night.
Interestingly, Tomato does not stop with port-based restrictions; it also incorporates the L7-filter and IPP2P classifiers. Both modules are essentially deep packet inspection implementations, allowing the classification (and, thus, control) of traffic based on a look at the actual bits passing through. With L7-filter, for example, an administrator can block specific role-playing games, regardless of whether the official servers or ports are being used. There's a vast set of canned rules, enabling control of various instant messaging protocols, file formats, and more. It is now possible to block the downloading of Perl scripts - something which, while tempting, is probably unwise to actually do. IPP2P, instead, is more directly focused on the detection of peer-to-peer protocols. Together, they are a control freak's dream; network neutrality stops at the local router.
Even if a network administrator does not wish to ban, say, role-playing games outright, there is value in saying that such uses of the network should not interfere with real work like reading XKCD. That's where the quality of service (QOS) screens come in. QOS is a two-step process: dividing the available bandwidth among various classes of traffic, and assigning specific types of traffic to those classes. Tomato provides ten different classifications, each of which has a priority and a guaranteed bandwidth portion - all of which can be changed, of course. By default, only outbound (to the wide-area network) traffic is subject to control; it is possible to control inbound traffic, but, since that traffic has already passed over the WAN link by the time the router can work with it, there's usually little point. Classification rules look a lot like access control rules, allowing the use of addresses, port numbers, or classification by IPP2P or L7-filter.
With all this, the administrator can decree that, say, a certain
proprietary role-playing game favored by the children is a very low
priority stream - but it still gets a few percent of the available
bandwidth so the kids do not suffer permanent trauma as a result of
lag-induced fragging. Tomato can also generate pie charts showing (by
classification) how bandwidth is being used currently; clicking on a
classification yields a list of current connections. All told, it's a
capable and easy-to-use way of ensuring that the network functions well
even under heavy use.
Other features abound. There is a DHCP server, of course, along with a nice screen for doing static DHCP assignments without ever having to type a MAC address. The router can report its globally-visible address to a wide variety of dynamic DNS services. Incoming connections can be forwarded to internal machines in a flexible way. There is a "triggering" mechanism which automatically opens specific incoming ports in response to specific outgoing connections. Old-timers will see triggering as a way to support the full FTP protocol; everybody else will use it to enable incoming BitTorrent connections. And so on. It is, to say the least, a highly capable system.
The biggest operational problem your editor has experienced is the occasional dropping of long-lived SSH connections. A bit of research led to the tweaking of a few of the rather intimidating array of connection tracking parameters, and things would appear to have improved.
There are a couple of more general concerns, though. Like many of its peers, Tomato appears to be well past its active development phase; there were a few releases in 2009, but they did not make a great many changes. Meanwhile, its 2.4.20 kernel is rather far back from the leading edge, and both L7-Filter and IPP2P are explicitly unmaintained. Given the steady stream of security updates for protocol dissectors in WireShark, your editor has a hard time believing that these other classifiers can be completely free of security issues. But there is nobody maintaining them, and Tomato has no apparent means for the monitoring of security problems or the distribution of updates. Given that these routers are directly exposed to the net and are the first line of defense for many networks, the combination of ancient software and no security support is worrying.
Tomato is also not 100% free software. The core Linux system is, of course, free, but the user interface code carries a "for use with Tomato only" copyright notice. There is also the issue of the proprietary Broadcom network driver, but that's a problem any 2.4-based firmware for this router will have.
These concerns are strong enough that, despite Tomato's many qualities, your editor is not yet sure that he has found the final distribution for his router. In particular, OpenWRT - which offers a 2.6 kernel, a seemingly larger and more active development team, release notes with CVE numbers included, and a packaging system allowing others to add features to the router - seems worth a detailed look. The good news is that this choice exists and is easy to execute. That, in turn, is the result of the GPL and the developers who made an effort to enforce it.
RawTherapee: the newest open source raw photo editor
Gábor Horváth has been developing the raw photo converter RawTherapee single-handedly, on Linux and Windows, since 2006. The application has been freeware the entire time, with Horváth accepting Paypal donations through the project's web site. Consequently, although there are significant changes in the 3.0 alpha release announced on January 4th, it was arguably bigger news that the project was switching to the GPLv3.
RawTherapee is a raw image conversion and editing utility that (like most raw converters) supports the native file formats of virtually all digital cameras courtesy of the dcraw project. It offers exposure control, highlight and shadow recovery, color and tint balancing and adjustments, sharpening and noise reduction, and basic crop/rotation tools. On the workflow side, it supports color management, Exif and IPTC tagging, quality ratings, batch processing, saved snapshots, and sending images to an external editor for detailed work.
Getting started
Builds for 3.0 alpha 1 are available for Linux and Windows, and for the first time, source tarballs as well. The Linux builds are provided as 32-bit and 64-bit standalone binaries; simply extract the package and run ./rtstart from a shell prompt to get started. There is no dependency checking, but RawTherapee is compiled against standard GTK+ and GNOME libraries. A more complete list of dependencies is found in a forum thread about compiling the source on Linux; the only special-purpose libraries are libtiff and libiptcdata, which should already be pulled in by other modern image editing packages.
In use, RawTherapee behaves like most comparable raw converters, sporting a three-pane window with a file browser in the left-hand column, an image viewer in the center, and a tabbed image-adjustment toolbox on the right. The vast majority of raw converters take this approach, exposing the image adjustment controls as a vertical stack of sliders and checkboxes. Novices may need to familiarize themselves with the terminology before feeling comfortable tweaking the myriad of settings, but on the positive side, RawTherapee is non-destructive — it saves adjustments not by changing the original image, but by storing an auxiliary "sidecar" file in the same directory.
As raw converters go, RawTherapee offers a full palette of controls, with multiple user-selectable sharpening algorithms, separate luminance- and color-noise reduction sliders, an RGB channel mixer, and multiple demosaicing algorithms. Nevertheless, the tool layout is organized, providing a sensible division of the potentially overwhelming controls into four main tabs (Exposure, Detail, Color, and Transform), and sub-dividing each tab into groups. Batch operations are easy to queue, offering the choice of a specified output folder or a user-defined template, with which you can rename and store output files based on their original name and directory.
RawTherapee does diverge from other converters in a few areas, such as its use of tabbed windows. Starting with 3.0, opening an image to edit opens it in a separate tab. This allows the user to keep multiple editing sessions open at once without exporting, and is definitely a nice feature. There is also no "filmstrip" window pane displaying other image thumbnails in the current directory; the only way to open an new image for editing is through the file browser — a difference that some users might find less convenient. It also provides floating "magnify" windows to zoom in on particular parts of the current image without zooming the entire image view, something not every editor supports.
Linux users will find several oddities in the user interface, though, such as the lack of any menus (standard or otherwise) — the closest thing are the "Preferences" and "Exit" text-buttons on the bottom right-hand corner. And those users with a scroll mouse must take care when scrolling the vertical toolbox; it is easy to accidentally throw off an adjustment slider if the cursor happens to land hovering over one of the controls. This release also lacks tooltips for many of the settings, which would be a boon to new users.
For real-world work, it is also critical to take the "alpha" status of this release seriously. 3.0 alpha 1 is crash-prone, and the adjustment sidecar files it creates automatically are not compatible with the 2.x-series. Those who use the current, stable release of RawTherapee (2.4.1) must be sure to back up their work before testing 3.0.
Open source and further development
Horváth cited three factors behind his decision to change the licensing of RawTherapee: personal lack of time, the difficulty of reproducing and fixing reported bugs, and interest in focusing his own time on the core image-processing features of the program rather than the GUI and other components. He set up a RawTherapee project on Google Code, including Subversion access to the source, build instructions, and an issue tracker. He has also opened developer discussion forums on the main RawTherapee site.
The RawTherapee code breaks into three parts: the image processing library, an Exif support library, and the GUI application itself. Bug reports and enhancement requests have already begun to appear at the Google Code site; Horváth has stated that his top priority for the moment is working out the kinks in the CMake build system.
Moving forward, Horváth's intent to focus on the image processing core is a key component of the 3.x roadmap. Part of the rewrite that led up to 3.0 alpha 1 — although not yet visible to end users — is a separation of the editor component to make it easier to add more algorithms, such as additional demosaicing and noise-reduction choices and new tools to correct fringing and perspective distortion.
Looking at the state of RawTherapee and its user base, the decision to move the code to an open source license is undoubtedly a good one. The application already has an active community, including many Linux users and language translators. But as Horváth discovered maintaining the project in closed source state, supporting that user community's bug reports and support requests became more and more time consuming as the project grew in popularity — a fact many solo software developers may not consider when starting a new project.
Furthermore, Horváth wants to focus on the part of the code he finds most interesting, the image adjustment algorithms. By adopting a free software license, RawTherapee might be able to slim down by swapping out some other components for existing open libraries (such as libexiv, rather than its own separate Exif library).
There is clearly room for what Horváth wants to do with RawTherapee in the open source graphics space. Arguably the most similar raw converter, Rawstudio, takes a different approach, aiming to make raw image editing accessible for the average non-technical photographer. RawTherapee's decision to make multiple user-selectable algorithms available for so many controls will make it appealing to a different crowd, those that like to experiment or who have very specific opinions about their image editing. There are other raw-capable editors and applications, such as Digikam, that emphasize more image collection management, raster editing, or other functions.
All in all, RawTherapee has been a consistently good performer on Linux and Windows for years. As one of the few free choices in a space dominated by high-priced applications, it was a standout. Considering that most of the underpinnings of raw image editing — dcraw, Exif and IPTC, and the various mathematical algorithms — are not proprietary, it only makes sense that good, open source solutions would emerge. With the upcoming 3.0 release, it is excellent to see that RawTherapee will be among them.
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: SSH: passwords or keys?; New vulnerabilities in firefox, kernel, pdns-recursor, sendmail,...
- Kernel: sys_membarrier(); FBAC-LSM; Speculating on page faults.
- Distributions: Ubuntu Women Project; Mandriva Linux 2010 Spring alpha1; Fedora Advisory Board; Ubuntu Developer Week.
- Development: Log message classification with syslog-ng, libvirt examined, location aware search, new versions of SQLObject, Samba, RPM, Amarok, Klactoveedsedstene, GPGME, Wine, Claws Mail, Jato.
- Announcements: CodePlex Foundation's first 100 days, Stallman on GPL exceptions, DRM and video standards, California ok's open-source, Jim Whitehurst interview, Moblin and Linux at CES, LAC call for music, Linux-Kongress cfp, Netbook Summit cfp, sambaXP cfp, OpenClinica, PGCon.
