By Jake Edge
January 13, 2010
Sometimes bugs are in the eye of the beholder as a recent PHP bug report
illustrates. That report also illustrates how quickly discussions in bug
reports can spiral out of control, turning to anger and insults. There are
some comical aspects to the thread, but the underlying issue, maintaining
compatibility with existing bugs, is one that many projects struggle with.
A PHP user ("endosquid") reported that the
number_format() function had changed behavior in PHP 5.3; that is,
when number_format("",0) is called, it no longer returns "0",
instead it returns an empty string. Given that the first argument to the
function is supposed to be a number, in particular a floating point number
that is to be formatted based on the rest of the arguments, an empty string
might seem like the right thing to return. On the other hand, all earlier
versions of the function returned a string containing "0".
It turns out that part of the work that went into version 5.3 was to clean
up the parameter parsing code in PHP, and to use one routine,
zend_parse_parameters(), internally. As PHP creator
Rasmus Lerdorf related in the thread: "Most
of PHP was using this already, but there were still some stragglers
like number_format()." Lerdorf also suggested casting the
first argument to a float (i.e. number_format((float)"",0)) as a
solution to the problem.
As one would guess, endosquid's application wasn't calling
number_format() directly with an empty string, but was instead
passing a variable that may or may not have been initialized. In general
that is a bad programming practice, but it is quite common in
PHP code where the language has often tried to "do the right thing" with
uninitialized variables. But if the "right thing" changes, lots of code that
relied on it can break.
The argument that endosquid makes about what number_format()
should return is not entirely without merit. The function is supposed to
return a formatted number, and the empty string is hardly that, so
endosquid believes that it should return "0". But, as
Lerdorf points out, what would one expect number_format("a",0) to
return? The unfortunate answer is that pre-5.3 versions did return
"0" in that case. So, in tightening up the PHP parameter parsing code, a
substantial difference in the behavior of number_format() was
introduced.
The documentation for number_format()
is not terribly helpful as it doesn't address error conditions at all. It
does specify that the first parameter is a float, but PHP will
happily take strings like "9" or "3.14159" for that parameter, converting
as needed. Given all
that, programmers have to rely on what the language actually does, and
since at least PHP 3, number_format() has always
returned "0" when handed random strings.
It doesn't take long for the bug report thread to descend into flames.
Evidently endosquid works in a tightly controlled environment that requires
a raft of paperwork to accompany code changes, but that still doesn't
justify a claim of "MONTHS [of] fixing code for no real
benefit". It seems clear that endosquid didn't quite understand who
it was responding to the bug report when asking Lerdorf to "escalate
this to someone who can answer the question as to why
this was changed". Lerdorf responds:
"Escalate? Oh how I wish I had someone to escalate to."
Lerdorf also explained that the change was first made public as part of the
first 5.3 release candidate in March 2009. He said that interested folks
had until July to make a case that any particular change shouldn't go into
the release. While endosquid complained that 5.3 had only recently become
available on the platform he was using, Lerdorf pointed out
that users have some responsibility to keep up with their tools:
Part
of your responsibility in your position is to keep track of your tools
and the changes coming down the pipeline. 5.3 was available to you as
a release candidate in March of last year, and even earlier directly
from our revision control system. Many things have changed and there
are many many people out there affected by these changes, we recognize
that. That is also why we are not likely to reverse a change like this
that others in your situation have now accounted for, tested and
deployed in production for many months simply because it is
inconvenient for you.
There is certainly some truth to Lerdorf's admonishment, but it didn't sit
well with endosquid, who plans to change the C code back to the old
behavior. Patching the language source—rather than making a fairly
simple textual substitution to the number_format() call
sites—seems a bit extreme, but is evidently
easier in that environment. Unlike some proprietary alternatives, though,
free software allows just that kind of change.
But free software developers should not have to deal with insulting
comments from bug reporters. There are multiple alternatives for
endosquid, including staying with the 5.1.x version of PHP, patching the
5.3.x source, or fixing the actual calls, so getting angry and lashing out
in the bug report is not likely to help anyone. It is, as Lerdorf
points out, "a classic case of how not to treat unpaid volunteers who
provide
critical pieces of your money-making infrastructure".
There is always the question, though, of when a "bug" has lived long enough
that it becomes something that needs to be carried forward. Once
applications start depending on buggy behavior, there will always be
annoyed users when the bug gets fixed. The Linux kernel has run into this
problem numerous times, generally opting to maintain the
"insanity" (in the words
of Al Viro) for compatibility's sake.
It is a difficult balance to strike. PHP developers cannot possibly know
all of the different corner-cases and quirks that PHP applications depend
on. When fixing what they see as a bug, they have to rely on users testing
betas and release candidates to find places where the "bug" label may not
be appropriate—or at least requires some discussion. But users are
often busy with other things, so we are likely to see this kind of
situation play out for various projects in the future.
Comments (27 posted)
By Jonathan Corbet
January 11, 2010
Your editor has just completed an important transition: moving his Internet
connectivity from one evil branch of the local telecom duopoly to the
other, equally
evil branch. This change required the acquisition of a new router; that,
in turn, provided the opportunity to play with Linux-based router
software, and
Tomato in
particular. Read on for your editor's impressions of this impressive bit
of (mostly) free software.
Tomato has its roots in the original Linksys WRT54G firmware. This
firmware was first distributed as if it were proprietary software, but
Linksys, under heavy GPL-enforcement pressure, eventually made the source
available under the GPL. The existence of this source, along with the ease
by which the Linksys routers could have new firmware installed, led to the
creation of a number of firmware distributions, all of which added new
features and otherwise improved on the original Linksys offering. Over
time, Linksys (Cisco) has incorporated some of these improvements; the
company also continues to offer a special version of its basic household
router (the WRT54GL) which is explicitly designed to allow firmware
replacement.
If a company is going to make a competitively-priced, Linux-based,
user-hackable router, your editor feels an obligation to buy it. That
choice is easy, but the choice of which replacement firmware to use
is harder. There's a wide variety of offerings, including OpenWrt, DD-WRT, FreeWRT, and Tomato. There appears to no
easy way to pick one in particular; your editor started with Tomato because
the screen shots looked nice and the installation instructions were
straightforward. On the other hand, OpenWRT's
installation instructions are simply missing (though some information
is available on the
OpenWRT wiki), and those for
DD-WRT are lengthy and intimidating, making the process look similar to
installing Gentoo.
The funny thing, of course, is that installing replacement firmware on a
WRT54GL router is a trivial task: download firmware, go to the router's
"upgrade firmware" screen, and upload the new blob. Two minutes later the
job is done.
Your editor's first impression of Tomato is that it is great stuff - though
reflection yields some concerns which will be discussed below. Tomato
brings a whole range of new functionality to a cheap consumer device,
yielding a degree of visibility into and control over the network which
your editor has never had before. The web-based interface is slick - if
JavaScript heavy - and mostly easy to use. It would have been nice to
bring this device into the house some time ago, even if Evil Telecom #1's
network did not require its presence.
One nice feature is simple bandwidth monitoring and display; there are a
number of plots which can be brought up and watched in real time. The
router is also able to store network statistics for a long period of time
and produce plots on daily, weekly, or monthly scales. The only problem
there is that the hardware lacks the storage for this amount of data;
Tomato can work around that little limitation by using a built-in CIFS
client to use storage found elsewhere on the net.
The Linux kernel has the facilities to exercise a great deal of control
over the processing of network traffic. There is simple firewalling, of
course, with the ability to decide which traffic is worthy of passage and
which should be denied. But there is also an extensive traffic control
subsystem allowing the user to prioritize the use of the available
bandwidth. That feature is arguably underused because it takes a while to
figure out how to configure it with the available command-line clients.
Tomato provides a relatively straightforward mechanism for the creation of
both access control and quality-of-service rules.
On the access control side, Tomato has a screen which allows the creation
of rules for specific addresses and port numbers. Rules can be global, or
they can apply only to traffic from specific machines on the local network.
Rules can have a schedule attached so that, say, distracting web sites can
be blocked during the day - encouraging accomplishment - while serious
sites can be blocked at night - encouraging relaxation. Specific systems
can be blocked from the net entirely on a schedule, a potentially useful
feature for parents who have long since given up on trying to keep
wireless-enabled devices out of the kids' rooms late at night.
Interestingly, Tomato does not stop with port-based restrictions; it also
incorporates the L7-filter
and IPP2P classifiers. Both modules are
essentially deep packet inspection implementations, allowing the
classification (and, thus, control) of traffic based on a look at the
actual bits passing through. With L7-filter, for example, an administrator
can block specific role-playing games, regardless of whether the official
servers or ports are being used. There's a vast set of canned rules,
enabling control of various instant messaging protocols, file formats, and
more. It is now possible to block the downloading of Perl scripts -
something which, while tempting, is probably unwise to actually do. IPP2P, instead,
is more directly focused on the detection of peer-to-peer
protocols. Together, they are a control freak's dream; network neutrality
stops at the local router.
Even if a network administrator does not wish to ban, say, role-playing
games outright, there is value in saying that such uses of the network
should not interfere with real work like reading XKCD. That's where the
quality of service (QOS) screens come in. QOS is a two-step process:
dividing the available bandwidth among various classes of traffic, and
assigning specific types of traffic to those classes. Tomato provides ten
different classifications, each of which has a priority and a guaranteed
bandwidth portion - all of which can be changed, of course. By default,
only outbound (to the wide-area network) traffic is subject to control; it
is possible to control inbound traffic, but, since that traffic has already passed
over the WAN link by the time the router can work with it, there's usually
little point. Classification rules look a lot like access control rules,
allowing the use of addresses, port numbers, or classification by IPP2P or
L7-filter.
With all this, the administrator can decree that, say, a certain
proprietary role-playing game favored by the children is a very low
priority stream - but it still gets a few percent of the available
bandwidth so the kids do not suffer permanent trauma as a result of
lag-induced fragging. Tomato can also generate pie charts showing (by
classification) how bandwidth is being used currently; clicking on a
classification yields a list of current connections. All told, it's a
capable and easy-to-use way of ensuring that the network functions well
even under heavy use.
Other features abound. There is a DHCP server, of course, along with a
nice screen for doing static DHCP assignments without ever having to type a
MAC address. The router can report its globally-visible address to a wide
variety of dynamic DNS services. Incoming connections can be forwarded to
internal machines in a flexible way. There is a "triggering" mechanism
which automatically opens specific incoming ports in response to specific
outgoing connections. Old-timers will see triggering as a way to support the full
FTP protocol; everybody else will use it to enable incoming BitTorrent
connections. And so on. It is, to say the least, a highly capable system.
The biggest operational problem your editor has experienced is the
occasional dropping of long-lived SSH connections. A bit of research led
to the tweaking of a few of the rather intimidating array of connection
tracking parameters, and things would appear to have improved.
There are a couple of more general concerns, though. Like many of its
peers, Tomato appears to be well past its active development phase; there
were a few releases in 2009, but they did not make a great many changes.
Meanwhile, its 2.4.20 kernel is rather far back from the leading edge, and
both L7-Filter and IPP2P are explicitly unmaintained. Given the steady
stream of security updates for protocol dissectors in WireShark, your
editor has a hard time believing that these other classifiers can be
completely free of security issues. But there is nobody maintaining them,
and Tomato has no apparent means for the monitoring of security problems or
the distribution of updates. Given that these routers are directly exposed
to the net and are the first line of defense for many networks, the
combination of ancient software and no security support is worrying.
Tomato is also not 100% free software. The core Linux system is, of
course, free, but the user interface code carries a "for use with Tomato
only" copyright notice. There is also the issue of the proprietary
Broadcom network driver, but that's a problem any 2.4-based firmware for
this router will have.
These concerns are strong enough that, despite Tomato's many qualities,
your editor is not yet sure that he has found the final distribution for
his router. In particular, OpenWRT - which offers a 2.6 kernel, a seemingly
larger and more active development team, release notes with CVE numbers
included, and a packaging system allowing others to add features to the
router - seems worth a detailed look. The good news is that this choice
exists and is easy to execute. That, in turn, is the result of the GPL and
the developers who made an effort to enforce it.
Comments (52 posted)
January 13, 2010
This article was contributed by Nathan Willis
Gábor Horváth has been developing the raw photo converter
RawTherapee single-handedly, on
Linux and Windows, since 2006. The application has been freeware the
entire time, with Horváth accepting Paypal donations through the
project's web site. Consequently, although there are significant changes
in the 3.0 alpha release announced on
January 4th, it was arguably bigger news that the project was switching to
the GPLv3.
RawTherapee is a raw image conversion and editing utility that (like most raw converters) supports the native file formats of virtually all digital cameras courtesy of the dcraw project. It offers exposure control, highlight and shadow recovery, color and tint balancing and adjustments, sharpening and noise reduction, and basic crop/rotation tools. On the workflow side, it supports color management, Exif and IPTC tagging, quality ratings, batch processing, saved snapshots, and sending images to an external editor for detailed work.
Getting started
Builds for 3.0 alpha 1 are available for Linux
and Windows, and for the first time, source tarballs as well. The Linux builds are provided as 32-bit and 64-bit standalone binaries; simply extract the package and run ./rtstart from a shell prompt to get started. There is no dependency checking, but RawTherapee is compiled against standard GTK+ and GNOME libraries. A more complete list of dependencies is found in a forum thread about compiling the source on Linux; the only special-purpose libraries are libtiff and libiptcdata, which should already be pulled in by other modern image editing packages.
In use, RawTherapee behaves like most comparable raw converters, sporting a three-pane window with a file browser in the left-hand column, an image viewer in the center, and a tabbed image-adjustment toolbox on the right. The vast majority of raw converters take this approach, exposing the image adjustment controls as a vertical stack of sliders and checkboxes. Novices may need to familiarize themselves with the terminology before feeling comfortable tweaking the myriad of settings, but on the positive side, RawTherapee is non-destructive — it saves adjustments not by changing the original image, but by storing an auxiliary "sidecar" file in the same directory.
As raw converters go, RawTherapee offers a full palette of controls, with multiple user-selectable sharpening algorithms, separate luminance- and color-noise reduction sliders, an RGB channel mixer, and multiple demosaicing algorithms. Nevertheless, the tool layout is organized, providing a sensible division of the potentially overwhelming controls into four main tabs (Exposure, Detail, Color, and Transform), and sub-dividing each tab into groups. Batch operations are easy to queue, offering the choice of a specified output folder or a user-defined template, with which you can rename and store output files based on their original name and directory.
RawTherapee does diverge from other converters in a few areas, such as its use of tabbed windows. Starting with 3.0, opening an image to edit opens it in a separate tab. This allows the user to keep multiple editing sessions open at once without exporting, and is definitely a nice feature. There is also no "filmstrip" window pane displaying other image thumbnails in the current directory; the only way to open an new image for editing is through the file browser — a difference that some users might find less convenient. It also provides floating "magnify" windows to zoom in on particular parts of the current image without zooming the entire image view, something not every editor supports.
Linux users will find several oddities in the user interface, though, such as the lack of any menus (standard or otherwise) — the closest thing are the "Preferences" and "Exit" text-buttons on the bottom right-hand corner. And those users with a scroll mouse must take care when scrolling the vertical toolbox; it is easy to accidentally throw off an adjustment slider if the cursor happens to land hovering over one of the controls. This release also lacks tooltips for many of the settings, which would be a boon to new users.
For real-world work, it is also critical to take the "alpha" status of this release seriously. 3.0 alpha 1 is crash-prone, and the adjustment sidecar files it creates automatically are not compatible with the 2.x-series. Those who use the current, stable release of RawTherapee (2.4.1) must be sure to back up their work before testing 3.0.
Open source and further development
Horváth cited three factors behind his decision to change the
licensing of RawTherapee: personal lack of time, the difficulty of
reproducing and fixing reported bugs, and interest in focusing his own time
on the core image-processing features of the program rather than the GUI
and other components. He set up a RawTherapee project on Google Code,
including Subversion access to the source, build
instructions, and an issue tracker. He has also opened developer discussion forums on the main RawTherapee site.
The RawTherapee code breaks into three parts: the image processing library, an Exif support library, and the GUI application itself. Bug reports and enhancement requests have already begun to appear at the Google Code site; Horváth has stated that his top priority for the moment is working out the kinks in the CMake build system.
Moving forward, Horváth's intent to focus on the image processing core is a key component of the 3.x roadmap. Part of the rewrite that led up to 3.0 alpha 1 — although not yet visible to end users — is a separation of the editor component to make it easier to add more algorithms, such as additional demosaicing and noise-reduction choices and new tools to correct fringing and perspective distortion.
Looking at the state of RawTherapee and its user base, the decision to move the code to an open source license is undoubtedly a good one. The application already has an active community, including many Linux users and language translators. But as Horváth discovered maintaining the project in closed source state, supporting that user community's bug reports and support requests became more and more time consuming as the project grew in popularity — a fact many solo software developers may not consider when starting a new project.
Furthermore, Horváth wants to focus on the part of the code he
finds most interesting, the image adjustment algorithms. By adopting a
free software license, RawTherapee might be able to slim down by swapping out some other components for existing open libraries (such as libexiv, rather than its own separate Exif library).
There is clearly room for what Horváth wants to do with RawTherapee in the open source graphics space. Arguably the most similar raw converter, Rawstudio, takes a different approach, aiming to make raw image editing accessible for the average non-technical photographer. RawTherapee's decision to make multiple user-selectable algorithms available for so many controls will make it appealing to a different crowd, those that like to experiment or who have very specific opinions about their image editing. There are other raw-capable editors and applications, such as Digikam, that emphasize more image collection management, raster editing, or other functions.
All in all, RawTherapee has been a consistently good performer on Linux and Windows for years. As one of the few free choices in a space dominated by high-priced applications, it was a standout. Considering that most of the underpinnings of raw image editing — dcraw, Exif and IPTC, and the various mathematical algorithms — are not proprietary, it only makes sense that good, open source solutions would emerge. With the upcoming 3.0 release, it is excellent to see that RawTherapee will be among them.
Comments (37 posted)
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: SSH: passwords or keys?; New vulnerabilities in firefox, kernel, pdns-recursor, sendmail,...
- Kernel: sys_membarrier(); FBAC-LSM; Speculating on page faults.
- Distributions: Ubuntu Women Project; Mandriva Linux 2010 Spring alpha1; Fedora Advisory Board; Ubuntu Developer Week.
- Development: Log message classification with syslog-ng, libvirt examined, location aware search, new versions of SQLObject, Samba, RPM, Amarok, Klactoveedsedstene, GPGME, Wine, Claws Mail, Jato.
- Announcements: CodePlex Foundation's first 100 days, Stallman on GPL exceptions, DRM and video standards, California ok's open-source, Jim Whitehurst interview, Moblin and Linux at CES, LAC call for music, Linux-Kongress cfp, Netbook Summit cfp, sambaXP cfp, OpenClinica, PGCon.
Next page:
Security>>