LWN.net Logo

When does a bug turn into a feature?

By Jake Edge
January 13, 2010

Sometimes bugs are in the eye of the beholder as a recent PHP bug report illustrates. That report also illustrates how quickly discussions in bug reports can spiral out of control, turning to anger and insults. There are some comical aspects to the thread, but the underlying issue, maintaining compatibility with existing bugs, is one that many projects struggle with.

A PHP user ("endosquid") reported that the number_format() function had changed behavior in PHP 5.3; that is, when number_format("",0) is called, it no longer returns "0", instead it returns an empty string. Given that the first argument to the function is supposed to be a number, in particular a floating point number that is to be formatted based on the rest of the arguments, an empty string might seem like the right thing to return. On the other hand, all earlier versions of the function returned a string containing "0".

It turns out that part of the work that went into version 5.3 was to clean up the parameter parsing code in PHP, and to use one routine, zend_parse_parameters(), internally. As PHP creator Rasmus Lerdorf related in the thread: "Most of PHP was using this already, but there were still some stragglers like number_format()." Lerdorf also suggested casting the first argument to a float (i.e. number_format((float)"",0)) as a solution to the problem.

As one would guess, endosquid's application wasn't calling number_format() directly with an empty string, but was instead passing a variable that may or may not have been initialized. In general that is a bad programming practice, but it is quite common in PHP code where the language has often tried to "do the right thing" with uninitialized variables. But if the "right thing" changes, lots of code that relied on it can break.

The argument that endosquid makes about what number_format() should return is not entirely without merit. The function is supposed to return a formatted number, and the empty string is hardly that, so endosquid believes that it should return "0". But, as Lerdorf points out, what would one expect number_format("a",0) to return? The unfortunate answer is that pre-5.3 versions did return "0" in that case. So, in tightening up the PHP parameter parsing code, a substantial difference in the behavior of number_format() was introduced.

The documentation for number_format() is not terribly helpful as it doesn't address error conditions at all. It does specify that the first parameter is a float, but PHP will happily take strings like "9" or "3.14159" for that parameter, converting as needed. Given all that, programmers have to rely on what the language actually does, and since at least PHP 3, number_format() has always returned "0" when handed random strings.

It doesn't take long for the bug report thread to descend into flames. Evidently endosquid works in a tightly controlled environment that requires a raft of paperwork to accompany code changes, but that still doesn't justify a claim of "MONTHS [of] fixing code for no real benefit". It seems clear that endosquid didn't quite understand who it was responding to the bug report when asking Lerdorf to "escalate this to someone who can answer the question as to why this was changed". Lerdorf responds: "Escalate? Oh how I wish I had someone to escalate to."

Lerdorf also explained that the change was first made public as part of the first 5.3 release candidate in March 2009. He said that interested folks had until July to make a case that any particular change shouldn't go into the release. While endosquid complained that 5.3 had only recently become available on the platform he was using, Lerdorf pointed out that users have some responsibility to keep up with their tools:

Part of your responsibility in your position is to keep track of your tools and the changes coming down the pipeline. 5.3 was available to you as a release candidate in March of last year, and even earlier directly from our revision control system. Many things have changed and there are many many people out there affected by these changes, we recognize that. That is also why we are not likely to reverse a change like this that others in your situation have now accounted for, tested and deployed in production for many months simply because it is inconvenient for you.

There is certainly some truth to Lerdorf's admonishment, but it didn't sit well with endosquid, who plans to change the C code back to the old behavior. Patching the language source—rather than making a fairly simple textual substitution to the number_format() call sites—seems a bit extreme, but is evidently easier in that environment. Unlike some proprietary alternatives, though, free software allows just that kind of change.

But free software developers should not have to deal with insulting comments from bug reporters. There are multiple alternatives for endosquid, including staying with the 5.1.x version of PHP, patching the 5.3.x source, or fixing the actual calls, so getting angry and lashing out in the bug report is not likely to help anyone. It is, as Lerdorf points out, "a classic case of how not to treat unpaid volunteers who provide critical pieces of your money-making infrastructure".

There is always the question, though, of when a "bug" has lived long enough that it becomes something that needs to be carried forward. Once applications start depending on buggy behavior, there will always be annoyed users when the bug gets fixed. The Linux kernel has run into this problem numerous times, generally opting to maintain the "insanity" (in the words of Al Viro) for compatibility's sake.

It is a difficult balance to strike. PHP developers cannot possibly know all of the different corner-cases and quirks that PHP applications depend on. When fixing what they see as a bug, they have to rely on users testing betas and release candidates to find places where the "bug" label may not be appropriate—or at least requires some discussion. But users are often busy with other things, so we are likely to see this kind of situation play out for various projects in the future.


(Log in to post comments)

When does a bug turn into a feature?

Posted Jan 14, 2010 2:20 UTC (Thu) by malor (subscriber, #2973) [Link]

Insults aside, I think I'm in endosquid's camp on this one; if the function is supposed to return a number, it should always return a number or indicate an error in some way. It should never, ever return an empty string. It may be that expecting rigor from the PHP team is unrealistic, given the constant stream of security problems originating from that package, but one could hope.

I suspect quite a number of installations may have been affected by this, but don't know it yet. That kind of bug often takes quite awhile to crop up.

When does a bug turn into a feature?

Posted Jan 14, 2010 4:30 UTC (Thu) by flewellyn (subscriber, #5047) [Link]

As someone who codes PHP for a living, I WOULD be in endosquid's camp, except for the fact that the codebase in question is passing around uninitialized variables.

"Tries to do the right thing" aside, I still consider this execrable practice. It's just asking for trouble.

That said, number_format's docs should be updated to reflect the change.

When does a bug turn into a feature?

Posted Jan 14, 2010 13:52 UTC (Thu) by malor (subscriber, #2973) [Link]

Well, that codebase is obviously poorly done, but that doesn't excuse the fact that a function that's supposed to return a number isn't returning a number.

I would also argue that adjusting the documentation is even worse; it's called number_format, not number_format_sometimes_as_a_string_if_we_feel_like_it. Every other language I've personally worked with that did conversions to numbers, when given an alpha value, would either indicate an error, or return zero. Emitting strings from functions that should be working with numbers is, purely and simply, bad form, and should be fixed.

This remains true no matter how bad the code is that's using the function.

When does a bug turn into a feature?

Posted Jan 14, 2010 15:05 UTC (Thu) by flewellyn (subscriber, #5047) [Link]

I don't disagree. I'm just saying, uninitialized variables = bad.

When does a bug turn into a feature?

Posted Jan 14, 2010 18:17 UTC (Thu) by butlerm (subscriber, #13312) [Link]

Languages that support null values often return nulls from functions that
otherwise return numbers, when one or more arguments are null. Most SQL
implementations are a case in point.

For example if you call TO_NUMBER('') in Oracle, you get a null / empty
string in return. Most other databases are similar.

The other rational choice is to throw an exception. That way people don't
write huge programs that depend on language underspecification and
implementation bugs.

When does a bug turn into a feature?

Posted Jan 14, 2010 17:41 UTC (Thu) by rsidd (subscriber, #2582) [Link]

I think you misunderstood: number_format() returns a string. When given a null string as input, it used to return the string "0" (not the number 0). Now it returns the null string. Which, I would suggest, is the more sensible behaviour.

When does a bug turn into a feature?

Posted Jan 14, 2010 17:59 UTC (Thu) by malor (subscriber, #2973) [Link]

You're right, I did misunderstand, and I retract my complaint. A null string is, indeed, a more sensible return value.

When does a bug turn into a feature?

Posted Jan 14, 2010 5:00 UTC (Thu) by elanthis (guest, #6227) [Link]

The entire PHP codebase and language is a bug, and this is from someone who
worked with it professionally since 2001. I cannot put into words the
number of absolutely stupid behaviors and design decisions that have gone
into PHP and _still_ go into PHP (there was a new batch of them in 5.3 and
there's another new batch of them in 6.0 I've already identified). The PHP
developers are, for the most part, incompetent and clueless. The same is
true for the vast majority of downstream developers that use PHP to write
their applications. I cannot recall seeing a clean, well-written PHP
codebase in the Open Source world, ever. The poster-projects of PHP are
amongst the worst, in fact, including WordPress, Smarty, and osCommerce
(before it died off).

If you're a PHP programmer, you just need to learn to roll with the crap.
endosquid's environment sounds like a problem; that kind of setup might
work for stand-alone C++/Java apps that are fairly self-contained, but in a
setup where your code depends on the behavior of a ton of underlying crazy
runtimes and libraries, you just need to be ready to deal with the
inevitable breakages that occur when the runtime simultaneously fixes old
broken design screw-ups that you may have depended on, especially since
you're going to end up depending on new design screw-ups that they're
constantly introducing to work around the ones they're removing.

I'm stuck working with PHP on many projects because that's what the client
wants, but in general, I just recommend using something else. Sadly, that
"something else" can be hard to find. Python is a popular alternative, and
it tends to be more competently developed, but it still does break
compatibility in a least a few corner cases with every release (usually for
no compelling reasons other than just to force people to do things "the new
way"). I've little experience with Ruby to know if it is any better in
this regard, but the language itself looks nice. In the end, the only
super-stable options fall back to the big bloated professionally-developed
languages like Java or C#, which just aren't a (palatable) option for most
people. Overall, the entire Web development situation just plain out
sucks. Which is precisely why I decided to stop doing it professionally
and get into a programming field that is far, far more rewarding and less
migraine-inducing.

When does a bug turn into a feature?

Posted Jan 14, 2010 8:51 UTC (Thu) by epa (subscriber, #39769) [Link]

Isn't MediaWiki the biggest showcase app for PHP? I don't know about its code quality under the covers, but certainly it has to cope with high loads and lots of would-be attackers.

When does a bug turn into a feature?

Posted Jan 14, 2010 13:08 UTC (Thu) by nye (guest, #51576) [Link]

I've done some customisation of MediaWiki for my organisation's website, since it was 'close enough' to what we wanted. It's not *too* bad - certainly miles better than any other PHP application I've ever dug into. I still despise PHP as a language but MW is probably no more headache-inducing than any other codebase of similar size.

OT: Still hoping they switch to git.

When does a bug turn into a feature?

Posted Jan 14, 2010 11:38 UTC (Thu) by kruemelmo (subscriber, #8279) [Link]

Well, at least it appears to me that for selling 50-60 applications with business-critical tax etc. functions, having thousends of calls to format_number(), php would be a weak choice. After all, it was designed as a hypertext preprocessor.

When does a bug turn into a feature?

Posted Jan 21, 2010 14:47 UTC (Thu) by jch (guest, #51929) [Link]

"PHP is a minor evil perpetrated and created by incompetent amateurs, whereas Perl is a great and insidious evil, perpetrated by skilled but perverted professionals." — Jon Ribbens

When does a bug turn into a feature?

Posted Jan 14, 2010 11:07 UTC (Thu) by martin.langhoff (subscriber, #61417) [Link]

As others have noted, PHP isn't the most tightly designed language (and I suffer that first hand a lot when working on Moodle and other PHP projects). It has, however, done a lot of other things right, specially Just Working on many platforms for the last 10+ years.

When I wanted to use mod_perl + EmbPerl it was so fragile in terms of what Apache/Perl/modperl/gcc combinations worked and which ones bombed out that I had to abandon it. May have been a fault of the rpm/deb packagers but it was a nightmare of fragility.

In the case of this bug, I don't think PHP devs are at fault. The bugreporter is upset because the webapps he works on are for tax calculations, so they *have* to be right. And yet, the root cause of all this is that there are lots of unitialized variables being passed around.

Oops. Sounds like a case for TheDailyWTF.

And it's an easy fix, even if the "fix" is to sed the code to a custom function that emulates the old behaviour.

When does a bug turn into a feature?

Posted Jan 14, 2010 15:45 UTC (Thu) by mrshiny (subscriber, #4266) [Link]

I agree. I feel that the PHP devs have made a cleanup that breaks programs, perhaps unnecessarily, but at the end of the day the real problem is endosquid's program and business practices.

He has conflicting requirements: 1. Upgrade PHP, 2. Have nothing in PHP change from 5.1 to 5.3. You can't have perfect stability AND upgrades at the same time. This is why we test our upgrades before we use them, right?

Fixing the application should be so simple, and the fix so deserving of attention, that I can't understand why they don't just do it. Patching the runtime? Please! That way lies madness. The irony is that they could fix the code so that it works, and it would still work just as well on the old PHP if they ever needed to go back for some other reason.

When does a bug turn into a feature?

Posted Jan 14, 2010 17:58 UTC (Thu) by nye (guest, #51576) [Link]

What I don't get is why he has to get approval and sign-offs to change the application, but not to change the platform. That just makes a mockery of the whole process.

When does a bug turn into a feature?

Posted Jan 14, 2010 18:03 UTC (Thu) by butlerm (subscriber, #13312) [Link]

The obvious solution to the problem is to make a new function that has the
old behavior and do a search and replace. What is so hard about that?

When does a bug turn into a feature?

Posted Jan 16, 2010 1:11 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

The obvious solution to the problem is to make a new function that has the old behavior and do a search and replace. What is so hard about that?

He's working inside a business process that, for quality control reasons, makes it very costly to modify his code in any way.

When does a bug turn into a feature?

Posted Jan 15, 2010 21:05 UTC (Fri) by flewellyn (subscriber, #5047) [Link]

There's also the fact that endosquid was doing two things which no programmer should ever do, in any language, at any time, ever.

First, passing uninitialized variables into functions. Every variable should always be initialized to a value that is sensible for what it will be used for! It's true that PHP does not enforce this as strictly as some other dynamic languages do (Python and Common Lisp come to mind), but it's still terrible practice.

But second, and even worse, endosquid was apparently relying on undefined behavior! The documentation for number_format() does not specify what happens if you pass in a non-numeric value as its first argument, and while we could wish that said docs explicitly said "this behavior is undefined", the fact remains that it DOES say the first argument must be a float. Yes, in the past, passing in some other value resulted in "0", but the whole point of undefined behavior is that it cannot be relied upon.

When does a bug turn into a feature?

Posted Jan 16, 2010 11:04 UTC (Sat) by addw (guest, #1771) [Link]

while we could wish that said docs explicitly said "this behavior is undefined"
It was things like this that turned the C language definition from a slim K & R into something the size of a telephone directory. The trouble is that everyone read K & R but few read the full standard.

I suppose that one can at least assume that ''if the documentation doesn't say what happens, then assume that it is undefined -- don't rely on it''.

But at least PHP does have reasonably complete end user documentation, which is better than can be said (unfortunately) for PEAR.

Uninitialized variables

Posted Jan 18, 2010 14:52 UTC (Mon) by pdc (guest, #1353) [Link]

The problem is that PHP encourages the use of uninitialized and wrongly typed
variables through its default type coercions and default values and so on.

In some other languages (e.g., Python) these practices are quickly weeded out
because uninitialized variables cause an exception.

When does a bug turn into a feature?

Posted Jan 14, 2010 20:02 UTC (Thu) by shredwheat (guest, #4188) [Link]

Interesting that switching operating system platforms and system library versions does not seem to be a problems for the bug reporter's environment. But casting an uninitialized value through large volumes of a code base is a deal breaker.

PHP changes

Posted Jan 14, 2010 23:00 UTC (Thu) by rfunk (subscriber, #4054) [Link]

As someone who has hated PHP's many deficiencies and inconsistencies, I
applaud their improvements in 5.3. I hope it quickly makes its way to all
the various production environments. But it might've been nice to hold back
on the ones that break compatibility until 6.0.

When does a bug turn into a feature?

Posted Jan 15, 2010 5:02 UTC (Fri) by johnflux (subscriber, #58833) [Link]

And this is why libraries should have extensive unit tests.

Not a solution in this case

Posted Jan 17, 2010 22:57 UTC (Sun) by man_ls (subscriber, #15091) [Link]

The test suite would have to be unrealistically complete to include undocumented pathologic conditions such as passing an unitialized variable to a function.

Not a solution in this case

Posted Jan 18, 2010 2:21 UTC (Mon) by johnflux (subscriber, #58833) [Link]

I disagree - that would be one of the first tests that you'd write. Get a defined behaviour for passing an undefined variable, and empty string, a string with letters etc.

Not a solution in this case

Posted Jan 18, 2010 7:27 UTC (Mon) by man_ls (subscriber, #15091) [Link]

Have you ever written a test case? One thing is to test for boundary cases (empty string, 0, -1 and 1 for numbers); and another to test for undefined behaviors such as an undefined variable. With these, often the best thing is to let your program throw an exception.

Of course the language design that allows such thing (use of an undefined variable) can be questioned.

New versions of interpreters must be validated.

Posted Jan 22, 2010 15:44 UTC (Fri) by rnesius (guest, #63156) [Link]

As someone who provisioned scripting languages into a world-wide design and validation environment for over ten years (perl, python, ruby, tcl-tk, and php), I've learned that it all too easy for a program to become wed to a specific implementation of the language for exactly the reasons this article describes.

There's no excuse for not validating functionality over new versions of an interpreter before deploying into production.

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds