LWN.net Logo

Cracks in the Foundation (PHP Advent)

Cracks in the Foundation (PHP Advent)

Posted Dec 18, 2011 21:54 UTC (Sun) by Richard_J_Neill (subscriber, #23093)
In reply to: Cracks in the Foundation (PHP Advent) by imgx64
Parent article: Cracks in the Foundation (PHP Advent)

Actually, being able to quickly swap from PHP to HTML (especially using short-tags <? ... ?> ) can be useful. One way to create, imho, legible code is to do all the processing first, and then write out the bulk of the HTML, inserting <?=$variable;?> as needed. Like any language feature, this can be abused.

Also, the distinction between == and === is a really useful one. Again, it requires proper understanding, but it can save a lot of time. Consider strpos() which normally returns an integer (perhaps zero), but false on error. This is compact, clear, and avoids the problem of error-handling in C's atoi() or the horrible workaround in strtol().

The key advantage of PHP is the documentation, which is excellent.

BTW, ironically, you *didn't* disable your rant. String "0" IS equal (==) to false, though it isn't identically equal (===). If you wanted to complain, you'd have to complain that "0.0" is considered true; "0.0" is considered equal to 0.0, and that 0.0 is considered false. Sadly there is no perfect way to write automatic-casting rules.


(Log in to post comments)

Cracks in the Foundation (PHP Advent)

Posted Dec 18, 2011 22:05 UTC (Sun) by alankila (subscriber, #47141) [Link]

So you are saying that 0 == strpos($x, $y) will do the wrong thing when the string is not found at beginning. Other languages show remarkably more sense by returning -1 or something. Sorry, but this specific example you picked is full of fail.

Cracks in the Foundation (PHP Advent)

Posted Dec 21, 2011 2:37 UTC (Wed) by Richard_J_Neill (subscriber, #23093) [Link]

Perhaps using strpos() wasn't quite such a good example, because -1 could never be a valid answer, and is therefore potentially OK as an error-flag.
This is in the same spirit as, for example, C's write() .

BUT, what do you do with an integer-function that normally returns an integer (negative, positive, or zero) when it needs to return an error?

There are several ways to do it; of which I think that C's strtol() is the worst possible. Some might suggest returning an object, but that's logically equivalent. PHP has adopted the general convention that any function that fails will return false; I think this is actually quite sensible once one knows to expect it.

As for the casting rules of "0.0" vs 0.0, it's rather a perverse example, which shouldn't happen in real-life.

Cracks in the Foundation (PHP Advent)

Posted Dec 21, 2011 4:26 UTC (Wed) by nybble41 (subscriber, #55106) [Link]

> BUT, what do you do with an integer-function that normally returns an integer (negative, positive, or zero) when it needs to return an error?

Return success or failure (or a more specific error code), and store the result in a reference parameter? That's the standard C approach. If the possibility of failure exists, you'll need to check for that first anyway before using the result. For uncommon failure modes, in languages which support it, alternate continuations (including, but not limited to, C++-style exception handling) offer a more efficient solution.

> PHP has adopted the general convention that any function that fails will return false; I think this is actually quite sensible once one knows to expect it.

That's great so long as you don't need to return false in a non-failure situation... The Common Lisp solution to this is rather elegant: return two values, the first being the result or false, and the second (which you can ignore) indicating success or failure. If you know that successful results can't be false you can use the normal return value, but the extended status is there if you need it.

Cracks in the Foundation (PHP Advent)

Posted Dec 18, 2011 22:49 UTC (Sun) by HelloWorld (guest, #56129) [Link]

> Consider strpos() which normally returns an integer (perhaps zero), but false on error.
That is the kind of idiotic type mess that dynamic typing encourages. Just say no to that kind of crap.

Cracks in the Foundation (PHP Advent)

Posted Dec 19, 2011 3:52 UTC (Mon) by imgx64 (guest, #78590) [Link]

I'd attribute this to the lack of proper exceptions and/or multiple return values than dynamic typing. For example, the C standard library is terrible when it comes to returning errors, and it's statically-typed.

Of course it can be improved (assuming PHP has proper exceptions), but it requires a change in the culture of PHP programmers. Something I highly doubt is possible.

Cracks in the Foundation (PHP Advent)

Posted Dec 22, 2011 11:20 UTC (Thu) by justincormack (subscriber, #70439) [Link]

Multiple return values is the real answer here. Exceptions are probably overkill. Dont suppose php will get them though.

Cracks in the Foundation (PHP Advent)

Posted Dec 22, 2011 11:59 UTC (Thu) by etienne (subscriber, #25256) [Link]

> Multiple return values

Sometimes, in C, I would like a function to return a value (like now) *and* the (processor) flags - for instance the "zero" flag would mean no error occured. Very fast and simple...

Cracks in the Foundation (PHP Advent)

Posted Dec 23, 2011 10:09 UTC (Fri) by jezuch (subscriber, #52988) [Link]

> Sometimes, in C, I would like a function to return a value (like now) *and* the (processor) flags - for instance the "zero" flag would mean no error occured. Very fast and simple...

You mean like the errno variable?

Cracks in the Foundation (PHP Advent)

Posted Jan 3, 2012 12:41 UTC (Tue) by etienne (subscriber, #25256) [Link]

>> returning flags from functions
> You mean like the errno variable?

Not really, errno is a global variable - I was thinking of something very fast at the asssembly level, for the million of use case.
Something like (doesn't really work in C):

unsigned mybitfield, bitindex;
ifzeroset(bitno = ffs(mybitfield)) // ffs = find first bit set
error("none bit are set");
else
bitindex = bitno;

Translated in ia32 assembler by:
ffs eax,edx
jz call_error
mov edx,bitindex

You can replace ffs by strchr or any basic function which may not be inlined.
Maybe it could also be implemented in GCC by:
register struct flags {
unsigned zero : 1;
unsigned carry : 1;
...
} flags asm ("cc");
but the modification of flags by so many assembly instruction is probably a problem (reordering instructions optimisations).

Why is it a mess?

Posted Dec 19, 2011 7:06 UTC (Mon) by khim (subscriber, #9252) [Link]

Actually this is time-honored way of doing things (think SQL and nullable types in C#).

Of course in a language where the most natural operator will happily declare 0 identical to false it's a disaster, but in LISP-derived languages (where “eq?”, “eqv?”, and “equal?” - all agree that “0” and “#f” are different and “if” treats both “0” and “1” and kind-of-true) it works just fine.

In languages like perl, php and python (which try to "guess" what you meant and "help" you) it's disaster, obviously.

Why is it a mess?

Posted Dec 19, 2011 10:11 UTC (Mon) by HelloWorld (guest, #56129) [Link]

I don't see the point in allowing anything that isn't either true or false in an if statement/expression.
I happen to like the way the STL solved this problem in std::find. If an element is found, return an iterator to it. If it's not found, return a past-the-end iterator. In Haskell, one would typically return a 'Maybe' value, which is either "Nothing" or "Just x", where x is the number you're looking for. One then uses pattern matching to distinguish the cases.

Hmm...

Posted Dec 19, 2011 10:32 UTC (Mon) by khim (subscriber, #9252) [Link]

I don't see the point in allowing anything that isn't either true or false in an if statement/expression.

The point is expressiveness, as usual. You can do anything you want without such things (see Turing tarpit), but this kind of misses the point.

I happen to like the way the STL solved this problem in std::find. If an element is found, return an iterator to it. If it's not found, return a past-the-end iterator.

This is good kludge for the statically typed language with a lot of limitations. They could not use just plain NULL because they wanted to make sure iterators may be anything, so they invented this scheme. Still not sure why you think it's better then simple "iterator or “#f”": in C++ case you often need to process iterator somehow (if caller which needs the iterator does not want to know about your map), in case of LISP you can just return result “as is”.

In Haskell, one would typically return a 'Maybe' value, which is either "Nothing" or "Just x", where x is the number you're looking for. One then uses pattern matching to distinguish the cases.

And all thus additional unneeded manipulations are good… exactly why?

I dislike languages without static typing because they leave too much to the runtime, but then if we are already in realm of dynamic languages it's stupid not to use the fact that you can return objects of different types to your advantage: why do you use language with dynamic typing or duck typing if you only use things available in statically typed language???

Hmm...

Posted Dec 19, 2011 12:15 UTC (Mon) by alankila (subscriber, #47141) [Link]

In general it's acceptable to return different types, I guess, as long as it is driven by some kind of compelling necessity or resulting convenience. This FALSE is just not convenient because of the confusion with 0; I think it just shows remarkably poor taste. When designing an API, the designer should take full responsibility about the anguish the API's hapless user must endure.

Hmm...

Posted Dec 19, 2011 12:45 UTC (Mon) by HelloWorld (guest, #56129) [Link]

The point is expressiveness, as usual.[...] And all thus additional unneeded manipulations are good… exactly why?
I don't see any gain in expressiveness here, nor do I see any "unneeded manipulations". PHP:
$z = strpos($x, $y);
if ($z === false)
  do_something();
else
  do_something_else($z);
C++:
auto z = boost::search(x, y);
if (it == x.end())
  do_something();
else
  do_something_else(z);
I dislike languages without static typing because they leave too much to the runtime, but then if we are already in realm of dynamic languages it's stupid not to use the fact that you can return objects of different types to your advantage: why do you use language with dynamic typing or duck typing if you only use things available in statically typed language???
Well, you're raising a good point, and the answer is that one just shouldn't use a dynamically typed language for anything but throw-away stuff. Doing stuff like returning either an int or a boolean spoils the way people think. They start to think that in order to obtain a convenient solution to this problem they need dynamic typing, even though that isn't so.

Hmm...

Posted Dec 19, 2011 12:49 UTC (Mon) by HelloWorld (guest, #56129) [Link]

Oh by the way, that reminds me of an old quote by Dijkstra:
It is practically impossible to teach good programming to students that have had a prior exposure to BASIC: as potential programmers they are mentally mutilated beyond hope of regeneration.
Needless to say, the same applies to PHP.

Hmm...

Posted Dec 19, 2011 13:00 UTC (Mon) by mpr22 (subscriber, #60784) [Link]

I'm inclined to think that that particular Dijkstra quote probably says at least as much about Dijkstra's ability to teach as it does about BASIC.

Hmm...

Posted Dec 19, 2011 13:27 UTC (Mon) by nix (subscriber, #2304) [Link]

I think it has more to say about what Dijkstra wanted to teach. Once you've been exposed to a language that just lets you get simple stuff done without much effort, the preferred Dijkstra way of formally proving as much as possible starts to seem incredibly long-winded, even if in theory it does eventually produce better results. (It is, of course, a good idea in some domains -- just not everywhere, as Dijkstra sometimes seems to have wished.)

For very large programs, particularly in safety-critical domains, formal proof, particularly of core components, starts to seem reasonable -- but when you're teaching you're going to be using small examples, because you have to. And those small examples are typically too small to need formal methods of any kind. Expose someone to BASIC, or another similar language which is good for quickly whipping up something small that works but that falls apart on larger scales, and they are likely to think 'why bother with formal methods?' when exposed to little teaching examples of their use, for which, to be blunt, any random language would often suffice with no use of formal methods at all.

Why is it a mess?

Posted Dec 19, 2011 13:18 UTC (Mon) by ekj (guest, #1524) [Link]

If the language is object-oriented and allows objects to override operators, it makes sense to also let them answer the question of being true or false.

This is essentially what happens in python: An empty string is considered false while all other strings are considered true. That's just sugar offcourse, you could do if mystring.length() > 0 instead of if mystring and get the same result, but it's a useful shortcut.

Considering integers different from zero to be true, and zero to be false is also very common although the same tradeoff applies if myint could always be rewritten to if myint <> 0 which does have the advantage of being more explicit.

automagick type-conversion like PHP have is nuts though, they consider "0" to be false because the integer 0 is false. (but "0.0" is not false, despite the fact that 0.0 *is* false.

Why is it a mess?

Posted Dec 25, 2011 14:09 UTC (Sun) by juliank (subscriber, #45896) [Link]

> myint <> 0

Use myint != 0. That other form is deprecated in Python 2, and removed in Python 3.

Why is it a mess?

Posted Dec 20, 2011 23:32 UTC (Tue) by pboddie (subscriber, #50784) [Link]

In languages like perl, php and python (which try to "guess" what you meant and "help" you) it's disaster, obviously.

Python doesn't "guess" anything. There's a protocol that all classes support which indicates whether an instance is considered true or false.

Why is it a mess?

Posted Dec 21, 2011 7:37 UTC (Wed) by ekj (guest, #1524) [Link]

Yeah. But even python considers boolean false and integer zero to be equivalent, and all other integers to be true. Indeed the method to override to define your own behaviour for if object is object.__nonzero__ or object.__len__

From the naming alone it's clear that being "true" by convention means "having length" or "not being zero". (why is the method __nonzero__ and not __zero__ (with opposite semantics) by the way, seems an odd kind of superfluous negation. Returning false hear means that the object is *not* *nonzero* i.e. that it's zero.

Cracks in the Foundation (PHP Advent)

Posted Dec 19, 2011 3:30 UTC (Mon) by imgx64 (guest, #78590) [Link]

> Actually, being able to quickly swap from PHP to HTML (especially using short-tags <? ... ?> ) can be useful. One way to create, imho, legible code is to do all the processing first, and then write out the bulk of the HTML, inserting <?=$variable;?> as needed. Like any language feature, this can be abused.

Yes, this is called a template engine. Every other web language has one, and it's by no means PHP specific. Even statically-typed languages like Java and Go have template engines (Go even has two in the standard library, although the older one will be deleted before Go1 is released).

On the other hand, making the whole language a template engine is just wrong IMO.

> BTW, ironically, you *didn't* disable your rant. String "0" IS equal (==) to false, though it isn't identically equal (===).

Yes, that was the intention.

> If you wanted to complain, you'd have to complain that "0.0" is considered true; "0.0" is considered equal to 0.0, and that 0.0 is considered false. Sadly there is no perfect way to write automatic-casting rules.

Wow. PHP is even worse than I remember.

Cracks in the Foundation (PHP Advent)

Posted Dec 19, 2011 13:21 UTC (Mon) by nix (subscriber, #2304) [Link]

If you wanted to complain, you'd have to complain that "0.0" is considered true; "0.0" is considered equal to 0.0, and that 0.0 is considered false.
Who the hell thought that was sane? It makes C++'s loop-avoidance rules on user-defined implicit casting ('no more than one') look sensible.

I know, I know, nobody thought it was sane -- the string-equals-number rule was added because it might be useful without considering its effect on the semantics of the language.

"Pile in useful features without much global consideration" works for a lot of software. It really really does not work for languages. Does PHP have nothing like the PEP process, where the badly-thought-out ideas can quietly die?

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds