Posted Nov 25, 2010 21:52 UTC (Thu) by jengelh (subscriber, #33263)
In reply to: On breaking things by mrshiny
Parent article: On breaking things
Then I don't understand your comment why you'd have to update all programs. Just the ones that use memcpy incorrectly.
>user, who knows nothing about memcpy, now has a broken app.
But false attribution of fault is nothing new. When a program/driver did a stupid thing in Windows 9x and lead to a bluescreen, few would consider it to be a program/driver issue, and instead blamed Windows.
Posted Nov 26, 2010 4:27 UTC (Fri) by mrshiny (subscriber, #4266)
[Link]
Nobody is saying the apps aren't at fault.
The problem is that the glibc changed the situation from a hypothetical bug to an actual bug.
And due to the nature of the bug it's impossible for the user to diagnose it.
And because this change isn't hidden from older binaries by version symbols, upgrading the library breaks the apps and the user may have no way of getting a fixed app.
My point is that users are being held hostage so that the glibc maintainers can say "meh, those stupid programmers at <wherever> should read the C99 standard". Thanks, that doesn't help me with my problem at all.
The windows developers have many features in place to provide backwards compatibility for broken apps. Yes, they need it more because source isn't available for most windows apps, but still.
At least in Linux I can roll my own LD_PRELOAD hack to fix this. Except, it's a pain in the butt to use, and I only know of one app that needs it right now (flash). Maybe there is another one, somewhere on my system, which is misbehaving in a way that will cause me to lose important data in a few days. I have no way of knowing.
Also this is the 2nd time in recent years that a change to memcpy broke apps on my system. Maybe the glibc people should change glibc so that it subtly breaks ALL apps that violate the C standard? So that instead of hundreds of hypothetical bugs we'd have hundreds of real bugs, happily munching the data on your hard disk? It's within their rights, I suppose.
On breaking things
Posted Nov 26, 2010 9:57 UTC (Fri) by mpr22 (subscriber, #60784)
[Link]
I believe that Ulrich Drepper's position is roughly that if a change to glibc's internal implementation of aspects of an ISO C function's behaviour that compliant ISO C programs are explicitly forbidden to rely on (e.g. whether memcpy() copies forward, backward, or oscillating outward from the middle; what isalpha() etc. do if you pass them OOB values) breaks an application, it's the application developer's fault and officially Not His Problem and if you try to make it his problem he will tell you exactly where to get off. Especially if it's a closed source application.
On breaking things
Posted Nov 26, 2010 13:40 UTC (Fri) by mrshiny (subscriber, #4266)
[Link]
I think you are correct about his position. I just feel that it's not the right position for a library maintainer to take, especially the single most important library in the whole system.
The thing that bugs me is that there is a way to implement this change such that all newly-compiled apps get the improvement while older apps get the older behaviour. Sure, for Fedora that means that every single app which might have this bug is now vulnerable, but anything else will be fine. Lots of people have apps that they can't easily change. Many of those apps are even Free Software. Those users cannot reliably upgrade glibc, it seems.
On breaking things
Posted Nov 26, 2010 19:09 UTC (Fri) by giraffedata (subscriber, #1954)
[Link]
I just feel that it's not the right position for a library maintainer to take, especially the single most important library in the whole system.
But let's not attribute more responsibility to Drepper than he really has. One of the reasons a distributor of free software has the privilege of defining what is Not His Problem is that anyone who disagrees is free to do better. The article makes this point in noting that it is Fedora, not the Glibc project, that is distributing a problematic library, and Fedora is accepting that responsibility and discussing whether to distribute the old or new memcpy behavior.
And, according to the article, the people arguing in favor of distributing the new memcpy behavior aren't doing so based on principle, like Drepper, but based on the belief that giving better performance to a wide range of users over the long term is better than making Flash work for some users in the short term.
On breaking things
Posted Nov 26, 2010 19:27 UTC (Fri) by dlang (✭ supporter ✭, #313)
[Link]
it's not just flash that was broken.
there's also the problem that the breakage can easily go unnoticed, and can corrupt the users data.
On breaking things
Posted Nov 26, 2010 19:48 UTC (Fri) by mrshiny (subscriber, #4266)
[Link]
If they used symbol versioning (or whatever it's called) they could have working Flash AND better performance in the long run.
On breaking things
Posted Nov 26, 2010 20:06 UTC (Fri) by oak (subscriber, #2786)
[Link]
> My point is that users are being held hostage so that the glibc maintainers can say "meh, those stupid programmers at <wherever> should read the C99 standard".
This was specified already in ANSI-C in 80's, i.e. last century.
Memory debugging tools like Valgrind, duma etc. have been giving warnings about memcpy() calls with overlapping addresses at least for a decade.
If 10-20 years isn't enough for e.g. Adobe to test with freely available (or commercial) tools that their software is robust, portable and correctly implemented, I don't have very high hopes of it ever being what I (and apparently Steve Jobs) call "product quality" SW.
On breaking things
Posted Nov 26, 2010 20:22 UTC (Fri) by dlang (✭ supporter ✭, #313)
[Link]
Valgrind will only report the problem if the particular run of the program happens to produce overlapping regions.
If the work is something like defragmenting memory by moving things around, it's very possible for one particular run to not have overlaps, but another run to have overlaps. Any time you have the pointers calculated in the program, you have a case where they may or may not overlap on a particular run.
and while you are chastising Adobe for not having tested with Valgrind, make sure you chastise everyone else (including glibc maintainers) for the same thing.
this is unfortunantly a very easy mistake to make, and unless a change like this is made, it's unlikely to some to light.
On breaking things
Posted Nov 26, 2010 20:46 UTC (Fri) by jengelh (subscriber, #33263)
[Link]
Well if this many users run into the problem that have, surely the chance for Adobe employess would be reasonably high to encounter it too at least once.
On breaking things
Posted Nov 26, 2010 21:01 UTC (Fri) by mrshiny (subscriber, #4266)
[Link]
Naturally, now that glibc has changed this bug from hypothetical to actual, the Adobe maintainers will have no problem at all reproducing it. But prior to this change nobody experienced the bug.
On breaking things
Posted Nov 26, 2010 21:43 UTC (Fri) by jengelh (subscriber, #33263)
[Link]
Which would hint towards Adobe not having run Valgrind. Because, seriously, Youtube is not exactly new nor a small unimportant site.
On breaking things
Posted Nov 26, 2010 22:16 UTC (Fri) by oak (subscriber, #2786)
[Link]
> If the work is something like defragmenting memory by moving things around, it's very possible for one particular run to not have overlaps, but another run to have overlaps.
One should of course understand that tools can give only positive proof of bugs existence, not proof of them not existing. Things like running Valgrind (and static checkers) should be part of the development process, so that over SW life time & changes one gets better coverage. It's not some one-off, instantly forgotten thing.
> and while you are chastising Adobe for not having tested with Valgrind, make sure you chastise everyone else (including glibc maintainers) for the same thing.
Sure.
Btw. If calls to a function are within the same library where the function is implemented, function wrappers don't catch that unless it also goes through .plt. And as to Glibc FORTIFY utility, I'm not sure whether Glibc enables that for itself...?
With Duma I've seen also another "issue" in Glibc, memmove() calling memcpy() because it has inherent knowledge about in which direction memcpy() works. Because Duma doesn't have that info, it complains (Duma has now a variable for this)...