Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 23, 2013
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
On breaking things
Posted Nov 25, 2010 21:47 UTC (Thu) by mrshiny (subscriber, #4266)
If you wrote a program which accidentally did a memcpy when it should have been memmove, and (due to implementation details in memcpy) this worked just fine for years, you might not find the bug. A user who installed your application, which worked fine for years, and never upgraded the app but upgraded glibc, will suddenly find that your app is broken. Maybe you found and fixed the bug in a later version of the app. Maybe not. The point is that the user, who knows nothing about memcpy, now has a broken app. And the app might only break under certain conditions. And it might result in any kind of error: crash, corrupted data, silent corruption of data, audio or video glitches, who knows. These users are being punished so that someone else potentially might have a slightly faster memcpy.
Posted Nov 25, 2010 21:52 UTC (Thu) by jengelh (subscriber, #33263)
>user, who knows nothing about memcpy, now has a broken app.
But false attribution of fault is nothing new. When a program/driver did a stupid thing in Windows 9x and lead to a bluescreen, few would consider it to be a program/driver issue, and instead blamed Windows.
Posted Nov 26, 2010 4:27 UTC (Fri) by mrshiny (subscriber, #4266)
The problem is that the glibc changed the situation from a hypothetical bug to an actual bug.
And due to the nature of the bug it's impossible for the user to diagnose it.
And because this change isn't hidden from older binaries by version symbols, upgrading the library breaks the apps and the user may have no way of getting a fixed app.
My point is that users are being held hostage so that the glibc maintainers can say "meh, those stupid programmers at <wherever> should read the C99 standard". Thanks, that doesn't help me with my problem at all.
The windows developers have many features in place to provide backwards compatibility for broken apps. Yes, they need it more because source isn't available for most windows apps, but still.
At least in Linux I can roll my own LD_PRELOAD hack to fix this. Except, it's a pain in the butt to use, and I only know of one app that needs it right now (flash). Maybe there is another one, somewhere on my system, which is misbehaving in a way that will cause me to lose important data in a few days. I have no way of knowing.
Also this is the 2nd time in recent years that a change to memcpy broke apps on my system. Maybe the glibc people should change glibc so that it subtly breaks ALL apps that violate the C standard? So that instead of hundreds of hypothetical bugs we'd have hundreds of real bugs, happily munching the data on your hard disk? It's within their rights, I suppose.
Posted Nov 26, 2010 9:57 UTC (Fri) by mpr22 (subscriber, #60784)
I believe that Ulrich Drepper's position is roughly that if a change to glibc's internal implementation of aspects of an ISO C function's behaviour that compliant ISO C programs are explicitly forbidden to rely on (e.g. whether memcpy() copies forward, backward, or oscillating outward from the middle; what isalpha() etc. do if you pass them OOB values) breaks an application, it's the application developer's fault and officially Not His Problem and if you try to make it his problem he will tell you exactly where to get off. Especially if it's a closed source application.
Posted Nov 26, 2010 13:40 UTC (Fri) by mrshiny (subscriber, #4266)
The thing that bugs me is that there is a way to implement this change such that all newly-compiled apps get the improvement while older apps get the older behaviour. Sure, for Fedora that means that every single app which might have this bug is now vulnerable, but anything else will be fine. Lots of people have apps that they can't easily change. Many of those apps are even Free Software. Those users cannot reliably upgrade glibc, it seems.
Posted Nov 26, 2010 19:09 UTC (Fri) by giraffedata (subscriber, #1954)
I just feel that it's not the right position for a library maintainer to take, especially the single most important library in the whole system.
But let's not attribute more responsibility to Drepper than he really has. One of the reasons a distributor of free software has the privilege of defining what is Not His Problem is that anyone who disagrees is free to do better. The article makes this point in noting that it is Fedora, not the Glibc project, that is distributing a problematic library, and Fedora is accepting that responsibility and discussing whether to distribute the old or new memcpy behavior.
And, according to the article, the people arguing in favor of distributing the new memcpy behavior aren't doing so based on principle, like Drepper, but based on the belief that giving better performance to a wide range of users over the long term is better than making Flash work for some users in the short term.
Posted Nov 26, 2010 19:27 UTC (Fri) by dlang (✭ supporter ✭, #313)
there's also the problem that the breakage can easily go unnoticed, and can corrupt the users data.
Posted Nov 26, 2010 19:48 UTC (Fri) by mrshiny (subscriber, #4266)
Posted Nov 26, 2010 20:06 UTC (Fri) by oak (subscriber, #2786)
This was specified already in ANSI-C in 80's, i.e. last century.
Memory debugging tools like Valgrind, duma etc. have been giving warnings about memcpy() calls with overlapping addresses at least for a decade.
If 10-20 years isn't enough for e.g. Adobe to test with freely available (or commercial) tools that their software is robust, portable and correctly implemented, I don't have very high hopes of it ever being what I (and apparently Steve Jobs) call "product quality" SW.
Posted Nov 26, 2010 20:22 UTC (Fri) by dlang (✭ supporter ✭, #313)
If the work is something like defragmenting memory by moving things around, it's very possible for one particular run to not have overlaps, but another run to have overlaps. Any time you have the pointers calculated in the program, you have a case where they may or may not overlap on a particular run.
and while you are chastising Adobe for not having tested with Valgrind, make sure you chastise everyone else (including glibc maintainers) for the same thing.
this is unfortunantly a very easy mistake to make, and unless a change like this is made, it's unlikely to some to light.
Posted Nov 26, 2010 20:46 UTC (Fri) by jengelh (subscriber, #33263)
Posted Nov 26, 2010 21:01 UTC (Fri) by mrshiny (subscriber, #4266)
Posted Nov 26, 2010 21:43 UTC (Fri) by jengelh (subscriber, #33263)
Posted Nov 26, 2010 22:16 UTC (Fri) by oak (subscriber, #2786)
One should of course understand that tools can give only positive proof of bugs existence, not proof of them not existing. Things like running Valgrind (and static checkers) should be part of the development process, so that over SW life time & changes one gets better coverage. It's not some one-off, instantly forgotten thing.
> and while you are chastising Adobe for not having tested with Valgrind, make sure you chastise everyone else (including glibc maintainers) for the same thing.
Btw. If calls to a function are within the same library where the function is implemented, function wrappers don't catch that unless it also goes through .plt. And as to Glibc FORTIFY utility, I'm not sure whether Glibc enables that for itself...?
With Duma I've seen also another "issue" in Glibc, memmove() calling memcpy() because it has inherent knowledge about in which direction memcpy() works. Because Duma doesn't have that info, it complains (Duma has now a variable for this)...
Posted Nov 26, 2010 11:44 UTC (Fri) by cortana (subscriber, #24596)
Even ones that make use of NSS modules?
Posted Nov 26, 2010 13:34 UTC (Fri) by mrshiny (subscriber, #4266)
Posted Nov 26, 2010 20:00 UTC (Fri) by oak (subscriber, #2786)
With e.g. C-libraries intended for embedded devices like uClibc, it's a bit easier.
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds