LWN.net Logo

On breaking things

On breaking things

Posted Nov 24, 2010 23:01 UTC (Wed) by jengelh (subscriber, #33263)
Parent article: On breaking things

>On the other hand, one could argue that breaking Flash is a good way to demonstrate to users that they should be using a different distribution - or another operating system entirely.

Ha, those users came to Linux hoping to run proprietary stuff? They be taught better!


(Log in to post comments)

On breaking things

Posted Nov 25, 2010 15:10 UTC (Thu) by mrshiny (subscriber, #4266) [Link]

Damn straight. And they better not upgrade glibc without also upgrading every other package on their system too, since those older packages may have been developed against an older glibc; even a recompile doesn't fix this problem, you need to change every affected program, and update those to the latest, non-buggy version.

On breaking things

Posted Nov 25, 2010 19:10 UTC (Thu) by jengelh (subscriber, #33263) [Link]

Only for statically linked programs. Thankfully most proprietary programs' vendors have got the idea that dynamic linking against libc is a good idea.

On breaking things

Posted Nov 25, 2010 21:47 UTC (Thu) by mrshiny (subscriber, #4266) [Link]

No, only statically linked programs are NOT affected by the change in Glibc.

If you wrote a program which accidentally did a memcpy when it should have been memmove, and (due to implementation details in memcpy) this worked just fine for years, you might not find the bug. A user who installed your application, which worked fine for years, and never upgraded the app but upgraded glibc, will suddenly find that your app is broken. Maybe you found and fixed the bug in a later version of the app. Maybe not. The point is that the user, who knows nothing about memcpy, now has a broken app. And the app might only break under certain conditions. And it might result in any kind of error: crash, corrupted data, silent corruption of data, audio or video glitches, who knows. These users are being punished so that someone else potentially might have a slightly faster memcpy.

On breaking things

Posted Nov 25, 2010 21:52 UTC (Thu) by jengelh (subscriber, #33263) [Link]

Then I don't understand your comment why you'd have to update all programs. Just the ones that use memcpy incorrectly.

>user, who knows nothing about memcpy, now has a broken app.

But false attribution of fault is nothing new. When a program/driver did a stupid thing in Windows 9x and lead to a bluescreen, few would consider it to be a program/driver issue, and instead blamed Windows.

On breaking things

Posted Nov 26, 2010 4:27 UTC (Fri) by mrshiny (subscriber, #4266) [Link]

Nobody is saying the apps aren't at fault.

The problem is that the glibc changed the situation from a hypothetical bug to an actual bug.

And due to the nature of the bug it's impossible for the user to diagnose it.

And because this change isn't hidden from older binaries by version symbols, upgrading the library breaks the apps and the user may have no way of getting a fixed app.

My point is that users are being held hostage so that the glibc maintainers can say "meh, those stupid programmers at <wherever> should read the C99 standard". Thanks, that doesn't help me with my problem at all.

The windows developers have many features in place to provide backwards compatibility for broken apps. Yes, they need it more because source isn't available for most windows apps, but still.

At least in Linux I can roll my own LD_PRELOAD hack to fix this. Except, it's a pain in the butt to use, and I only know of one app that needs it right now (flash). Maybe there is another one, somewhere on my system, which is misbehaving in a way that will cause me to lose important data in a few days. I have no way of knowing.

Also this is the 2nd time in recent years that a change to memcpy broke apps on my system. Maybe the glibc people should change glibc so that it subtly breaks ALL apps that violate the C standard? So that instead of hundreds of hypothetical bugs we'd have hundreds of real bugs, happily munching the data on your hard disk? It's within their rights, I suppose.

On breaking things

Posted Nov 26, 2010 9:57 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

I believe that Ulrich Drepper's position is roughly that if a change to glibc's internal implementation of aspects of an ISO C function's behaviour that compliant ISO C programs are explicitly forbidden to rely on (e.g. whether memcpy() copies forward, backward, or oscillating outward from the middle; what isalpha() etc. do if you pass them OOB values) breaks an application, it's the application developer's fault and officially Not His Problem and if you try to make it his problem he will tell you exactly where to get off. Especially if it's a closed source application.

On breaking things

Posted Nov 26, 2010 13:40 UTC (Fri) by mrshiny (subscriber, #4266) [Link]

I think you are correct about his position. I just feel that it's not the right position for a library maintainer to take, especially the single most important library in the whole system.

The thing that bugs me is that there is a way to implement this change such that all newly-compiled apps get the improvement while older apps get the older behaviour. Sure, for Fedora that means that every single app which might have this bug is now vulnerable, but anything else will be fine. Lots of people have apps that they can't easily change. Many of those apps are even Free Software. Those users cannot reliably upgrade glibc, it seems.

On breaking things

Posted Nov 26, 2010 19:09 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

I just feel that it's not the right position for a library maintainer to take, especially the single most important library in the whole system.

But let's not attribute more responsibility to Drepper than he really has. One of the reasons a distributor of free software has the privilege of defining what is Not His Problem is that anyone who disagrees is free to do better. The article makes this point in noting that it is Fedora, not the Glibc project, that is distributing a problematic library, and Fedora is accepting that responsibility and discussing whether to distribute the old or new memcpy behavior.

And, according to the article, the people arguing in favor of distributing the new memcpy behavior aren't doing so based on principle, like Drepper, but based on the belief that giving better performance to a wide range of users over the long term is better than making Flash work for some users in the short term.

On breaking things

Posted Nov 26, 2010 19:27 UTC (Fri) by dlang (✭ supporter ✭, #313) [Link]

it's not just flash that was broken.

there's also the problem that the breakage can easily go unnoticed, and can corrupt the users data.

On breaking things

Posted Nov 26, 2010 19:48 UTC (Fri) by mrshiny (subscriber, #4266) [Link]

If they used symbol versioning (or whatever it's called) they could have working Flash AND better performance in the long run.

On breaking things

Posted Nov 26, 2010 20:06 UTC (Fri) by oak (guest, #2786) [Link]

> My point is that users are being held hostage so that the glibc maintainers can say "meh, those stupid programmers at <wherever> should read the C99 standard".

This was specified already in ANSI-C in 80's, i.e. last century.

Memory debugging tools like Valgrind, duma etc. have been giving warnings about memcpy() calls with overlapping addresses at least for a decade.

If 10-20 years isn't enough for e.g. Adobe to test with freely available (or commercial) tools that their software is robust, portable and correctly implemented, I don't have very high hopes of it ever being what I (and apparently Steve Jobs) call "product quality" SW.

On breaking things

Posted Nov 26, 2010 20:22 UTC (Fri) by dlang (✭ supporter ✭, #313) [Link]

Valgrind will only report the problem if the particular run of the program happens to produce overlapping regions.

If the work is something like defragmenting memory by moving things around, it's very possible for one particular run to not have overlaps, but another run to have overlaps. Any time you have the pointers calculated in the program, you have a case where they may or may not overlap on a particular run.

and while you are chastising Adobe for not having tested with Valgrind, make sure you chastise everyone else (including glibc maintainers) for the same thing.

this is unfortunantly a very easy mistake to make, and unless a change like this is made, it's unlikely to some to light.

On breaking things

Posted Nov 26, 2010 20:46 UTC (Fri) by jengelh (subscriber, #33263) [Link]

Well if this many users run into the problem that have, surely the chance for Adobe employess would be reasonably high to encounter it too at least once.

On breaking things

Posted Nov 26, 2010 21:01 UTC (Fri) by mrshiny (subscriber, #4266) [Link]

Naturally, now that glibc has changed this bug from hypothetical to actual, the Adobe maintainers will have no problem at all reproducing it. But prior to this change nobody experienced the bug.

On breaking things

Posted Nov 26, 2010 21:43 UTC (Fri) by jengelh (subscriber, #33263) [Link]

Which would hint towards Adobe not having run Valgrind. Because, seriously, Youtube is not exactly new nor a small unimportant site.

On breaking things

Posted Nov 26, 2010 22:16 UTC (Fri) by oak (guest, #2786) [Link]

> If the work is something like defragmenting memory by moving things around, it's very possible for one particular run to not have overlaps, but another run to have overlaps.

One should of course understand that tools can give only positive proof of bugs existence, not proof of them not existing. Things like running Valgrind (and static checkers) should be part of the development process, so that over SW life time & changes one gets better coverage. It's not some one-off, instantly forgotten thing.

> and while you are chastising Adobe for not having tested with Valgrind, make sure you chastise everyone else (including glibc maintainers) for the same thing.

Sure.

Btw. If calls to a function are within the same library where the function is implemented, function wrappers don't catch that unless it also goes through .plt. And as to Glibc FORTIFY utility, I'm not sure whether Glibc enables that for itself...?

With Duma I've seen also another "issue" in Glibc, memmove() calling memcpy() because it has inherent knowledge about in which direction memcpy() works. Because Duma doesn't have that info, it complains (Duma has now a variable for this)...

On breaking things

Posted Nov 26, 2010 11:44 UTC (Fri) by cortana (subscriber, #24596) [Link]

> No, only statically linked programs are NOT affected by the change in Glibc.

Even ones that make use of NSS modules?

On breaking things

Posted Nov 26, 2010 13:34 UTC (Fri) by mrshiny (subscriber, #4266) [Link]

I couldn't answer that question, I don't know enough about it. But by definition, statically-linked libraries don't get upgraded, so your old statically-linked apps with hidden bugs won't suddenly find that those bugs are now active once glibc is upgraded for the rest of the system.

On breaking things

Posted Nov 26, 2010 20:00 UTC (Fri) by oak (guest, #2786) [Link]

You can't do completely statically linked programs with Glibc unless they're really simple. There are several parts in Glibc (like NSS) which load code dynamically.

With e.g. C-libraries intended for embedded devices like uClibc, it's a bit easier.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds