Python 3 adoption

Posted Apr 15, 2015 23:02 UTC (Wed) by dlang (guest, #313)
In reply to: Python 3 adoption by fandingo
Parent article: Python 3 adoption

> You (and/or your coworkers) have no one to blame but yourself. 1) Didn't pay attention to upstream; yet, you use their updated code. 2) Didn't pay attention to your distro update policy or read the update notes. 3) Apparently, apply upgrades to production that cause major problems without even basic testing.

The third item is the only one that is reasonably their fault. The first two are the reason for using distros in the first place. Nobody is able to follow every upstream to the level of detail needed to catch this ahead of time.

as to the third, you make the assertion that they did the upgrades "without even basic testing". You probably have a different definition of "basic testing" that I do, but I don't find it possible to test every single function of every single tool for every single upgrade.

Trying to do that leads to paralysis.

Python 3 adoption

Posted Apr 16, 2015 0:12 UTC (Thu) by fandingo (guest, #67019) [Link] (5 responses)

> The third item is the only one that is reasonably their fault. The first two are the reason for using distros in the first place. Nobody is able to follow every upstream to the level of detail needed to catch this ahead of time.

Cyberax specifically knew about this 3 entire months before it was published! This isn't even remotely a tale about someone not knowing this change was coming. He obviously read and commented on an article that said the version number and release schedule. It's kind of a different situation when there is documented proof that he did know about this change.

Furthermore, he says they were using apt-get, and that almost certainly means Ubuntu or Debian. None of the Ubuntu LTS releases have anything newer than 2.7.5. Debian testing and unstable do, but Testing only got it on March 16th. The point being that there aren't too many places one could've picked up python 2.7.9 with apt-get.

I've worked at places that use Gentoo, Fedora, contemporary EL, or the most ancient EL distros imaginable. Different strokes for different folks, and I think that they all have their places. But, if you want to use something that changes quicker, you *have* to pay closer attention.

> You probably have a different definition of "basic testing" that I do, but I don't find it possible to test every single function of every single tool for every single upgrade.

Automated testing, TDD? I thought that was common place a decade ago. I can't imagine deploying code where every method and class aren't testable, especially for something like interfacing with AWS. This problem would've been caught by any Boto interaction with AWS. Any!

Furthermore, the mention of Boto and the extensive amount of work to fix the problem indicate that this functionality was critical to the application, which means it should've been covered.

The other point worth mentioning is that there is an issue on the Gevent Github page that is easily searchable, and there was a very easy workaround on September 21. It also works for Boto.

Basically, Cyberax's story is complete nonsense. He just wanted to bitch about Python, yet again, so he concocted a story. The dates don't make any sense. There was already a workaround nearly 3 months before the Python release for the specific libraries he mentions. Then, the distro both doesn't make sense (Debian Testing/Unstable or Ubuntu 14.10 in an enterprise that does blind updates?), and the timings are crazy (Python 2.7.5 released in Debian Testing and Ubuntu 14.10 on March 16, 2015).

Maybe there was a problem with their application, maybe they are running a newer Apt-based distro, and maybe the update slipped past them. Okay. The parts that seem totally unreasonable are 1) that he knew about this change in September by commenting on an LWN article about the compatibility break (and complained about it) and 2) that he would spend "several 20-hour days" trying to fix this problem and not type in "boto python 2.7.9 ssl" in Google and find the same solution that I did, which was posted in September. Hell, that solution is damn simple anyways, and any programmer should be able to recreate a missing method. Or, you know, just downgrade the Python package.

Getting back to testing, don't you have multiple environments? I don't know what his application does, but it seems difficult to imagine that if any sort of staggered upgrades across environments were being used that, at minimum, one of the developers would notice that every call to AWS was failing spectacularly.

Python 3 adoption

Posted Apr 16, 2015 3:41 UTC (Thu) by dlang (guest, #313) [Link] (4 responses)

> Automated testing, TDD? I thought that was common place a decade ago. I can't imagine deploying code where every method and class aren't testable, especially for something like interfacing with AWS. This problem would've been caught by any Boto interaction with AWS. Any!

what you are missing is that they didn't deploy their code. they applied system patches.

Every month when windows patches are applied in your network, does every function of every application (including ones that you didn't write) get tested before being deployed to anything important?

If so, you are a 1%er (if not more rare)

Python 3 adoption

Posted Apr 16, 2015 6:03 UTC (Thu) by fandingo (guest, #67019) [Link]

> Every month when windows patches are applied in your network, does every function of every application (including ones that you didn't write) get tested before being deployed to anything important?

First, that's not my job, and second, I don't use Windows at work, so I don't have the first clue. Additionally, I wouldn't really consider workstation updates anywhere near critical, but perhaps I simply don't work at a company with enough employees to fill a stadium. (I don't see how this relates to the issue at hand anyways.)

I feel like I'm in crazy town talking to you. Do you not promote system updates to your dev, test, and integration environments before pushing them to production. This isn't some sort of sneaky problem that only triggers itself under rare circumstances. If boto is unpatched and talks to AWS in any manner, it's error city. Cursory test coverage would hit it, and developers just doing their normal work would uncover it in no time.

You seem to want to turn this into some sort of hypothetical or abstract discussion. That's not the situation. It's a very specific set of circumstances that affect a defined set of libraries (boto and gevent -- really just gevent). I don't really care about talking about abstract stuff when there's already a concrete discussion to have.

The facts are that this error would be apparent to any developer or system administrator who has segregated environments and staggers updates through them because the errors would occur so often and obviously that they could not be overlooked. It was either sloppy engineering that allowed this patch to be deployed to production without testing or lazy management who didn't realize they were dependent on dead code (gevent) and didn't bother to follow changes that might affect that dead library.

Python 3 adoption

Posted Apr 17, 2015 9:52 UTC (Fri) by asaz989 (guest, #67798) [Link] (2 responses)

At my previous work, apt-get dist-upgrade was not something you just ran routinely - in fact, it is an Even Scarier Thing To Do than a code deploy, since often unit test environments don't test the base system. Hence, very very slow staged deployments.

Python 3 adoption

Posted Apr 17, 2015 23:58 UTC (Fri) by zlynx (guest, #2285) [Link] (1 responses)

This is why unit tests are good but not sufficient. They only show that code works the way that the testers think it does.

I especially hate mock objects, because now the code is being tested against a pretend version of the real thing and the mock probably has bugs.

So without integration tests on real libraries, services and hardware, the code has not really been tested at all.

Python 3 adoption

Posted Apr 18, 2015 7:25 UTC (Sat) by cortana (subscriber, #24596) [Link]

Why not both? Unit tests are still useful for testing how your code will react when a collaborator behaves in a way that is documented, but is difficult to reproduce in a test environment.