Fallout from the Python certificate verification change

By Jake Edge
December 2, 2015

Up until fairly recently, the Python standard library did not do any checking of HTTPS certificates, which allows man-in-the-middle attacks, but that all changed with PEP 476. The ssl module changed to check those certificates by default in both Python 2.7.x and 3.4.x, but that has led to compatibility problems—especially for distributors of enterprise software. Smoothing the transition for enterprise customers (and others who are change-averse) is the subject of PEP 493. But the decision on whether to accept the PEP or not seems to have gotten lost in the shuffle a bit—which proponents are now trying to rectify.

The basic problem is that some Python users are relying on the earlier behavior—knowingly or not—so changing it can come as a great unpleasant surprise. If an organization is using self-signed certificates or those signed by an entity that is not in all of the certificate root stores, it will work just "fine" for Python versions that don't implement PEP 476. It is, of course, rather insecure, as man-in-the-middle attacks can readily be performed on the connections, but there will be no errors.

Python distributors are either releasing Python 2.7.9, which has the change, or backporting that feature into earlier Python versions that they distribute. That sets their customers up for errors if they are using "bad" certificates. Since part of what those customers pay for is protection against unpleasant surprises, it makes sense that two developers from Red Hat would be joined by Marc-André Lemburg of eGenix (a Python-focused company) in creating the PEP. Their customers will need ways to override the default behavior in some cases, and the entire community will benefit if the solutions across multiple distributions are the same.

There are two separate mechanisms proposed in the version of the PEP posted by Nick Coghlan on November 11. For those wishing to provide a per-application opt out of the certificate checking, the recommendation is to add a PYTHONHTTPSVERIFY environment variable; when set to zero, it would disable certificate checking in the ssl module. For distributions that want to provide a system-wide setting, the PEP recommends using a configuration file (/etc/python/cert-verification.cfg on Unix-like systems) that can enable or disable certificate checking; it also has a setting that explicitly defers the decision to the distributor ("platform_default").

PEP 493 was originally proposed as an informational PEP that just made recommendations for distributors, but that has evolved in the last few weeks. In posting the PEP, Coghlan was asking for a pronouncement (decision) on it, but it quickly became clear that finding someone to make that decision could be troublesome. Guido van Rossum did not feel he was the right one to review it, so he asked that a benevolent dictator for life (BDFL) delegate be found. That turned out to be tricky, as some of the more likely candidates also work for Red Hat, which might make the process seem to be just a rubber-stamping of the company's decision, as Van Rossum pointed out:

Hm, making Christian [Heimes] the BDFL-delegate would mean two out of three authors *and* the BDFL-delegate all working for Red Hat, which clearly has a stake (and IIUC has already committed to this approach ahead of PEP approval). SO then it would look like this is just rubber-stamping Red Hat's internal decision process (if it's a process -- sounds more like an accident :-).

The accident Van Rossum refers to is that Coghlan had admittedly dropped the ball on getting the PEP accepted or rejected and that Red Hat was close to releasing RHEL 7.2, which adopted parts of the PEP. In fact, as Coghlan noted, 7.2 was released on November 19 and it implemented the file-based configuration mechanism from the PEP.

Finding a BDFL-delegate is hard since there are not that many qualified developers and several were not able to do it for various reasons, so that part has not been resolved. But the discussion soon turned to other distributions and it started to become clear that an informational PEP, with patches that must be carried by each distribution, is not really what was wanted. Barry Warsaw, who works on Python for Debian and Ubuntu, thought that the suggestions in PEP 493 should instead be added to the standard library as distributed by python.org:

I would like to see the sample code in PEP 493 implemented in the stdlib. It would then be a matter of cherry picking that should a distro decide to backport it. It means one less non-trivial delta from upstream which has to be managed separately.

Robert Collins agreed: "a PEP telling distributors to patch the standard library is really distasteful to me". While Coghlan is sympathetic to that view, he pointed out that the distributors have other ideas:

Different redistributors have different customer bases, which also differ from the audience for upstream python.org releases, so saying "don't patch Python" is denying the reality of the obligations commercial vendors have to their customers, while "when you patch Python, please ensure you abide by these guidelines" is something redistributors can realistically do.

But Warsaw noted that the presence or absence of code in the standard library really doesn't change anything with respect to what the distributors do:

ISTM [It seems to me] that the same forces are in play regardless of whether the change is in code or in an informational PEP. Best to get consensus where possible, and manifest those decisions in code, but if competing goals are the outcome of a code change or informational PEP, downstream consumers will still make what they judge to be the best decision in the interest of their users, balanced against their own competing constraints.

Those arguments seems to have won the day, as Coghlan changed the PEP from "informational" to "standards track" and posted a summary of the changes in his new draft of the PEP. It targets 2.7.12 for implementing the changes; that release will likely come sometime in mid-2016.

At the time that PEP 476 changed the ssl module to check certificates by default, there was talk of adding a way to opt out, but it was controversial. That PEP eventually went out with a suggestion that those who needed to globally opt out perform a "monkeypatch" on the ssl module. That is not a particularly popular option and is what led to the need, at least in the eyes of some, for PEP 493. Security purists tend to believe that certificate validation should always be done, but more pragmatic developers recognize there are environments where that simply may not be possible. Whether the purists or pragmatists prevail on this topic for the language as a whole is still up in the air.

Index entries for this article
Security	Python
Security	TLS certificates

Fallout from the Python certificate verification change

Posted Dec 3, 2015 3:48 UTC (Thu) by noxxi (subscriber, #4994) [Link] (9 responses)

I don't think it is a good idea to have a simple environment variable to switch off validation, especially if it has an innocent name like PYTHONHTTPSVERIFY. It should at least be something scary like "I_FULLY_UNDERSTAND_THAT_DISABLING_VALIDATION_IS_BAD" so users will think twice before using it.

It is not that I have a really good answer what should be done, but I can share what worked within Perl and what not: In Perl IO::Socket::SSL (where I'm the maintainer) has traditionally also defaulted to no validation. In 11/2012 a fat warning was added if the default of no validation was used (i.e. no explicit enabling or disabling) and in 07/2013 the default was changed to enable validation all the time. It was also added to use the usual CA path on UNIX and in 03/2014 some way to have a usable CA path on Windows too was added.

Unfortunately the result was that several code was upated to explicitly disable validation since the warnings were just annoying. And when looking at stackoverflow this is often one of the first things people suggest when dealing with SSL problems, i.e. there are lots of NullHostnameVerifier for Java and similar suggestions.

Thus I consider making it easy for users to disable validation a very bad thing, because they will just do it to get rid of problems and then forget about the impact, if they ever understood it. And even worse than the proposed environment variable is the idea of having a global file to switch validation off for all python code.

Incidentally IO::Socket::SSL has an official hook to replace any SSL settings done by the calling code. Thus by simply loading a specific module using this hook all validation could be switched off. But it is also used in practice to enable validation again for code which wrongly disabled it because the author did not fully understand the implications.

Fallout from the Python certificate verification change

Posted Dec 4, 2015 8:37 UTC (Fri) by kleptog (subscriber, #1183) [Link] (6 responses)

Just yesterday we ran into related problems with SSL validation. As a developer you're testing everything locally and with Docker you're essentially running an identical configuration to production, except of course you don't have valid certificates for all the internal components, they all have randomly generated self-signed certificates for testing. So every where you end up making flags on all the components to disable the certificate checking to you can test everything.

An environment variable is a very easy way to handle this, but you have to be very very careful to not have it sneak into a production build. What you need is a global flag that says "we're in production, no shortcuts!" but with all the virtualisation these days machines/images have no idea where they're running any more...

I'm open to better suggestions though.

Fallout from the Python certificate verification change

Posted Dec 4, 2015 15:58 UTC (Fri) by jhhaller (guest, #56103) [Link] (5 responses)

The essence of the problem here is that there are certain common use cases which cause problem when certificate checking is enabled. Randomly generated self-signed certificates are one case, dealing with expired certificates where one doesn't easily control renewing said certificate is another. Unfortunately, the tools are our disposal to deal with this are binary, either turn off all certificates, or turn them on. Having a more limited exception mechanism would be helpful. One example of limited exceptions is how browsers allow support of deprecated SSL authentication, by configuring the browser with site or domain exceptions. A more limited exception mechanism which only allows certain IP addresses or domains to be exempted from certificate checking would allow minimizing the damage of globally disabling certificate settings to a subset of addresses. This would allow configuring only access to test systems to bypass certificate validation, without worrying that the setting would leak into production.

For test systems, it may be better to update the procedures to allow pulling the upstream signed certificate into the local test environment. This has the possibility of that certificate and it's signing certificate to leak into production, so must still be carefully handled, so as to not repeat Dell's experience. You are generating both a certificate authority certificate and host certificates, all signed with the signing certificate, right?

Of course, Stack Exchange will still be full of descriptions on how to globally disable certificate checking, but hopefully someone will give a better answer.

Fallout from the Python certificate verification change

Posted Dec 4, 2015 18:04 UTC (Fri) by noxxi (subscriber, #4994) [Link] (4 responses)

> Randomly generated self-signed certificates are one case

I don't really accept this as a valid use case. TLS is used to make the connection secure and not just to let it look secure. This requires validation of the certificate, because otherwise a man in the middle attack or similar is possible. Blindly accepting random generated certificates is for me the wrong way to deal with the problem caused by automatical installations - instead it should be that the installation should setup the certificates properly, i.e. create a CA, add it to the trust store, issue various certificates from the CA etc.

And even though it is only used for testing - I would not recommend to ship anything with TLS enabled where the TLS part is not tested properly.

> dealing with expired certificates ... either turn off all certificates, or turn them on. Having a more limited exception mechanism would be helpful.

At least with Perl you can simple use SSL_fingerprint to accept specific certificates. This works similar to the exceptions you can make inside the browser. The underlying idea is not hard and it should be easy to add something like this to python and other languages too. This way you can trust specific certificates, no matter if they are expired, revoked, self-signed or for the wrong hostname. It is much more safe than to simply switch off verification.

I agree with you that most environments make it too hard to make *secure* adjustments to the validation. But the current way to solve this problem by making it easy to switch validation off is wrong. Instead it should be made easy to adjust validation in a secure way.

Fallout from the Python certificate verification change

Posted Dec 4, 2015 18:27 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

> I don't really accept this as a valid use case.
Security is not binary and even unvalidated self-signed certificates are way better than cleartext. You can't just sniff TLS using a purely passive listening device (an infected phone on a corporate WiFi) and an active MITM attack requires more resources to mount and is infinitely more conspicuous.

And of course, in some cases you can not even generate a real certificate - literal IP addresses can't be signed!

Forcing half-baked TLS validation enforcement crap was one of the most ass-headed decisions of Python developers. I had stopped using or advocating Python for anything more than a 10-line script after that.

Fallout from the Python certificate verification change

Posted Dec 4, 2015 20:56 UTC (Fri) by noxxi (subscriber, #4994) [Link]

> Security is not binary and even unvalidated self-signed certificates are way better than cleartext.

I agree. But if there is the option to do it properly one should do it. I think it can be done better than just randomly generating certificates and completely switching off certificate validation. For example one could create a list of all generated certificates and then explicitly mark these as trusted.

Nobody would (hopefully) suggest that the browsers should disable all validation just because there are some self-signed certificates out there. Instead browsers offer a way to add exceptions just for these special cases. But in other cases it should be ok to switch all validation for the application (environment variable) or even for the full system (special file)?

> .. literal IP addresses can't be signed!

It is true that public CA's will no longer do it. But private CA's can still create such certificates and they are accepted by browsers and in the programming languages. And if you use an IP address instead of hostname you will probably use a private CA anyway.

> Forcing half-baked TLS validation enforcement crap was one of the most ass-headed decisions of Python developers.

In my opinion it was actually fairly well done. Compared to PHP (no usable default CA store on windows) or Java (own CA store with comparable few root CA) they actually managed the transition with most applications still working. Of course, if one was implicitly expecting that no validation was done instead of explicitly disabling validation then the program broke. But when just connecting to a public site like the browser does it mostly worked because they also added support for SNI at the same time.

I think the main problems were actually caused by programmers which did not understand the concept of validation and when it was needed and when not and why it should be explicitly disabled when not needed. There is not really a good way to get secure defaults if the users of the software don't understand the concepts of security. And without the secure defaults we have know we had much bigger problems, because certificates were not checked even though most users implicitly expected it.

Unfortunately security has the big problem that it costs and is in a way without showing any obvious benefit. It is the same as with backups - you only realize how important they were when your data are lost. But then it is too late.

Fallout from the Python certificate verification change

Posted Dec 4, 2015 21:02 UTC (Fri) by raven667 (subscriber, #5198) [Link] (1 responses)

> TLS is used to make the connection secure

Encryption and authentication can be two separate issues, self-signed certificates which are pinned, which is the SSH key management model, are still useful, even completely unauthenticated encryption is useful to detect errors and prevent passive monitoring even if it doesn't prevent an active attack. For far too long the idea that Authorities should Certify identity and that anything else was invalid, insecure and should generate big scary errors has retarded the deployment of our available tools. If anything unencrypted should generate warnings, encrypted but not authenticated by an authority should be normal and encrypted plus authenticated should flag the higher level of confidence.

Fallout from the Python certificate verification change

Posted Dec 4, 2015 21:31 UTC (Fri) by noxxi (subscriber, #4994) [Link]

> self-signed certificates which are pinned,...

I'm not against self-signed as long as they are pinned or otherwise explicitly marked trusted. I'm also not against private CA. I don't favor the current model of public CA's where you have to buy certificates. But is is the only one which actually scales because without it everybody would need to explicitly trust each of the https sites without having any kind of idea what this actually means.

Writing secure programs is not possible if you expect each programmer to know every possible attack vector. Just look at the web with all the Cross Site Scripting and CSRF attacks which all can be in theory fixed if the programmers would understand the main problem and also all the small browser incompabilities. Instead the design and the defaults should provide security and robustness by default. That's why it is important to enable certificate validation by default and also make validation just work by default using the system of public CA's. There will be cases were the defaults are not appropriate and need to be adapated, but it is still much better than to expect all developers to switch on the security themselves.

Fallout from the Python certificate verification change

Posted Dec 4, 2015 15:50 UTC (Fri) by epa (subscriber, #39769) [Link] (1 responses)

Does Python have a cultural equivalent to Perl's "warnings"? I've always felt this is something that should be handled better, with standard error from programs going into some kind of log which the system administrator can then review. Currently, for desktop applications, the stderr is usually ignored, and the lack of attention usually causes it to fill up with junk messages if it contains anything at all.

Fallout from the Python certificate verification change

Posted Dec 4, 2015 19:53 UTC (Fri) by jwilk (subscriber, #63328) [Link]

https://docs.python.org/2/library/warnings.html

Fallout from the Python certificate verification change

Posted Dec 11, 2015 3:12 UTC (Fri) by toyotabedzrock (guest, #88005) [Link] (1 responses)

If there is no cert verification the encryption is worthless because anyone can intercept it.

They should be creating a way to whitelist the self signed certificates they want to use. That is a valid way to authenticate it within a it shop.

Fallout from the Python certificate verification change

Posted Dec 11, 2015 3:36 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

> If there is no cert verification the encryption is worthless because anyone can intercept it.
Not intercept, but actually do a full-scale MITM. And it's much harder to do it covertly inside a local network than you think.