Python, SSL/TLS certificates and default validation

By Jake Edge
January 29, 2014

Since the beginning of time—Python time anyway—there has been no checking of SSL/TLS certificates in Python's standard library; neither the urllib nor the urllib2 library performs this checking. As a result, when a Python client connects to a site using HTTPS, any certificate can be offered by the server and the connection will be established. That is probably not what most Python programmers expect, but the documentation does warn those who read it. There are alternatives, of course, but not in the standard library—until now. Python 3.4 makes things a lot better but still does no verification by default, which is a major concern to some Python developers.

To address that concern, Donald Stufft proposed that a backward-incompatible change be made to Python 3 so that SSL/TLS certificates are checked by default when HTTPS is used. While Python 3.4 has made it much easier to turn on certificate checking (by way of a default SSLContext object in the standard library), it does not do so by default. Making certificate checking on by default would break lots of applications that are—knowingly or unknowingly—relying on the existing behavior. For example, applications that connect to sites with self-signed certificates or those signed by certificate authorities (CAs) that are not in the system-wide root store (e.g. CAcert) work just fine—until certificate checking is turned on.

At first blush, it seems like an obvious change to make. Clearly anyone making a connection using HTTPS would want to ensure that the certificate is valid at the other end. But it is not quite that simple. There are many sites out there with certificates that were not signed by one of the "approved" CAs. For any of a number of reasons—cost being the most obvious—a web site may decide to sign its own certificate or to use ones signed by alternative CAs, possibly their own mini-CA that was set up to sign multiple company-specific certificates.

It is really up the user to determine what to do when there are certificates that are not signed by the approved CAs; applications need to provide some way for them to choose (à la browser certificate warnings). So, flipping a switch in the standard library will just break applications when they connect to certificates that don't validate for any reason—man in the middle or just a certificate that is signed by a CA not in the root store—but users will have no way to fix the problem. It would require a code change that typical users are not able to make. It all adds up to something of a dilemma.

While most agreed with Stufft in the abstract—that certificate checking should default to on—there was strong sentiment that a change like that couldn't be made quickly. Marc-Andre Lemburg suggested using the usual deprecation mechanism. He also noted that some python.org sites use CAcert certificates, which would be directly affected by the changes.

Nick Coghlan was even more specific, laying out a possible transition plan that would deprecate the feature in a Python 3.6 or 3.7 time frame (2017 or later). Changing things quickly is not an option, he said:

Securing the web is a "boil the ocean" type task - Python 3.4 takes us a step closer by making it possible for people to easily use the system certs via ssl.create_default_context() (http://docs.python.org/dev/library/ssl.html#ssl.create_default_context), but "move fast and break things" isn't going to work on this one any more than it does for proper Unicode support or the IPv4 to IPv6 transition. Security concerns are too abstract for most people for them to accept it as an excuse when you tell them you broke their software for their own good.

But Jesse Noller agreed with Stufft:

I have to concur with Donald here - in the case of security, especially language security which directly impacts the implicit security of downstream applications, I should not have to opt in to the most secure defaults.

Noller continued that the default behavior makes it "trivial to MITM [man in the middle] an application". But, overall, support for a quick change was hard to find in the thread. Most were concerned that applications will break and that Python will be blamed. Stephen J. Turnbull pointed out that it is more than just interactive applications that will be affected:

This is quite different from web browsers and other interactive applications. It has the potential to break "secure" mail and news and other automatic data transfers. Breaking people's software that should run silently in the background just because they upgrade Python shouldn't happen, and people here will blame Python, not their broken websites and network apps.

I don't know what the right answer is, but this needs careful discussion and amelioration, not just "you're broken, so take the consequences!"

The right answer will eventually have to come in the form of a Python enhancement proposal (PEP), though none has been started. There is plenty of time as Python 3.4 is in feature freeze (due to be released in March) and 3.5 will come in the latter half of 2015. Stufft made another suggestion that might be incorporated into a transition plan in the PEP: add an environment variable that allows users to revert to not checking certificates. That "would act as a global sort of --insecure flag for applications that don't provide one", he said. Another possibility that did not get mentioned would be to have an environment variable that turned on the checking, which would make for an easy way to look for broken code.

The lack of certificate validation in the Python standard library has been known for a long time. There are scary warnings about it in various places in the Python documentation. We looked at the problem (in many more places than just Python) in 2012. There is even the alternative Requests library that by default does certificate validation. For Python 2.x, Requests is one of the few ways to actually get certificate validation at all—there is nothing in the Python 2 standard library that does it.

Things are clearly getting better. With Python 3.4, it will be fairly straightforward for developers to use ssl.create_default_context() to turn on certificate checking, which is a big step in the right direction. But, regardless of how much sense it seems to make to do it by default, the amount of legacy code out there makes it too risky to do without a good deal of warning. The next few years will hopefully provide that warning and Python will eventually be default hardened against man-in-the-middle attacks on SSL/TLS.

Index entries for this article
Security	Python
Security	Secure Sockets Layer (SSL)

Python, SSL/TLS certificates and default validation

Posted Jan 29, 2014 17:16 UTC (Wed) by lkundrak (subscriber, #43452) [Link] (2 responses)

This has been the case with Perl for a long time too.
Situation is fixed now already though.

Python, SSL/TLS certificates and default validation

Posted Jan 29, 2014 23:58 UTC (Wed) by noxxi (subscriber, #4994) [Link] (1 responses)

I'm the maintainer of the Perl module IO::Socket::SSL for some years and started 2010 to gradually move away from the insecure defaults the module had before. But the situation is different to python, because Perl never had an SSL module in the core. This has drawbacks, but on the other side it allowed me to increase the default security of IO::Socket::SSL (which uses Net::SSLeay as the backend) without needing permission from somebody else.

The first step in increasing security was to issue a big warning, if certificate verification was left at the default value of no verification (it did not complain if the value was explicitly set to not verify). This helped to enable verification in some users of IO::Socket::SSL but probably caused others to just explicitly disable the verification just to get rid of the warning. Then, after 3 years of loudly complaining, the default was changed to enable verification, but I got still enough reports about modules breaking - often when modules were used which were unmaintained for years. So another step was to offer an easy way to work around modules which used IO::Socket::SSL but did not care about certificate verification.

Then there are lots of established modules which don't support SSL at all, even if the protocols they implement often use SSL today. Instead of adding proper SSL support to core modules like Net::SMTP or Net::FTP several modules were created, which sometimes only implemented partial support for SSL (e.g. Net::SMTP::TLS implemented the use of STARTTLS while Net::SMTP::SSL direct SSL, e.g. smtps) and usually did not care about certificate checking. So Net::SSLGlue was created to help with this mess.

And there is more to verifying the certificate than just enabling it. It needs to know where the trusted root-CA are, so since 07/2013 it uses the systems default CA like compiled into OpenSSL. But then not only the chain of certificates needs to be verified, but also the contents of the certificate itself, especially the hostname. How this is done depends on the protocol, e.g. LDAP, HTTP and SMTP each have slightly different rules. For now IO::Socket::SSL has no default, but important modules like LWP for HTTP do the correct settings.

And there are the SSL version and the ciphers. Years ago some modules tried to be more secure by setting the version to TLSv1, with the result, that this leads to compatibility problems and limits the support today to TLS 1.0 only, skipping TLS 1.1 and TLS 1.2. Or they set ciphers to 'ALL', not realizing, that this also included aNULL (no authentication) and eNULL (no encryption).

Lots of these problems are caused by the inherent complexity of SSL, but also of insufficient and easy to understand documentation in OpenSSL and IO::Socket::SSL. At least from my part I continiously try to enhance the documentation, but also to make useful defaults so that the user does not try to "increase" the security. So recently forward secrecy was enabled by default and it uses a secure cipher set which also works around known problems with older F5 BIG-IP load balancers.

Still open are a proper handling of revocations. There is support for CRLs which is cumbersome to use (e.g. user has to get CRLs from somewhere) and there is no support for OCSP. Also, current browsers have some certificates blocked, where not even a CRL is available (http://googleonlinesecurity.blogspot.de/2013/12/further-i...).

BTW, there is a good paper from 2012 about the impact of insecure defaults and badly designed APIs in SSL libraries at https://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf.

And of course SSL like currently used is broken anyway. The browsers have currently about 100 "trusted" CAs inside, where each can sign certificates for any domain or could create any number of intermediate CAs with the same rights.

Python, SSL/TLS certificates and default validation

Posted Jan 30, 2014 11:20 UTC (Thu) by TRS-80 (guest, #1804) [Link]

Ruby has had a bit of back and forth about default OpenSSL ciphers and whether they should deal with the fact that some versions of OpenSSL have insecure defaults.

Python, SSL/TLS certificates and default validation

Posted Jan 29, 2014 18:24 UTC (Wed) by dskoll (subscriber, #1630) [Link] (4 responses)

Given the NSA shenanigans, rumours of CAs being paid off or threatened by governments, and fiascos like DigiNotar, I am of the cynical opinion that checking SSL certificates against well-known CAs by default really gives you a false sense of security rather than real security. Nevertheless, default-on for checking probably is better than default-off.

For anything important, I would generate my own certificates and make sure any client software trusts only my CA certificate.

Python, SSL/TLS certificates and default validation

Posted Jan 29, 2014 18:59 UTC (Wed) by jhoblitt (subscriber, #77733) [Link]

Agreed -- there's signifigant reason to believe that any number of 'public' CAs are or can be suverted by sophisticated attackers. However, giving the option of CA signature validation does prevent trivial MITM attacks which does have merit.

Python, SSL/TLS certificates and default validation

Posted Jan 29, 2014 19:38 UTC (Wed) by drag (guest, #31333) [Link] (1 responses)

> For anything important, I would generate my own certificates and make sure any client software trusts only my CA certificate.

This.

I used to think that it was irritating that many, if not most, Linux software needed to be configured individually to trust my certs even though my signer was part of the standard mozilla suite of CAs that get shipped with the distro.

Now I think this is a good thing. Most software should be configured to trust a specific CA rather then trust the default CA bundles.

Python, SSL/TLS certificates and default validation

Posted Jan 29, 2014 19:45 UTC (Wed) by raven667 (subscriber, #5198) [Link]

> Most software should be configured to trust a specific CA rather then trust the default CA bundles.

Although before this is relevant urllib needs to have the capability to start checking _any_ list of CAs, then you can start worrying about what public keys are in that list.

Maybe the way forward for Python is to define a new CA bundle file that urllib always checks by default, but if the file isn't there or is empty then it falls back to the existing behavior of not caring what the remote side looks like, if the CA file is populated then it does check. The easiest thing for an admin to do would be to symlink the python CA bundle to the system one if they want that, or to hand-maintain their own file if that's what they want but this shouldn't require code changes and shouldn't break on update.

Python, SSL/TLS certificates and default validation

Posted Feb 4, 2014 18:01 UTC (Tue) by plugwash (subscriber, #29694) [Link]

If you use plain unencyrpted connections then anyone can sniff your traffic with minimal effort and at basically zero risk to themselves.

Using ssl without verification raises the bar. Passive sniffing is no longer enough. MITM is still possible but requires both more effort from the attacker and significantly increases the risk of the attacker being exposed.

Using ssl with CA verification raises the bar again. The attacker now has to bully or trick a CA into giving them a suitable cert. I'm sure this is possible for government agents and possibly some advanced criminals but it's likely more than enough to discourage mass abuse.

Using ssl an application specific verification system raises the bar again, they now have to bully or trick the root of trust for said verification system (which is likely to be harder than bullying or tricking a random CA).

The questions are:

1: as an application developer how high do you need to raise the bar given the sensitivity of your application.
2: as a library or browser developer how do you avoid inadvertantly lowering the bar? if insisting on ssl cert verification drives people from ssl without certificate verification to no encryption at all you have just lowered the bar for attackers.

Python, SSL/TLS certificates and default validation

Posted Jan 30, 2014 10:00 UTC (Thu) by DG (subscriber, #16978) [Link] (3 responses)

PHP appears to have only just moved to performing peer certificate checks by default -

See : https://wiki.php.net/rfc/tls-peer-verification

And : https://twitter.com/rdlowrey/status/428239825347424257

"#php just got more secure :) I merged the implementation for the TLS Peer Verification RFC: https://github.com/php/php-src/commit/7a90254231eb419d2d7... … More TLS++ on the way. "

Python, SSL/TLS certificates and default validation

Posted Jan 30, 2014 11:31 UTC (Thu) by Richard_J_Neill (subscriber, #23093) [Link] (2 responses)

Surely a bare minimum would be:
* always check the cert
* if (fail){ print a warning to stderr }
* continue anyway.

That way, at least people would know there's a problem, even if we don't break backward compatibility.

Python, SSL/TLS certificates and default validation

Posted Jan 31, 2014 6:06 UTC (Fri) by noxxi (subscriber, #4994) [Link] (1 responses)

> "..print a warning to stderr...continue anyway.."

One should hope, that this helps, but experiences from making the Perl module IO::Socket::SSL move away from insecure defaults (see another comment here) shows, that even after 3 years printing a fat warning covering multiple lines and then finally making verification the default, there were still lots of people who were surprised by the new default and did not care all the time before.
It's just too easy to ignore a warning.

Python, SSL/TLS certificates and default validation

Posted Jan 31, 2014 23:08 UTC (Fri) by hkario (subscriber, #94864) [Link]

Maybe it should not be a warning, but a "set environment variable <modulename>SSL to 'I_want_my_connections_as_robust_as_wet_paper_napkin' if you want this program to run or complain to application author to fix it" message that just aborts applications.

Also, it's dead scary that we have such discussions in the first place. After Snowden revelations I don't have the appropriate words to describe it.

Python, SSL/TLS certificates and default validation

Posted Feb 1, 2014 8:59 UTC (Sat) by triebefx (guest, #72361) [Link] (1 responses)

PYCURL seems to be the only Python library that allows you to enable certificate validation, but it is kind of a low level API, so security is hard to get right.

Also, certificate validation is only the start, what about weak ciphers or know SSL/TLS attacks? Have you ever tried to run your Python client against https://www.howsmyssl.com/ ? Then try to limit the list of available ciphers to known good ones, this will depend on the SSL library being used in PYCURL. This is again very hard to get right.

Python, SSL/TLS certificates and default validation

Posted Oct 22, 2015 8:54 UTC (Thu) by ceplm (subscriber, #41334) [Link]

I know this is a way too old discussion for anybody to care, but just to say that this is very wrong for the sake of anybody who reads it from the archive. There are many third-party alternatives to make certificates working properly: PyOpenSSL, M2Crypto (of which I am a maintainer), Requests, Python-cryptography, and perhaps some others.

Python, SSL/TLS certificates and default validation

Posted Feb 3, 2014 9:30 UTC (Mon) by faassen (guest, #1676) [Link] (1 responses)

Not a word about Python 2.x, which is definitely behind most running Python code, and probably behind most new Python code being written today?

Python, SSL/TLS certificates and default validation

Posted Feb 3, 2014 9:32 UTC (Mon) by faassen (guest, #1676) [Link]

Ah, I take that back, the article does mention requests in Python 2.x. But Python 2.x is just not a concern by the maintainers of the CPython implementation. Makes total sense.

Python, SSL/TLS certificates and default validation

Posted Feb 13, 2014 8:09 UTC (Thu) by gmatht (subscriber, #58961) [Link] (1 responses)

What if python by default accepts and stores a new self-signed cert iff:
1) There is no competing cert from a trusted CA for the domain.
2) There is no stored cert for the domain.

This would seem to greatly increase the difficultly to MITM the https connection without anything suddenly breaking. Things would break if the self-signed cert changes but then the breakage would be the "fault" of the person who changed the cert not the python upgrade; a self-signed cert that changes without warning isn't particularly useful.

My main concern is that this would now mean that we keep a log of every domain the python script accesses, which could be a privacy issue. However I expect that normally those domains would be included in the python script itself or its configuration files so that may not be a problem in practice.

A random idea: I think it would be nice if we could also embed signatures of certs in URLs. If the user is going to a new website they don't recognize, then knowing the target matches the link is perhaps more useful to them than knowing the target matches whatever some CA mapped the name they don't recognize to.

Python, SSL/TLS certificates and default validation

Posted Feb 15, 2014 19:37 UTC (Sat) by kleptog (subscriber, #1183) [Link]

The "list of domains" issue can be solved the same way as for SSH, store a hash of the domain. I think introducing "store on first connect" semantics would improve security without breaking randomly all over the place.

Why not take requests into stdlib?

Posted Feb 14, 2014 18:44 UTC (Fri) by Velmont (guest, #46433) [Link]

In recent times it seems that *everyone* is using requests instead of urllib and urllib2 directly.

If requests would be integrated into the standard library, new code would be written using that right away, and you'd have it secure by default.

Follow-up article on LWN: "Python decides for certificate validation"

Posted Jan 5, 2015 2:22 UTC (Mon) by jmehnle (guest, #100452) [Link]

There's a follow-up article on LWN on this issue, dated September 10, 2014:

"Python decides for certificate validation"
http://lwn.net/Articles/611243/