Python, SSL/TLS certificates and default validation
Since the beginning of time—Python time anyway—there has been no checking of SSL/TLS certificates in Python's standard library; neither the urllib nor the urllib2 library performs this checking. As a result, when a Python client connects to a site using HTTPS, any certificate can be offered by the server and the connection will be established. That is probably not what most Python programmers expect, but the documentation does warn those who read it. There are alternatives, of course, but not in the standard library—until now. Python 3.4 makes things a lot better but still does no verification by default, which is a major concern to some Python developers.
To address that concern, Donald Stufft proposed that a backward-incompatible change be made to Python 3 so that SSL/TLS certificates are checked by default when HTTPS is used. While Python 3.4 has made it much easier to turn on certificate checking (by way of a default SSLContext object in the standard library), it does not do so by default. Making certificate checking on by default would break lots of applications that are—knowingly or unknowingly—relying on the existing behavior. For example, applications that connect to sites with self-signed certificates or those signed by certificate authorities (CAs) that are not in the system-wide root store (e.g. CAcert) work just fine—until certificate checking is turned on.
At first blush, it seems like an obvious change to make. Clearly anyone making a connection using HTTPS would want to ensure that the certificate is valid at the other end. But it is not quite that simple. There are many sites out there with certificates that were not signed by one of the "approved" CAs. For any of a number of reasons—cost being the most obvious—a web site may decide to sign its own certificate or to use ones signed by alternative CAs, possibly their own mini-CA that was set up to sign multiple company-specific certificates.
It is really up the user to determine what to do when there are certificates that are not signed by the approved CAs; applications need to provide some way for them to choose (à la browser certificate warnings). So, flipping a switch in the standard library will just break applications when they connect to certificates that don't validate for any reason—man in the middle or just a certificate that is signed by a CA not in the root store—but users will have no way to fix the problem. It would require a code change that typical users are not able to make. It all adds up to something of a dilemma.
While most agreed with Stufft in the abstract—that certificate checking should default to on—there was strong sentiment that a change like that couldn't be made quickly. Marc-Andre Lemburg suggested using the usual deprecation mechanism. He also noted that some python.org sites use CAcert certificates, which would be directly affected by the changes.
Nick Coghlan was even more specific, laying out a possible transition plan that would deprecate the feature in a Python 3.6 or 3.7 time frame (2017 or later). Changing things quickly is not an option, he said:
But Jesse Noller agreed with Stufft:
Noller continued that the default behavior makes it "trivial to MITM
[man in the middle] an application
". But, overall, support for a quick change
was hard to find in the thread. Most were concerned that applications will
break and that Python will be blamed. Stephen J. Turnbull pointed out that it is more than just interactive
applications that will be affected:
I don't know what the right answer is, but this needs careful discussion and amelioration, not just "you're broken, so take the consequences!"
The right answer will eventually have to come in the form of a Python
enhancement proposal (PEP), though none has been started. There is plenty
of time as Python 3.4 is in feature freeze (due to be released in
March) and 3.5 will come in the latter half of 2015. Stufft made another
suggestion that might be incorporated into
a transition plan in the PEP: add an environment variable that allows users
to revert to not checking certificates. That "would act as a global sort of --insecure flag for applications
that don't provide one
", he said. Another possibility that did not
get mentioned would be to have an environment variable that turned on the
checking, which would make for an easy way to look for broken code.
The lack of certificate validation in the Python standard library has been known for a long time. There are scary warnings about it in various places in the Python documentation. We looked at the problem (in many more places than just Python) in 2012. There is even the alternative Requests library that by default does certificate validation. For Python 2.x, Requests is one of the few ways to actually get certificate validation at all—there is nothing in the Python 2 standard library that does it.
Things are clearly getting better. With Python 3.4, it will be fairly straightforward for developers to use ssl.create_default_context() to turn on certificate checking, which is a big step in the right direction. But, regardless of how much sense it seems to make to do it by default, the amount of legacy code out there makes it too risky to do without a good deal of warning. The next few years will hopefully provide that warning and Python will eventually be default hardened against man-in-the-middle attacks on SSL/TLS.
| Index entries for this article | |
|---|---|
| Security | Python |
| Security | Secure Sockets Layer (SSL) |
