Python, SSL/TLS certificates and default validation
Since the beginning of time—Python time anyway—there has been no checking of SSL/TLS certificates in Python's standard library; neither the urllib nor the urllib2 library performs this checking. As a result, when a Python client connects to a site using HTTPS, any certificate can be offered by the server and the connection will be established. That is probably not what most Python programmers expect, but the documentation does warn those who read it. There are alternatives, of course, but not in the standard library—until now. Python 3.4 makes things a lot better but still does no verification by default, which is a major concern to some Python developers.
To address that concern, Donald Stufft proposed that a backward-incompatible change be made to Python 3 so that SSL/TLS certificates are checked by default when HTTPS is used. While Python 3.4 has made it much easier to turn on certificate checking (by way of a default SSLContext object in the standard library), it does not do so by default. Making certificate checking on by default would break lots of applications that are—knowingly or unknowingly—relying on the existing behavior. For example, applications that connect to sites with self-signed certificates or those signed by certificate authorities (CAs) that are not in the system-wide root store (e.g. CAcert) work just fine—until certificate checking is turned on.
At first blush, it seems like an obvious change to make. Clearly anyone making a connection using HTTPS would want to ensure that the certificate is valid at the other end. But it is not quite that simple. There are many sites out there with certificates that were not signed by one of the "approved" CAs. For any of a number of reasons—cost being the most obvious—a web site may decide to sign its own certificate or to use ones signed by alternative CAs, possibly their own mini-CA that was set up to sign multiple company-specific certificates.
It is really up the user to determine what to do when there are certificates that are not signed by the approved CAs; applications need to provide some way for them to choose (à la browser certificate warnings). So, flipping a switch in the standard library will just break applications when they connect to certificates that don't validate for any reason—man in the middle or just a certificate that is signed by a CA not in the root store—but users will have no way to fix the problem. It would require a code change that typical users are not able to make. It all adds up to something of a dilemma.
While most agreed with Stufft in the abstract—that certificate checking should default to on—there was strong sentiment that a change like that couldn't be made quickly. Marc-Andre Lemburg suggested using the usual deprecation mechanism. He also noted that some python.org sites use CAcert certificates, which would be directly affected by the changes.
Nick Coghlan was even more specific, laying out a possible transition plan that would deprecate the feature in a Python 3.6 or 3.7 time frame (2017 or later). Changing things quickly is not an option, he said:
But Jesse Noller agreed with Stufft:
Noller continued that the default behavior makes it "trivial to MITM
[man in the middle] an application
".  But, overall,  support for a quick change
was hard to find in the thread.  Most were concerned that applications will
break and that Python will be blamed. Stephen J. Turnbull pointed out that it is more than just interactive
applications that will be affected:
I don't know what the right answer is, but this needs careful discussion and amelioration, not just "you're broken, so take the consequences!"
The right answer will eventually have to come in the form of a Python
enhancement proposal (PEP), though none has been started.  There is plenty
of time as Python 3.4 is in feature freeze (due to be released in
March) and 3.5 will come in the latter half of 2015.  Stufft made another
suggestion that might be incorporated into
a transition plan in the PEP: add an environment variable that allows users
to revert to not checking certificates.  That "would act as a global sort of --insecure flag for applications
that don't provide one
", he said.  Another possibility that did not
get mentioned would be to have an environment variable that turned on the
checking, which would make for an easy way to look for broken code.
The lack of certificate validation in the Python standard library has been known for a long time. There are scary warnings about it in various places in the Python documentation. We looked at the problem (in many more places than just Python) in 2012. There is even the alternative Requests library that by default does certificate validation. For Python 2.x, Requests is one of the few ways to actually get certificate validation at all—there is nothing in the Python 2 standard library that does it.
Things are clearly getting better. With Python 3.4, it will be fairly straightforward for developers to use ssl.create_default_context() to turn on certificate checking, which is a big step in the right direction. But, regardless of how much sense it seems to make to do it by default, the amount of legacy code out there makes it too risky to do without a good deal of warning. The next few years will hopefully provide that warning and Python will eventually be default hardened against man-in-the-middle attacks on SSL/TLS.
| Index entries for this article | |
|---|---|
| Security | Python | 
| Security | Secure Sockets Layer (SSL) | 
      Posted Jan 29, 2014 17:16 UTC (Wed)
                               by lkundrak (subscriber, #43452)
                              [Link] (2 responses)
       
     
    
      Posted Jan 29, 2014 23:58 UTC (Wed)
                               by noxxi (subscriber, #4994)
                              [Link] (1 responses)
       
The first step in increasing security was to issue a big warning, if certificate verification was left at the default value of no verification (it did not complain if the value was explicitly set to not verify). This helped to enable verification in some users of IO::Socket::SSL but probably caused others to just explicitly disable the verification just to get rid of the warning. Then, after 3 years of loudly complaining, the default was changed to enable verification, but I got still enough reports about modules breaking - often when modules were used which were unmaintained for years. So another step was to offer an easy way to work around modules which used IO::Socket::SSL but did not care about certificate verification. 
Then there are lots of established modules which don't support SSL at all, even if the protocols they implement often use SSL today. Instead of adding proper SSL support to core modules like Net::SMTP or Net::FTP several modules were created, which sometimes only implemented partial support for SSL (e.g. Net::SMTP::TLS implemented the use of STARTTLS while Net::SMTP::SSL direct SSL, e.g. smtps) and usually did not care about certificate checking. So Net::SSLGlue was created to help with this mess. 
And there is more to verifying the certificate than just enabling it. It needs to know where the trusted root-CA are, so since 07/2013 it uses the systems default CA like compiled into OpenSSL. But then not only the chain of certificates needs to be verified, but also the contents of the certificate itself, especially the hostname. How this is done depends on the protocol, e.g. LDAP, HTTP and SMTP each have slightly different rules. For now IO::Socket::SSL has no default, but important modules like LWP for HTTP do the correct settings.  
And there are the SSL version and the ciphers. Years ago some modules tried to be more secure by setting the version to TLSv1, with the result, that this leads to compatibility problems and limits the support today to TLS 1.0 only, skipping TLS 1.1 and TLS 1.2. Or they set ciphers to 'ALL', not realizing, that this also included aNULL (no authentication) and eNULL (no encryption).  
Lots of these problems are caused by the inherent complexity of SSL, but also of insufficient and easy to understand documentation in OpenSSL and IO::Socket::SSL. At least from my part I continiously try to enhance the documentation, but also to make useful defaults so that the user does not try to "increase" the security. So recently forward secrecy was enabled by default and it uses a secure cipher set which also works around known problems with older F5 BIG-IP load balancers. 
Still open are a proper handling of revocations. There is support for CRLs which is cumbersome to use (e.g. user has to get CRLs from somewhere) and there is no support for OCSP. Also, current browsers have some certificates blocked, where not even a CRL is available (http://googleonlinesecurity.blogspot.de/2013/12/further-i...). 
BTW, there is a good paper from 2012 about the impact of insecure defaults and badly designed APIs in SSL libraries at https://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf. 
And of course SSL like currently used is broken anyway. The browsers have currently about 100 "trusted" CAs inside, where each can sign certificates for any domain or could create any number of intermediate CAs with the same rights. 
     
    
      Posted Jan 30, 2014 11:20 UTC (Thu)
                               by TRS-80 (guest, #1804)
                              [Link] 
       
     
      Posted Jan 29, 2014 18:24 UTC (Wed)
                               by dskoll (subscriber, #1630)
                              [Link] (4 responses)
       Given the NSA shenanigans, rumours of CAs being paid off or threatened by governments, and fiascos like DigiNotar, I am of the cynical opinion that checking SSL certificates against well-known CAs by default really gives you a false sense of security rather than real security.  Nevertheless, default-on for checking probably is better than default-off.
 For anything important, I would generate my own certificates and make sure any client software trusts only my CA certificate.
      
           
     
    
      Posted Jan 29, 2014 18:59 UTC (Wed)
                               by jhoblitt (subscriber, #77733)
                              [Link] 
       
     
      Posted Jan 29, 2014 19:38 UTC (Wed)
                               by drag (guest, #31333)
                              [Link] (1 responses)
       
This.  
I used to think that it was irritating that many, if not most, Linux software needed to be configured individually to trust my certs even though my signer was part of the standard mozilla suite of CAs that get shipped with the distro.  
Now I think this is a good thing. Most software should be configured to trust a specific CA rather then trust the default CA bundles. 
     
    
      Posted Jan 29, 2014 19:45 UTC (Wed)
                               by raven667 (subscriber, #5198)
                              [Link] 
       
Although before this is relevant urllib needs to have the capability to start checking _any_ list of CAs, then you can start worrying about what public keys are in that list. 
Maybe the way forward for Python is to define a new CA bundle file that urllib always checks by default, but if the file isn't there or is empty then it falls back to the existing behavior of not caring what the remote side looks like, if the CA file is populated then it does check.  The easiest thing for an admin to do would be to symlink the python CA bundle to the system one if they want that, or to hand-maintain their own file if that's what they want but this shouldn't require code changes and shouldn't break on update. 
     
      Posted Feb 4, 2014 18:01 UTC (Tue)
                               by plugwash (subscriber, #29694)
                              [Link] 
       
Using ssl without verification raises the bar. Passive sniffing is no longer enough. MITM is still possible but requires both more effort from the attacker and significantly increases the risk of the attacker being exposed. 
Using ssl with CA verification raises the bar again. The attacker now has to bully or trick a CA into giving them a suitable cert. I'm sure this is possible for government agents and possibly some advanced criminals but it's likely more than enough to discourage mass abuse. 
Using ssl an application specific verification system raises the bar again, they now have to bully or trick the root of trust for said verification system (which is likely to be harder than bullying or tricking a random CA). 
The questions are: 
1: as an application developer how high do you need to raise the bar given the sensitivity of your application. 
     
      Posted Jan 30, 2014 10:00 UTC (Thu)
                               by DG (subscriber, #16978)
                              [Link] (3 responses)
       
See : https://wiki.php.net/rfc/tls-peer-verification 
And : https://twitter.com/rdlowrey/status/428239825347424257 
"#php just got more secure :) I merged the implementation for the TLS Peer Verification RFC: https://github.com/php/php-src/commit/7a90254231eb419d2d7... … More TLS++ on the way. " 
     
    
      Posted Jan 30, 2014 11:31 UTC (Thu)
                               by Richard_J_Neill (subscriber, #23093)
                              [Link] (2 responses)
       
That way, at least people would know there's a problem, even if we don't break backward compatibility.  
 
     
    
      Posted Jan 31, 2014 6:06 UTC (Fri)
                               by noxxi (subscriber, #4994)
                              [Link] (1 responses)
       
One should hope, that this helps, but experiences from making the Perl module IO::Socket::SSL move away from insecure defaults (see another comment here) shows, that even after 3 years printing a fat warning covering multiple lines and then finally making verification the default, there were still lots of people who were surprised by the new default and did not care all the time before. 
     
    
      Posted Jan 31, 2014 23:08 UTC (Fri)
                               by hkario (subscriber, #94864)
                              [Link] 
       
Also, it's dead scary that we have such discussions in the first place. After Snowden revelations I don't have the appropriate words to describe it. 
     
      Posted Feb 1, 2014 8:59 UTC (Sat)
                               by triebefx (guest, #72361)
                              [Link] (1 responses)
       
Also, certificate validation is only the start, what about weak ciphers or know SSL/TLS attacks? Have you ever tried to run your Python client against https://www.howsmyssl.com/ ? Then try to limit the list of available ciphers to known good ones, this will depend on the SSL library being used in PYCURL. This is again very hard to get right. 
     
    
      Posted Oct 22, 2015 8:54 UTC (Thu)
                               by ceplm (subscriber, #41334)
                              [Link] 
       
     
      Posted Feb 3, 2014 9:30 UTC (Mon)
                               by faassen (guest, #1676)
                              [Link] (1 responses)
       
 
 
     
    
      Posted Feb 3, 2014 9:32 UTC (Mon)
                               by faassen (guest, #1676)
                              [Link] 
       
 
     
      Posted Feb 13, 2014 8:09 UTC (Thu)
                               by gmatht (subscriber, #58961)
                              [Link] (1 responses)
       
This would seem to greatly increase the difficultly to MITM the https connection without anything suddenly breaking. Things would break if the self-signed cert changes but then the breakage would be the "fault" of the person who changed the cert not the python upgrade; a self-signed cert that changes without warning isn't particularly useful. 
My main concern is that this would now mean that we keep a log of every domain the python script accesses, which could be a privacy issue. However I expect that normally those domains would be included in the python script itself or its configuration files so that may not be a problem in practice. 
A random idea: I think it would be nice if we could also embed signatures of certs in URLs. If the user is going to a new website they don't recognize, then knowing the target matches the link is perhaps more useful to them than knowing the target matches whatever some CA mapped the name they don't recognize to. 
     
    
      Posted Feb 15, 2014 19:37 UTC (Sat)
                               by kleptog (subscriber, #1183)
                              [Link] 
       
     
      Posted Feb 14, 2014 18:44 UTC (Fri)
                               by Velmont (guest, #46433)
                              [Link] 
       
If requests would be integrated into the standard library, new code would be written using that right away, and you'd have it secure by default. 
     
      Posted Jan 5, 2015 2:22 UTC (Mon)
                               by jmehnle (guest, #100452)
                              [Link] 
       
"Python decides for certificate validation" 
 
     
    Python, SSL/TLS certificates and default validation
      
Situation is fixed now already though.
Python, SSL/TLS certificates and default validation
      
      Ruby has had a bit of back and forth about default OpenSSL ciphers and whether they should deal with the fact that some versions of OpenSSL have insecure defaults.
      
          Python, SSL/TLS certificates and default validation
      Python, SSL/TLS certificates and default validation
      Python, SSL/TLS certificates and default validation
      
Python, SSL/TLS certificates and default validation
      
Python, SSL/TLS certificates and default validation
      
Python, SSL/TLS certificates and default validation
      
2: as a library or browser developer how do you avoid inadvertantly lowering the bar? if insisting on ssl cert verification drives people from ssl without certificate verification to no encryption at all you have just lowered the bar for attackers.
Python, SSL/TLS certificates and default validation
      
Python, SSL/TLS certificates and default validation
      
 * always check the cert
 * if (fail){ print a warning to stderr }
 * continue anyway.
Python, SSL/TLS certificates and default validation
      
It's just too easy to ignore a warning.
Python, SSL/TLS certificates and default validation
      
Python, SSL/TLS certificates and default validation
      
Python, SSL/TLS certificates and default validation
      
Python, SSL/TLS certificates and default validation
      
Python, SSL/TLS certificates and default validation
      
Python, SSL/TLS certificates and default validation
      
1) There is no competing cert from a trusted CA for the domain.
2) There is no stored cert for the domain.
Python, SSL/TLS certificates and default validation
      
Why not take requests into stdlib?
      
Follow-up article on LWN: "Python decides for certificate validation"
      
http://lwn.net/Articles/611243/
           