|
|
Subscribe / Log in / New account

hostname matching

hostname matching

Posted Jun 5, 2017 0:09 UTC (Mon) by njs (subscriber, #40338)
In reply to: hostname matching by tialaramex
Parent article: Python ssl module update

It does happen that Python functions that work with hostnames in general accept U-labels and do the A-label conversion automatically, but this isn't the problem – Python's getaddrinfo and SNI and hostname checking code all use the same routine for this, so they stay consistent. (I agree that it seems a bit fragile, but I'm not aware of it having caused any problems yet in practice.) The problem is that Python's U-label -> A-label code has gotten stuck on IDNA 2003, so if the user asks for "faß.de" then getaddrinfo helpfully gives them the IP address for fass.de, and then the hostname checking helpfully confirms that they do have a valid certificate for fass.de, etc., and there's no indication that they're not actually talking to xn--fa-hia.de like they should be. For this it doesn't really matter whether the A-label conversion happens once at the boundary or multiple times inside, because it gives the same wrong answer either way :-). If there were some standard well-maintained library for doing hostname checking that also took care of IDN encoding and Python delegated this stuff to it, then it would at least catch that fass.de does *not* have a valid certificate for faß.de. But really the solution is just to upgrade to IDNA 2008. (Possibly breaking everyone's code in the process, which I guess is why it hasn't happened yet.)

Some security-conscious libraries like requests do already do their own IDN encoding, so that the stdlib functions only see the A-label.

> Over in m.d.s.policy we had discussions with Cory Benfield about the other end of this stuff - Cory sees that the CA trust relationships packaged up with a Linux distro, or with Python requests are only a crude partial summary of the actual CA trust exhibited by the browsers

Yeah, this is also unfortunate. Cory's currently engaged in a herculean effort to define a new TLS API for Python that can delegate to the platform TLS implementations on Windows and MacOS. I'm not sure that they're actually any better at this in practice, but at least it would reduce the number of distinct trust databases, and shift the responsibility away from the Python devs. Of course on Linux we can't even agree on where the list of trusted CAs gets put on disk, never mind any kind of more sophisticated policy decisions...


to post comments

hostname matching

Posted Jun 5, 2017 1:10 UTC (Mon) by tialaramex (subscriber, #21167) [Link] (3 responses)

"If there were some standard well-maintained library for doing hostname checking that also took care of IDN encoding and Python delegated this stuff to it, then it would at least catch that fass.de does *not* have a valid certificate for faß.de"

Arguably there is no such thing as "a valid certificate for faß.de" the certificate would be for xn--fa-hia.de, and it's purely a presentation layer decision to render this as faß.de. It certainly isn't correct to say "Oh, the user can type faß.de, we'll connect to the wrong machine, then give them a certificate error". That's not even a halfway acceptable solution.

There absolutely are CAs which will issue a certificate with a dnsName SAN for xn--fa-hia.de and then in CN they'll write faß.de because they can (the dnsName is deliberately defined with one of ASN.1's far too numerous sort-of ASCII encodings, so you can't write ß there, but CN is just arbitrary human-readable text...) However, checking CN for a Unicode version of the name is just compounding the original error, please don't do that either!

hostname matching

Posted Jun 8, 2017 7:33 UTC (Thu) by njs (subscriber, #40338) [Link] (2 responses)

I was going to say oh it's not that bad, but it turns out that was based on a misreading of the source... it's not just that they have the wrong IDNA standard implemented :-(. In fact Python's SSL module's hostname verification will encode whatever hostname you gave it to a U-label (even if you forcibly pass in an A-label yourself), and then it will compare that against the raw subjectAltNames and CN. So currently the *only* situation in which the stdlib ssl module will successfully connect to a IDN over TLS is when the CN has the U-label in it.

In conclusion, TLS is hard and software is hard and everything is terrible.

hostname matching

Posted Jun 8, 2017 22:32 UTC (Thu) by tialaramex (subscriber, #21167) [Link] (1 responses)

While I appreciate that the "and everything is terrible" line seems appropriate here, might we at least raise this as a clear bug? Can I do that somewhere? Or if it already exists, can I be told where the bug report is so I can ensure it gets tended to by others who grok this stuff and will try to "gently" direct people towards actually doing what the spec. says ?

From the Web PKI side, bugs like this mean when we say to CAs "Don't do X" they point at the bug and say "We have to because of this bug". And so another year or six goes by without the problem fixed. Python being part of the problem not the solution is disappointing.

hostname matching

Posted Jun 11, 2017 8:26 UTC (Sun) by njs (subscriber, #40338) [Link]

The determinedly broken hostname matching is: https://bugs.python.org/issue28414
The lack of IDNA 2008 is: https://bugs.python.org/issue17305

I also just alerted Cory to the issue in the hopes that his new TLS library will hopefully avoid this problem... the Python ssl maintainer(s) is (are) certainly aware of it, but the stdlib ssl module is (like everything) pretty under-resourced, and with the Python release cycle and the py2/py3 split getting this kind of complex change done can be really slow :-/


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds