hostname matching
hostname matching
Posted Jun 4, 2017 9:37 UTC (Sun) by tialaramex (subscriber, #21167)In reply to: hostname matching by njs
Parent article: Python ssl module update
But why? You can't use this name _for_ anything here. I absolutely understand that users want the name shown as they expect it, but the user isn't feeding the name into the hostname matching code, almost always the user doesn't care about matching at all, this all needs to happen behind the scenes when they connect. If you are able to connect to the host (otherwise what are you trying to "match" against?) then somewhere you have successfully figured out the punycode DNS name for this host and _that_ is the thing you ought to be matching against the SAN dnsName inside the certificate. [[ If you connected by IP address, you should only be matching SAN ipAddress names NOT trying to contemplate dnsNames, do not repeat Microsoft's bug here ]] Doing the conversion separately in each place that it occurs just increases the chance things will break.
If the reason is just "it looks like text, so we accept Unicode" well, I guess, I don't know enough about Python style to recommend the correct way forward, in Java I would suggest labelling the Unicode API @deprecated and explaining why in the documentation. It's not useless to offer to do Punycode translation here, it just makes the API needlessly fragile to rely upon that for the usual case when we should know the exact name we're trying to match already. Given that people _shouldn't_ be calling in here with Unicode, it's probably safer to actively reject that than to try to muddle along, that way people who do need to work directly with the U-name form (e.g. maybe a test tool) will be aware of the sharp edge they're invoking because they'll need to do the encode/decode step themselves.
Over in m.d.s.policy we had discussions with Cory Benfield about the other end of this stuff - Cory sees that the CA trust relationships packaged up with a Linux distro, or with Python requests are only a crude partial summary of the actual CA trust exhibited by the browsers (in this case Mozilla's Firefox) which is implemented in software. In particular browsers often impose what we might call "sanctions" short of distrust through such code, e.g. a poorly managed French government CA is not actually trusted by Firefox to issue for TLDs that aren't controlled by the French state, and the incompetent/ deceitful WoSign CA is not actually trusted to issue new certificates. However the simple list of trusted CAs exported to software like Cory's does not reflect these nuances. In both cases the CA is simply "trusted" because the alternative is "not trusted". Alas we did not come to much conclusion, there is understandable reluctance on the Mozilla side to do more (they already do more than their fair share) and the sanctions imposed are a bit "ad hoc" so there's not much realistic chance of consistently exposing them as data so that they can be consumed by other tools.
Posted Jun 5, 2017 0:09 UTC (Mon)
by njs (subscriber, #40338)
[Link] (4 responses)
Some security-conscious libraries like requests do already do their own IDN encoding, so that the stdlib functions only see the A-label.
> Over in m.d.s.policy we had discussions with Cory Benfield about the other end of this stuff - Cory sees that the CA trust relationships packaged up with a Linux distro, or with Python requests are only a crude partial summary of the actual CA trust exhibited by the browsers
Yeah, this is also unfortunate. Cory's currently engaged in a herculean effort to define a new TLS API for Python that can delegate to the platform TLS implementations on Windows and MacOS. I'm not sure that they're actually any better at this in practice, but at least it would reduce the number of distinct trust databases, and shift the responsibility away from the Python devs. Of course on Linux we can't even agree on where the list of trusted CAs gets put on disk, never mind any kind of more sophisticated policy decisions...
Posted Jun 5, 2017 1:10 UTC (Mon)
by tialaramex (subscriber, #21167)
[Link] (3 responses)
Arguably there is no such thing as "a valid certificate for faß.de" the certificate would be for xn--fa-hia.de, and it's purely a presentation layer decision to render this as faß.de. It certainly isn't correct to say "Oh, the user can type faß.de, we'll connect to the wrong machine, then give them a certificate error". That's not even a halfway acceptable solution.
There absolutely are CAs which will issue a certificate with a dnsName SAN for xn--fa-hia.de and then in CN they'll write faß.de because they can (the dnsName is deliberately defined with one of ASN.1's far too numerous sort-of ASCII encodings, so you can't write ß there, but CN is just arbitrary human-readable text...) However, checking CN for a Unicode version of the name is just compounding the original error, please don't do that either!
Posted Jun 8, 2017 7:33 UTC (Thu)
by njs (subscriber, #40338)
[Link] (2 responses)
In conclusion, TLS is hard and software is hard and everything is terrible.
Posted Jun 8, 2017 22:32 UTC (Thu)
by tialaramex (subscriber, #21167)
[Link] (1 responses)
From the Web PKI side, bugs like this mean when we say to CAs "Don't do X" they point at the bug and say "We have to because of this bug". And so another year or six goes by without the problem fixed. Python being part of the problem not the solution is disappointing.
Posted Jun 11, 2017 8:26 UTC (Sun)
by njs (subscriber, #40338)
[Link]
I also just alerted Cory to the issue in the hopes that his new TLS library will hopefully avoid this problem... the Python ssl maintainer(s) is (are) certainly aware of it, but the stdlib ssl module is (like everything) pretty under-resourced, and with the Python release cycle and the py2/py3 split getting this kind of complex change done can be really slow :-/
Posted Jun 6, 2017 3:00 UTC (Tue)
by flussence (guest, #85566)
[Link] (1 responses)
It caused me some mild grief, e.g. Pidgin wouldn't connect to AIM any more because its entire SSL chain was rotten. Some workaround must be in place since it still uses Symantec certs.
Posted Jun 6, 2017 16:17 UTC (Tue)
by tialaramex (subscriber, #21167)
[Link]
I appreciate that CACert's processes may feel robust if you happen to know the core CACert people, most of us don't and never will, so what we see is just another flailing volunteer group. Ten years ago CACert looked like a reasonable way forward, but today it does not. Maybe if CACert had been in the game much earlier, say in 1998 not 2003 then they'd already have been included in key stores prior to Honest Achmed and the CA/B and so then they'd be _inside_ the tent making rules for newcomers, not outside desperately playing catch-up.
In terms of competence, I see basically the same sort of errors made by CACert as at Symantec, and I feel the same way. Yes, in principle you can take a bunch of tools and know-how and do whatever you want, issue whatever you want, and it will all work out fine. But you will very likely make lots of mistakes if you do that, so I _strongly_ recommend you instead put the effort into having machines doing just a handful of things very well, and then sit on your hands. At one point Symantec tried to create a custom tbsCertificate and in doing so they erroneously signed it, even though the _whole point_ of the exercise was not to sign anything, when you read transcripts of CACert trying to follow simple instructions for a non-standard procedure it looks much the same.
hostname matching
hostname matching
hostname matching
hostname matching
hostname matching
The lack of IDNA 2008 is: https://bugs.python.org/issue17305
hostname matching
Gentoo's packaging of Mozilla's CA bundle is surprisingly opinionated - not only have they given the option to trust CACert (the only root that has OV/EV practices worth a damn) but they also blacklisted the evil Symantec/Wosign/StartCom certs far earlier than the browsers did.
hostname matching