bytes vs. characters
bytes vs. characters
Posted Apr 17, 2015 7:15 UTC (Fri) by zyga (subscriber, #81533)In reply to: bytes vs. characters by Cyberax
Parent article: Report from the Python Language Summit
HTTP headers are a perfect example of binary data. Handling them as unicode text is broken IMHO. You can just use byte processing for everything there and Python 3.4, AFAIR, fixed some last gripes about lack of formatting support for edge cases like that.
Posted Apr 17, 2015 7:22 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link]
I'd love to fix these data sources, but they're out of my control. The vendor knows about it and they plan to base64 binary data in the future, but for now I have to work with what I have.
> You can just use byte processing for everything there and Python 3.4
I've fixed tons of code like this:
>def blah(p):
It mostly works as is, but occasionally it doesn't.
bytes vs. characters
I think it will still be broken. There's a workaround that simply stores binary bytes in the lower byte of UCS-4 codepoints and it sorta works.
Not exactly. Most of the built-in library can be used with byte sequences, but third-party libraries are often too careless.
> if fail_to_do_something(p):
> raise SomeException(u"Failed to frobnicate %s!" % p)