The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
Posted Jul 31, 2018 22:28 UTC (Tue) by k8to (guest, #15413)Parent article: The Grumpy Editor's Python 3 experience
* For the base problem, I've always favored shipping a lib dir with my project that the bootstrap explicitly adds to sys.path before (most) imports begin. Some people think it's hacky, but it works with any deployment model, and is simple. The only downside is if someone hacking on it doesn't realize this, so I usually put it in a README.
* import loops are nasty, and it unfortunately takes a long time in python before your project is big enough or you do something unusual enough that they bite you. The net result is that you can end up with a lot of them before you realize you don't want them. Maybe pyflakes or whatever will auto-find them. Python is kind of best run with a lot of checkers, but sadly a lot of the checkers come with too many opinions and too much work to turn off the dumb ones.
-----
* For the division, it's been possible to get "new-style" division in python2 for many years now, so I've been able to make the switch gradually module by module. That doesn't help you of course. I guess this change makes python more similar to other languages, but I don't think it makes it more internally consistent, and I don't really think it was worth it. It's one of the things that was a pain in the butt when writing polyglot code to run on both.
* I hope you don't run into situations where you really want to use a utility thing that wants str when you have bytes. Often it's wrong to go bytes->str->util->str->bytes. Sometimes I have to just re-implement it, often copy-pasting from the python code. Probably I should write patches, but sometimes a whole module has the idea it only wants bytes, so it would be a large (and maybe controversial) patch.
* six is a little magical, agreed. I think it was a godsend when maintaining code with significant numbers of developers in a time when you had to target both pythons. If you don't though, it isn't worth it, and I think the window for that time is passing.
* The octal thing seems like the right idea, though I would have vastly preferred the error be opt-in, or opt-out. I find 0o777 truly bizarre, though I guess it's worth making octal numbers hard to do by accident.
* The most worrying part of this is the discussion of pickles. It sounds like you were transmitting pickles over the network, potentially in an insecure fashion. I've always felt pickling was acceptable for persisting to disk in scenarios where accessing the disk was already game over. However, putting them in emails smells like a remote code execution open door, even if you think you control the email store.
Obviously, if you load an object and run it, it's a remote code execution, but you may not be aware that the load action can take ownership of the process immediately without ever running your code from that point on.
I view pickling of python objects as truly magical. You can stash executable logic in a set of bytes and run it again later, which can be extremely powerful. You can have ephemeral plugins over the network and other crazy ideas. It's based on the python bytecode loader (that's essentially all it is), though so it can't work across versions.
If you just want to store *data*, then something like json.dumps is probably better, though it's not necessarily safe by default (depending on python version, it is willing to deserialize executable junk to objects by default, which is truly unfortunate).
Even if you're just sending the data out to a system that is differently controlled, and not an executable object, I recommend being paranoid: https://pythonhosted.org/itsdangerous/
Posted Jul 31, 2018 22:39 UTC (Tue)
by jake (editor, #205)
[Link] (2 responses)
No, our pickles are stored in the database, not taken from (or sent to) the network.
The email module woes were unrelated, mostly concerning ingesting emails to turn them into "articles".
jake
Posted Aug 1, 2018 3:10 UTC (Wed)
by k8to (guest, #15413)
[Link]
Posted Aug 1, 2018 6:50 UTC (Wed)
by Darkmere (subscriber, #53695)
[Link]
The data in a database was out there by someone who doesn't have the same bugs and valialdation patterns as you do today, thus you know from the beginning that it's not validated properly.
The dev in the past is always to be considered both untrustworthy and malicious on the level of incompetent. Just look at how much extra work they've caused you by not doing things that you now know is right and good. Clearly you can't trust that dev.
This is something thats likely to continue. Noone has caused me so much work as past me.
Posted Aug 13, 2018 7:58 UTC (Mon)
by ber (subscriber, #2142)
[Link]
Can you elaborate on this? (A quick search did not turn out anything discussing this problem.)
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
