The Grumpy Editor's Python 3 experience
One often-heard excuse for delaying this work is that one or more dependencies have not yet been ported to Python 3. For almost everybody, that excuse ran out of steam some time ago; if a module has not been forward-ported by now, it probably never will be and other plans need to be made. In our case, the final dependency was the venerable Quixote web framework which, due to the much appreciated work of Neil Schemenauer, was forward-ported at the end of 2017. Quixote never really took the world by storm, but it makes the task of creating a code-backed site easy; we would have been sad to have to leave it behind.
Much of the anxiety around moving to Python 3 is focused on how that language handles strings. The ability to work with Unicode was kind of bolted onto Python 2, but it was designed into Python 3 from the beginning. The result is a strict separation between the string type (str), which holds text as Unicode code points, and bytes, which contains arbitrary data — including text in a specific encoding. Python 2 made it easy to be lazy and ignore that distinction much of the time; Python 3 requires a constant awareness of which kind of data is being dealt with.
In practice, for LWN at least, Unicode is not where the problems arose. The standard advice is to use bytes for encoded strings originating from (or exiting to) the world outside a program, while converting to (or from) str at the boundary, thus using only str internally. That forces a focus on how one is communicating with the environment — a focus that really needs to be there anyway. It is not a hard discipline to acquire, and it leads to more robust code overall.
So text encodings aren't a big challenge except — in your editor's experience — for a couple of places, one of which is the email module, which has proved to be the reason for the most version-dependent code in this particular project. Much of that is due to API changes in that module, most of which are probably justified for proper email handling even if they are annoying in the short term. But there is also the simple problem that one cannot hide the text-encoding issue when dealing with email. It's not just that a message can arrive in an arbitrary encoding: a single message can contain text in multiple encodings — in a single header line. Properly processing such email is arguably easier and more correct in Python 3, but it's different from Python 2 in subtle ways that took a while to figure out.
Another problem has put your editor in a pickle — literally. The Python pickle module is a convenient way to serialize objects, but it has always been loaded with traps for the unwary. Pickle in Python 2 could be relied upon to generate pickles that could be treated as strings, especially if the oldest "protocol" was used. In Python 3, pickles are bytes, and they are not friendly toward any attempt to treat them as strings. Even the "human readable" protocol=0 mode will produce distinctly non-readable output for some types; these include things like NUL bytes that trip up even the relatively oblivious Latin-1 decoder. The datetime type is prone to this kind of problem, for example.
One solution is paint "PICKLES ARE NOT STRINGS" on one's monitor and to resolve never to be so sloppy again. But pickles have other problems, including sometimes surprising behavior when one pickles an object under Python 2, then tries to unpickle it under Python 3, where the definition of the object's class may have changed considerably. Your editor has concluded that pickles are an attractive way to avoid defining a proper persistence mechanism for Python objects, but that taking that shortcut leads to problems in the long run.
Yet another inspiration for high levels of grumpiness is the change in how module importing works. In Python 2, a line like:
import mydamnmodule
would find mydamnmodule.py in the same directory as the module doing the import. That behavior was evidently too convenient to survive into Python 3, so it was taken out. The documentation gives some lame excuse about confusion between modules located this way and standard-library modules, but your editor knows that a more mean-spirited motive must have driven such a change.
Now, one can try to fix such code with an explicit relative import:
from . import mydamnmodule
In many situations, though, that will lead to the dreaded "attempted relative import in non-package" exception that has been the cause of a seemingly infinite series of Stack Overflow postings. Once again, the rules must make sense to somebody, but they make this kind of relative import nearly impossible to use.
So there was nothing for it but to actually get a handle on the namespaces in use and change all the import statements into proper absolute form. Doing so revealed some interesting things. The lazy way in which we had set up our hierarchy was silently causing modules to be imported multiple times — as foo, lwn.foo, and even lwn.lwn.foo, for example — unnecessarily bloating the size of the running program. Such imports can also create difficult-to-debug havoc if any modules maintain module-level state that will also be duplicated and, naturally, become inconsistent.
Moving to well-defined absolute imports fixed those issues, but revealed another that had been hidden: the presence of a number of import loops in the code. These loops, where module A imports B which, in turn (and possibly through several layers of indirection) tries to import A, lead to a "can't import" exception. They are almost always an indication of code structure that, to put it charitably, could use a little more thought. Fixing those required a fair amount of refactoring, profanity, and slanderous thoughts about the Python developers.
The truth, though, is that these issues should have been fixed long ago; the end result of the import change is a much improved code structure here.
Some of the more annoying language changes really do seem like gratuitous attacks on people who have to maintain code over the long term, though. Python 2 did the Right Thing with source files containing both spaces and tabs, for example, while Python 3 throws a fit. The problem is easily fixed, but it seems like it didn't need to be a problem in the first place. Since time immemorial, octal constants have been written with a preceding zero — 0777, for example. Python 3 requires one to write 0o777 instead, for reasons that are not particularly clear. But JavaScript made that change too, so it must be the right thing to do.
At least old-style octal constants will generate a syntax error in Python 3, so there is no chance of subtle problems resulting from those constants being interpreted as decimal. The same is not true of integer division. Python 2 defined integer division as originally intended by $DEITY and implemented by almost every processor: the result is a rounded-downward integer value. So 3/2 == 1. In Python 3, instead, dividing integers yields a floating-point result: 3/2 == 1.5. That is a change that could silently create subtle problems. In the LWN code, integer division is used for tasks like subscription management and money calculations; these are not places where mistakes can be afforded.
The fix is easy enough on its face: use // for true integer division. But that requires finding every place that needs to be fixed. Grepping "/" in a large code base is not particularly fun, especially if said code base also includes a lot of HTML. This work has been done, but it is going to take a lot of testing before your editor is confident with the results.
There are numerous other little incompatibilities that one stumbles across, naturally. Some library modules have changed or are no longer present. The syntax of the except statement is different. Dictionaries no longer have has_key(). And so on. Most of these are relatively easy to catch and fix, though — just part of a day's work.
One might wonder about the various tools that are available to help with this transition. The 2to3 tool can be useful for finding some issues, but it wants to translate the code outright, generating a result that no longer runs under Python 2. That is a bigger jump than your editor would like to take; the strategy has very much been to get the code working under both versions of the language before making the big switch. 2to3 also chokes on the Quixote template syntax that is used by much of LWN's Python code. So it was of limited use overall.
An alternative is the six compatibility library, which can be useful for writing code that works under both Python versions. Your editor steered away from six instinctively, though, due to a kernel programmer's inherent dislike for low-level, behind-the-scenes magic. It reworks the module namespace, overrides functionality in surprising places, and requires coding in a version of the language that is neither 2 nor 3. Various versions of six bundled with dependencies have already led to problems even in the Python 2 version of the code. It is better, in your editor's opinion, to have the transitional compatibility code be in one's face, where it can be left behind once the changeover is complete. The increasing number of Python 3 features added to 2.7 make it easier to write portable code, in any case.
All told, the Python 3 transition has been an adventure — one that is not
yet complete. It has taken a lot of time that was already in short
supply. The end result, though, is cleaner code written in a better
version of the language, or so your editor believes, anyway. The
Python 2 code base put in over 16 years of service; hopefully the
next version will be good for at least that long.
| Index entries for this article | |
|---|---|
| Python | Python 3 |
Posted Jul 31, 2018 20:18 UTC (Tue)
by mrshiny (guest, #4266)
[Link] (49 responses)
Posted Jul 31, 2018 20:23 UTC (Tue)
by lsl (subscriber, #86508)
[Link] (45 responses)
Posted Jul 31, 2018 20:32 UTC (Tue)
by mrshiny (guest, #4266)
[Link] (23 responses)
Posted Jul 31, 2018 23:38 UTC (Tue)
by gerdesj (subscriber, #5446)
[Link] (1 responses)
... except when you did ...
Posted Aug 1, 2018 14:44 UTC (Wed)
by mrshiny (guest, #4266)
[Link]
I'm firmly in the camp of "computer languages shouldn't surprise people". 011 being unequal to 11 is just bizarre. If Octal were used constantly, all the time, every day, by many many programmers, it would be one of those weird things like x = x + 1. How can x be equal to itself plus a number? Oh, the equals sign means assignment. Fair, we do lots of assignment so we get used to it, although people make mistakes, like if (x = 1) when they meant if (x == 1). (Aside: that's another surprise: the language allows assignment in a test expression, and many languages don't require a boolean expression).
If we're improving languages, we can prevent problems by removing features that aren't necessary for clean, readable, easily-typed code, and still having a language that makes sense and is understandable.
Posted Aug 1, 2018 8:43 UTC (Wed)
by marcH (subscriber, #57642)
[Link] (10 responses)
Posted Aug 1, 2018 20:06 UTC (Wed)
by cortana (subscriber, #24596)
[Link] (9 responses)
Hmm...
Or am I missing something?
Posted Aug 1, 2018 20:31 UTC (Wed)
by farnz (subscriber, #17727)
[Link] (5 responses)
Given a file with an unknown mode, how do you set user readable, remove other writeable and add execute to all without changing the rest of the permission bits?
chmod can't apply logical operations to the disk mode expressed in octal form, but chmod u+r,o-w,a+x will do that operation.
Posted Aug 2, 2018 12:33 UTC (Thu)
by cortana (subscriber, #24596)
[Link]
Posted Aug 2, 2018 15:07 UTC (Thu)
by rweikusat2 (subscriber, #117920)
[Link] (3 responses)
There are two much more interesting questions here, though.
1) Why specify a translation depending on an existing value in order to change that to one you want instead of just using 'the one you want'?
2) 0206? Seriously? May be better fix the program ...
Posted Aug 2, 2018 20:40 UTC (Thu)
by rweikusat2 (subscriber, #117920)
[Link] (2 responses)
Assuming that's called pchmod, it becomes something like pchmod +511,-2.
Posted Aug 2, 2018 21:57 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (1 responses)
Posted Aug 2, 2018 22:10 UTC (Thu)
by rweikusat2 (subscriber, #117920)
[Link]
The features (or lack of features) of chmod are not relevant to the discussion. Apparently, nobody ever needed relative octal mode specifications so badly that this got implemented. As demonstrated above, this is trivial (I wrote this while waiting for a 'git gc' to finish).
Octal mode specifications are convenient in code, especially, C code, because the replacement macronames are lengthy sequences of unpronouncible gibberish. They're easy enough to remember that they're also convenient for specifying absolute modes for the chmod command. I found it useful to overcome my original "numbers ... "-prejudice and would thus encourage others to try the same.
Posted Aug 1, 2018 20:38 UTC (Wed)
by marcH (subscriber, #57642)
[Link]
Yes (and you're in incredibly large company, never understood why)
Let me give you a more real world example:
chmod -R g+rX friends_can_look/
Good luck octal.
Posted Aug 2, 2018 13:27 UTC (Thu)
by virtex (subscriber, #3019)
[Link] (1 responses)
Posted Aug 6, 2018 12:59 UTC (Mon)
by cortana (subscriber, #24596)
[Link]
Posted Aug 2, 2018 9:14 UTC (Thu)
by madhatter (subscriber, #4665)
[Link] (9 responses)
I've been sysadminning for 25 years and have almost never specified them any other way. I say this not to show you're wrong, but to show that we're different.
Posted Aug 2, 2018 11:58 UTC (Thu)
by mrshiny (guest, #4266)
[Link] (8 responses)
Posted Aug 2, 2018 13:26 UTC (Thu)
by madhatter (subscriber, #4665)
[Link] (7 responses)
To follow your argument, as I understood it, from the top: you said that octal-with-a-leading-zero-in-python shouldn't be needed because no-one uses octal any more; lsl pointed out that file creation is a time when it is used; you said that you don't generally specify file permissions at all, and when you do you don't do them that way, and you were assuming that was more generally true.
I was merely pointing out that last bit of the argument is fallacious, and that some people do specify absolute file permissions. Do by all means argue that my case is a corner case and doesn't justify inclusion of octal in python. But please don't argue that my case doesn't exist.
Posted Aug 2, 2018 14:53 UTC (Thu)
by mrshiny (guest, #4266)
[Link] (6 responses)
I'm not saying octal literals should be impossible because literally nobody uses them. I'm saying octal literals with a leading zero are a trap, a misfeature, a bug waiting to happen in every general purpose programming language. Furthermore, I'm saying that the use of octal literals is relatively tiny compared to the use of programming languages. Probably the only time I've ever used octal at all was on a command line, where I didn't even need to use the leading-zero syntax because the command-line tool already interpreted numeric values as octal. chmod only accepts octal permissions and thus lets you use an unprefixed octal number for specifying permissions, so a large swath of octal use-cases are unaffected by this conversation.
The point is that yes, some people do need or prefer to use octal. But those use-cases are so insignificant compared to the general use of languages that I support making breaking changes to a language to prevent the use of leading-zero octal and instead require 0o123 notation or some other notation that is not so badly designed. Programming languages should not surprise people, and octal with a leading zero is one of those things most people never need, never use, and probably forget about. At least with the new notation there's basically no chance they'll accidentally use an octal number.
Posted Aug 2, 2018 15:05 UTC (Thu)
by madhatter (subscriber, #4665)
[Link] (3 responses)
I don't think so. I wrote:
> Do by all means argue that my case is a corner case and doesn't justify inclusion of octal in python.
You write:
> But those use-cases are so insignificant compared to the general use of languages that I support making breaking changes to a language to prevent the use of leading-zero octal
It seems to me that what I wrote is exactly what you're doing, which is fine. I think you're wrong in your count of the valid use-cases, but we'd both have to actually count them to know.
I agree with you that leading-zero-to-denote-octal is pretty odd, and I'm personally fine for you to get rid of it in any given language *as long as you replace it with another way to specify octal* - 0o777 would be fine - because some people, including me, need it.
Posted Aug 2, 2018 17:10 UTC (Thu)
by mrshiny (guest, #4266)
[Link] (2 responses)
Posted Aug 3, 2018 5:46 UTC (Fri)
by madhatter (subscriber, #4665)
[Link] (1 responses)
Posted Aug 3, 2018 17:14 UTC (Fri)
by mrshiny (guest, #4266)
[Link]
Not to mention that the danger of falling into the "Accidentally used an octal literal when I meant a decimal" trap is much reduced if octal is so commonly-used a feature that everyone is aware of it.
Most programmers simply aren't specifying unix file permissions on a constant basis. Just think of every Android or MacOS or iOS or Windows or Javascript programmer, or even web programmers on Unix. File creation is rare, and file creation with specific permissions is rarer, and file creation with specific permissions on a Unix-like filesystem is rarest. I don't need to take a census to know this is true.
Posted Aug 9, 2018 9:47 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (1 responses)
Or learnt on a language/system that encouraged people to use it (like C encourages hex).
I really can't remember that far back but I've never had a problem with octal because it was used quite extensively on Pr1me for PL/1 and FORTRAN when I first started programming what, 35 years ago now?
And seeing as Pr1mos is a Multics derivative, I guess that's where Unix got it from, too :-)
Cheers,
Posted Aug 9, 2018 11:52 UTC (Thu)
by anselm (subscriber, #2796)
[Link]
There are few visible places where Unix uses octal, and the most prominent of those is probably file access permissions. Since these come in convenient packages of three bits, octal makes a lot more sense for them than, say, hexadecimal. Incidentally, file permissions are one aspect of Unix that doesn't seem to be influenced by Multics to any great extent.
Posted Jul 31, 2018 20:32 UTC (Tue)
by roc (subscriber, #30627)
[Link] (13 responses)
Posted Jul 31, 2018 20:54 UTC (Tue)
by lsl (subscriber, #86508)
[Link] (12 responses)
Posted Jul 31, 2018 22:03 UTC (Tue)
by k8to (guest, #15413)
[Link] (9 responses)
chmod u=rwx,go=rx
At least it's super clear compared to the C version.
But I still tend to use the C constants in code as opposed to octal numbers. I think it's a mix of expecting other programmers to come across it who are not UNIX nerds, and the fear of mangling the octal, and a bit of dogmatic fear over magic inline numbers.
Posted Aug 1, 2018 0:19 UTC (Wed)
by madscientist (subscriber, #16861)
[Link] (4 responses)
So, it could be "u" means the owner of the file and "o" means other users, or it could be that "o" means the owner of the file and "u" means other users.
Really, it's hard to imagine a worse pair of letters for sowing confusion. In fact the way it makes the most sense to me is exactly the opposite of reality: "o" should be "owner" and "u" should be general users.
That's why I prefer the numerical codes and consider them simpler to get right. Every time I need the text syntax (if I need to do something more sophisticated such as remove the w bit without touching other values) I have to go look up the man page to make sure I have it right. You definitely don't want to mess it up!!
Posted Aug 1, 2018 3:08 UTC (Wed)
by k8to (guest, #15413)
[Link]
Posted Aug 1, 2018 11:57 UTC (Wed)
by tao (subscriber, #17563)
[Link] (2 responses)
But I guess we all have different ways of remembering things.
Posted Aug 1, 2018 13:03 UTC (Wed)
by madscientist (subscriber, #16861)
[Link] (1 responses)
I already clearly (I think) explained the specific issue I had. I don't object to the text form, and as mentioned I do use it when I need to use the "+" or "-" forms of chmod for example. However I think poor design choices make it harder to use correctly and so I prefer the numeric system on the command line when possible. There's little possibility of mixing up the order of three numbers.
Please note I'm speaking here specifically of the "chmod" command line syntax.
Posted Aug 2, 2018 12:10 UTC (Thu)
by tao (subscriber, #17563)
[Link]
Posted Aug 1, 2018 15:50 UTC (Wed)
by rweikusat2 (subscriber, #117920)
[Link] (3 responses)
chmod 0755
u=rwx,go=rx has no inherent meaning. It's an arbitrary encoding of an integer using 'letters' and 'funny symbols' instead of numbers.
I stopped using the 'letters and funny symbols' encoding once I got over my prejudice that 'letters and funny symbols' are somehow 'inherently better' than numbers. For the octal encoding, 1 means execute, 2 means write and 4 means read (also 1 means sticky, 2 setgid and 4 setuid for the fourth set).
Posted Aug 1, 2018 20:47 UTC (Wed)
by marcH (subscriber, #57642)
[Link] (2 responses)
Those damn, so-called "letters"... I wish email addresses had allowed digits only, they would have been as easy to remember as phone numbers.
Posted Aug 1, 2018 21:42 UTC (Wed)
by rweikusat2 (subscriber, #117920)
[Link] (1 responses)
It's certainly not more complicated than remembering PINs or passcodes, something many people apparently do without problems.
It does take a conscious decision to do so, though.
Posted Aug 2, 2018 4:22 UTC (Thu)
by marcH (subscriber, #57642)
[Link]
Posted Aug 1, 2018 1:43 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Aug 3, 2018 16:44 UTC (Fri)
by mm7323 (subscriber, #87386)
[Link]
The reasons are simple and many.
1) giving things a name often help understanding
Posted Aug 1, 2018 4:29 UTC (Wed)
by warrax (subscriber, #103205)
[Link]
(Even if it were the case, octal by prepending just a 0 is utterly stupid because it goes against conventional decimal notation for no good reason. While '0o' as a prefix is heaps better, it's still unfortunate that the characters look so similar in many fonts. Hopefully programmers would use fonts that clearly distinguish between 0 and o and O, though.)
Posted Aug 1, 2018 6:20 UTC (Wed)
by jubal (subscriber, #67202)
[Link] (1 responses)
Posted Aug 1, 2018 11:36 UTC (Wed)
by dskoll (subscriber, #1630)
[Link]
Posted Aug 1, 2018 6:22 UTC (Wed)
by rsidd (subscriber, #2582)
[Link]
Posted Aug 2, 2018 16:22 UTC (Thu)
by MatyasSelmeci (guest, #86151)
[Link] (1 responses)
When wrote code that I wanted to make compatible with both Python 3 and Python 2.4 (which did not support the 0o syntax), I had to write "int('0755', 8)".
Posted Aug 2, 2018 16:46 UTC (Thu)
by jwilk (subscriber, #63328)
[Link]
Posted Aug 2, 2018 17:01 UTC (Thu)
by farnz (subscriber, #17727)
[Link]
Most of the time, when I create a file, I want the permission bits set to 0o0666 & ~umask, and directories set to 0o0777 & ~umask. Coincidentally, this is the default permissions I will get with a plain syscall, no efforts to set permission bits.
Posted Jul 31, 2018 21:33 UTC (Tue)
by ejr (subscriber, #51652)
[Link] (2 responses)
This ish needs to go away. Posted Jul 31, 2018 21:34 UTC (Tue)
by JFlorian (guest, #49650)
[Link] (1 responses)
The task of grep'ing for division sounds ... awful. That reminds me of when I was trying to decipher some Ruby that did something like bar=%x'foo' and it was entirely non-obvious that foo was an executable. Googling for the answer seemed impossible and I kept mumbling "code is read more than written." I'm sure glad the author didn't have to type "exec" all the way out.
It seems all migrations (languages, apps, whatever) have one thing in common, however. Things wind up healthier on the flip side. Maybe buggier in the immediate short term, but better overall. It's like moving, a chance to revisit all the clutter that we surround ourselves with and become accustomed to. It doesn't mean it's fun though.
As always, your grumpiness makes *me* feel better, so thanks! :-)
Posted Aug 15, 2018 20:04 UTC (Wed)
by Wol (subscriber, #4433)
[Link]
Depends whether it's driven by tech, or by PHBs fed rubbish by sales people.
I can think of several disaster-story ports - Oxford Health Care is a well-known case story in my industry where the migration basically sent the company bankrupt, because the new system was far more resource hungry and less capable than its predecessor.
And my favourite story - consultants announcing to management (after SIX MONTHS hard work) that their new system was 10% faster than the old one. Only for the dinosaur in charge of the old system to overhear, and say "10%!, 10%!!!, you're PROUD that your twin Xeon 800 is ONLY ten percent faster than a PENTIUM NINETY!!!".
Cheers,
Posted Jul 31, 2018 22:07 UTC (Tue)
by RooTer (guest, #91640)
[Link]
Posted Jul 31, 2018 22:28 UTC (Tue)
by k8to (guest, #15413)
[Link] (4 responses)
* For the base problem, I've always favored shipping a lib dir with my project that the bootstrap explicitly adds to sys.path before (most) imports begin. Some people think it's hacky, but it works with any deployment model, and is simple. The only downside is if someone hacking on it doesn't realize this, so I usually put it in a README.
* import loops are nasty, and it unfortunately takes a long time in python before your project is big enough or you do something unusual enough that they bite you. The net result is that you can end up with a lot of them before you realize you don't want them. Maybe pyflakes or whatever will auto-find them. Python is kind of best run with a lot of checkers, but sadly a lot of the checkers come with too many opinions and too much work to turn off the dumb ones.
-----
* For the division, it's been possible to get "new-style" division in python2 for many years now, so I've been able to make the switch gradually module by module. That doesn't help you of course. I guess this change makes python more similar to other languages, but I don't think it makes it more internally consistent, and I don't really think it was worth it. It's one of the things that was a pain in the butt when writing polyglot code to run on both.
* I hope you don't run into situations where you really want to use a utility thing that wants str when you have bytes. Often it's wrong to go bytes->str->util->str->bytes. Sometimes I have to just re-implement it, often copy-pasting from the python code. Probably I should write patches, but sometimes a whole module has the idea it only wants bytes, so it would be a large (and maybe controversial) patch.
* six is a little magical, agreed. I think it was a godsend when maintaining code with significant numbers of developers in a time when you had to target both pythons. If you don't though, it isn't worth it, and I think the window for that time is passing.
* The octal thing seems like the right idea, though I would have vastly preferred the error be opt-in, or opt-out. I find 0o777 truly bizarre, though I guess it's worth making octal numbers hard to do by accident.
* The most worrying part of this is the discussion of pickles. It sounds like you were transmitting pickles over the network, potentially in an insecure fashion. I've always felt pickling was acceptable for persisting to disk in scenarios where accessing the disk was already game over. However, putting them in emails smells like a remote code execution open door, even if you think you control the email store.
Obviously, if you load an object and run it, it's a remote code execution, but you may not be aware that the load action can take ownership of the process immediately without ever running your code from that point on.
I view pickling of python objects as truly magical. You can stash executable logic in a set of bytes and run it again later, which can be extremely powerful. You can have ephemeral plugins over the network and other crazy ideas. It's based on the python bytecode loader (that's essentially all it is), though so it can't work across versions.
If you just want to store *data*, then something like json.dumps is probably better, though it's not necessarily safe by default (depending on python version, it is willing to deserialize executable junk to objects by default, which is truly unfortunate).
Even if you're just sending the data out to a system that is differently controlled, and not an executable object, I recommend being paranoid: https://pythonhosted.org/itsdangerous/
Posted Jul 31, 2018 22:39 UTC (Tue)
by jake (editor, #205)
[Link] (2 responses)
No, our pickles are stored in the database, not taken from (or sent to) the network.
The email module woes were unrelated, mostly concerning ingesting emails to turn them into "articles".
jake
Posted Aug 1, 2018 3:10 UTC (Wed)
by k8to (guest, #15413)
[Link]
Posted Aug 1, 2018 6:50 UTC (Wed)
by Darkmere (subscriber, #53695)
[Link]
The data in a database was out there by someone who doesn't have the same bugs and valialdation patterns as you do today, thus you know from the beginning that it's not validated properly.
The dev in the past is always to be considered both untrustworthy and malicious on the level of incompetent. Just look at how much extra work they've caused you by not doing things that you now know is right and good. Clearly you can't trust that dev.
This is something thats likely to continue. Noone has caused me so much work as past me.
Posted Aug 13, 2018 7:58 UTC (Mon)
by ber (subscriber, #2142)
[Link]
Can you elaborate on this? (A quick search did not turn out anything discussing this problem.)
Posted Jul 31, 2018 23:35 UTC (Tue)
by luto (guest, #39314)
[Link] (3 responses)
$DEITY indeed intended for division to round down. Alas, almost every CPU rounds toward zero instead, which is an abomination unto mathematics.
Posted Aug 1, 2018 6:40 UTC (Wed)
by Homer512 (subscriber, #85295)
[Link]
I've gotta say: All concerns about mathematical purity aside, I really hate it when Python needlessly deviates from well-established C semantics (even if the are just pseudo-standards because they work on x86). It just creates tons of new gotchas and makes code conversion harder. Just like the bitshift. Why is 16 >> -2 an error? The common argument I've heard is that it's not well defined in C, either. Okay, but then why is it okay to change the semantics of division? And how does forbidding negative bitshifts result in better code overall when we now have to sprinkle our code with if's just to get the bitshift working reliably?
Posted Aug 1, 2018 13:58 UTC (Wed)
by k3ninho (subscriber, #50375)
[Link] (1 responses)
Forgive them, for their direction sign is divorced from their magnitude symbols.
(I'm pretty sure I don't like -4/3 = -2. At least you can reason about the magnitude of 4/3 and -4/3 being the same thing and so having the same ratio of threes if our calculation rounds both toward zero.)
K3n.
Posted Aug 1, 2018 15:34 UTC (Wed)
by nybble41 (subscriber, #55106)
[Link]
If -4//3 = -(4//3) = -1 then the remainder -4%3 must be negative (-1) to preserve the relation that quotient * divisor + remainder = dividend. However, the standard representation for numbers modulo N is in the range 0 to N-1. To get a standard non-negative remainder (for positive divisors), the division operation must round toward negative infinity rather than zero.
Note that Python does produce negative remainders in the case where the divisor is negative (4 % -3). Correcting this would require the rounding direction to depend on the sign of the divisor. I assume that negative divisors were deemed too rare to justify the extra complexity.
Posted Jul 31, 2018 23:59 UTC (Tue)
by ewen (subscriber, #4772)
[Link] (1 responses)
The other useful trick I found was using a Python 3.6+ venv as a way of testing compatibility, without having to change the bang path (#!) explicitly to Python 3:
python3 -m venv SOMEDIR
then activate that venv, and within it "/usr/bin/env python" will run python3, but outside the venv "/usr/bin/env python" can still run python2. That makes it easier to test both Python 2.7 and Python 3.7 side by side on the same machine, without having to edit files or manually run python FILE. (FTR, that venv creation syntax also needs Python 3.6+.)
Ewen
Posted Aug 1, 2018 9:56 UTC (Wed)
by Kamilion (guest, #42576)
[Link]
Thanks for the venv tip; has already come in handy.
Posted Aug 1, 2018 10:02 UTC (Wed)
by tekNico (subscriber, #22)
[Link] (1 responses)
This seems to refer to the manual kind of testing. Are there automated tests in the code base?
Posted Aug 1, 2018 19:36 UTC (Wed)
by ceplm (subscriber, #41334)
[Link]
Posted Aug 1, 2018 15:07 UTC (Wed)
by david.a.wheeler (subscriber, #72896)
[Link] (1 responses)
I find it very helpful to incrementally improve code over time to run on BOTH 2 and 3. "Futurize" can help automatically do some of this work:
That way, instead of trying to convert everything, you can do things a piece at a time. Tweaking code to use print functions and python3 division, while still running under Python2, is easier to handle if you do it gradually.
Here's what the Python developers suggest, though as noted it omits much:
Posted Aug 6, 2018 6:30 UTC (Mon)
by salimma (subscriber, #34460)
[Link]
Posted Aug 1, 2018 18:00 UTC (Wed)
by filbranden (guest, #87848)
[Link]
Got me to write an experiment and see that I actually don't understand that as well as I thought I did...
Description of the experiment and questions about it posted here:
Python experts, your answer there would be appreciated :-)
Posted Aug 2, 2018 17:58 UTC (Thu)
by kpfleming (subscriber, #23250)
[Link] (1 responses)
Posted Aug 2, 2018 18:00 UTC (Thu)
by corbet (editor, #1)
[Link]
Now if I'd said we were doing those calculations in floating point, then you would have reason to be concerned...
Posted Aug 9, 2018 17:50 UTC (Thu)
by gartim (guest, #10123)
[Link]
Posted Aug 10, 2018 15:39 UTC (Fri)
by sumanah (guest, #59891)
[Link]
Posted Aug 24, 2018 9:20 UTC (Fri)
by nim-nim (subscriber, #34454)
[Link]
It is quite sad most dev environments do not detect and warn about them by default (expecting humans to be disciplined without tooling is quite hopeless). That should be much easier than some of the refactoring helping that's expected nowadays.
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
>>> 0o400 + 0o200 == 0o600
True
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
BTW,
Octal mode translations
#!/usr/bin/perl
#
sub usage
{
print STDERR ("Usage: pchmod <mode arg> <path>+\n");
exit(1);
}
sub add { $_[0] | $_[1] }
sub clear { $_[0] & ~$_[1] }
sub set { $_[1] }
sub valid_mode
{
$_[0] =~ /^([+-])?0?[0-7]{1,3}$/;
}
sub parse_mode
{
my (@ops, $op, $v);
for (split(/,/, $_[0])) {
die("invalid mode $_") unless valid_mode($_);
if (/^([+-])(.*)/) {
$op = $1 eq '+' ? \&add : \&clear;
$v = $2;
} else {
$op = \&set;
$v = $_;
}
push(@ops, [$op, oct($v)]);
}
return @ops;
}
my (@ops, @stat, $m, $rc);
@ARGV > 1 || usage();
@ops = parse_mode(shift);
for (@ARGV) {
@stat = stat;
@stat or warn("stat '$_': $!"), next;
$m = $stat[2];
$m = $_->[0]($m, $_->[1]) for @ops;
$rc = chmod($m, $_);
$rc or warn("chmod '$_': $!");
}
Octal mode translations
Octal mode translations
The Grumpy Editor's Python 3 experience
When setting bits you usually don't want to just add the numbers together like you're doing because if the bit is already set the answer won't be what you want:
The Grumpy Editor's Python 3 experience
>>> oct(0o400 + 0o200)
'0o600' (Looks good)
>>> oct(0o600 + 0o200)
'0o1000' (Likely not what you wanted)
It's better to use the bitwise OR operation which will set the bit if it's unset, or leave it alone if it's already set:
>>> oct(0o400 | 0o200)
'0o600' (Looks good)
>>> oct(0o600 | 0o200)
'0o600' (The bit is already set, so nothing changes)
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
Wol
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
If you're familiar with the encoding, it's "clear" what it means, if you're not, you won't understand it. One could even call it misleading as go is an English verb which doesn't mean "group and other" (a phrase with doesn't mean anything in itself, either).
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
2) macros or constants provide a convenient place to hang documentation
3) you can easily grep for them to audit use
4) they provide a sort of type hint or type safety depending on language
5) you can Google them much more easily
6) macros can hide expressions or reliance on other macros which helps explain their derivation
7) you can change their definition at a later date without having to search or change lots of code sites
The Grumpy Editor's Python 3 experience
Oh, but “0” in 0755 is not denoting octal-ness, it's literally 0. Cf. the difference between 1755 and 0755.
The Grumpy Editor's Python 3 experience
The constant 1755 in C is 03333, so I hope you don't use it in a C program to specify file permissions....
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
Wol
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
The Grumpy Editor's Python 3 experience
Automated tests?
Automated tests?
Supporting Python 2 and 3
http://python-future.org/automatic_conversion.html
https://docs.python.org/3/howto/pyporting.html
Supporting Python 2 and 3
Imports and circular dependencies
https://stackoverflow.com/questions/51639547/python-circu...
The Grumpy Editor's Python 3 experience
How do you think we acquired that massive LWN yacht?
Fractional pennies
The Grumpy Editor's Python 3 experience
Thanks
The Grumpy Editor's Python 3 experience
