|
|
Subscribe / Log in / New account

Python support for regular expressions

Python support for regular expressions

Posted Feb 22, 2022 23:38 UTC (Tue) by iabervon (subscriber, #722)
Parent article: Python support for regular expressions

I really wish Python releases came bundled with particular versions of some well-managed PyPI packages you could upgrade further. You should be able to rely on the fact that "regex" is present, and that it conforms to the documentation frozen when 3.9 came out, but not that is doesn't have undocumented features from the future or that it still has bugs that hadn't been discovered.

AFAICT, "requests" hasn't removed any documented features since 2015; on the other hand, "construct" has not been as stable, even over a shorter period of time.


to post comments

Python support for regular expressions

Posted Feb 23, 2022 0:03 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (5 responses)

I feel like almost every language does packaging and dependencies poorly, and they all just find different ways of being terrible (albeit most also have some redeeming qualities). In the case of Python:

* On the one hand, import statements use a relatively straightforward, easy to understand set of semantics (i.e. "just stick a bunch of py files in a directory structure, and you're done!"). This is good for scripting purposes, because you don't have to faff about with something like CMake just to build an entirely self-contained app.
* On the other, those semantics are perhaps *too* simple, because there is no way to specify "I need version X or greater" within the import statement itself, nor where the module actually comes from. So now that information needs to live in metadata somewhere, and get tracked and managed separately by a tool like Pip.
* To add insult to injury, you can't have two different versions of the same module in the same process, without doing all sorts of nasty hacks that may or may not break something depending on how the underlying module works (e.g. Does it check __name__? Does it fiddle around with sys.modules? etc.).
* And, of course, you have the common beginner mistake of accidentally naming a Python script after a stdlib module (Python will prefer to re-import the script a second time, rather than using the stdlib module, and then everything breaks because it probably won't implement the stdlib module's API). This can break backcompat if a new stdlib module is introduced with the same name as one of your existing modules, but for some reason nobody seems to care about that failure mode.

Perhaps it would've been less bad if Python had used a slightly more elaborate syntax instead:

from python import re: 3.11+ # Must include Python version for standard library modules.
from pypi import regex: 3.9+ # Also need version for PyPI modules - if it's not installed or too old, then error out.
from local import regex # "local" means "don't search sys.path, just check the __main__ module's containing directory for regex.py." No version.
from my_custom_namespace import regex: 1.2.3+ # You can install custom hooks to handle imports in whatever crazy way you want.

Unfortunately, that would be egregiously incompatible with existing usage, so it's probably too late now. Oh well.

Python support for regular expressions

Posted Feb 23, 2022 8:14 UTC (Wed) by marcH (subscriber, #57642) [Link]

Python support for regular expressions

Posted Feb 23, 2022 8:28 UTC (Wed) by interalia (subscriber, #26615) [Link] (1 responses)

It does seem crazy that Python didn't at some stage namespace libraries better, both stdlib and external ones...

But anyway, do you think what you said about every language doing packaging/dependencies poorly is partly due to every approach having advantages/disadvantages, that being the nature of software development?

Python support for regular expressions

Posted Feb 24, 2022 18:45 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

Maybe a little, but I think the bigger issue here is that packaging and dependencies are a pathologically hard problem to solve. Nevertheless, I would expect Python to do a better job than it actually does, because Python's import system is almost entirely made up of first-class objects that exist at runtime. If you *wanted* to have some sort of fancy dependency resolution system requiring elaborate runtime support, Python has already built all of the complicated infrastructure which you need in order to make that possible (see e.g. the importlib documentation: https://docs.python.org/3/library/importlib.html). And yet Python's out-of-the-box packaging and dependency system is just as bad as everyone else's.

Python support for regular expressions

Posted Feb 23, 2022 10:24 UTC (Wed) by MrWim (subscriber, #47432) [Link]

Rust (with Cargo) does this better. I've written a bit about this here: https://blog.williammanley.net/2022/02/23/pip-and-cargo-a... .

I realise this is a bit of a tangent, the subject is Python and not rust, but here goes anyway:

> On the one hand, import statements use a relatively straightforward, easy to understand set of semantics (i.e. "just stick a bunch of py files in a directory structure, and you're done!").

This is where rust isn't so good. It can be confusing how `mod` and `use` and `Cargo.toml` interrelate.

>This is good for scripting purposes, because you don't have to faff about with something like CMake just to build an entirely self-contained app.

Cargo helps here as it's the single blessed build system, and comes bundled with rust.

> On the other, those semantics are perhaps *too* simple, because there is no way to specify "I need version X or greater" within the import statement itself,

With Cargo these versions are specified in `Cargo.toml` - so still external.

> nor where the module actually comes from.

Cargo does restrict where the module comes from. You'll get the version you specified in your Cargo.toml, and you can't `use` things that you haven't specified there - even if they're included in a transitive dependency.

> To add insult to injury, you can't have two different versions of the same module in the same process

Cargo and rust fix this with clever symbol mangling. It could still be an issue with C dependencies, but they are relatively rare in rust land vs. Python

> And, of course, you have the common beginner mistake of accidentally naming a Python script after a stdlib module

This isn't a problem with rust - you can use from your local crate with `use crate::mymod`. And anyway - if you get it wrong you'll get a compile error, it's not going to silently give you the wrong behaviour at runtime.

Thanks for your comment, it encouraged me to finally publish that blog post. It had been sitting almost finished for about a year now.

Python support for regular expressions

Posted Feb 23, 2022 18:17 UTC (Wed) by smoogen (subscriber, #97) [Link]

> I feel like almost every language does packaging and dependencies poorly, and they all just find different ways of being terrible (albeit most also have some redeeming qualities). In the case of Python:

I remember all the OS packaging and dependency fights about how they all did it terribly (and thus resulting in a new OS which would use some new method which would be fine for a simple case and then fall over horribly when faced with 'the real world'. Then it seemed like those conversations went dormant, but instead they had all moved to ${SCRIPT_LANGUAGE} which would again seem to come up with some method that people thought would be better/faster/cleaner than say .deb/.rpm/.etc. First the lines would be 'this is better', then they go to 'it does a good enough job', and finally 'it is horrible but we have too large of an ecosystem to change it.'


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds