Python support for "irregular" expressions
Python support for "irregular" expressions
Posted Feb 23, 2022 13:16 UTC (Wed) by fman (subscriber, #121579)In reply to: Python support for "irregular" expressions by brenns10
Parent article: Python support for regular expressions
> [1]: https://swtch.com/~rsc/regexp/regexp1.html
Thanks for that link. That is certainly an enlightening read.
Make me wonder if what is *really* needed isn't a "simple-re" module with a "Thompson NFA" regex engine. A 6 digit speedup should be worth aiming for after all
Posted Mar 1, 2022 0:22 UTC (Tue)
by NYKevin (subscriber, #129325)
[Link]
1. Modify one or both of re/regex to use an NFA/DFA implementation if possible (i.e. if there are no lookarounds/possessives/atomic groups in the expression).
This is 100% backwards compatible, would significantly improve the performance of existing regular expressions, and the only downside is a more complicated implementation.
Python support for "irregular" expressions
2. Add a flag to re/regex.compile() that throws if it sees any of those features in the expression to be compiled. Off by default, must be explicitly passed.
2½. For bonus points, make a locked_down_regex module (preferably with a better name) that is exactly like re/regex, except the flag in (2) is always passed for you automatically and cannot be turned off by any means. This is analogous to the use of the secrets module in lieu of random. Since it's a whole new module, it won't break anything and must be opted-into, but OTOH it's easy to audit whether you are using the "right" module if your org cares.