|
|
Log in / Subscribe / Register

Development quotes of the week - regular expression edition

Development quotes of the week - regular expression edition

Posted Feb 22, 2022 23:30 UTC (Tue) by NYKevin (subscriber, #129325)
In reply to: Development quotes of the week - regular expression edition by ras
Parent article: Development quotes of the week - regular expression edition

I believe you're mixing up NFAs with Turing machines; every NFA can be translated into an equivalent DFA (possibly at the cost of having a larger number of states), and NFAs and DFAs can match exactly the same sets of languages (neither can handle lookaround assertions). Other than that, I entirely agree with your assessment of the situation. But at least they added atomic/possessive matches to band-aid over the situation - which is still a lot worse than just using a properly-designed engine in the first place.


to post comments

Development quotes of the week - regular expression edition

Posted Feb 23, 2022 2:13 UTC (Wed) by ras (subscriber, #33059) [Link]

> I believe you're mixing up NFAs with Turing machines

Well, perhaps. In fact in language theory terms I definitely am. I guess I was loose with my language. Current regexp recognises started life as Thompson's NFA's, and are still called NFA engines today.

Consider this "regular expression":

^.?|(..+?)\\1+$

That regexp can't be represented as an NFA, which I guess is your point. But it *is* handled by regexp libraries in found most languages today and we generally call them NFA engines. That was how I was using the term NFA.

I don't think these "NFA engines" are as powerful as Turning machines, but they do have some extra features like prime number recognition. Those features seem to have a lot of cheerleaders who apparently don't mind that they come at the cost non-deterministic execution time. cf. quote from today's LWN front page story:

> Jonathan Slenders replied that the regex module on the Python Package Index (PyPI) has support for timeouts.

So the fix is to replace one form of non-deterministic behaviour with another. Words fail me, as in my attempt to find polite words to express my opinion of the "fix" failed.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds