|
|
Subscribe / Log in / New account

Surprisingly relevant?

Surprisingly relevant?

Posted May 20, 2020 16:30 UTC (Wed) by scientes (guest, #83068)
In reply to: Surprisingly relevant? by NYKevin
Parent article: The state of the AWK

sed is not a horrible idea, but whenever I use it I run into the fact that it cannot parse arbitrary regular languages because of the lack of non-greedy matching (i.e. a decent regex implementation).


to post comments

Surprisingly relevant?

Posted May 20, 2020 20:05 UTC (Wed) by NYKevin (subscriber, #129325) [Link]

In my experience, replacing dot with [^x] (for some suitable x) is often Good Enough. This is certainly true when parsing something like a path name into its constituent components. True non-greedy matching is more powerful than that, of course, but eventually you may want to reach for a Real Parser (TM).

(Strictly speaking, it is not correct to claim that non-greedy matching is required for parsing arbitrary regular languages. Formally, a regular language can be parsed entirely in terms of literal characters, parentheses, alternation, and the Kleene star, plus anchoring if you assume that regexes are not implicitly anchored. But this might require a very long and unwieldy regex in practice, so a lack of non-greedy matching is certainly a valid complaint.)

Alternatively, I suppose you could use ex(1) noninteractively (with -c {command} or +{command}).


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds