LWN.net Logo

RE2: a principled approach to regular expression matching

Google has released a project called RE2, an alternative regular expression matching engine that it describes as a "mostly drop-in replacement for PCRE's C++ bindings." RE2 implements regular expression matching without a backtracking search, the approach used by most other implementations that can have an exponential run time in worst-case scenarios.


(Log in to post comments)

RE2: a principled approach to regular expression matching

Posted Dec 13, 2012 3:34 UTC (Thu) by Comet (subscriber, #11646) [Link]

Er, this was released in 2010.

RE2: a principled approach to regular expression matching

Posted Dec 13, 2012 4:28 UTC (Thu) by dtlin (✭ supporter ✭, #36537) [Link]

Russ Cox has written a bit about regular expressions, including a high-level overview of RE2 and how Google Codesearch can quickly match regular expressions against enormous corpora.  They’re interesting reads, but not news…

RE2: a principled approach to regular expression matching

Posted Dec 13, 2012 11:25 UTC (Thu) by epa (subscriber, #39769) [Link]

You can also use RE2 as a replacement for Perl's built-in regexp engine using re::engine::RE2.

RE2: a principled approach to regular expression matching

Posted Dec 14, 2012 11:50 UTC (Fri) by geertj (subscriber, #4116) [Link]

No backreferences.... It would have been interesting to see a regular expression matching library that does not do backtracking but does support backreferences. Is there a proof that this is mathematically impossible?

RE2: a principled approach to regular expression matching

Posted Dec 14, 2012 19:20 UTC (Fri) by nybble41 (subscriber, #55106) [Link]

It may be possible to do that sort of match without backtracking, but it certainly wouldn't be a regular expression. The formal definition of a regular expression (concatenation /ab/, repetition /a*/, alternation /a|b/) precludes back-references.

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds