LWN.net Logo

Google's RE2 regular expression library

Google's RE2 regular expression library

Posted Mar 15, 2010 9:47 UTC (Mon) by PO8 (guest, #41661)
In reply to: Google's RE2 regular expression library by droundy
Parent article: Google's RE2 regular expression library

I guess my point was just that this work seems to be presented as a significant improvement on the state of the art in RE-matching performance. As far as I can tell it's only an improvement over RE implementations that have never been held up as the state of the art for performance. Further, its performance seems to have been measured mostly on a particularly pathological microbenchmark.

Just grabbed the source code and stared at it for a bit. It's big; 14K lines of C++ code, really well commented. It does do lazy DFA compilation with a state cache, which is nice. Couldn't find any evidence of Boyer-Moore, though. This is really a major omission if performance is an issue.

Don't really have time right now to do more benchmarking of this code. There's some benchmarks there already, but I haven't figured out how to play with them yet. More if and when I get some time.


(Log in to post comments)

Google's RE2 regular expression library

Posted Mar 15, 2010 17:31 UTC (Mon) by droundy (subscriber, #4559) [Link]

I guess I saw the articles more as a denouncement of the poor
implementations in common use than the claim of a drastically new and better
implementations, and the choice of a pathological regular expression made
sense as such, since the point is that such a beast exists with backtracking
implementations, and doesn't exist with good implementations.

Google's RE2 regular expression library

Posted Mar 17, 2010 21:39 UTC (Wed) by dthurston (subscriber, #4603) [Link]

I think it's more that they're giving the standard, fast performance of a proper RegExp engine to almost all of the various convenient extensions that Perl, Python, etc. have. So it's an advance in speed over Perl et al, and an advance in convenience over awk, etc.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds