User: Password:
|
|
Subscribe / Log in / New account

KS2011: Structured error logging

KS2011: Structured error logging

Posted Oct 25, 2011 7:56 UTC (Tue) by liljencrantz (guest, #28458)
In reply to: KS2011: Structured error logging by Cyberax
Parent article: KS2011: Structured error logging

Minor nit: You can easily parse quoted strings using something like:

"([^"]|\\.)*"
This will work since regexps choose the longest mathing string. Or am I missing something?


(Log in to post comments)

KS2011: Structured error logging

Posted Oct 25, 2011 8:21 UTC (Tue) by l0b0 (subscriber, #80670) [Link]

You also need to account for the fact that you might have an even or odd number of backslashes before the quote:
echo '"foo \"bar\" baz"' | grep -E '"([^"]|\\.)*"' # Succeeds
echo '"foo \"bar\\" baz"' | grep -E '"([^"]|\\.)*"' # Ouch, that's a literal backslash, not an escaped quote!
To fix it, we would need to check that any quotes are preceded by an *odd* number of backslashes:
"([^"]|(?<=\\(\\\\)*)")*"
Unfortunately this doesn't work with grep -P ("lookbehind assertion is not fixed length"). I don't know if any other regex engines support this.

KS2011: Structured error logging

Posted Oct 25, 2011 8:53 UTC (Tue) by iq-0 (subscriber, #36655) [Link]

Try this one: "(\\\\|\\[^\\]|[^\\"])*"

KS2011: Structured error logging

Posted Oct 25, 2011 9:35 UTC (Tue) by liljencrantz (guest, #28458) [Link]

Or "(\\.|[^\\"])*"

KS2011: Structured error logging

Posted Oct 25, 2011 10:02 UTC (Tue) by nix (subscriber, #2304) [Link]

This I think is proof that you need a proper parser rather than just regex matching. Regexps are not a parser, they are (the core of) a tokenizer.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds