|
|
Subscribe / Log in / New account

globs vs regexps

globs vs regexps

Posted Aug 5, 2004 5:31 UTC (Thu) by rfunk (subscriber, #4054)
Parent article: Bash 3.0 released

The "failglob" option will probably be of interest to many users. When set, this option will cause an error when a regular expression fails to match any files

Filename globs are not regular expressions.

Filename globs are those expressions where ? matches a single character, * matches any number of characters, and the beginning and end are always anchored (e.g. a*d matches abcd but not qabcdz).

Regular expressions have a precise meaning I don't remember at the moment, but commonly refer to expressions where a period matches a single character, a * modifies a previous bit of the expression to match any number of that bit, and beginning and end achors must be explicitly specified if they are desired. (a.*d matches both abcd and qabcdz.)

The single-star glob "*" is equivalent to the four-character regular expression "^.*$".

Confusion between these two systems trips up a lot of people, so it's important not to refer to one when you mean the other.


to post comments

RE: globs vs regexps

Posted Aug 5, 2004 6:42 UTC (Thu) by stuart_hc (guest, #9737) [Link]

Yes, this is an important point to stress. Filename pattern matching was historically related to regular expressions but the two have always been distinct. Filename patterns are described under the Pattern Matching section of the Bash man page, or better still in the POSIX standard on Pattern Matching Notation. The Wikipedia page on Regular Expressions gives both a gentle introduction and a formal description.

regexps vs regexps

Posted Aug 5, 2004 8:22 UTC (Thu) by pkolloch (subscriber, #21709) [Link] (1 responses)

Regular expressions are first of all a theoretical concept. In that sense, regular expressions specify a (often infinite) language of accepted words. Fileglobs are usually not quite as powerful, but have a simplified syntax. There are some very frequently used regular expressions syntaxes as nicely described in the referenced WikiPedia page.

If we start nit picking, we should not mistake one of those syntaxes as the one and only real regular expression syntax.

regexps vs regexps

Posted Aug 5, 2004 14:26 UTC (Thu) by rfunk (subscriber, #4054) [Link]

Regular expressions are first of all a theoretical concept

Yes, I tried to account for that without getting into the theoretical details that I don't remember.

If we start nit picking, we should not mistake one of those syntaxes as the one and only real regular expression syntax.
Except that globs do not have the power required by a true (theoretical) regular expression. They can't express everything that a regular expression can.

globs vs regexps

Posted Aug 5, 2004 11:34 UTC (Thu) by Lasse99 (guest, #1899) [Link] (1 responses)

Actually, the single-star glob '*' is equivalent to the rather complicated
regular expression '^[^.].*$' or even '^[^./][^/]$'.
On the other hand, the simple regular expression '.*' must be expressed by
a set of two filename globs, namely ( '.*' '*' ).
So, it is debatable which syntax is more complicated...

globs vs regexps

Posted Aug 5, 2004 11:35 UTC (Thu) by Lasse99 (guest, #1899) [Link]

Ooops! That should have been '^[^./][^/]*$'


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds