Go to the first, previous, next, last section, table of contents.


An Introduction to Regexps

In the simplest cases, a regexp is just a literal string that must match exactly. For example, the pattern:

regexp

matches the string "regexp" and no others.

Some characters have a special meaning when they occur in a regexp. They aren't matched literally as in the previous example, but instead denote a more general pattern. For example, the character * is used to indicate that the preceeding element of a regexp may be repeated 0, 1, or more times. In the pattern:

smooo*th

the * indicates that the preceeding o can be repeated 0 or more times. So the pattern matches:

smooth
smoooth
smooooth
smoooooth
...

Suppose you want to write a pattern that literally matches a special character like * -- in other words, you don't want to * to indicate a permissible repetition, but to match * literally. This is accomplished by quoting the special character with a backslash. The pattern:

smoo\*th

matches the string:

smoo*th

and no other strings.

In seven cases, the pattern is reversed -- a backslash makes the character special instead of making a special character normal. The characters +, ?, |, (, and ) are normal but the sequences \+, \?, \|, \(, \), \{, and \} are special (their meaning is described later).

The remaining sections of this chapter introduce and explain the various special characters that can occur in regexps.


Go to the first, previous, next, last section, table of contents.