In the simplest cases, a regexp is just a literal string that must match exactly. For example, the pattern:
regexp
matches the string "regexp" and no others.
Some characters have a special meaning when they occur in a regexp.
They aren't matched literally as in the previous example, but instead
denote a more general pattern. For example, the character *
is used to indicate that the preceeding element of a regexp may be
repeated 0, 1, or more times. In the pattern:
smooo*th
the *
indicates that the preceeding o
can be repeated 0 or
more times. So the pattern matches:
smooth smoooth smooooth smoooooth ...
Suppose you want to write a pattern that literally matches a special
character like *
-- in other words, you don't want to *
to
indicate a permissible repetition, but to match *
literally. This
is accomplished by quoting the special character with a backslash.
The pattern:
smoo\*th
matches the string:
smoo*th
and no other strings.
In seven cases, the pattern is reversed -- a backslash makes the
character special instead of making a special character normal. The
characters +
, ?
, |
, (
, and )
are
normal but the sequences \+
, \?
, \|
, \(
,
\)
, \{
, and \}
are special (their meaning is
described later).
The remaining sections of this chapter introduce and explain the various special characters that can occur in regexps.
Go to the first, previous, next, last section, table of contents.