This section introduces the special characters .
and [
.
.
matches any character except the NULL character. For example:
p.ck
matches
pick pack puck pbck pcck p.ck ...
[
begins a character set. A character set is similar to
.
in that it matches not a single, literal character, but any
of a set of characters. [
is different from .
in that
with [
, you define the set of characters explicitly.
There are three basic forms a character set can take.
In the first form, the character set is spelled out:
[<cset-spec>] -- every character in <cset-spec> is in the set.
In the second form, the character set indicated is the negation of a character set is explicitly spelled out:
[^<cset-spec>] -- every character *not* in <cset-spec> is in the set.
A <cset-spec>
is more or less an explicit enumeration of a set
of characters. It can be written as a string of individual characters:
[aeiou]
or as a range of characters:
[0-9]
These two forms can be mixed:
[A-za-z0-9_$]
Note that special regexp characters (such as *
) are not
special within a character set. -
, as illustrated above,
is special, except, as illustrated below, when it is the first
character mentioned.
This is a four-character set:
[-+*/]
The third form of a character set makes use of a pre-defined "character class":
[[:class-name:]] -- every character described by class-name is in the set.
The supported character classes are:
alnum - the set of alpha-numeric characters alpha - the set of alphabetic characters blank - tab and space cntrl - the control characters digit - decimal digits graph - all printable characters except space lower - lower case letters print - the "printable" characters punct - punctuation space - whitespace characters upper - upper case letters xdigit - hexidecimal digits
Finally, character class sets can also be inverted:
[^[:space:]] - all non-whitespace characters
Character sets can be used in a regular expression anywhere a literal character can.
Go to the first, previous, next, last section, table of contents.