A backreference is written \n
where n
is some single digit
other than 0. To be a valid backreference, there must be at least
n
parenthesized subexpressions in the pattern prior to the
backreference.
A backreference matches a literal copy of whatever was matched by the corresponding subexpression. For example,
\(.*\)-\1
matches:
go-go ha-ha wakka-wakka ...
In some applications, subexpressions are used to extract substrings.
For example, Emacs has the functions match-beginnning
and
match-end
which report the positions of strings matched by
subexpressions. These functions use the same numbering scheme for
subexpressions as backreferences, with the additional rule that
subexpression 0 is defined to be the whole regexp.
In some applications, subexpressions are used in string substitution. This again uses the backreference numbering scheme. For example, this sed command:
s/From:.*<\(.*\)>/To: \1/
first matches the line:
From: Joe Schmoe <schmoe@uspringfield.edu>
when it does, subexpression 1 matches "schmoe@uspringfield.edu". The command replaces the matched line with "To: \1" after doing subexpression substitution on it to get:
To: schmoe@uspringfield.edu
Go to the first, previous, next, last section, table of contents.