Matching a single character |
Characters that otherwise have special regexp meanings |
\ |
Precedes characters that have a special meaning: \. \+ \* \? \| \{ \( \[ \^ \$ |
Characters that need to be written in a special way |
\t |
The tab character |
\n |
The newline (line feed) character |
\r |
The carriage-return character |
\f |
The form-feed character |
Matching a single character with a predefined character class |
. |
Any character (may or may not match line terminators) |
\d |
A digit: [0-9] |
\D |
A non-digit: [^0-9] |
\s |
A whitespace character: [ \t\n\x0B\f\r] |
\S |
A non-whitespace character: [^\s] |
\w |
A word character: [a-zA-Z_0-9] |
\W |
A non-word character: [^\w] |
Defining Character classes (match one character) |
Character classes provide a way to specify a set of characters.
The class specification is enclosed in [].
The set can also be expressed by what must
not be in it by beginning the set with a caret, "^".
Minus, "-", can be used to indicate
a range of character values. Altho a character class matches only one character,
a quantifier following it can be used to match multiple characters. |
[abc] |
a, b, or c (simple class) |
[^abc] |
Any character except a, b, or c (negation) |
[a-zA-Z] |
a through z
or A through Z, inclusive (range) |
Position and Boundary patterns (match zero characters) |
^ |
The beginning of a line. Very useful. |
$ |
The end of a line. Very userful. ^$ matches all emtpy lines. |
\b |
A word boundary |
\B |
A non-word boundary |
\A |
The beginning of the input |
\G |
The end of the previous match |
\Z |
The end of the input but for the final
terminator, if any |
\z |
The end of the input |
Quantifiers (repeating the previous element) |
|
Greedy quantifiers - Expand as much as possible |
X? |
X, once or not at all |
X* |
X, zero or more times |
X+ |
X, one or more times |
X{n} |
X, exactly n times |
X{n,} |
X, at least n times |
X{n,m} |
X, at least n but not more than m times |
|
Reluctant quantifiers - Expand only if forced by later failure to match |
X?? |
X, once or not at all |
X*? |
X, zero or more times |
X+? |
X, one or more times |
X{n}? |
X, exactly n times |
X{n,}? |
X, at least n times |
X{n,m}? |
X, at least n but not more than m times |
Other |
|
Alternation |
X|Y |
Tries matching X first, if that doesn't work, tries Y |
|
Grouping - Parentheses both group and create a numbered element that can be used later. |
(X) |
X. This capturing group is remembered so it can be referenced later. Numbered starting at 1. |