Types of regular expressions

Anchors

^ Start of string, or start of line in multi-line pattern
\A Start of string
$ End of string, or end of line in multi-line pattern
\Z End of string
\b Word boundary
\B Not word boundary
\< Start of word
\> End of word

Character classes

[abc] A single character of: a, b or c
[^abc] A character except: a, b or c. The caret ^ negates character classes
[a-z] A character in the range: a-z
[^a-z] A character not in the range: a-z
[0-9] A digit in the range: 0-9
[a-zA-Z] A character in the range:a-z or A-Z
[a-zA-Z0-9] A character in the range:a-z, A-Z or 0-9

Shorthand character classes

\c Control character
\s White space
\S Not white space
\d Digit
\D Not digit
\w Word
\W Not word
\x Hexade­cimal digit
\O Octal digit

Posix BRE Special Character Classes

Class Same as Description
[[:alnum:]] [0-9A-Za-z] Matches any alphanumeric character 0–9, A–Z, or a–z
[[:alpha:]] [A-Za-z] Matches any alphabetical character, either upper or lower case
[[:blank:]] [\t ] Matches a space or Tab character
[[:digit:]] [0-9] Matches a numerical digit from 0 through 9
[[:graph:]] [[:alnum:][:punct:]] Visible characters (not space)
[[:lower:]] [a-z] Matches any lowercase alphabetical character a–z
[[:print:]] [ -~] == [ [:graph:]] Matches any printable character
[[:punct:]] [!"#$%&’()*+,-./:;<=>?@[]^_`{ }~]
[[:space:]] [\t\n\v\f\r ] Matches any whitespace character: space, Tab, NL(newline), FF (formfeed), VT (vertical tab), CR (carriage return)
[[:upper:]] [A-Z] Matches any uppercase alphabetical character A–Z
[[:word:]] [0-9A-Za-z_] Matches all world characters
[[:xdigit:]] [0-9A-Fa-f] Matches all hexadecimal digits
[[:<:]] [\b(?=\w)] Matches start of word
[[:>:]] [\b(?<=\w)] Matches end of word
[[:ascii:]] [\x00-\x7F] ASCII codes 0-127
[[:cntrl:]] [\x00-\x1F\x7F] Control characters

Quantifiers

? Match an Element Zero or One Time
* Match an Element Zero or More Times
+ Match an Element One or More Times
[0-9]+ Any digit from 0-9 must appear 1 or more times.
{} Match an Element a Specific Number of Times
a{m} The character a must appear exactly m times.
a{m,n} The character a must appear at least m times, but no more than n times.
a{m,} The character a must appear m or more times.
a* Greedy quantifier
a*? Lazy quantifier
a*+ Possessive quantifier
{,n} Match the preceding element if it occurs no more than m times.

<aside> 💡 Common Meta-ch­ara­cters ^ ( { | * < $ ) ? > . + [ \\
Escape these special characters with \\

</aside>

Groups and ranges

. Any character except new line (\n)
(a b)
(...) Group
(?:...) Passive (non-c­apt­uring) group
[abc] Range (a or b or c)
[^abc] Not (a or b or c)
[a-q] Lower case letter from a to q
[A-Q] Upper case letter from A to Q
[0-7] Digit from 0 to 7
\x Group/­sub­pattern number "­x"

Special Characters

\n Newline
\r Carriage return
\t Horizontal Tab
\v Vertical Tab
\f Form feed
\xxx Octal character xxx
\xhh Hex character hh

Pattern Modifiers

g Global match
i * Case-i­nse­nsitive
m * Multiple lines
s * Treat string as single line
x * Allow comments and whitespace in pattern
e * Evaluate replac­ement
U * Ungreedy pattern

Escape sequences

\ Escape following character
\Q Begin literal sequence
\E End literal sequence

<aside> 💡 What is Escaping?

Escaping is a method of treating characters with special meanings in regular expressions literally rather than as special characters.

</aside>

Assertions

?= Lookahead assertion
?! Negative lookahead
?<= Lookbehind assertion
?!= or ?<! Negative lookbehind
?> Once-only Subexp­ression
?() Condition [if then]