4.24. Regex Cheatsheet¶
Also known as: "Regular Expressions", "Regular Expr", "regexp", "regex" or "re"
a
- exacta|b
- alternative[abc]
- enumerated character class[a-z]
- range character class.
- any character except a newline (changes meaning withre.DOTALL
)^
- start of line (changes meaning withre.MULTILINE
)$
- end of line (changes meaning withre.MULTILINE
)\A
- start of text (doesn't change meaning withre.MULTILINE
)\Z
- end of text (doesn't change meaning withre.MULTILINE
)[^]
- negation\d
- digit (alias to[0-9]
)\D
- anything but digit (alias to[^0-9]
)\s
- whitespace (space, tab, newline, non-breaking space)\S
- anything but whitespace\b
- word boundary\B
- anything but word boundary\w
- any unicode alphabet character (lower or upper, also with diacritics (i.e. ąćęłńóśżź...), numbers and underscores\W
- anything but any unicode alphabet character (i.e. whitespace, dots, comas, dashes){n}
- exactly n repetitions, exact{,n}
- maximum n repetitions, greedy (prefer longest){n,}
- minimum n repetitions, greedy (prefer longest){n,m}
- minimum n repetitions, maximum m times, greedy (prefer longest)*
- minimum 0 repetitions, no maximum, greedy (prefer longest), alias to{0,}
+
- minimum 1 repetitions, no maximum, greedy (prefer longest), alias to{1,}
?
- minimum 0 repetitions, maximum 1 repetitions, greedy (prefer longest), alias to{0,1}
{,n}?
- maximum n repetitions, lazy (prefer shorter){n,}?
- minimum n repetitions, lazy (prefer shorter){n,m}?
- minimum n repetitions, maximum m times, lazy (prefer shorter)*?
- minimum 0 repetitions, no maximum, lazy (prefer shorter), alias to{0,}?
+?
- minimum 1 repetitions, no maximum, lazy (prefer shorter), alias to{1,}?
??
- minimum 0 repetitions, maximum 1 repetition, lazy (prefer shorter), alias to{0,1}?
()
- matches whatever regular expression is inside the parentheses, and indicates the start and end of a group(...)
- unnamed group (positional)(?P<mygroup>...)
- named group mygroup(?:...)
- non-capturing group(?#...)
- comment(?P=name)
- backreferencing by group name\g<number>
- backreferencing by group number\g<name>
- backreferencing by group namere.ASCII
- perform ASCII-only matching instead of full Unicode matchingre.IGNORECASE
- case-insensitive searchre.LOCALE
- case-insensitive matching dependent on the current locale (deprecated)re.MULTILINE
- match can start in one line, and end in anotherre.DOTALL
- dot (.
) matches also newline charactersre.UNICODE
- turns on unicode character support for\w
re.VERBOSE
- ignores spaces (except\s
) and allows for comments in inre.compile()
re.DEBUG
- display debugging information during pattern compilation