-
Notifications
You must be signed in to change notification settings - Fork 0
regex cheat sheet
holzkohlengrill edited this page Dec 15, 2023
·
5 revisions
- Escape special characters with a prepending
\
- Greediness
- Regexes are per se greedy; meaning as many as possible characters will be matched while still satisfying the regex pattern
- Appending
?
to quantifiers results in non-greediness
Pattern | Description |
---|---|
. |
Any character |
^ |
Beginning of line |
$ |
EOL |
[a-c8] |
Characters a , b , c OR 8
|
[^chars] |
Any character except c , h , a , r , s
|
( ) |
Capture group |
( a ( b ) c ) |
Nested capture group >> \1 = abc; \2 = b
|
( a )? |
Optional capture group; Abc?a Matches Abc and Abca
|
Pattern | Description |
---|---|
* |
0 or more |
+ |
1 or more |
{N} |
N occurences |
{N, M} |
M to N occurences If omitted: N = 0; M = inf. |
{N, M}? |
M to N as few as possible
|
? |
0 OR 1 |
Pattern | Description |
---|---|
\w |
[a-zA-Z0-9_] (alphanumeric) |
\W |
[^a-zA-Z0-9_] (non-alphanumeric) |
\d |
[0-9] (digit) |
\D |
[^0-9] (non-digit) |
\b |
Empty string (@ word boundary (between \w and \W )) |
\B |
Empty string (not at word boundary) |
\s |
[\t\n\r\f\v] (whitespace) |
\S |
[^\t\n\r\f\v] (non-whitespace) |
\A |
Beginning of string |
\Z |
End of string |
\g<id> |
Previously defined group |
R|S |
Regex R OR S
|
Pattern | Description |
---|---|
(?:...) |
Non-capturing group (match but do not use) |
(?\<name>A) |
Define named group; A = Regex, <name> = callable name |
(?P\<name>A) |
Same as before; first does not always work |
(?P...) |
Match any named group |
(?#...) |
Comment (use for documentation) |
(?=...) |
Lookahead; matches without consuming |
(?!...) |
Negative lookahead |
(?<=...) |
Lookbehind; matches without consuming |
(?<!...) |
Negative lookbehind |
(?(A)B|C) |
'B' if A matched, else 'B'
|
Pattern | Description |
---|---|
\1 , \2 , ... \n
|
Backreference; Get match of n -th capturing group |
You can even backreference capture groups in find and use them in replace. In some IDEs backreferencing differs:
-
PyCharm:
$n
instead of\n
-
Notepad++:
\n
re.compile()
re.search()
-
match.groups()
ormatch.group(<group_name>)
import re
# "Normal" synthax
pattModuleSummary = re.compile(r"[0-9a-f]{8}") # Matches 8 chars long hex numbers
# Find and print matches
for line in lines:
match = re.search(pattModuleSummary, line)
# Check if we have at least one match
if match:
# Print matched groups
print(match.groups())
Comment + multiline synthax (ignores whitespaces and (python) comments):
import re
pattModuleSummary = re.compile(r"""
([0-9a-f]{8}) # Origin
(?:\+{1})([0-9a-f]{8}) # Size
""", re.X) # <-- re.X is important!!
# Find and print matches
for line in lines:
match = re.search(pattModuleSummary, line)
# Check if we have at least one match
if match:
# Print matched groups
print(match.groups())
re.X
is neccesary if you want to use the multiline re.compile
synthax.
import re
pattern1 = re.compile('^(?P<addr>[0-9a-f]{8,16})\+(?P<size>[0-9a-f]{8,})$')
match = pattern1.search(line)
match.group('addr') # References only the group `addr`
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License *.
Code (snippets) are licensed under a MIT License *.
* Unless stated otherwise