-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python 正则表达式 #53
Comments
部分正则语法=> => => => => re.match(r'(?i)ab|AB', 'aB') => => re.match(r'(?P<underscore>_{1,2})abc(?P=underscore)', '__abc__') => => re.match(r'abc(?=def)', 'abcdef') => re.match(r'abc(?!d)', 'abce') =>
=> 这里 >>> re.match(r'(<)?(\w+@\w+(?:\.\w+)+)(?(1)>|$)', '[email protected]').group(0)
<<< '[email protected]'
>>> re.match(r'(<)?(\w+@\w+(?:\.\w+)+)(?(1)>|$)', '<[email protected]>').group(0)
<<< '<[email protected]>' 这里如果前面的 => => => => => => => => => |
re 模块=> 返回一个正则表达式对象,上面有很多下面介绍的在 re.compile(pattern, flags=0) <<< r = re.compile(r'<(\d+)>')
<<< r.match('<323>').groups()
>>> ('323',) => 在字符中搜索匹配的 pattern,如果有匹配则返回 match object,否则返回 None。 match = re.search(pattern, string)
if match:
process(match) => 如果字符串的开头与 pattern 匹配,则返回 match object,否则返回 None。 re.match(pattern, string, flags=0) 如果整个字符串都匹配,则返回 re.fullmatch(pattern, string, flags=0) => re.split(pattern, string, maxsplit=0, flags=0) 如果用于切分的正则表达式中包含分组,那么分组捕获的内容也会返回。 >>> re.split(r'<->', '123<->456<->789')
<<< ['123', '456', '789']
>>> re.split(r'<(-)>', '123<->456<->789')
<<< ['123', '-', '456', '-', '789'] => 返回所有匹配的子字符串。 <<< re.findall(r'\d{3}', '123<->456<->789')
>>> ['123', '456', '789']
# 含有多个分组的时候,返回的是元组的列表
<<< re.findall(r'(\d)\d(\d)', '123<->456<->789')
>>> [('1', '3'), ('4', '6'), ('7', '9')] => 返回 match object 的迭代器,由此可以得到所有的匹配项,用于处理较长的文本。 => 这个方法用来从源字符串中匹配部分内容,然后通过一个模板构成新的字符串。 re.subn(pattern, repl, string, count=0, flags=0) 举例子,比如用
这里 >>> re.sub(r'(?P<from>\d{3})->(?P<to>\d{3})', '\g<to><-\g<from>', '123->456')
<<< '456<-123' |
Match 对象=> <<< match = re.search(r'<(?P<name>\w+)@(\w+)\.(\w+)>', '<[email protected]>')
<<< match.expand('mailto:\g<name>[at]\g<2>[dot]\g<3>')
>>> 'mailto:wangyu[at]163[dot]com' => <<< match = re.search(r'<(?P<name>\w+)@(\w+)\.(\w+)>', '<[email protected]>')
<<< match.group(0) # or match.group()
>>> '<[email protected]>'
<<< match.group(0,1,2,3)
>>> ('<[email protected]>', 'wangyu', '163', 'com') => 以元组的形式返回所有分组。 <<< match.groups()
>>> ('wangyu', '163', 'com') => 以字典的形式返回所有命名分组,未命名的分组会被忽略。 <<< match = re.search(r'<(?P<name>\w+)@(?P<host>\w+.\w+)>', '<[email protected]>')
<<< match.groupdict()
>>> {'name': 'wangyu', 'host': '163.com'} => 返回匹配的分组的起始和结束位置。 << match = re.search(r'<(?P<name>\w+)@(?P<host>\w+.\w+)>', '<[email protected]>')
<<< match.start(),match.start(1),match.start(2)
>>> (0, 1, 8)
<<< match.end(),match.end(1),match.end(2)
>>> (16, 7, 15) => 返回指定分组的起始和结束位置。
|
The text was updated successfully, but these errors were encountered: