结果输出如下:
Candidate: cai junsheng <cai.junsheng@example.com>
Match name : cai junsheng
Match email: cai.junsheng@example.com
Candidate: Different Name <cai.junsheng@example.com>
No match
Candidate: Cai Middle junsheng <cai.junsheng@example.com>
Match name : Cai junsheng
Match email: cai.junsheng@example.com
Candidate: Cai M. junsheng <cai.junsheng@example.com>
Match name : Cai junsheng
Match email: cai.junsheng@example.com
在这个例子里,就是通过(?P=first_name)引用。
3.python里使用正则表达式的组匹配是否成功之后再自引用
在前面学习了通过名称或组号来引用本身正则表达式里的组内容,可以实现前后关联式的相等判断。如果再更进一步,比如当前面组匹配成功之后,就选择一种模式来识别,而不匹配成功又选择另外一种模式进行识别,这相当于if...else...语句的选择。我们来学习这种新的语法:(?(id)yes-expression|no-expression)。其中id是表示组名称或组编号, yes-expression是当组匹配成功之后选择的正则表达式,而no-expression 是不匹配成功之后选择的正则表达式。如下例子:
#python 3.6
#
import re
address = re.compile(
'''
^
# A name is made up of letters, and may include "."
# for title abbreviations and middle initials.
(?P<name>
([\w.]+\s+)*[\w.]+
)?
\s*
# Email addresses are wrapped in angle brackets, but
# only if a name is found.
(?(name)
# remainder wrapped in angle brackets because
# there is a name
(?P<brackets>(?=(<.*>$)))
|
# remainder does not include angle brackets without name
(?=([^<].*[^>]$))
)
# Look for a bracket only if the look-ahead assertion
# found both of them.
(?(brackets)<|\s*)
# The address itself: username@domain.tld
(?P<email>
[\w\d.+-]+ # username
@
([\w\d.]+\.)+ # domain name prefix
(com|org|edu) # limit the allowed top-level domains
)
# Look for a bracket only if the look-ahead assertion
# found both of them.
(?(brackets)>|\s*)
$
''',
re.VERBOSE)
candidates = [
u'Cai junsheng <Cai.junsheng@example.com>',
u'No Brackets first.last@example.com',
u'Open Bracket <first.last@example.com',
u'Close Bracket first.last@example.com>',
u'no.brackets@example.com',
]
for candidate in candidates:
print('Candidate:', candidate)
match = address.search(candidate)
if match:
print(' Match name :', match.groupdict()['name'])
print(' Match email:', match.groupdict()['email'])
else:
print(' No match')
结果输出如下:
Candidate: Cai junsheng <Cai.junsheng@example.com>
Match name : Cai junsheng
Match email: Cai.junsheng@example.com
Candidate: No Brackets first.last@example.com
No match
Candidate: Open Bracket <first.last@example.com
No match
Candidate: Close Bracket first.last@example.com>
No match
Candidate: no.brackets@example.com
Match name : None
Match email: no.brackets@example.com
在这里,当name组出现时才会寻找括号< >,如果括号不成对就不匹配成功;如果name组不出现,就不需要括号,因此选择了另一个正则表达式。
4.python里使用正则表达式来替换匹配成功的组