Python正则表达式入门进阶(3)


结果输出如下:
Candidate: cai junsheng <cai.junsheng@example.com>
  Match name : cai junsheng
  Match email: cai.junsheng@example.com
Candidate: Different Name <cai.junsheng@example.com>
  No match
Candidate: Cai Middle junsheng <cai.junsheng@example.com>
  Match name : Cai junsheng
  Match email: cai.junsheng@example.com
Candidate: Cai M. junsheng <cai.junsheng@example.com>
  Match name : Cai junsheng
  Match email: cai.junsheng@example.com

在这个例子里,就是通过(?P=first_name)引用。

3.python里使用正则表达式的组匹配是否成功之后再自引用

在前面学习了通过名称或组号来引用本身正则表达式里的组内容,可以实现前后关联式的相等判断。如果再更进一步,比如当前面组匹配成功之后,就选择一种模式来识别,而不匹配成功又选择另外一种模式进行识别,这相当于if...else...语句的选择。我们来学习这种新的语法:(?(id)yes-expression|no-expression)。其中id是表示组名称或组编号, yes-expression是当组匹配成功之后选择的正则表达式,而no-expression 是不匹配成功之后选择的正则表达式。如下例子:
#python 3.6
#
import re
 
address = re.compile(
    '''
    ^
    # A name is made up of letters, and may include "."
    # for title abbreviations and middle initials.
    (?P<name>
      ([\w.]+\s+)*[\w.]+
    )?
    \s*
    # Email addresses are wrapped in angle brackets, but
    # only if a name is found.
    (?(name)
      # remainder wrapped in angle brackets because
      # there is a name
      (?P<brackets>(?=(<.*>$)))
      |
      # remainder does not include angle brackets without name
      (?=([^<].*[^>]$))
    )
    # Look for a bracket only if the look-ahead assertion
    # found both of them.
    (?(brackets)<|\s*)
    # The address itself: username@domain.tld
    (?P<email>
      [\w\d.+-]+      # username
      @
      ([\w\d.]+\.)+    # domain name prefix
      (com|org|edu)    # limit the allowed top-level domains
    )
    # Look for a bracket only if the look-ahead assertion
    # found both of them.
    (?(brackets)>|\s*)
    $
    ''',
    re.VERBOSE)
 
candidates = [
    u'Cai junsheng <Cai.junsheng@example.com>',
    u'No Brackets first.last@example.com',
    u'Open Bracket <first.last@example.com',
    u'Close Bracket first.last@example.com>',
    u'no.brackets@example.com',
]
 
for candidate in candidates:
    print('Candidate:', candidate)
    match = address.search(candidate)
    if match:
        print('  Match name :', match.groupdict()['name'])
        print('  Match email:', match.groupdict()['email'])
    else:
        print('  No match')


结果输出如下:
Candidate: Cai junsheng <Cai.junsheng@example.com>
  Match name : Cai junsheng
  Match email: Cai.junsheng@example.com
Candidate: No Brackets first.last@example.com
  No match
Candidate: Open Bracket <first.last@example.com
  No match
Candidate: Close Bracket first.last@example.com>
  No match
Candidate: no.brackets@example.com
  Match name : None
  Match email: no.brackets@example.com


在这里,当name组出现时才会寻找括号< >,如果括号不成对就不匹配成功;如果name组不出现,就不需要括号,因此选择了另一个正则表达式。

4.python里使用正则表达式来替换匹配成功的组

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/6ad4bb123b4949bf8d86e1bf71b2cfd2.html