Python妙用re.sub分析正则表达式匹配过程

转载

mob604756eedb0b 2019-08-24 15:20:00

文章标签 子串 python 搜索正则表达式学python 文章分类 代码人生

声明：本文所使用方法为老猿自行研究并编码，相关代码版权为老猿所有，禁止转载文章，代码禁止用于商业用途！

在《第11.23节 Python 中re模块的搜索替换功能：sub及subn函数》介绍了re.sub函数，其中的替换内容可以是一个函数，利用该功能我们可以展示正则表达式匹配过程中匹配到的目标子串的匹配顺序、匹配文本的内容和匹配文本在搜索文本中的位置。具体实现如下：

import re
matchcount = 0

def parsematch(patstr,text):
    global matchcount
    matchcount = 0
    re.sub(patstr,matchrsult,text)

def matchrsult(m):
    global matchcount
    matchcount += 1   
    print(f"第{matchcount}次匹配，匹配情况:")
    if(m.lastindex):
        for i in range(0,m.lastindex+1):print(f"    匹配子串group({i}): {m.group(i)},位置为：{m.span(i)}") #正则表达式为{m.re},搜索文本为{m.string},
    else:print(f"    匹配子串group(0): {m.group(0)},位置为：{m.span(0)}")
    return m.group(0)

调用举例：

>>> parsematch(r'(?i)(?P<lab>py\w*)','Python?PYTHON!Learning python with LaoYuan! ')
第1次匹配，匹配情况:
    匹配子串group(0): Python,位置为：(0, 6)
    匹配子串group(1): Python,位置为：(0, 6)
第2次匹配，匹配情况:
    匹配子串group(0): PYTHON,位置为：(7, 13)
    匹配子串group(1): PYTHON,位置为：(7, 13)
第3次匹配，匹配情况:
    匹配子串group(0): python,位置为：(23, 29)
    匹配子串group(1): python,位置为：(23, 29)
>>>
>>> parsematch('(.?)*',"abc")
第1次匹配，匹配情况:
    匹配子串group(0): abc,位置为：(0, 3)
    匹配子串group(1): ,位置为：(3, 3)
第2次匹配，匹配情况:
    匹配子串group(0): ,位置为：(3, 3)
    匹配子串group(1): ,位置为：(3, 3)
>>> 
>>> parsematch('(?P<l1>Lao)(?P<l2>\w+)(Python)','LaoYuanPython')
第1次匹配，匹配情况:
    匹配子串group(0): LaoYuanPython,位置为：(0, 13)
    匹配子串group(1): Lao,位置为：(0, 3)
    匹配子串group(2): Yuan,位置为：(3, 7)
    匹配子串group(3): Python,位置为：(7, 13)
>>>