Java 正则表达式——捕获组
转载
捕获组分为:
- 普通捕获组(Expression)
- 命名捕获组(?Expression)
- 普通捕获组
从正则表达式左侧开始,每出现一个左括号"("记做一个分组,分组编号从 1 开始。0 代表整个表达式。
对于时间字符串:2017-04-25,表达式如下
(\\d{4})-((\\d{2})-(\\d{2}))
有 4 个左括号,所以有 4 个分组:
编号
| 捕获组
| 匹配
|
0
| (\d{4})-((\d{2})-(\d{2}))
| 2017-04-25
|
1
| (\d{4})
| 2017
|
2
| ((\d{2})-(\d{2}))
| 04-25
|
3
| (\d{2})
| 04
|
4
| (\d{2})
| 25
|
public static final String DATE_STRING = "2017-04-25";
public static final String P_COMM = "(\\d{4})-((\\d{2})-(\\d{2}))";
Pattern pattern = Pattern.compile(P_COMM);
Matcher matcher = pattern.matcher(DATE_STRING);
matcher.find();//必须要有这句
System.out.printf("\nmatcher.group(0) value:%s", matcher.group(0));
System.out.printf("\nmatcher.group(1) value:%s", matcher.group(1));
System.out.printf("\nmatcher.group(2) value:%s", matcher.group(2));
System.out.printf("\nmatcher.group(3) value:%s", matcher.group(3));
System.out.printf("\nmatcher.group(4) value:%s", matcher.group(4));
- 命名捕获组
每个以左括号开始的捕获组,都紧跟着 ?,而后才是正则表达式。
对于时间字符串:2017-04-25,表达式如下:
(?\\d{4})-(?(?\\d{2})-(?\\d{2}))
有 4 个命名的捕获组,分别是:
编号
| 名称
| 捕获组
| 匹配
|
0
| 0
| (?\d{4})-(?(?\d{2})-(?\d{2}))
| 2017-04-25
|
1
| year
| (?\d{4})
| 2017
|
2
| md
| (?(?\d{2})-(?\d{2}))
| 04-25
|
3
| month
| (?\d{2})
| 04
|
4
| day
| (?\d{2})
| 25
|
public static final String P_NAMED = "(?\\d{4})-(?(?\\d{2})-(?\\d{2}))";
public static final String DATE_STRING = "2017-04-25";
Pattern pattern = Pattern.compile(P_NAMED);
Matcher matcher = pattern.matcher(DATE_STRING);
matcher.find();
System.out.printf("\n===========使用名称获取=============");
System.out.printf("\nmatcher.group(0) value:%s", matcher.group(0));
System.out.printf("\n matcher.group('year') value:%s", matcher.group("year"));
System.out.printf("\nmatcher.group('md') value:%s", matcher.group("md"));
System.out.printf("\nmatcher.group('month') value:%s", matcher.group("month"));
System.out.printf("\nmatcher.group('date') value:%s", matcher.group("date"));
matcher.reset();
System.out.printf("\n===========使用编号获取=============");
matcher.find();
System.out.printf("\nmatcher.group(0) value:%s", matcher.group(0));
System.out.printf("\nmatcher.group(1) value:%s", matcher.group(1));
System.out.printf("\nmatcher.group(2) value:%s", matcher.group(2));
System.out.printf("\nmatcher.group(3) value:%s", matcher.group(3));
System.out.printf("\nmatcher.group(4) value:%s", matcher.group(4));
- 非捕获组
在左括号后紧跟 ?:,而后再加上正则表达式,构成非捕获组 (?:Expression)。
对于时间字符串:2017-04-25,表达式如下:
(?:\\d{4})-((\\d{2})-(\\d{2}))
这个正则表达式虽然有四个左括号,理论上有 4 个捕获组。但是第一组 (?:\d{4}),其实是被忽略的。当使用 matcher.group(4) 时,系统会报错。
编号
| 捕获组
| 匹配
|
0
| (\d{4})-((\d{2})-(\d{2}))
| 2017-04-25
|
1
| ((\d{2})-(\d{2}))
| 04-25
|
2
| (\d{2})
| 04
|
3
| (\d{2})
| 25
|