正則表示式-捕獲組和反向引用
阿新 • • 發佈:2021-06-30
一、捕獲組
捕獲組是正則中分組的一個概念,若是要對一段字元進行重複,就須要有用到分組,分組在正則中用"()"表示.而後後面能夠對這個組進行重複引用。
捕獲組分為兩類:普通捕獲組和命名捕獲組。
我們可以通過以下兩個簡單的demo來體會:
1. 普通捕獲組
從正則表示式左側開始,每出現一個左括號“(”記作一個分組,分組編號從1開始。0表明整個表示式。
public void test8() { String DATE_STRING = "2021-07-01"; String P_COMM = "(\\d{4})-((\\d{2})-(\\d{2}))"; Pattern pattern= Pattern.compile(P_COMM); Matcher matcher = pattern.matcher(DATE_STRING); matcher.find();//必需要有這句 System.out.printf("\nmatcher.group(0) value:%s", matcher.group(0)); System.out.printf("\nmatcher.group(1) value:%s", matcher.group(1)); System.out.printf("\nmatcher.group(2) value:%s", matcher.group(2)); System.out.printf("\nmatcher.group(3) value:%s", matcher.group(3)); System.out.printf("\nmatcher.group(4) value:%s", matcher.group(4)); }
列印結果:
matcher.group(0) value:2021-07-01 matcher.group(1) value:2021 matcher.group(2) value:07-01 matcher.group(3) value:07 matcher.group(4) value:01 Process finished with exit code 0
2. 命名捕獲組
每一個以左括號開始的捕獲組,都緊跟著“?”,然後才是正則表示式。
public void test9() { String P_NAMED = "(?<year>\\d{4})-(?<md>(?<month>\\d{2})-(?<date>\\d{2}))"; String DATE_STRING = "2021-07-01"; Pattern pattern = Pattern.compile(P_NAMED); Matcher matcher = pattern.matcher(DATE_STRING); matcher.find(); System.out.printf("\n===========使用名稱獲取============="); System.out.printf("\nmatcher.group(0) value:%s", matcher.group(0)); System.out.printf("\n matcher.group('year') value:%s", matcher.group("year")); System.out.printf("\nmatcher.group('md') value:%s", matcher.group("md")); System.out.printf("\nmatcher.group('month') value:%s", matcher.group("month")); System.out.printf("\nmatcher.group('date') value:%s", matcher.group("date")); matcher.reset(); System.out.printf("\n===========使用編號獲取============="); matcher.find(); System.out.printf("\nmatcher.group(0) value:%s", matcher.group(0)); System.out.printf("\nmatcher.group(1) value:%s", matcher.group(1)); System.out.printf("\nmatcher.group(2) value:%s", matcher.group(2)); System.out.printf("\nmatcher.group(3) value:%s", matcher.group(3)); System.out.printf("\nmatcher.group(4) value:%s", matcher.group(4)); }
程式結果:
===========使用名稱獲取============= matcher.group(0) value:2021-07-01 matcher.group('year') value:2021 matcher.group('md') value:07-01 matcher.group('month') value:07 matcher.group('date') value:01 ===========使用編號獲取============= matcher.group(0) value:2021-07-01 matcher.group(1) value:2021 matcher.group(2) value:07-01 matcher.group(3) value:07 matcher.group(4) value:01
3. 非捕獲組
在左括號後緊跟“?:”,然後再加上正則表示式,構成非捕獲組(?:Expression)
public void test10() { String P_UNCAP = "(?:\\d{4})-((\\d{2})-(\\d{2}))"; String DATE_STRING = "2021-07-01"; Pattern pattern = Pattern.compile(P_UNCAP); Matcher matcher = pattern.matcher(DATE_STRING); matcher.find(); System.out.printf("\nmatcher.group(0) value:%s", matcher.group(0)); System.out.printf("\nmatcher.group(1) value:%s", matcher.group(1)); System.out.printf("\nmatcher.group(2) value:%s", matcher.group(2)); System.out.printf("\nmatcher.group(3) value:%s", matcher.group(3)); // Exception in thread "main" java.lang.IndexOutOfBoundsException: No group 4 System.out.printf("\nmatcher.group(4) value:%s", matcher.group(4)); }
執行結果:
matcher.group(0) value:2021-07-01 matcher.group(1) value:07-01 matcher.group(2) value:07 matcher.group(3) value:01 java.lang.IndexOutOfBoundsException: No group 4
二、反向引用
1.反向引用須要使用到分組,分組就是使用()括起來的部分為一個總體,在進行分組匹配時的原則是:由外向內,由左向右3d
2.反向引用如:\1,\2等 \1:表示的是引用第一次匹配到的()括起來的部分 \2:表示的是引用第二次匹配到的()括起來的部分 例:String regex = "^(\\d)\\1$"; 首先這裡是匹配兩位,\d一位,\1又引用\d一位這裡的\1會去引用(\d)匹配到的內容,由於(\d)是第一次匹配到的內容。 如:str = "22"時,(\\d)匹配到2,因此\1引用(\\d)的值也為2,因此str="22"能匹配 str = "23"時,(\\d)匹配到2,由於\1引用(\\d)的值2,而這裡是3,因此str="23"不能匹配 下面通過一些demo來體會下:@Test public void test1() { String reg = "([a-z]{3}[1-9]{3})[a-z]{3}[1-9]{3}"; String str = "asd123asd123"; //常規寫法 true System.out.println(Pattern.matches(reg, str)); } @Test public void test2() { String reg = "([a-z]{3}[1-9]{3})\\1"; String str = "asd123asd123"; //使用到反向引用 true System.out.println(Pattern.matches(reg, str)); } @Test public void test3() { String str = "1234567123123123"; // 只能匹配“123123” Pattern p = Pattern.compile("(\\d\\d\\d)\\1"); Matcher m = p.matcher(str); // 1 System.out.println(m.groupCount()); while (m.find()) { String word = m.group(); // 123123 7 13 System.out.println(word + " " + m.start() + " " + m.end()); } } @Test public void test4() { String pattern = "\\b(\\w+)\\b[\\w\\W]*\\b\\1\\b"; Pattern p = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE); String phrase = "unique is not duplicate but unique, Duplicate is duplicate."; Matcher m = p.matcher(phrase); while (m.find()) { String val = m.group(); System.out.println("Matching subsequence is \"" + val + "\""); System.out.println("Duplicate word: " + m.group(1) + "\n"); } } @Test public void test5() { String reg = "(\\w)(\\w)\\2\\1"; String str = "abba"; // true System.out.println(Pattern.matches(reg, str)); } @Test public void test6() { String reg = "(\\w)(\\w)\\2\\1"; String str = "abba"; // true System.out.println(Pattern.matches(reg, str)); } @Test public void test7() { String reg = "([a-z]{3})([1-9]{3})\\1\\2"; String str = "asd123asd123"; // true System.out.println(Pattern.matches(reg, str)); String reg1 = "([a-z]{3})([1-9]{3})\\2\\1"; String str1 = "asd123123asd"; // true System.out.println(Pattern.matches(reg1, str1)); }