1. 程式人生 > 實用技巧 >ABAP學習(20):正則表示式使用

ABAP學習(20):正則表示式使用

ABAP 正則表示式

ABAP支援正則表示式。

支援正則表示式的語句:

1.FIND,REPLACE語句;

2.Functions:count,count_xxx,contains,find,find_xxx,match,matches,replace,substring,substring_xxx;

3.類:CL_ABAP_REGEX,CL_ABAP_MATCHER;

1. 正則表示式語句規則

1.1 SingleCharacterPatterns

單個普通字元:A-B,0-9等單個字元,以及一些特殊字元通過反斜槓(\)轉義變成普通字元;

特殊字元:. , [,],-,^這些字元作為特殊操作符,-,^只有在[]中有特殊意義;

示例:

"1.Single Character Patterns 
  "示例:
  "regex:A  string:a  結果:不匹配
  "regex:AB string:A  結果:不匹配
  IF cl_abap_matcher=>matches( pattern = 'A' text = 'A' ) = abap_true.
    WRITE:/ '1.true'.
  ENDIF.

  ".,[,],-,^特殊操作字元
  ".可以替換任意單個字元;
  "\使用反斜槓將特殊字元變成普通字元;
  "\和一些字元一起表示一組字元(不能再[]中使用):
  "1.\C:表示字母字符集;
"2.\d:表示數字字符集; "3.\D:表示非數字字符集; "4.\l:表示小寫字符集; "5.\L:表示非小寫字符集; "6.\s:表示空白字元; "7.\S:表示非空白字元; "8.\u:表示大寫字符集; "9.\U:表示非大寫字符集; "10.\w:表示字母數字下劃線字符集; "11.\W:表示非字母數字下劃線字符集; "[]表示一個字符集,只需要匹配字符集中一個字元,表示匹配; "[^x]表示對該字符集取反,只需要不匹配字符集中任意字元,表示匹配; "[x-x]表示字符集範圍,A-Z,a-z,0-1等; "ABAP定義的字符集 "
1.[:alnum:]字母數字集; "2.[:alpha:]字母集; "3.[:digit:]數字集; "4.[:blank:]空白字元,水平製表符; "5.[:cntrl:]所有控制字符集; "6.[:graph:]可顯示字符集,除空白和水平製表符; "7.[:lower:]小寫字符集; "8.[:print:]所有可顯示字元的集合([:graph:]和[:blank:]的並集); "9.[:punct:]所有標點字符集; "10.[:space:]所有空白字元、製表符和回車符的集合; "11.[:unicode:]字元表示大於255的所有字符集(僅在Unicode系統中); "12.[:upper:]所有大寫字符集; "13.[:word:]包括下劃線在內的所有字母數字字符集_; "14.[:xdigit:]所有十六進位制數字的集合(“0”-“9”,“A”-“F”,和“A”-“F”); "示例: "regex:\. string:. 結果:匹配 "regex:\C string:A 結果:匹配 "regex:.. string:AB 結果:匹配 "regex:[ABC] string:A 結果:匹配 "regex:[AB][CD] string:AD 結果:匹配 "regex:[^A-Z] string:1 結果:匹配 "regex:[A-Z-] string:- 結果:匹配 IF cl_abap_matcher=>matches( pattern = '[A-Z-]' text = 'A' ) = abap_true. WRITE:/ '2.true'. ENDIF.

1.2 Characterstringpatterns

多正則表示式連線匹配。

特殊字元{,},*,+,?,(,),|,\

示例:

 "2.Character string patterns 
  "示例:
  "regex:h[ae]llo  string:hello 結果:匹配;
  "regex:h[ae]llo  string:hallo 結果:匹配;
  IF cl_abap_matcher=>matches( pattern = '[A-Z-]' text = 'A' ) = abap_true.
    WRITE:/ '3.true'.
  ENDIF.

  "{,},*,+,?,(,),|,\特殊字元
  "x{n}:表示修飾的字元出現n次;
  "x{n,m}:表示修飾字符出現n~m次;
  "x*:表示修飾字符出現{0,}次;
  "x+:表示修飾字符出現{1,}次;
  "x?:表示修飾字符出現{0,1}次;
  "a|b:表示匹配a或b字元;
  "():表示分組匹配
  "(?:xxx):表示xxx出現一次
  "使用\1,\2代表分組從左到右
  "\Qxxx\E之間的特殊字元變成普通字元
  "示例:
  "regex:hi{2}o string:hiio 結果:匹配
  "regex:hi{1,3}o string:hiiio 結果:匹配
  "regex:hi?o  string:ho 結果:匹配
  "regex:hi*o    string:ho  結果:匹配
  "regex:hi+o    string:hio  結果:匹配
  "regex:.{0,4}  string:匹配0~4個字元
  "regex:a|bb|c   string:bb    結果:匹配
  "regex:h(a|b)o string:hao   結果:匹配
  "regex:(a|b)(?:ac)  string:bac  結果:匹配
  "regex:(").*\1  string:"hi"  結果:匹配
  IF cl_abap_matcher=>matches( pattern = '(a|b)(?:ac)' text = 'bac' ) = abap_true.
    WRITE:/ '4.true'.
  ENDIF.
  IF cl_abap_matcher=>matches( pattern = '(").*\1' text = '"hi"' ) = abap_true.
    WRITE:/ '5.true'.
  ENDIF.

  DATA:TEXT type STRING.
  DATA:result_tab TYPE match_result_tab.
  DATA:wa_result_tab TYPE match_result.
  text = 'aaaaaabaaaaaaacaaaa'.
  FIND ALL OCCURRENCES OF REGEX '(a+)(a)' IN text RESULTS result_tab.
  WRITE:/ text.
  LOOP AT result_tab INTO wa_result_tab.
    WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length.
  ENDLOOP.

1.3 SearchPattern

開始結尾字元匹配

示例:

 "3.Search Pattern
  "特殊字元:^,$,\,(,),=,!
  "示例1:Start and end of a line
  "^,$表示前置符號,結尾符號,每一行
  text = |Line1\nLine2\nLine3|.
  FIND ALL OCCURRENCES OF REGEX '^'
     IN text RESULTS result_tab.
  WRITE:/ text.
  LOOP AT result_tab INTO wa_result_tab.
    WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length.
  ENDLOOP.
  FIND ALL OCCURRENCES OF REGEX '$'
     IN text RESULTS result_tab.
  WRITE:/ text.
  LOOP AT result_tab INTO wa_result_tab.
    WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length.
  ENDLOOP.

  "示例2:Start and end of a character string
  "\A,\z作為前置符號,結尾符號,字串開始結尾
  DATA:t_text(10) TYPE c.
  DATA:t_text_tab LIKE TABLE OF text.
  APPEND '     Smile' TO t_text_tab.
  APPEND '     Smile' TO t_text_tab.
  APPEND '     Smile' TO t_text_tab.
  APPEND '     Smile' TO t_text_tab.
  APPEND '     Smile' TO t_text_tab.
  APPEND '     Smile' TO t_text_tab.
  FIND ALL OCCURRENCES OF regex '\A(?:Smile)|(?:Smile)\z'
       IN TABLE t_text_tab RESULTS result_tab.
  WRITE:/ 'Smile匹配'.
  LOOP AT result_tab INTO wa_result_tab.
    WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length.
  ENDLOOP.

  "示例3
  "\z匹配最後行,\Z忽略換行匹配最後字元
  text = |... this is the end\n\n\n|.
  FIND REGEX 'end\z' IN text.
  IF sy-subrc <> 0.
    WRITE  / `There's no end.`.
  ENDIF.
  FIND  REGEX 'end\Z' IN text.
  IF sy-subrc = 0.
    WRITE / `The end is near the end.`.
  ENDIF.

  "示例4:Start and End of Word
  "\<,\>也表示匹配開頭,結尾單詞
  "\b表示開頭結尾匹配
  "查詢s開頭
  text = `Sometimes snow seems so soft.`.
  FIND ALL OCCURRENCES OF regex '\<s'
       IN text IGNORING CASE
       RESULTS result_tab.
  WRITE:/ 's開頭',text.
  LOOP AT result_tab INTO wa_result_tab.
    WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length.
  ENDLOOP.
  FIND ALL OCCURRENCES OF regex 's\b'
       IN text IGNORING CASE
       RESULTS result_tab.
  WRITE:/ 's開頭或結尾',text.
  LOOP AT result_tab INTO wa_result_tab.
    WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length.
  ENDLOOP.

  "示例5:Preview Condition
  "預定義匹配內容不作為匹配結果內容
  "(?=x),相當於匹配x
  "(?!x),相當於不匹配x
  text = `Shalalala!`.
  FIND ALL OCCURRENCES OF REGEX '(?:la)(?=!)'
       IN text RESULTS result_tab.
  WRITE:/ text.
  "這裡匹配到最後'la','!'不作為匹配到內容
  LOOP AT result_tab INTO wa_result_tab.
    WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length.
  ENDLOOP.

  "示例6:Cut operator
  DATA:s_text TYPE string.
  DATA:moff TYPE i.
  DATA:mlen TYPE i.
  s_text = `xxaabbaaaaxx`.

  FIND REGEX 'a+b+|[ab]+' IN text
    MATCH OFFSET moff
    MATCH LENGTH mlen.
  WRITE:/ s_text.
  IF sy-subrc = 0.
    WRITE:/ moff.
    WRITE:/ mlen.
    WRITE:/ text+moff(mlen).
  ENDIF.

  FIND REGEX '(?>a+b+|[ab]+)' IN text
    MATCH OFFSET moff
    MATCH LENGTH mlen.
  WRITE:/ s_text.
  IF sy-subrc = 0.
    WRITE:/ moff.
    WRITE:/ mlen.
    WRITE:/ text+moff(mlen).
  ENDIF.

  FIND REGEX '(?>a+|a)a' IN text
    MATCH OFFSET moff
    MATCH LENGTH mlen.
  WRITE:/ s_text.
  IF sy-subrc <> 0.
    WRITE:/ moff.
    WRITE:/ mlen.
    WRITE:/ 'Nothing found'.
  ENDIF.

1.4 ReplacePatterns

替換字元REPLACE

示例:

"4.Replace Patterns
  "REPLACE關鍵詞替換字元
  "特殊字元:$,&,`,`
  "示例1:Addressing the Full Occurrence
  text = `Yeah!`.
  REPLACE REGEX `\w+` IN text WITH `$0,$&`.
  WRITE:/ text.

  "示例2:Addressing the Registers of Subgroups
  "自身分組替換,返回`CBA'n'ABC`
  text = `ABC'n'CBA`.
  REPLACE REGEX `(\w+)(\W\w\W)(\w+)` IN text WITH `$3$2$1`.
  WRITE:/ text.

  "示例3:Addressing the Text Before the Occurrence
  text = `ABC and BCD`.
  REPLACE REGEX 'and' IN text WITH '$0 $`'.
  "ABC and ABC  BCD
  WRITE:/ text.

1.5 SimplifiedRegularExpressions

簡化正則表示式

示例:

 "5.Simplified Regular Expressions
  "這個類CL_ABAP_REGEX,僅支援簡化正則表示式
  "不支援+,|,(?=),(?!),(?:);
  "{} => \{\}
  "() => \(\)
  "示例1
  DATA:lo_regex TYPE REF TO cl_abap_regex.
  DATA:t_res   TYPE match_result_tab.
  DATA:wa_res  TYPE match_result.
  "不使用simplified Regular,+表示前面字元出現{1,}
  CREATE OBJECT lo_regex
    EXPORTING
      pattern      = 'a+'
      ignore_case  = abap_true "忽略大小寫
      simple_regex = abap_false.
  FIND ALL OCCURRENCES OF REGEX lo_regex IN 'aaa+bbb' RESULTS t_res.
  LOOP AT t_res INTO wa_res.
     WRITE:/ wa_res-line,wa_res-offset,wa_res-length.
  ENDLOOP.
  "使用simplified Regular,+表示普通+
  CREATE OBJECT lo_regex
    EXPORTING
      pattern      = 'a+'
      simple_regex = abap_true.
  FIND ALL OCCURRENCES OF REGEX lo_regex IN 'aaa+bbb' RESULTS t_res.
  LOOP AT t_res INTO wa_res.
     WRITE:/ wa_res-line,wa_res-offset,wa_res-length.
  ENDLOOP.

1.6 SpecialCharactersinRegularExpressions

示例:

  "6.Special Characters in Regular Expressions
  "正則表示式中特殊表示式
  "\ Escape character for special characters
  "反斜槓轉義字元
  "$0, $& Placeholder for the whole found location
  "$1, $2, $3... Placeholder for the registration of subgroups
  "$` Placeholder for the text before the found location
  "$' Placeholder for the text after the found location

2. 正則表示式使用

2.1FIND,REPLACE關鍵詞

示例:

 "使用FIND,REPLACE關鍵詞
  "FIND
  "語法:FIND [{FIRST OCCURRENCE}|{ALL OCCURRENCES} OF] pattern
  "  IN [section_of] dobj
  "  [IN {CHARACTER|BYTE} MODE]
  "  [find_options].
  "pattern = {[SUBSTRING] substring} | {REGEX regex}
  "可以查詢substring或匹配regex
  "section_of = SECTION [OFFSET off] [LENGTH len] OF
  "可以指定查詢dobj字串匹配範圍,off匹配開始位置,len偏移長度
  "find_options = [{RESPECTING|IGNORING} CASE]
  "   [MATCH COUNT  mcnt]
  "   { {[MATCH OFFSET moff]
  "   [MATCH LENGTH mlen]}
  "  | [RESULTS result_tab|result_wa] }
  "  [SUBMATCHES s1 s2 ...]
  "mcnt:匹配次數,如果first occurrence,mcnt一直為1
  "moff:最後一次匹配偏移值,如果是first occurrence,則是第一次匹配值
  "mlen:最後一次匹配字串長度,如果是first occurence,則是第一次匹配值
  "submatches:分組匹配字串
  "示例1
  DATA:s1   TYPE string.
  DATA:s2   TYPE string.
  text = `Hey hey, my my, Rock and roll can never die`.
  FIND REGEX `(\w+)\W+\1\W+(\w+)\W+\2` IN text
       IGNORING CASE
       MATCH OFFSET moff
       MATCH LENGTH mlen
       SUBMATCHES s1 s2.
  WRITE:/ moff,mlen,s1,s2.

  "REPLACE
  "語法:
  "1. REPLACE [{FIRST OCCURRENCE}|{ALL OCCURRENCES} OF] pattern
  "  IN [section_of] dobj WITH new
  "  [IN {CHARACTER|BYTE} MODE]
  "  [replace_options].
  "replace_options = [{RESPECTING|IGNORING} CASE]
  "  [REPLACEMENT COUNT  rcnt]
  "  {{[REPLACEMENT OFFSET roff][REPLACEMENT LENGTH rlen]}
  " |[RESULTS result_tab|result_wa]}

  "2. REPLACE SECTION [OFFSET off] [LENGTH len] OF dobj WITH new
  " [IN {CHARACTER|BYTE} MODE].
  text = 'hello1 world!22'.
  REPLACE
    ALL OCCURRENCES OF
    REGEX '[0-9]'
    IN SECTION OFFSET 0 LENGTH 10 OF text
    WITH '!'.
  WRITE:/ text.
  "指定位置範圍替換
  REPLACE SECTION OFFSET 10 LENGTH 5 OF text WITH '!'.
  WRITE:/ text.

2.2使用function

可以使用到正則表示式的function:find,count,match等方法。

示例:

 "使用function
  "find
  "返回匹配字元位置
  "語法:
  "1.find( val = text  {sub = substring}|{regex = regex}[case = case][off = off] [len = len] [occ = occ] )
  "2.find_end( val = text regex = regex [case = case][off = off] [len = len] [occ = occ] )
  "3.find_any_of( val = text  sub = substring [off = off] [len = len] [occ = occ] )
  "4.find_any_not_of( val = text  sub = substring [off = off] [len = len] [occ = occ] )
  "occ表是返回第幾次匹配值,如果為正從左到右匹配,如果為負從右到左匹配
  "示例
  DATA:mocc TYPE I VALUE 1.
  DATA:result TYPE I.
  text = 'hello world world'.
  result = find( val = text sub = 'wo' case = abap_true off = moff len = mlen occ = mocc ).
  WRITE:/ text,result,moff,mlen,mocc.

  "count
  "返回匹配次數
  "語法:
  "1.count( val = text  {sub = substring}|{regex = regex} [case = case][off = off] [len = len] )
  "2.count_any_of( val = text  sub = substring [off = off] [len = len] )
  "3.count_any_not_of( val = text  sub = substring [off = off] [len = len] )
  result = count( val = text sub = 'wo' case = abap_true off = moff len = mlen ).
  WRITE:/ text,result,moff,mlen.

  "match
  "返回匹配結果子串
  "語法:
  "match( val = text regex = regex [case = case] [occ = occ] )
  DATA:s_result TYPE string.
  s_result = match( val = text regex = 'wor' case = abap_true occ = 1 ).
  WRITE:/ s_result.

  "contains
  "返回字串是否包含子串,boolean
  "1.contains( val = text  sub|start|end = substring [case = case][off = off] [len = len] [occ = occ] )
  "2.contains( val = text regex = regex [case = case][off = off] [len = len] [occ = occ] )
  "3.contains_any_of( val = text sub|start|end = substring [off = off] [len = len] [occ = occ] )
  "4.contains_any_not_of( val = text sub|start|end = substring [off = off] [len = len] [occ = occ] )
  "off:匹配開始位置
  "len:從開始偏移量
  "occ:指定匹配次數,如果匹配字串沒有出現大於等於指定次數,返回false
  "case:大小寫敏感
  text = 'abcdef egg help'.
  IF contains( val = text sub = 'e' case = abap_true off = 0 len = 15 occ = 2 ).
    WRITE:/ 'contains:匹配成功'.
  ENDIF.

  "matches
  "返回字串匹配結果,boolean
  "語法:matches( val = text regex = regex [case = case] [off = off] [len = len] ) ...
  "示例:
  text = '[email protected]'.
  "匹配郵箱
  IF matches( val   = text
              regex = `\w+(\.\w+)*@(\w+\.)+((\l|\u){2,4})` ).
    MESSAGE 'Format OK' TYPE 'S'.
  ELSEIF matches(
           val   = text
           regex = `[[:alnum:],!#\$%&'\*\+/=\?\^_``\{\|}~-]+`     &
                  `(\.[[:alnum:],!#\$%&'\*\+/=\?\^_``\{\|}~-]+)*` &
                  `@[[:alnum:]-]+(\.[[:alnum:]-]+)*`              &
                  `\.([[:alpha:]]{2,})` ).
    MESSAGE 'Syntax OK but unusual' TYPE 'S' DISPLAY LIKE 'W'.
  ELSE.
    MESSAGE 'Wrong Format' TYPE 'S' DISPLAY LIKE 'E'.
  ENDIF.

  "replace
  "替換指定範圍字串,off,len指定
  "1.replace( val = text [off = off] [len = len] with = new )
  "替換匹配字元子串
  "如果off有值,len = 0,表示插入到off處;
  "如果len有值,off = 0,替換頭部len長度字串;
  "如果off等於字串長度,len=0,表示將子串拼接到字串後;
  "2.replace( val = text {sub = substring}|{regex = regex} with = new [case = case] [occ = occ] )
  "occ指定替換次數
  "示例:
  text = 'hello world! welcome china!'.
  text = replace( val = text off = 0 len = 5 with = 'hi' ).
  WRITE:/ 'replace:',text.
  "這裡只替換第一次匹配的'!'
  text = replace( val = text sub = '!' with = '.' case = abap_true occ = 1 ).
  WRITE:/ 'replace:',text.

  "substring
  "返回子字串
  "1.substring( val = text [off = off] [len = len] )
  "2.substring_from( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len]  )
  "3.substring_after( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len] )
  "4.substring_before( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len]  )
  "5.substring_to( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len]  )
  text = 'ABCDEFGHJKLMN'.
  text = substring( val = text off = 0 len = 10 ).
  WRITE:/ 'substring:',text.
  "返回ABCDE,返回匹配子字串,len指定返回長度
  text = 'ABCDEFGHJKLMN'.
  text = substring_from( val = text sub = 'ABCDEF' case = abap_true occ = 1 len = 5 ).
  WRITE:/ 'substring:',text.
  "返回DEFGH,返回查詢到字串後面len長度部分
  text = 'ABCDEFGHJKLMN'.
  text = substring_after( val = text sub = 'ABC' case = abap_true occ = 1 len = 5 ).
  WRITE:/ 'substring:',text.
  "返回DEFGH,返回查詢到字串前面len長度部分
  text = 'ABCDEFGHJKLMN'.
  text = substring_before( val = text sub = 'JKL' case = abap_true occ = 1 len = 5 ).
  WRITE:/ 'substring:',text.
  "返回GHJKL,返回查詢到字串前面len長度部分(包含匹配字串)
  text = 'ABCDEFGHJKLMN'.
  text = substring_to( val = text sub = 'JKL' case = abap_true occ = 1 len = 5 ).
  WRITE:/ 'substring:',text.

2.3使用cl_abap_regex,cl_abap_matcher

類cl_abap_regex,用來建立正則表示式,cl_abap_matcher,用來進行匹配,查詢,替換等操作。

示例:

 "使用類
  "CL_ABAP_REGEX
  "CL_ABAP_MATCHER
  DATA:lo_matcher TYPE REF TO cl_abap_matcher.
  DATA:ls_match TYPE match_result.
  DATA:lv_match TYPE C LENGTH 1.
  "直接使用cl_abap_matcher類方法matches
  IF cl_abap_matcher=>matches( pattern = 'ABC.*' text = 'ABCDABCE' ) = abap_true.
    "返回靜態例項
    lo_matcher = cl_abap_matcher=>get_object( ).
    "獲取匹配結果
    ls_match = lo_matcher->get_match( ).
    "cl_abap_matcher的attribute
    "text:匹配的字串
    "table:匹配的table
    "regex:匹配的正則表示式
    WRITE:/ 'cl_abap_matcher:',lo_matcher->text,ls_match-offset,ls_match-length.
  ENDIF.

  "建立matcher物件,然後匹配
  lo_matcher = cl_abap_matcher=>create( pattern = 'A.*'
    ignore_case = abap_true
    text        = 'ABC' ).
  "匹配結果,匹配‘X’,不匹配為空
  lv_match = lo_matcher->match( ).
  WRITE:/ 'cl_abap_matcher:',lv_match.

  "建立cl_abap_regex,正則表示式物件
  "通過regex物件建立matcher
  "DATA: lo_regex TYPE REF TO cl_abap_regex.
  CREATE OBJECT lo_regex EXPORTING pattern = '^add.*' ignore_case = abap_true.
  lo_matcher = lo_regex->create_matcher( text = 'addition' ).
  lv_match = lo_matcher->match( ).
  WRITE:/'cl_abap_matcher:',lv_match.


  "建立matcher物件,使用構造方法
  DATA:t_result_tab TYPE MATCH_RESULT_TAB.
  DATA:s_result_tab TYPE MATCH_RESULT.
  CREATE OBJECT lo_regex EXPORTING pattern = 'A'.
  CREATE OBJECT lo_matcher EXPORTING REGEX = lo_regex TEXT = 'ABCDABCD'.
  t_result_tab = lo_matcher->find_all( ).
  LOOP AT t_result_tab INTO s_result_tab.
    WRITE:/ 'find_all:',s_result_tab-offset,s_result_tab-length.
  ENDLOOP.