ABAP學習(20):正則表示式使用
阿新 • • 發佈:2020-10-16
ABAP 正則表示式
ABAP支援正則表示式。
支援正則表示式的語句:
1.FIND,REPLACE語句;
2.Functions:count,count_xxx,contains,find,find_xxx,match,matches,replace,substring,substring_xxx;
3.類:CL_ABAP_REGEX,CL_ABAP_MATCHER;
1. 正則表示式語句規則
1.1 SingleCharacterPatterns
單個普通字元:A-B,0-9等單個字元,以及一些特殊字元通過反斜槓(\)轉義變成普通字元;
特殊字元:. , [,],-,^這些字元作為特殊操作符,-,^只有在[]中有特殊意義;
示例:
"1.Single Character Patterns "示例: "regex:A string:a 結果:不匹配 "regex:AB string:A 結果:不匹配 IF cl_abap_matcher=>matches( pattern = 'A' text = 'A' ) = abap_true. WRITE:/ '1.true'. ENDIF. ".,[,],-,^特殊操作字元 ".可以替換任意單個字元; "\使用反斜槓將特殊字元變成普通字元; "\和一些字元一起表示一組字元(不能再[]中使用): "1.\C:表示字母字符集;"2.\d:表示數字字符集; "3.\D:表示非數字字符集; "4.\l:表示小寫字符集; "5.\L:表示非小寫字符集; "6.\s:表示空白字元; "7.\S:表示非空白字元; "8.\u:表示大寫字符集; "9.\U:表示非大寫字符集; "10.\w:表示字母數字下劃線字符集; "11.\W:表示非字母數字下劃線字符集; "[]表示一個字符集,只需要匹配字符集中一個字元,表示匹配; "[^x]表示對該字符集取反,只需要不匹配字符集中任意字元,表示匹配; "[x-x]表示字符集範圍,A-Z,a-z,0-1等; "ABAP定義的字符集 "1.[:alnum:]字母數字集; "2.[:alpha:]字母集; "3.[:digit:]數字集; "4.[:blank:]空白字元,水平製表符; "5.[:cntrl:]所有控制字符集; "6.[:graph:]可顯示字符集,除空白和水平製表符; "7.[:lower:]小寫字符集; "8.[:print:]所有可顯示字元的集合([:graph:]和[:blank:]的並集); "9.[:punct:]所有標點字符集; "10.[:space:]所有空白字元、製表符和回車符的集合; "11.[:unicode:]字元表示大於255的所有字符集(僅在Unicode系統中); "12.[:upper:]所有大寫字符集; "13.[:word:]包括下劃線在內的所有字母數字字符集_; "14.[:xdigit:]所有十六進位制數字的集合(“0”-“9”,“A”-“F”,和“A”-“F”); "示例: "regex:\. string:. 結果:匹配 "regex:\C string:A 結果:匹配 "regex:.. string:AB 結果:匹配 "regex:[ABC] string:A 結果:匹配 "regex:[AB][CD] string:AD 結果:匹配 "regex:[^A-Z] string:1 結果:匹配 "regex:[A-Z-] string:- 結果:匹配 IF cl_abap_matcher=>matches( pattern = '[A-Z-]' text = 'A' ) = abap_true. WRITE:/ '2.true'. ENDIF.
1.2 Characterstringpatterns
多正則表示式連線匹配。
特殊字元{,},*,+,?,(,),|,\
示例:
"2.Character string patterns "示例: "regex:h[ae]llo string:hello 結果:匹配; "regex:h[ae]llo string:hallo 結果:匹配; IF cl_abap_matcher=>matches( pattern = '[A-Z-]' text = 'A' ) = abap_true. WRITE:/ '3.true'. ENDIF. "{,},*,+,?,(,),|,\特殊字元 "x{n}:表示修飾的字元出現n次; "x{n,m}:表示修飾字符出現n~m次; "x*:表示修飾字符出現{0,}次; "x+:表示修飾字符出現{1,}次; "x?:表示修飾字符出現{0,1}次; "a|b:表示匹配a或b字元; "():表示分組匹配 "(?:xxx):表示xxx出現一次 "使用\1,\2代表分組從左到右 "\Qxxx\E之間的特殊字元變成普通字元 "示例: "regex:hi{2}o string:hiio 結果:匹配 "regex:hi{1,3}o string:hiiio 結果:匹配 "regex:hi?o string:ho 結果:匹配 "regex:hi*o string:ho 結果:匹配 "regex:hi+o string:hio 結果:匹配 "regex:.{0,4} string:匹配0~4個字元 "regex:a|bb|c string:bb 結果:匹配 "regex:h(a|b)o string:hao 結果:匹配 "regex:(a|b)(?:ac) string:bac 結果:匹配 "regex:(").*\1 string:"hi" 結果:匹配 IF cl_abap_matcher=>matches( pattern = '(a|b)(?:ac)' text = 'bac' ) = abap_true. WRITE:/ '4.true'. ENDIF. IF cl_abap_matcher=>matches( pattern = '(").*\1' text = '"hi"' ) = abap_true. WRITE:/ '5.true'. ENDIF. DATA:TEXT type STRING. DATA:result_tab TYPE match_result_tab. DATA:wa_result_tab TYPE match_result. text = 'aaaaaabaaaaaaacaaaa'. FIND ALL OCCURRENCES OF REGEX '(a+)(a)' IN text RESULTS result_tab. WRITE:/ text. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP.
1.3 SearchPattern
開始結尾字元匹配
示例:
"3.Search Pattern "特殊字元:^,$,\,(,),=,! "示例1:Start and end of a line "^,$表示前置符號,結尾符號,每一行 text = |Line1\nLine2\nLine3|. FIND ALL OCCURRENCES OF REGEX '^' IN text RESULTS result_tab. WRITE:/ text. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. FIND ALL OCCURRENCES OF REGEX '$' IN text RESULTS result_tab. WRITE:/ text. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. "示例2:Start and end of a character string "\A,\z作為前置符號,結尾符號,字串開始結尾 DATA:t_text(10) TYPE c. DATA:t_text_tab LIKE TABLE OF text. APPEND ' Smile' TO t_text_tab. APPEND ' Smile' TO t_text_tab. APPEND ' Smile' TO t_text_tab. APPEND ' Smile' TO t_text_tab. APPEND ' Smile' TO t_text_tab. APPEND ' Smile' TO t_text_tab. FIND ALL OCCURRENCES OF regex '\A(?:Smile)|(?:Smile)\z' IN TABLE t_text_tab RESULTS result_tab. WRITE:/ 'Smile匹配'. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. "示例3 "\z匹配最後行,\Z忽略換行匹配最後字元 text = |... this is the end\n\n\n|. FIND REGEX 'end\z' IN text. IF sy-subrc <> 0. WRITE / `There's no end.`. ENDIF. FIND REGEX 'end\Z' IN text. IF sy-subrc = 0. WRITE / `The end is near the end.`. ENDIF. "示例4:Start and End of Word "\<,\>也表示匹配開頭,結尾單詞 "\b表示開頭結尾匹配 "查詢s開頭 text = `Sometimes snow seems so soft.`. FIND ALL OCCURRENCES OF regex '\<s' IN text IGNORING CASE RESULTS result_tab. WRITE:/ 's開頭',text. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. FIND ALL OCCURRENCES OF regex 's\b' IN text IGNORING CASE RESULTS result_tab. WRITE:/ 's開頭或結尾',text. LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. "示例5:Preview Condition "預定義匹配內容不作為匹配結果內容 "(?=x),相當於匹配x "(?!x),相當於不匹配x text = `Shalalala!`. FIND ALL OCCURRENCES OF REGEX '(?:la)(?=!)' IN text RESULTS result_tab. WRITE:/ text. "這裡匹配到最後'la','!'不作為匹配到內容 LOOP AT result_tab INTO wa_result_tab. WRITE:/ wa_result_tab-line,wa_result_tab-offset,wa_result_tab-length. ENDLOOP. "示例6:Cut operator DATA:s_text TYPE string. DATA:moff TYPE i. DATA:mlen TYPE i. s_text = `xxaabbaaaaxx`. FIND REGEX 'a+b+|[ab]+' IN text MATCH OFFSET moff MATCH LENGTH mlen. WRITE:/ s_text. IF sy-subrc = 0. WRITE:/ moff. WRITE:/ mlen. WRITE:/ text+moff(mlen). ENDIF. FIND REGEX '(?>a+b+|[ab]+)' IN text MATCH OFFSET moff MATCH LENGTH mlen. WRITE:/ s_text. IF sy-subrc = 0. WRITE:/ moff. WRITE:/ mlen. WRITE:/ text+moff(mlen). ENDIF. FIND REGEX '(?>a+|a)a' IN text MATCH OFFSET moff MATCH LENGTH mlen. WRITE:/ s_text. IF sy-subrc <> 0. WRITE:/ moff. WRITE:/ mlen. WRITE:/ 'Nothing found'. ENDIF.
1.4 ReplacePatterns
替換字元REPLACE
示例:
"4.Replace Patterns "REPLACE關鍵詞替換字元 "特殊字元:$,&,`,` "示例1:Addressing the Full Occurrence text = `Yeah!`. REPLACE REGEX `\w+` IN text WITH `$0,$&`. WRITE:/ text. "示例2:Addressing the Registers of Subgroups "自身分組替換,返回`CBA'n'ABC` text = `ABC'n'CBA`. REPLACE REGEX `(\w+)(\W\w\W)(\w+)` IN text WITH `$3$2$1`. WRITE:/ text. "示例3:Addressing the Text Before the Occurrence text = `ABC and BCD`. REPLACE REGEX 'and' IN text WITH '$0 $`'. "ABC and ABC BCD WRITE:/ text.
1.5 SimplifiedRegularExpressions
簡化正則表示式
示例:
"5.Simplified Regular Expressions "這個類CL_ABAP_REGEX,僅支援簡化正則表示式 "不支援+,|,(?=),(?!),(?:); "{} => \{\} "() => \(\) "示例1 DATA:lo_regex TYPE REF TO cl_abap_regex. DATA:t_res TYPE match_result_tab. DATA:wa_res TYPE match_result. "不使用simplified Regular,+表示前面字元出現{1,} CREATE OBJECT lo_regex EXPORTING pattern = 'a+' ignore_case = abap_true "忽略大小寫 simple_regex = abap_false. FIND ALL OCCURRENCES OF REGEX lo_regex IN 'aaa+bbb' RESULTS t_res. LOOP AT t_res INTO wa_res. WRITE:/ wa_res-line,wa_res-offset,wa_res-length. ENDLOOP. "使用simplified Regular,+表示普通+ CREATE OBJECT lo_regex EXPORTING pattern = 'a+' simple_regex = abap_true. FIND ALL OCCURRENCES OF REGEX lo_regex IN 'aaa+bbb' RESULTS t_res. LOOP AT t_res INTO wa_res. WRITE:/ wa_res-line,wa_res-offset,wa_res-length. ENDLOOP.
1.6 SpecialCharactersinRegularExpressions
示例:
"6.Special Characters in Regular Expressions "正則表示式中特殊表示式 "\ Escape character for special characters "反斜槓轉義字元 "$0, $& Placeholder for the whole found location "$1, $2, $3... Placeholder for the registration of subgroups "$` Placeholder for the text before the found location "$' Placeholder for the text after the found location
2. 正則表示式使用
2.1FIND,REPLACE關鍵詞
示例:
"使用FIND,REPLACE關鍵詞 "FIND "語法:FIND [{FIRST OCCURRENCE}|{ALL OCCURRENCES} OF] pattern " IN [section_of] dobj " [IN {CHARACTER|BYTE} MODE] " [find_options]. "pattern = {[SUBSTRING] substring} | {REGEX regex} "可以查詢substring或匹配regex "section_of = SECTION [OFFSET off] [LENGTH len] OF "可以指定查詢dobj字串匹配範圍,off匹配開始位置,len偏移長度 "find_options = [{RESPECTING|IGNORING} CASE] " [MATCH COUNT mcnt] " { {[MATCH OFFSET moff] " [MATCH LENGTH mlen]} " | [RESULTS result_tab|result_wa] } " [SUBMATCHES s1 s2 ...] "mcnt:匹配次數,如果first occurrence,mcnt一直為1 "moff:最後一次匹配偏移值,如果是first occurrence,則是第一次匹配值 "mlen:最後一次匹配字串長度,如果是first occurence,則是第一次匹配值 "submatches:分組匹配字串 "示例1 DATA:s1 TYPE string. DATA:s2 TYPE string. text = `Hey hey, my my, Rock and roll can never die`. FIND REGEX `(\w+)\W+\1\W+(\w+)\W+\2` IN text IGNORING CASE MATCH OFFSET moff MATCH LENGTH mlen SUBMATCHES s1 s2. WRITE:/ moff,mlen,s1,s2. "REPLACE "語法: "1. REPLACE [{FIRST OCCURRENCE}|{ALL OCCURRENCES} OF] pattern " IN [section_of] dobj WITH new " [IN {CHARACTER|BYTE} MODE] " [replace_options]. "replace_options = [{RESPECTING|IGNORING} CASE] " [REPLACEMENT COUNT rcnt] " {{[REPLACEMENT OFFSET roff][REPLACEMENT LENGTH rlen]} " |[RESULTS result_tab|result_wa]} "2. REPLACE SECTION [OFFSET off] [LENGTH len] OF dobj WITH new " [IN {CHARACTER|BYTE} MODE]. text = 'hello1 world!22'. REPLACE ALL OCCURRENCES OF REGEX '[0-9]' IN SECTION OFFSET 0 LENGTH 10 OF text WITH '!'. WRITE:/ text. "指定位置範圍替換 REPLACE SECTION OFFSET 10 LENGTH 5 OF text WITH '!'. WRITE:/ text.
2.2使用function
可以使用到正則表示式的function:find,count,match等方法。
示例:
"使用function "find "返回匹配字元位置 "語法: "1.find( val = text {sub = substring}|{regex = regex}[case = case][off = off] [len = len] [occ = occ] ) "2.find_end( val = text regex = regex [case = case][off = off] [len = len] [occ = occ] ) "3.find_any_of( val = text sub = substring [off = off] [len = len] [occ = occ] ) "4.find_any_not_of( val = text sub = substring [off = off] [len = len] [occ = occ] ) "occ表是返回第幾次匹配值,如果為正從左到右匹配,如果為負從右到左匹配 "示例 DATA:mocc TYPE I VALUE 1. DATA:result TYPE I. text = 'hello world world'. result = find( val = text sub = 'wo' case = abap_true off = moff len = mlen occ = mocc ). WRITE:/ text,result,moff,mlen,mocc. "count "返回匹配次數 "語法: "1.count( val = text {sub = substring}|{regex = regex} [case = case][off = off] [len = len] ) "2.count_any_of( val = text sub = substring [off = off] [len = len] ) "3.count_any_not_of( val = text sub = substring [off = off] [len = len] ) result = count( val = text sub = 'wo' case = abap_true off = moff len = mlen ). WRITE:/ text,result,moff,mlen. "match "返回匹配結果子串 "語法: "match( val = text regex = regex [case = case] [occ = occ] ) DATA:s_result TYPE string. s_result = match( val = text regex = 'wor' case = abap_true occ = 1 ). WRITE:/ s_result. "contains "返回字串是否包含子串,boolean "1.contains( val = text sub|start|end = substring [case = case][off = off] [len = len] [occ = occ] ) "2.contains( val = text regex = regex [case = case][off = off] [len = len] [occ = occ] ) "3.contains_any_of( val = text sub|start|end = substring [off = off] [len = len] [occ = occ] ) "4.contains_any_not_of( val = text sub|start|end = substring [off = off] [len = len] [occ = occ] ) "off:匹配開始位置 "len:從開始偏移量 "occ:指定匹配次數,如果匹配字串沒有出現大於等於指定次數,返回false "case:大小寫敏感 text = 'abcdef egg help'. IF contains( val = text sub = 'e' case = abap_true off = 0 len = 15 occ = 2 ). WRITE:/ 'contains:匹配成功'. ENDIF. "matches "返回字串匹配結果,boolean "語法:matches( val = text regex = regex [case = case] [off = off] [len = len] ) ... "示例: text = '[email protected]'. "匹配郵箱 IF matches( val = text regex = `\w+(\.\w+)*@(\w+\.)+((\l|\u){2,4})` ). MESSAGE 'Format OK' TYPE 'S'. ELSEIF matches( val = text regex = `[[:alnum:],!#\$%&'\*\+/=\?\^_``\{\|}~-]+` & `(\.[[:alnum:],!#\$%&'\*\+/=\?\^_``\{\|}~-]+)*` & `@[[:alnum:]-]+(\.[[:alnum:]-]+)*` & `\.([[:alpha:]]{2,})` ). MESSAGE 'Syntax OK but unusual' TYPE 'S' DISPLAY LIKE 'W'. ELSE. MESSAGE 'Wrong Format' TYPE 'S' DISPLAY LIKE 'E'. ENDIF. "replace "替換指定範圍字串,off,len指定 "1.replace( val = text [off = off] [len = len] with = new ) "替換匹配字元子串 "如果off有值,len = 0,表示插入到off處; "如果len有值,off = 0,替換頭部len長度字串; "如果off等於字串長度,len=0,表示將子串拼接到字串後; "2.replace( val = text {sub = substring}|{regex = regex} with = new [case = case] [occ = occ] ) "occ指定替換次數 "示例: text = 'hello world! welcome china!'. text = replace( val = text off = 0 len = 5 with = 'hi' ). WRITE:/ 'replace:',text. "這裡只替換第一次匹配的'!' text = replace( val = text sub = '!' with = '.' case = abap_true occ = 1 ). WRITE:/ 'replace:',text. "substring "返回子字串 "1.substring( val = text [off = off] [len = len] ) "2.substring_from( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len] ) "3.substring_after( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len] ) "4.substring_before( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len] ) "5.substring_to( val = text {sub = substring}|{regex = regex}[case = case] [occ = occ] [len = len] ) text = 'ABCDEFGHJKLMN'. text = substring( val = text off = 0 len = 10 ). WRITE:/ 'substring:',text. "返回ABCDE,返回匹配子字串,len指定返回長度 text = 'ABCDEFGHJKLMN'. text = substring_from( val = text sub = 'ABCDEF' case = abap_true occ = 1 len = 5 ). WRITE:/ 'substring:',text. "返回DEFGH,返回查詢到字串後面len長度部分 text = 'ABCDEFGHJKLMN'. text = substring_after( val = text sub = 'ABC' case = abap_true occ = 1 len = 5 ). WRITE:/ 'substring:',text. "返回DEFGH,返回查詢到字串前面len長度部分 text = 'ABCDEFGHJKLMN'. text = substring_before( val = text sub = 'JKL' case = abap_true occ = 1 len = 5 ). WRITE:/ 'substring:',text. "返回GHJKL,返回查詢到字串前面len長度部分(包含匹配字串) text = 'ABCDEFGHJKLMN'. text = substring_to( val = text sub = 'JKL' case = abap_true occ = 1 len = 5 ). WRITE:/ 'substring:',text.
2.3使用cl_abap_regex,cl_abap_matcher
類cl_abap_regex,用來建立正則表示式,cl_abap_matcher,用來進行匹配,查詢,替換等操作。
示例:
"使用類 "CL_ABAP_REGEX "CL_ABAP_MATCHER DATA:lo_matcher TYPE REF TO cl_abap_matcher. DATA:ls_match TYPE match_result. DATA:lv_match TYPE C LENGTH 1. "直接使用cl_abap_matcher類方法matches IF cl_abap_matcher=>matches( pattern = 'ABC.*' text = 'ABCDABCE' ) = abap_true. "返回靜態例項 lo_matcher = cl_abap_matcher=>get_object( ). "獲取匹配結果 ls_match = lo_matcher->get_match( ). "cl_abap_matcher的attribute "text:匹配的字串 "table:匹配的table "regex:匹配的正則表示式 WRITE:/ 'cl_abap_matcher:',lo_matcher->text,ls_match-offset,ls_match-length. ENDIF. "建立matcher物件,然後匹配 lo_matcher = cl_abap_matcher=>create( pattern = 'A.*' ignore_case = abap_true text = 'ABC' ). "匹配結果,匹配‘X’,不匹配為空 lv_match = lo_matcher->match( ). WRITE:/ 'cl_abap_matcher:',lv_match. "建立cl_abap_regex,正則表示式物件 "通過regex物件建立matcher "DATA: lo_regex TYPE REF TO cl_abap_regex. CREATE OBJECT lo_regex EXPORTING pattern = '^add.*' ignore_case = abap_true. lo_matcher = lo_regex->create_matcher( text = 'addition' ). lv_match = lo_matcher->match( ). WRITE:/'cl_abap_matcher:',lv_match. "建立matcher物件,使用構造方法 DATA:t_result_tab TYPE MATCH_RESULT_TAB. DATA:s_result_tab TYPE MATCH_RESULT. CREATE OBJECT lo_regex EXPORTING pattern = 'A'. CREATE OBJECT lo_matcher EXPORTING REGEX = lo_regex TEXT = 'ABCDABCD'. t_result_tab = lo_matcher->find_all( ). LOOP AT t_result_tab INTO s_result_tab. WRITE:/ 'find_all:',s_result_tab-offset,s_result_tab-length. ENDLOOP.