1. 程式人生 > >C++ regex 正則表示式的使用

C++ regex 正則表示式的使用

在C++中,有三種正則表示式可以選擇使用:C++11 regex、POSIX regex 和 boost regex
C++ regex函式有3個:regex_match、regex_search 、regex_replace

regex_match

regex_match是正則表示式匹配的函式,下面以例子說明

// regex_match example
#include <iostream>
#include <string>
#include <regex>
 
int main ()
{
 
  if (std::regex_match (
"subject", std::regex("(sub)(.*)") )) std::cout << "string literal matched\n"; std::string s ("subject"); std::regex e ("(sub)(.*)"); if (std::regex_match (s,e)) std::cout << "string object matched\n"; if ( std::regex_match ( s.begin(), s.end(), e ) ) std::cout <<
"range matched\n"; std::cmatch cm; // same as std::match_results<const char*> cm; std::regex_match ("subject",cm,e); std::cout << "string literal with " << cm.size() << " matches\n"; std::smatch sm; // same as std::match_results<string::const_iterator> sm;
std::regex_match (s,sm,e); std::cout << "string object with " << sm.size() << " matches\n"; std::regex_match ( s.cbegin(), s.cend(), sm, e); std::cout << "range with " << sm.size() << " matches\n"; // using explicit flags: std::regex_match ( "subject", cm, e, std::regex_constants::match_default ); std::cout << "the matches were: "; for (unsigned i=0; i<sm.size(); ++i) { std::cout << "[" << sm[i] << "] "; } std::cout << std::endl; return 0; }

輸出如下:
在這裡插入圖片描述

regex_search

regex_match是另外一個正則表示式匹配的函式,下面是regex_search的例子。regex_search和regex_match的主要區別是:regex_match是全詞匹配,而regex_search是搜尋其中匹配的字串

// regex_search example
#include <iostream>
#include <regex>
#include <string>
 
int main(){
  std::string s ("this subject has a submarine as a subsequence");
  std::smatch m;
  std::regex e ("\\b(sub)([^ ]*)");   // matches words beginning by "sub"
 
  std::cout << "Target sequence: " << s << std::endl;
  std::cout << "Regular expression: /\\b(sub)([^ ]*)/" << std::endl;
  std::cout << "The following matches and submatches were found:" << std::endl;
 
  while (std::regex_search (s,m,e)) {
    for (auto x=m.begin();x!=m.end();x++) 
      std::cout << x->str() << " ";
    std::cout << "--> ([^ ]*) match " << m.format("$2") <<std::endl;
    s = m.suffix().str();
  }
}

輸出如下:
在這裡插入圖片描述

regex_replace

regex_replace是替換正則表示式匹配內容的函式,下面是regex_replace的例子


#include <regex> 
#include <iostream> 
 
int main() { 
    char buf[20]; 
    const char *first = "axayaz"; 
    const char *last = first + strlen(first); 
    std::regex rx("a"); 
    std::string fmt("A"); 
    std::regex_constants::match_flag_type fonly = 
        std::regex_constants::format_first_only; 
 
    *std::regex_replace(&buf[0], first, last, rx, fmt) = '\0'; 
    std::cout << &buf[0] << std::endl; 
 
    *std::regex_replace(&buf[0], first, last, rx, fmt, fonly) = '\0'; 
    std::cout << &buf[0] << std::endl; 
 
    std::string str("adaeaf"); 
    std::cout << std::regex_replace(str, rx, fmt) << std::endl; 
 
    std::cout << std::regex_replace(str, rx, fmt, fonly) << std::endl; 
 
    return 0; 
}

輸出如下:

注意反斜槓字元(\)會被轉義

std::regex e1 ("\\d");  //  \d -> 匹配數字字元
std::regex e2 ("\\\\"); //  \\ -> 匹配反斜槓字元

模式 “(a+)." 匹配 “aardvark” 將匹配到 aa,模式 "(a+?).” 匹配 “aardvark” 將匹配到 a

單個字元

[abc] 匹配 a, b 或 c.
[^xyz] 匹配任何非 x, y, z的字元

範圍
[a-z] 匹配任何小寫字母 (a, b, c, …, z).
[abc1-5] 匹配 a, b , c, 或 1 到 5 的數字.