Linux下POSIX正則表示式API使用

阿新 • • 發佈：2018-11-10

一、概述

在Linux環境中，經常使用正則表示式，如grep、sed、find等等，目前正則表示式有2中不同的標準，分別是Perl標準和POSIX標準，這2種風格，大體相同，稍有差別。在 C/C++的標準庫均不支援表示式，不過在C++11標準中，貌似引入了boost的正則庫，在Linux環境中也自帶了一組API支援正則，即POSIX標準的C介面。
常用的一組API如下：
int regcomp (regex_t *compiled, const char *pattern, int cflags);
int regexec (regex_t *compiled, char *string, size_t nmatch,

regmatch_t matchptr [], int eflags);
void regfree (regex_t *compiled);
size_t regerror (int errcode, regex_t *compiled, char *buffer, size_t length);

二、例項解析

程式碼RegexDemo.c

程式碼的註釋，已經很清楚，無需多言，如有錯誤，歡迎指正。

/*************************************************************************
	> File Name: RegexDemo.c
	> Author: KentZhang
	> Mail:  
[email protected] 
	> Created Time: 2015年12月12日 星期六 09時22分26秒
 ************************************************************************/

#include<stdio.h>
#include<sys/types.h>
#include<regex.h>
#include<string.h>
#include<stdlib.h>
#define BUFSIZE 256
int main(){
	/************************************************************************************************
	 1、編譯正則表示式regcomp
	 2、匹配正則表示式regexec
	 3、釋放正則表示式regfree
	************************************************************************************************/

	char bufError[BUFSIZE] = {0};
	const char* strRule = "c[a-z]t";         //正則表示式
	const char* strSrc = "123citabcat+-cot"; //源字串
	regex_t reg;	                         //用來存放編譯後的正則表示式
	
	int nResult = regcomp(&reg, strRule, 0); //編譯正則表示式
	if (0 != nResult){                       //如果出錯，獲取出錯資訊
		regerror(nResult, &reg, bufError, sizeof(bufError));
		printf("regcomp() failed:%s\n", bufError);
	}

	regmatch_t pm[1];                       //這個結構體陣列用來存放匹配的結果資訊，本例是迴圈獲取所有字串，陣列長度為1即可
	const size_t nMatch = 1;                //表示上面陣列的長度
	char bufMatch[BUFSIZE];
      /**************************************************************************************************
	    1、regmatch_t 這個結構體非常重要，包含2個成員rm_so,rm_eo,即匹配到的子串的首，尾在源字串的偏移位置
	   顯然根據源字串首指標和這2個成員，可以獲取字串的內容
        2、下面的regexec函式的第二個引數即源字串的首指標，當然必須是UTF-8字串，若要迴圈匹配，源字串的指標
	   必須不斷後移，因為前面的已經匹配過
      **************************************************************************************************/
	while(!regexec(&reg, strSrc, nMatch, pm, 0)){ //迴圈匹配出所有子串
		bzero(bufMatch,sizeof(bufMatch));
		strncpy(bufMatch, strSrc+pm[0].rm_so, pm[0].rm_eo-pm[0].rm_so); //取出匹配的結果，並列印
		printf("Match result is:%s, rm_so=%d, rm_eo=%d\n", bufMatch, pm[0].rm_so, pm[0].rm_eo);
		
		strSrc += pm[0].rm_eo;             //將指標後移，接著匹配
		if ('\0' == *strSrc)
			break;
	}

	regfree(&reg);
	return 0;

}

編譯執行後的結果：

[email protected]:~/workspace$ ./a.out
Match result is:cit, rm_so=3, rm_eo=6
Match result is:cat, rm_so=2, rm_eo=5
Match result is:cot, rm_so=2, rm_eo=5

Linux下POSIX正則表示式API使用

Linux下POSIX正則表示式API使用

linux下練習正則表示式

Linux命令- grep +正則表示式

Java正則表示式API詳解

限制QLineEdit的數值輸入範圍(QT正則表示式方法)，順便簡單介紹下QT正則表示式方法

JDK正則表示式API包

windos下編譯正則表示式庫pcre

在linux下，如何在C語言中使用正則表示式（整理）

Linux 下用bash shell正則表示式批量處理檔案的應用例項

linux下shell 程式設計之擴充套件正則表示式

在linux下，如何在C語言中使用正則表示式

linux 下利用ls grep 和正則表示式實現目錄和檔案的分開顯示

linux下c/c++例項之五正則表示式字串匹配

Linux/Unix工具與正則表示式的POSIX規範

Linux下正則表示式的使用及grep工具

獲得某個資料夾下的符合正則表示式的檔案地址

3分鐘搞定Linux系統正則表示式

js正則表示式驗證字串只包括大小寫字母下劃線和－

只能輸入英文數字和下劃線和橫線的正則表示式

[一天幾個linux命令] shell指令碼之正則表示式

Linux下POSIX正則表示式API使用

相關推薦