PostgreSQL 1000億資料量 正則匹配 速度與激情
測試環境為 8臺主機(16c/host)的 PostgreSQL叢集,一共240個數據節點,測試資料量1008億。
效能圖表 :
如果要獲得更快的響應速度,可以通過增加主機和節點數(或者通過增加CPU和節點數),縮短recheck的處理時間。
資料生成方法:
#!/bin/bash
# 擷取通過random()計算得到的MD5 128bit hex的前48bit, 轉成字串,得到[0-9]和[a-f]組成的12個隨機字串。
psql digoal digoal -c "create table t_regexp_100billion distributed randomly"
for ((i=1;i<=1008;i++))
do
psql digoal digoal -c "copy (select substring(md5(random()::text),1,12) from generate_series(1,100000000)) to stdout" | psql digoal digoal -c "copy t_regexp_100billion from stdin"
done
psql digoal digoal -c "set maintenance_work_mem='4GB'; create index idx_t_regexp_100billion_1 on t_regexp_100billion(info)"
psql digoal digoal -c "set maintenance_work_mem='4GB'; create index idx_t_regexp_100billion_2 on t_regexp_100billion(reverse(info))"
psql digoal digoal -c "set maintenance_work_mem='4GB'; create index idx_t_regexp_100billion_gin on t_regexp_100billion using gin (info gin_trgm_ops)"
資料概貌
digoal=> select count(*) from t_regexp_100billion ;
count
--------------
100800000000
(1 row)
Time: 228721.386 ms
表大小
digoal=> \dt+ t_regexp_100billion
List of relations
Schema | Name | Type | Owner | Size | Description
--------+---------------------+-------+--------+---------+-------------
public | t_regexp_100billion | table | digoal | 4158 GB |
(1 row)
索引大小
idx_t_regexp_100billion_1 2961 GB
idx_t_regexp_100billion_1 2961 GB
idx_t_regexp_100billion_gin 2300 GB
測試資料展示:
digoal=> select * from t_regexp_100billion offset 1000000 limit 10;
info
--------------
bca0fb45367e
3051ca8a9a38
fadc91a3a4de
710b9c60417e
279dd9832cc3
f4743fe2e83b
9ce9e42d4039
65e64742fd3f
db3d0e0edc52
7cfb00bb38ec
(10 rows)
重複度取樣, 計算random() md5得到的字串,可以確保非常低的重複度:
digoal=> select count(distinct info) from (select * from t_regexp_100billion offset 1299422811 limit 1000000) t;
count
--------
999750
(1 row)
統計資訊展示:
digoal=> alter table t_regexp_100billion alter column info set statistics 10000;
ALTER TABLE
digoal=> analyze t_regexp_100billion ;
ANALYZE
schemaname | public
tablename | t_regexp_100billion
attname | info
inherited | f
null_frac | 0
avg_width | 13
n_distinct | -0.836834 # 取樣統計資訊,約83.6834%的唯一值
most_common_vals | (pg_catalog.text){7f68d12d2205,00083380706d,00154b6d79e8,...
most_common_freqs | {1e-06,6.66667e-07,6.66667e-07,6.66667e-07,..... 單個最高頻值的佔比為1e-06, 也就是說1000億記錄中出現10萬次。
histogram_bounds | (pg_catalog.text){0000008123b7,00066c71c9bb,000d672de234,...
correlation | 0.000237291
most_common_elems |
most_common_elem_freqs |
elem_count_histogram |
7f68d12d2205 實際的出現次數,可能是取樣時7f68d12d2205被取樣到的塊較多,所以資料庫認為它的佔比較多:
digoal=> select count(*) from t_regexp_100billion where info='7f68d12d2205';
-[ RECORD 1 ]
count | 54
digoal=> select ctid from t_regexp_100billion where info='7f68d12d2205' order by 1;
ctid
---------------
(15343,114)
(62134,39)
(96808,112)
(116492,176)
(194615,143)
(328074,116)
(364037,115)
(375240,158)
(376187,152)
(602144,81)
(664026,6)
(689501,136)
(695345,130)
(697374,126)
(714719,148)
(743169,20)
(802326,139)
(833830,41)
(839417,185)
(892417,78)
(892493,149)
(907979,52)
(967078,163)
(990313,159)
(1007998,27)
(1106961,57)
(1142731,165)
(1148427,67)
(1156654,156)
(1205854,137)
(1243429,68)
(1277287,165)
(1328836,98)
(1331727,150)
(1337534,3)
(1360947,104)
(1438970,97)
(1476941,22)
(1482022,82)
(1486307,69)
(1548445,155)
(1557209,82)
(1564980,158)
(1646685,76)
(1663018,99)
(1678604,77)
(1755845,177)
(1981937,153)
(1984723,98)
(2071955,59)
(2093147,149)
(2199794,102)
(2204957,44)
(2234820,142)
(54 rows)
效能測試:
字首匹配查詢速度:
digoal=> select ctid,tableoid,info from t_regexp_100billion where info ~ '^80ebcdd47';
ctid | tableoid | info
---------------+----------+--------------
(124741,60) | 16677 | 80ebcdd47006
(896121,64) | 16659 | 80ebcdd47006
(1124495,97) | 16659 | 80ebcdd47006
(1126474,141) | 16659 | 80ebcdd47006
(1059471,62) | 16659 | 80ebcdd47006
(1296562,115) | 16659 | 80ebcdd47006
(1190941,122) | 16659 | 80ebcdd47006
(680853,129) | 16659 | 80ebcdd47006
(1010667,15) | 16659 | 80ebcdd47006
(1386348,25) | 16659 | 80ebcdd47006
(1522827,90) | 16659 | 80ebcdd47006
(2204071,129) | 16659 | 80ebcdd47006
(1570431,114) | 16659 | 80ebcdd47006
(888185,38) | 16659 | 80ebcdd47006
(605886,160) | 16659 | 80ebcdd47006
(1306061,123) | 16659 | 80ebcdd47006
(757157,47) | 16659 | 80ebcdd47006
(1166290,83) | 16659 | 80ebcdd47006
(419730,1) | 16659 | 80ebcdd47006
(1833853,131) | 16659 | 80ebcdd47006
(964866,120) | 16659 | 80ebcdd47006
(904961,175) | 16659 | 80ebcdd47006
(984373,32) | 16659 | 80ebcdd47006
(891018,145) | 16659 | 80ebcdd47006
(1520483,121) | 16659 | 80ebcdd47006
(571001,124) | 16659 | 80ebcdd47006
(802093,55) | 16659 | 80ebcdd47006
(6831,172) | 16659 | 80ebcdd47006
(1169137,84) | 16659 | 80ebcdd47006
(77398,164) | 16659 | 80ebcdd47006
(24132,98) | 16659 | 80ebcdd47006
(564322,152) | 16659 | 80ebcdd47006
(357087,172) | 16659 | 80ebcdd47006
(1823628,60) | 16659 | 80ebcdd47006
(2153609,52) | 16659 | 80ebcdd47006
(816401,140) | 16659 | 80ebcdd47006
(542383,53) | 16662 | 80ebcdd47006
(1340971,64) | 16662 | 80ebcdd47006
(1239166,108) | 16662 | 80ebcdd47006
(2033648,39) | 16662 | 80ebcdd47006
(1890808,93) | 16662 | 80ebcdd47006
(1213124,4) | 16662 | 80ebcdd47006
(1025184,106) | 16662 | 80ebcdd47006
(620238,131) | 16662 | 80ebcdd47006
(583064,74) | 16662 | 80ebcdd47006
(1454680,42) | 16671 | 80ebcdd47006
(417385,74) | 16671 | 80ebcdd47006
(323669,61) | 16671 | 80ebcdd47006
(1759181,138) | 16671 | 80ebcdd47006
(2112157,146) | 16671 | 80ebcdd47006
(431326,92) | 16671 | 80ebcdd47006
(2097356,110) | 16671 | 80ebcdd47006
(52 rows)
Time: 3226.393 ms
digoal=> explain (analyze,verbose,buffers,costs,timing) select ctid,tableoid,info from t_regexp_100billion where info ~ '^80ebcdd47';
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0) (actual time=3085.502..3112.273 rows=52 loops=1)
Output: t_regexp_100billion.ctid, t_regexp_100billion.tableoid, t_regexp_100billion.info
Node/s: h1_data1, h1_data10, h1_data11, h1_data12, h1_data13, h1_data14, h1_data15, h1_data16, h1_data17, h1_data18, h1_data19, h1_data2, h1_data20, h1_data21, h1_data22, h1_data23, h1_data24, h1_data25, h1_data26, h1_data27, h1_data2
8, h1_data29, h1_data3, h1_data30, h1_data4, h1_data5, h1_data6, h1_data7, h1_data8, h1_data9, h2_data1, h2_data10, h2_data11, h2_data12, h2_data13, h2_data14, h2_data15, h2_data16, h2_data17, h2_data18, h2_data19, h2_data2, h2_data20, h
2_data21, h2_data22, h2_data23, h2_data24, h2_data25, h2_data26, h2_data27, h2_data28, h2_data29, h2_data3, h2_data30, h2_data4, h2_data5, h2_data6, h2_data7, h2_data8, h2_data9, h3_data1, h3_data10, h3_data11, h3_data12, h3_data13, h3_d
ata14, h3_data15, h3_data16, h3_data17, h3_data18, h3_data19, h3_data2, h3_data20, h3_data21, h3_data22, h3_data23, h3_data24, h3_data25, h3_data26, h3_data27, h3_data28, h3_data29, h3_data3, h3_data30, h3_data4, h3_data5, h3_data6, h3_d
ata7, h3_data8, h3_data9, h4_data1, h4_data10, h4_data11, h4_data12, h4_data13, h4_data14, h4_data15, h4_data16, h4_data17, h4_data18, h4_data19, h4_data2, h4_data20, h4_data21, h4_data22, h4_data23, h4_data24, h4_data25, h4_data26, h4_d
ata27, h4_data28, h4_data29, h4_data3, h4_data30, h4_data4, h4_data5, h4_data6, h4_data7, h4_data8, h4_data9, h5_data1, h5_data10, h5_data11, h5_data12, h5_data13, h5_data14, h5_data15, h5_data16, h5_data17, h5_data18, h5_data19, h5_data
2, h5_data20, h5_data21, h5_data22, h5_data23, h5_data24, h5_data25, h5_data26, h5_data27, h5_data28, h5_data29, h5_data3, h5_data30, h5_data4, h5_data5, h5_data6, h5_data7, h5_data8, h5_data9, h6_data1, h6_data10, h6_data11, h6_data12,
h6_data13, h6_data14, h6_data15, h6_data16, h6_data17, h6_data18, h6_data19, h6_data2, h6_data20, h6_data21, h6_data22, h6_data23, h6_data24, h6_data25, h6_data26, h6_data27, h6_data28, h6_data29, h6_data3, h6_data30, h6_data4, h6_data5,
h6_data6, h6_data7, h6_data8, h6_data9, h7_data1, h7_data10, h7_data11, h7_data12, h7_data13, h7_data14, h7_data15, h7_data16, h7_data17, h7_data18, h7_data19, h7_data2, h7_data20, h7_data21, h7_data22, h7_data23, h7_data24, h7_data25,
h7_data26, h7_data27, h7_data28, h7_data29, h7_data3, h7_data30, h7_data4, h7_data5, h7_data6, h7_data7, h7_data8, h7_data9, h8_data1, h8_data10, h8_data11, h8_data12, h8_data13, h8_data14, h8_data15, h8_data16, h8_data17, h8_data18, h8_
data19, h8_data2, h8_data20, h8_data21, h8_data22, h8_data23, h8_data24, h8_data25, h8_data26, h8_data27, h8_data28, h8_data29, h8_data3, h8_data30, h8_data4, h8_data5, h8_data6, h8_data7, h8_data8, h8_data9
Remote query: SELECT ctid, tableoid, info FROM t_regexp_100billion WHERE (info ~ '^80ebcdd47'::text)
Planning time: 0.061 ms
Execution time: 3112.296 ms
(6 rows)
Time: 3139.928 ms
字尾匹配查詢速度
digoal=> select ctid,tableoid,info from t_regexp_100billion where reverse(info) ~ '^f42d12089b';
ctid | tableoid | info
---------------+----------+--------------
(124741,26) | 16677 | f3b98021d24f
(1696888,151) | 16659 | f3b98021d24f
(1278911,101) | 16659 | f3b98021d24f
(1427480,157) | 16659 | f3b98021d24f
(449192,30) | 16659 | f3b98021d24f
(1833887,81) | 16659 | f3b98021d24f
(229525,72) | 16659 | f3b98021d24f
(1353789,17) | 16659 | f3b98021d24f
(1875911,148) | 16659 | f3b98021d24f
(1847078,35) | 16659 | f3b98021d24f
(316780,156) | 16659 | f3b98021d24f
(1265453,120) | 16659 | f3b98021d24f
(100075,60) | 16659 | f3b98021d24f
(1924176,2) | 16659 | f3b98021d24f
(279583,2) | 16659 | f3b98021d24f
(1631226,23) | 16659 | f3b98021d24f
(1906666,50) | 16659 | f3b98021d24f
(1640803,116) | 16659 | f3b98021d24f
(629651,46) | 16659 | f3b98021d24f
(134982,13) | 16659 | f3b98021d24f
(380660,123) | 16659 | f3b98021d24f
(2158193,31) | 16659 | f3b98021d24f
(324901,64) | 16659 | f3b98021d24f
(1243973,160) | 16659 | f3b98021d24f
(540958,139) | 16659 | f3b98021d24f
(441475,99) | 16659 | f3b98021d24f
(1207114,121) | 16659 | f3b98021d24f
(574598,21) | 16659 | f3b98021d24f
(1253283,185) | 16659 | f3b98021d24f
(1396717,142) | 16659 | f3b98021d24f
(149738,9) | 16659 | f3b98021d24f
(764749,26) | 16659 | f3b98021d24f
(1211899,5) | 16659 | f3b98021d24f
(1626746,65) | 16659 | f3b98021d24f
(1342895,124) | 16659 | f3b98021d24f
(733794,136) | 16659 | f3b98021d24f
(417796,2) | 16659 | f3b98021d24f
(555520,163) | 16659 | f3b98021d24f
(232038,105) | 16659 | f3b98021d24f
(355107,127) | 16659 | f3b98021d24f
(352143,175) | 16662 | f3b98021d24f
(1856293,69) | 16662 | f3b98021d24f
(1405106,105) | 16662 | f3b98021d24f
(47689,79) | 16662 | f3b98021d24f
(679310,7) | 16671 | f3b98021d24f
(1076234,164) | 16671 | f3b98021d24f
(46 rows)
Time: 3140.835 ms
digoal=> explain (verbose,costs,timing,buffers,analyze) select ctid,tableoid,info from t_regexp_100billion where reverse(info) ~ '^f42d12089b';
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0) (actual time=3085.738..3112.216 rows=46 loops=1)
Output: t_regexp_100billion.ctid, t_regexp_100billion.tableoid, t_regexp_100billion.info
Node/s: h1_data1, h1_data10, h1_data11, h1_data12, h1_data13, h1_data14, h1_data15, h1_data16, h1_data17, h1_data18, h1_data19, h1_data2, h1_data20, h1_data21, h1_data22, h1_data23, h1_data24, h1_data25, h1_data26, h1_data27, h1_data2
8, h1_data29, h1_data3, h1_data30, h1_data4, h1_data5, h1_data6, h1_data7, h1_data8, h1_data9, h2_data1, h2_data10, h2_data11, h2_data12, h2_data13, h2_data14, h2_data15, h2_data16, h2_data17, h2_data18, h2_data19, h2_data2, h2_data20, h
2_data21, h2_data22, h2_data23, h2_data24, h2_data25, h2_data26, h2_data27, h2_data28, h2_data29, h2_data3, h2_data30, h2_data4, h2_data5, h2_data6, h2_data7, h2_data8, h2_data9, h3_data1, h3_data10, h3_data11, h3_data12, h3_data13, h3_d
ata14, h3_data15, h3_data16, h3_data17, h3_data18, h3_data19, h3_data2, h3_data20, h3_data21, h3_data22, h3_data23, h3_data24, h3_data25, h3_data26, h3_data27, h3_data28, h3_data29, h3_data3, h3_data30, h3_data4, h3_data5, h3_data6, h3_d
ata7, h3_data8, h3_data9, h4_data1, h4_data10, h4_data11, h4_data12, h4_data13, h4_data14, h4_data15, h4_data16, h4_data17, h4_data18, h4_data19, h4_data2, h4_data20, h4_data21, h4_data22, h4_data23, h4_data24, h4_data25, h4_data26, h4_d
ata27, h4_data28, h4_data29, h4_data3, h4_data30, h4_data4, h4_data5, h4_data6, h4_data7, h4_data8, h4_data9, h5_data1, h5_data10, h5_data11, h5_data12, h5_data13, h5_data14, h5_data15, h5_data16, h5_data17, h5_data18, h5_data19, h5_data
2, h5_data20, h5_data21, h5_data22, h5_data23, h5_data24, h5_data25, h5_data26, h5_data27, h5_data28, h5_data29, h5_data3, h5_data30, h5_data4, h5_data5, h5_data6, h5_data7, h5_data8, h5_data9, h6_data1, h6_data10, h6_data11, h6_data12,
h6_data13, h6_data14, h6_data15, h6_data16, h6_data17, h6_data18, h6_data19, h6_data2, h6_data20, h6_data21, h6_data22, h6_data23, h6_data24, h6_data25, h6_data26, h6_data27, h6_data28, h6_data29, h6_data3, h6_data30, h6_data4, h6_data5,
h6_data6, h6_data7, h6_data8, h6_data9, h7_data1, h7_data10, h7_data11, h7_data12, h7_data13, h7_data14, h7_data15, h7_data16, h7_data17, h7_data18, h7_data19, h7_data2, h7_data20, h7_data21, h7_data22, h7_data23, h7_data24, h7_data25,
h7_data26, h7_data27, h7_data28, h7_data29, h7_data3, h7_data30, h7_data4, h7_data5, h7_data6, h7_data7, h7_data8, h7_data9, h8_data1, h8_data10, h8_data11, h8_data12, h8_data13, h8_data14, h8_data15, h8_data16, h8_data17, h8_data18, h8_
data19, h8_data2, h8_data20, h8_data21, h8_data22, h8_data23, h8_data24, h8_data25, h8_data26, h8_data27, h8_data28, h8_data29, h8_data3, h8_data30, h8_data4, h8_data5, h8_data6, h8_data7, h8_data8, h8_data9
Remote query: SELECT ctid, tableoid, info FROM t_regexp_100billion WHERE (reverse(info) ~ '^f42d12089b'::text)
Planning time: 0.063 ms
Execution time: 3112.236 ms
(6 rows)
Time: 3139.890 ms
前後模糊查詢速度:
digoal=> select ctid,tableoid,info from t_regexp_100billion where info ~ 'e7add04871';
ctid | tableoid | info
---------------+----------+--------------
(124741,45) | 16677 | be7add048713
(49315,69) | 16659 | be7add048713
(1770876,21) | 16659 | be7add048713
(199079,143) | 16659 | be7add048713
(151110,141) | 16659 | be7add048713
(1597384,137) | 16659 | be7add048713
(1693453,25) | 16659 | be7add048713
(101576,132) | 16659 | be7add048713
(1110249,50) | 16659 | be7add048713
(792326,68) | 16659 | be7add048713
(1676705,68) | 16659 | be7add048713
(1269148,101) | 16659 | be7add048713
(1027442,113) | 16659 | be7add048713
(1078144,100) | 16659 | be7add048713
(584038,141) | 16659 | be7add048713
(1245454,80) | 16659 | be7add048713
(1551184,102) | 16659 |
相關推薦
PostgreSQL 1000億資料量 正則匹配 速度與激情
測試環境為 8臺主機(16c/host)的 PostgreSQL叢集,一共240個數據節點,測試資料量1008億。
效能圖表 :
如果要獲得更快的響應速度,可以通過增加主機和節點數(或者通過增加CPU和節點數),縮短recheck的處理時間。
資料生成方法:
#!/bin/bash
#
linux下分割字串已經如何正則匹配日期與IP
今天專案需要在linux下將一個字串中的ip與日期提取出來,因為查了挺多資料,記到這裡方便以後檢視。
linux下分割字串
linux下分割字串可以使用命令expr,expr有許多功能,具體的使用方法可以使用man檢視,這裡只介紹分割字串的功能。
ex
postgresql使用正則匹配IP地址
在查詢某表的資料時,對錶中的ip進行正則匹配:
select '192.168.14.29' ~ '^((?:(?:25[0-5]|2[0-4]\\d|((1\\d{2})|([1-9]?\\d)))(?:\\.)){3}(?:25[0-5]|2[0-4]\\d|((1\\
python中正則匹配字符配置單詞邊界不生效的解決辦法
re python duoceshi #-*-coding:utf-8-*-import rename="duoceshi"p= re.compile(‘\bduoceshi\b‘)f = p.search(name)if f: print f.group()################
正則匹配 替換..追加..
bbs csdn 正則 flow code pan net eval nbsp 這裏都是以 圖片中的元素為例:
匹配出IMG標簽中alt的值:
1 Regex reg = new Regex(@"(?is)(?<=<img[^>]*)[^""]*(?
day11 grep正則匹配
collect lec linux 取反 pat 至少 判斷 con set
ps aus | trep nginx # 查看所有正在運行的nginx任務
別名路徑:
alias test_cmd=‘ls -l‘
PATH路徑:
臨時修改:
常用的正則匹配
marked clas 字符串 輸入 har round back [0 num 1.判斷只能輸入數字和字母
var num_char = /^[0-9A-Za-z]+$/;
^ :代表匹配字符串開始位置;
[0-9A-Za-z]+ :[0-9A-Za-z]匹配數
js 對表單的一些驗證及正則匹配
攻擊 update 匹配規則 asc htm out gin lease public 利用的是jq的validate.js
詳見菜鳥教程http://www.runoob.com/jquery/jquery-plugin-validate.html
以下是我測試的幾個文件
正則匹配所有的a標簽
結束 strong 分組 正則匹配 ref val 所有 a標簽 解釋 <a\b[^>]+\bhref="([^"]*)"[^>]*>([\s\S]*?)</a>
分組1和分組2即為href和value
解釋:
<a\b
關於JAVA正則匹配空白字符的問題(全角空格與半角空格)
轉義 空白 測試 rgs com text color 如何 clas 今天遇到一個字符串,怎麽匹配空格都不成功!!!
我把空格復制到test.properties文件 顯示“\u3000” ,這是什麽? 這是全角空格!!!
查了一下 \s
正則匹配<img>
普通 空白字符 展開 反向引用 功能 php php應用 換行 一個 preg_match_all(‘/<img(.*?)src=\"(.*?)\"(.*?)>/is‘, $content, $matches);
matches[0] 整個img標簽
match
js正則匹配的出鏈接地址
鏈接地址 匹配 ase lower length ont 正則匹配 nbsp case content為需要匹配的值
var b=/<a([\s]+|[\s]+[^<>]+[\s]+)href=(\"([^<>"\‘]*)\"|\‘([^
awk結合正則匹配
需要 上海 所有 統計 技術 領域 panda -1 數據處理 利用awk分析data.csv中label列各取值的分布.
在終端執行head data.csv查看數據:
1 name,business,label,label_name
2 滄州光松房屋拆遷有限公
正則匹配方法
blank csdn 關於 expr 取ip地址 數值 換ip 表達式 java 這裏是幾個主要非英文語系字符範圍(google上找到的):
2E80~33FFh:中日韓符號區。收容康熙字典部首、中日韓輔助部首、註音符號、日本假名、韓文音符,中日韓的符號、標點、帶圈或帶括
修正正則匹配日期---基於網絡未知大神的正則
http 日期 bsp question ges 基於 就會 貢獻 工作 今天工作時需要用到日期格式檢驗,於是發現未知的大神貢獻的一套正則表達式【1】,看起來很復雜;
但是經過測試發現有些問題:
((\d{2}(([02468][048])|([13579][26]
java正則匹配
java 成功 println 字符 示例代碼 括號 lan string main java正則提取需要用到Matcher類,下面給出案例示例供參考需要提取車牌號中最後一個數字,比如說:蘇A7865提取5,蘇A876X提取6import java.util.regex.M
python3 正則匹配[^abc]和(?!abc)的區別(把多個字符作為一個整體匹配排除)
mat obj python str 效果 目的 str1 排除 blog 目的:把數字後面不為abc的字符串找出來
如1ab符合要求,2abc不符合要求
1 str = ‘1ab‘
2 out = re.match(r‘\d+(?!abc)‘,str)
3
4
python正則匹配——中文字符的匹配
pri bsp odi col div class cnblogs mat 結果
# -*- coding:utf-8 -*-
import re
‘‘‘python 3.5版本
正則匹配中文,固定形式:\u4E00-\u9FA5
‘‘‘
words = ‘stud
php 正則匹配出a標簽級a標簽中的內容
har set ext htm file 鏈接地址 header char pre <?phpheader("Content-type: text/html; charset=utf-8");
$str=file_get_contents("https://www.
re模塊 正則匹配
reimport rere.M 多行模式 位或的意思parrterm就是正則表達式的字符串,flags是選項,表達式需要被編譯,通過語法、策劃、分析後衛其編譯為一種格式,與字符串之間進行轉換re模塊主要為了提速,re的其他方法為了提高效率都調用了編譯方法,就是為了提速re的方法單次匹配re.compile 和