1. 程式人生 > >關於python 的re.sub用法

關於python 的re.sub用法

import re
text = “JGood is a handsome boy, he is cool, clever, and so on…”
print(re.sub(r’\s+’, ‘-’, text))
JGood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on…
print(re.sub(r’is\s+’, ‘-’, text))
JGood -a handsome boy, he -cool, clever, and so on…
print(re.sub(r’\s+.’, ‘.’, text))
JGood is a handsome boy, he is cool, clever, and so on…
text = “JGood is a handsome boy , he is cool , clever , and so on…”
print(re.sub(r’\s+,\s+’, ‘,’,text))
JGood is a handsome boy,he is cool,clever,and so on…
許多資料的介紹如下:

re.sub
  re.sub用於替換字串中的匹配項。下面一個例子將字串中的空格 ’ ’ 替換成 ‘-’ :

import re  
  
text = ”JGood is a handsome boy, he is cool, clever, and so on…”  
print re.sub(r‘\s+’, ‘-‘, text)  

re.sub的函式原型為:re.sub(pattern, repl, string, count)

其中第二個函式是替換後的字串;本例中為’-‘

第四個引數指替換個數。預設為0,表示每個匹配項都替換。

re.sub還允許使用函式對匹配項的替換進行復雜的處理。如:re.sub(r’\s’, lambda m: ‘[’ + m.group(0) + ‘]’, text, 0);將字串中的空格’ ‘替換為’[ ]’。

自己實驗了一下,結果的確把句子中的“ ”替換為“-”

text = “JGood is a handsome boy, he is cool, clever, and so on…”
print re.sub(r’\s+’, ‘-‘, text)
JGood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on…

好奇之下,把“ r’\s+’ ” 替換為“ r’is\s+’” 結果是把原句中的is改為了-

text = “JGood is a handsome boy, he is cool, clever, and so on…”

print re.sub(r’is\s+’, ‘-‘, text)
JGood -a handsome boy, he -cool, clever, and so on…

自己的開源專案中用到了re.sub(r’\s+,\s+’, ‘, ‘, text),難道是把”,”改為逗號,這沒有什麼用處啊,很好奇,繼續實驗,結果如下

print re.sub(r’\s+.’, ‘.’, text)
JGood is a handsome boy, he is cool, clever, and so on…

確實,如果用這個例句,沒有任何更改。

不死心,就把例句做了一些更改,多家幾個逗號試試。

text=”JGood is a handsome boy, , , he is cool, clever, and so on…”
print re.sub(r’\s+,\s+’, ‘, ‘, text)
JGood is a handsome boy, , he is cool, clever, and so on…

發現,三個逗號沒有少,是空格發生了變化。
於是繼續探索,在原句每個空格之前加了空格,繼續實驗

text = “JGood is a handsome boy , he is cool , clever , and so on…”
print re.sub(r’\s+,\s+’, ‘,’, text)
JGood is a handsome boy,he is cool,clever,and so on…

哈哈,原來是把“,”前後的空格給刪除了。 頓時領悟了re.sub(pattern, repl, string, count)中PATTERN的作用,找到text中與patern所匹配的形式,把text中與patern所匹配的形式以外的用repl代替。

再次驗證一下,把“clever ”的逗號改為空格+句號+空格。

text = “JGood is a handsome boy , he is cool , clever . and so on…”
print re.sub(r’\s+.’, ‘.’, text)
JGood is a handsome boy , he is cool , clever. and so on…

很明顯“clever ”後句號前空格被去除。