Python 字符串過濾

阿新 • • 發佈：2018-09-06

color tail code fin span 內容 import pre ram

需求：

str1 = "

"""<div class="m_wrap clearfix"><ul class="clearfix"><br/><br/><
br/><br/><br/><br/><br/><br/><br/><br/><br/><br/><b
r/><br/><br/><br/><br/><br/><li class="li_1 clearfix"><spa
n class="pt_title S_txt2">公司：</span><span class="pt_detail"><a href="h
 
ttp://s.weibo.com/user/&work=%E6%89%AC%E5%B7%9E%E6%8A%A5%E4%B8%9A%E9%9B%86%E5%9B%A2&from=inf&wvr=5&loc=infjob" target="_blank">揚州報業集團</a><br/> 
地區：江蘇 ，揚州<br/> </span></li></ul></div></div></div></div>"""

想把這段字符串的標簽全部都去掉，比如去掉 </li>, </ul>, </div>.。只保留不帶<>的內容，但是要保留<br/>，

有什麽好的辦法嗎？使用正則可以實現這個工作：

# coding:utf-8
import re
newline = """<div class="m_wrap clearfix"><ul class="clearfix"><br/><br/><br/><br/><br/><br/><br/><br/><br/><br/><br/><br/><br
　　/><br/><br/><br/><br/><br/><li class="li_1 clearfix"><span class="pt_title S_txt2">公司：</span><span class="pt_detail"><a
 
 　　href="http://s.weibo.com/user/&work=%E6%89%AC%E5%B7%9E%E6%8A%A5%E4%B8%9A%E9%9B%86%E5%9B%A2&from=inf&wvr=5&loc=infjob" target="_blank">
　　揚州報業集團</a><br/> 地區：江蘇 ，揚州<br/> </span></li></ul></div></div></div></div>"""


newline= newline.replace(‘<br/>‘,‘!!!###‘)
re_comment = re.compile(‘<[^>]*>‘)
newlines = re_comment.sub(‘‘, newline)
newlines = newlines.replace(‘!!!###‘,‘<br/>‘).replace(‘<br/><br/>‘,‘<br/>‘).replace(‘<br/><br/>‘,‘<br/>‘)
print newlines

輸出結果是：

C:\Python27\python.exe F:/squid_frame/ZYXT__weibo/test.py
<br/>公司：揚州報業集團<br/> 地區：江蘇 ，揚州<br/> 

Process finished with exit code 0

Python 字符串過濾

color tail code fin span 內容 import pre ram 需求： str1 = " """<div class="m_wrap clearfix"><ul class="clearfix"><br/><

python 字符串,列表,元組,字典相互轉換

con pytho num list () content values div class 1、字典 dict = {‘name‘: ‘Zara‘, ‘age‘: 7, ‘class‘: ‘First‘} 字典轉為字符串，返回：<type ‘str‘>

python字符串、列表功能

python字符串、列表功能一、字符串功能1、capitaliza 首字母大寫# name = ‘alex‘# v = name.capitalize()# print(v)# 2、將所有大寫都變小寫，casefold 可以轉多國語言，lower只能轉英文。# name = ‘AleX‘# v = name.

python字符串格式化

python 字符串 Python 2.7.12rc1 (v2.7.12rc1:13912cd1e7e8, Jun 12 2016, 05:57:31) [MSC v.1500 64 bit (AMD64)] on win32 Type "copyright", "credits" or "licen

Python - 字符串的操作方法

表示 next ide oat string 一段帶寬 slow 通用字符串操作方法生成字符串str = ‘Python string Function study‘sequence類型都支持的一些通用操作：成員檢查：in、not in ‘

Python---字符串的操作方法

好記性不如爛筆頭 type 看看吧今天簡單介紹必須使用字符參考字符串的操作方法： Python的數據類型字符串、布爾類型、列表、元組、字典等等，今天就簡單介紹一下字符串的操作方法。 Python的學習就是要多練習，好記性不如爛筆頭，我們在實際中看看吧。下面就

python 字符串匹配問題

註釋 func tails repl post 新版 for pri shang 想匹配html = <div class="back fl"><a href="javascript:void(0);" onclick="_gaq.push([‘_trac

python 字符串操作

python 字符串操作特性：不可修改　name.capitalize() 首字母大寫 name.casefold() 大寫全部變小寫 name.center(50,"-") 輸出 ‘---------------------Alex Li----------------------‘ name.co

Python字符串基本操作

pytho cnblogs 是不是下標 dsw blog 大寫 pre pri Python字符串基本操作 1、判斷是不是合法的標識符isidentifier name="ABC" print(name.isidentifier()) 打印結果 True 2、首字母大

sphinx 字符串過濾

from ray 支持解決配置屬性 string 過濾 matches sphinx 不支持字符串作為屬性過濾。要使用字符串進行過濾可以使用下面的兩者方法進行解決：屬性查詢需要在sphinx配置文件中定義文本字段，當查詢索引時，參考其字段。sphinx配置如下： S

Python字符串

字符串 python name = "My {name} is jenkins {year}"print name.capitalize() #首字母大寫print name.count("i") #字符串i計數print name.center(50,"-") #打五十個字符，不夠的用-補上pri

Python 字符串

python 1.字符串的索引 2.字符串的切片 3.判斷字串 *判斷字符是否屬於字符串 4.重復，連接，計算字符長度 5.字符常用操作 * haha.caritaliz() 將字符首字母大寫並返回輸出 * haha.center(20.‘*’) 返回長度

python 字符串操作。。

lin python 分割 pri -1 元素 strip pytho tex #字符串操作以0開始，有負下標的使用0第一個元素，-1最後一個元素，-len第一個元素，len-1最後一個元素 name= "qwe , erw, qwe "print(name.index

python--字符串類型

python 字符串*************** 字符串類型 ***************1.字符串的定義:第一種方式:str1 = ‘our company is westos‘第二種方式:str2 = "our company is westos"第三種方式:str3 = """our

Python 字符串的所有方法詳解

drive nes and tabs 英文 spa space cas ive 1 name = "my name is {name} and my age is {age}" 2 3 # 首字母大寫 4 name.capitalize() 5 # 統計某個字

python字符串處理

values com including for color nal concat raise tween 字符串處理絕對是任何一門語言的重點。 str.partition(sep) Split the string at the first occurrence of s

python 字符串格式化（%）

元素一個空格小數指數 ive 替換 name 正常八進制轉載自： http://www.cnblogs.com/vamei/archive/2013/03/12/2954938.html 模板格式化字符串時，Python使用一個字符串作為模板。模板中有格式符

python 字符串 string

否則索引下標 join() subst swap cde find 字母 sub 字符串 string 語法： a = ‘hello world!‘ b = "hello world!" 常用操作： 1、乘法操作是將字符串重復輸出2遍 >>> a=‘

[Python][小知識] Python字符串前加 u、r、b 的含義

image cnblogs 學校 es2017 1-1 bytes unicode python字符串正常 1、字符串前加 u 　　例：u"我是含有中文字符組成的字符串。" 　　作用：後面字符串以 Unicode 格式進行編碼，一般用在中文字符串前面，防止因為源碼儲存格

python字符串詳解

bcd cnblogs line abcdefg print 子串 split 字符 true 一、截取子串 test="hello" print(test[0:4]) 二、復制字符串 #strcpy(sStr1,sStr2) sStr1 = ‘strcpy‘ s

Python 字符串過濾

相關推薦