正規表示式學習筆記02

re.search 掃整個字串並返回第乙個成功的匹配

content = 'extra stings hello 1234567 world_this is a regex demo extra stings'
result = re.match('hello.*?(\d+).*?demo',content)    # 從開頭開始匹配
print(result)
none
result = re.search('hello.*?(\d+).*?demo',content)    # 從字串中找到
print(result)
<_sre.sre_match object; span=(13, 53), match='hello 1234567 world_this is a regex demo'>

為匹配方便，能用search就不用match

# 提取 齊秦 往事隨風
html = '''id="songs-list">
class="title">經典老歌h2>
class="introduction">
經典老歌列表
p>
id="list"
class="list-group">
data-view="2">一路上有你li>
data-view="7">
href="/2.***"
singer="任賢齊">滄海一聲笑a>
li>
data-view="4"
class="active">
href="/3.***"
singer="齊秦">往事隨風a>
li>
data-view="6">
href="/4.***"
singer="beyond">光輝歲月a>
li>
data-view="5">
href="/5.***"
singer="陳慧琳">記事本a>
li>
data-view="5">
href="/6.***"
singer="鄧麗君">
class="fa fa-user">
i>但願人長久a>
li>
ul>
div>'''
res = re.search('(.*?)a>', html, re.s)
if res:
print(res.group(1), res.group(2))    # 列印出2個括號的內容
齊秦 往事隨風

# 去掉 active.*?
res = re.search('(.*?)
', html, re.s)
if res:
print(res.group(1), res.group(2))
任賢齊 滄海一聲笑

# 卻掉re.s 模式
# .* 匹配不到換行符 因此只能匹配到第乙個 沒有換行符內 容的 光輝歲月
res = re.search('(.*?)
', html)
if res:
print(res.group(1), res.group(2))
beyond 光輝歲月

搜尋字串，以列表形式返回全部能匹配的子串

獲取所有 a 節點的超連結、歌手和歌名

results = re.findall('(.*?)', html, re.s)
print(results)
[('/2.***', '任賢齊', '滄海一聲笑'), ('/3.***', '齊秦', '往事隨風'), ('/4.***', 'beyond', '光輝歲月'), ('/5.***', '陳慧琳', '記事本'), ('/6.***', '鄧麗君', '但願人長久')]
print(type(results))    # 返回結果是list
for result in
results:    #遍歷一次列印出每組資訊
print(result)
('/2.***', '任賢齊', '滄海一聲笑')
('/3.***', '齊秦', '往事隨風')
('/4.***', 'beyond', '光輝歲月')
('/5.***', '陳慧琳', '記事本')
('/6.***', '鄧麗君', '但願人長久')

替換字串中每乙個匹配的子串後返回替換後的字串

content = 'extra stings hello 1234567 world_this is a regex demo extra stings'
# re.sub(正規表示式,新字元，原字串）
content = re.sub('\d+','', content)    #匹配數字，替換為''為空
print(content)
extra stings hello  world_this is a regex demo extra stings

替換目標，是原字串本身或包含原字串

content = 'extra stings hello 1234567 world_this is a regex demo extra stings'
content = re.sub('(\d+)',r'\1 8910', content) # \1表示 group1，保證是原生字元就在前面加乙個r
print(content)
extra stings hello 1234567
8910 world_this is a regex demo extra stings

將正則字串編譯成正規表示式物件

將乙個正規表示式串編譯成正則物件，以便用於復用該匹配模式

content = 'hello 123 4567 world_this is a regex demo'
pattern = re.compile('hello.*demo', re.s)
res = re.match(pattern, content)
print(res)
<_sre.sre_match object; span=(0, 41), match='hello 123 4567 world_this is a regex demo'>

正規表示式 02

捕獲組也就是pattern中以括號對分割出的子pattern。至於為什麼要用捕獲組呢，主要是為了能找出在一次匹配中你更關心的部分。group 0 於group 等價，表示整個正規表示式的匹配字串，group 1 等價於第乙個括號內的表示式返回的字串，以此類推。捕獲組可以通過從左到右計算其開括號來編...

正規表示式02

標準字符集能夠與多種字元匹配的表示式注意區分大小寫，大寫是取補集的意思 d任意乙個數字，0 9中的任意乙個 w任意乙個字母或數字或下劃線，也就是a z,a z,0 9,中任意乙個 s包括空格，製表符，換行符等空白字元中的任意乙個小數點可以匹配任意乙個字元除換行符如果要匹配包括 n 在內...

JAVA學習正規表示式02

正規表示式說明 abc a b c中任意乙個字元 abc 除了a b c的任意字元 a z a b c z中的任意乙個字元 a za z0 9 a z a z 0 9中任意乙個字元 a z bc a z中除了b和c以外的任意乙個字元，其中表示與的關係正規表示式說明任意乙個字元 d 任意...

正規表示式 學習筆記02

正規表示式 02

正規表示式02

JAVA學習 正規表示式02

相關推薦

正規表示式學習筆記02

JAVA學習正規表示式02