Python爬蟲之正規表示式（3）

替換字串中每乙個匹配的子串後返回替換後的字串

3import

re4 content = '

extra strings hello 1234567 world_this is a regex demo extra strings

'5 content = re.sub('

\d+', ''

, content)

6print

(content)78

import

re9 content = '

extra strings hello 1234567 world_this is a regex demo extra strings

'10 content = re.sub('

\d+', '

replacement

', content)

11print

(content)

1213

#\1 是轉義字元

14import

re15 content = '

extra strings hello 1234567 world_this is a regex demo extra strings

'16 content = re.sub('

(\d+)

', r'

\1 8910

', content)

17print

(content)

1819

#re.compile20#

將正則字串編譯成正規表示式物件21#

將乙個正規表示式串編譯成正則物件，以便於復用該匹配模式

22import

re23 content = '''

hello 1234567 world_this

24is a regex demo

'''25 pattern = re.compile('

hello.*demo

', re.s)

26 result =re.match(pattern, content)

27print(result)

1
import
requests
2import
re3 content = requests.get('
').text4#
print(content)
5 pattern = re.compile('
(.*?)
.*?year">(.*?).*?
', re.s)
6 results =re.findall(pattern, content)
7for result in
results:
8     name, author, date =result
9     author = re.sub("
\s", ""
, author)
10     date = re.sub("
\s", ""
, date)
11print("
【書名】：
", name, "
【作者】：
", author, "
【出版年】：
", date)

本篇內容為：崔慶才爬蟲學習筆記

python爬蟲之正規表示式

search函式 import re re庫 pattern re.compile r worlda compile編譯生成可操作物件 m re.search pattern,hello world search的結果有一些屬性，其中group 返回如果查詢成功，則返回匹配的段落 if m pr...

python3爬蟲正規表示式

正規表示式首先要匯入re庫其中常用的函式 compile函式格式為re.complie pattern flags pattern 乙個字串形式的正規表示式 flags 可選，表示匹配模式，比如忽略大小寫，多行模式等，具體引數為 re.i 忽略大小寫 re.l 表示特殊字符集 w,w,b,b,...

python爬蟲正規表示式

正規表示式是十分高效而優美的匹配字串工具，一定要好好掌握。利用正規表示式可以輕易地從返回的頁面中提取出我們想要的內容。1 貪婪模式與非貪婪模式 python預設是貪婪模式。貪婪模式，總是嘗試匹配盡可能多的字元非貪婪模式，總是嘗試盡可能少的字元。一般採用非貪婪模式來提取。2 反斜槓問題正規表示式裡...

Python爬蟲之正規表示式（3）

python爬蟲之正規表示式

python3爬蟲 正規表示式

python爬蟲 正規表示式

相關推薦

python3爬蟲正規表示式

python爬蟲正規表示式