python正規表示式

語法

意義說明

「.」任意字元

「^」字串開始

『^hello』匹配』helloworld』而不匹配』aaaahellobbb』

「$」字串結尾

與上同理

「*」0 個或多個字元（貪婪匹配）

<*>匹配chinaunix

「+」1 個或多個字元（貪婪匹配）

與上同理

「?」0 個或多個字元（貪婪匹配）

與上同理

*?,+?,??

以上三個取第乙個匹配結果（非貪婪匹配）

<*>匹配

對於前乙個字元重複m到n次，亦可

a匹配6個a、a匹配2到4個a

?對於前乙個字元重複m到n次，並取盡可能少

『aaaaaa』中a只會匹配2個

「\\」

特殊字元轉義或者特殊序列

表示乙個字符集

[0-9]、[a-z]、[a-z]、[^0]

「|」或

a|b,或運算

(…)匹配括號中任意表示式

(?#…)

注釋，可忽略

(?=…)

matches if … matches next, but doesn』t consume the string.

『(?=test)』在hellotest中匹配hello

(?!…)

matches if … doesn』t match next.

『(?!=test)』若hello後面不為test，匹配hello

(?<=…)

matches if preceded by … (must be fixed length).

『(?<=hello)test』在hellotest中匹配test

(?matches if not preceded by … (must be fixed length).

『(? 字元

描述\a

只匹配字串的開始

\b匹配乙個單詞邊界

\b匹配乙個單詞的非邊界

\d匹配任意十進位制數字字元，等價於r』[0-9]』

\d匹配任意非十進位制數字字元，等價於r』[^0-9]』

\s匹配任意空格字元（空格符、tab製表符、換行符、回車、換頁符、垂直線符號）

\s匹配任意非空格字元

\w匹配任意字母數字字元

\w匹配任意非字母數字字元

\z僅匹配字串的尾部

\\匹配反斜線字元

#!/usr/bin/env python
import re
r1 = re.compile(r'world')
if r1.match('helloworld'):
print
'match succeeds'
else:
print
'match fails'
if r1.search('helloworld'):
print
'search succeeds'
else:
print
'search fails'

說明一下：r是raw(原始)的意思。因為在表示字串中有一些轉義符，如表示回車』\n』。如果要表示\表需要寫為』\』。但如果我就是需要表示乙個』\』+』n』，不用r方式要寫為:』\n』。但使用r方式則為r』\n』這樣清晰多了。

例：設定flag

#r2 = re.compile(r'n$', re.s)
#r2 = re.compile('\n$', re.s)
r2 = re.compile('world$', re.i)
if r2.search('helloworld\n'):
print
'search succeeds'
else:
print
'search fails'

例：直接呼叫

if re.search(r'abc','helloaaabcdworldn'): print 'search succeeds' else: print

'search fails'

split

re.split(pattern, string[, maxsplit=0, flags=0])

split(string[, maxsplit=0])

作用：可以將字串匹配正規表示式的部分割開並返回乙個列表

例：簡單分析ip

#!/usr/bin/env python
import re
r1 = re.compile('w+')
print r1.split('192.168.1.1')
print re.split('(w+)', '192.168.1.1')
print re.split('(w+)', '192.168.1.1', 1)

結果如下：

[『192』, 『168』, 『1』, 『1』]

[『192』, 『.』, 『168』, 『.』, 『1』, 『.』, 『1』]

[『192』, 『.』, 『168.1.1』]

findall

re.findall(pattern, string[, flags])

findall(string[, pos[, endpos]])

作用：在字串中找到正規表示式所匹配的所有子串，並組成乙個列表返回

例：查詢包括的內容（貪婪和非貪婪查詢）

#!/usr/bin/env python
import re
r1 = re.compile('([.*])')
print re.findall(r1, "hello[hi]heldfsdsf[iwonder]lo")
r1 = re.compile('([.*?])')
print re.findall(r1, "hello[hi]heldfsdsf[iwonder]lo")
print re.findall('[0-9]',"fdskfj1323jfkdj")
print re.findall('([0-9][a-z])',"fdskfj1323jfkdj")
print re.findall('(?=www)',"afdsfwwwfkdjfsdfsdwww")
print re.findall('(?<=www)',"afdsfwwwfkdjfsdfsdwww")

finditer

re.finditer(pattern, string[, flags])

finditer(string[, pos[, endpos]])

說明：和 findall 類似，在字串中找到正規表示式所匹配的所有子串，並組成乙個迭代器返回。同樣 regexobject 有：

sub

re.sub(pattern, repl, string[, count, flags])

sub(repl, string[, count=0])

說明：在字串 string 中找到匹配正規表示式 pattern 的所有子串，用另乙個字串 repl 進行替換。如果沒有找到匹配 pattern 的串，則返回未被修改的 string。repl 既可以是字串也可以是乙個函式。

例：

#!/usr/bin/env python
import re
p = re.compile('(one|two|three)')

subn

re.subn(pattern, repl, string[, count, flags])

subn(repl, string[, count=0])

說明：該函式的功能和 sub() 相同，但它還返回新的字串以及替換的次數。

python正規表示式

python正規表示式元字元正規表示式

Python 正規表示式

Python正規表示式

python正規表示式

python正規表示式元字元 正規表示式

Python 正規表示式

Python正規表示式

相關推薦

python正規表示式元字元正規表示式