正規表示式

『.ython』可以匹配首字母以任意元素開頭的字元，如『python』，『+ython』，『 ython』。

其中 . 為萬用字元，值匹配乙個任意字元，兩個及以上無法匹配

匹配『python.org』：使用『python\\.org』

為什麼需要 \\ 而不用 \ ？兩級轉義——直譯器轉義；re模組轉義。

實際上也可以使用原始字串 r'python\.org'

[a-z]：匹配從a到z 的任意乙個字元

[a-za-z0-9]：匹配任意大小寫字母和數字

[^abc]：匹配非abc的字串

python|perl： | 管道符號，匹配任意乙個，兩者是或的關係

p(ython|erl) ：跟數學的 1*（1+1）乙個道理，裡面的 | 仍然是管道，加上() 就是子模式

重複子模式

（pattern）*：0-多次

（pattern）+：1-多次

（pattern）：m-n次

r'(http://)?(www\.)?python\.org'：？前面的就是可選項，表示子模式可以出現0-1次。

為什麼不用\\ 轉義？注意前面是r ，指代原始字元

str.startswith(str, beg=0,end=len(string));

str.endswith(suffix[, start[, end]]);

str.find()

在(檔案中)一行中使用 endwith 方法的時候注意『\n』 ——換行標識，當然對每一行使用了 rstrip() 方法的話就不用在關鍵字後面加『\n』了。

def findkey(fname):
file = open(fname)
for line in file:
if line.startswith('xx') \
or line.endswith('xx\n'): # or line[:-1].endswith('xx'):
print line
file = open(fname)
for line in file:
line = line.rstrip()
if line.startswith('xx') or line.endswith('xx'):
print line
findkey('test.txt')

匹配字母或者下劃線開頭的字串

#a = ''
a = '_value'
# a can not be null or useless, 'a and' put 1th
boolean = a and ( a[0]== '_' or 'a' <= a[0] <= 'z')
print boolean

import re
str = 'x\nxx python'
#use 'r' can match exactlly what you write
#regular = re.compile('x\n',re.i)
regular = re.compile(r'x\n',re.i) #build one object,re.i,ignore lower or uppercase,add 'r' is better
#print regular,type(regular)
result = regular.match(str)  #return a object to store match result.
#print regular.match(str)
if result == none:
print 'nothing find'
else:
print result.span() #check index
# use once
result = re.match(r'x',str) # str is target string

上述**中的 match() 方法是從0開始匹配，如果0位沒有，則沒有匹配失敗，匹配失敗之後如果列印 result 你會發現 result 是none，可以使用這個值進行判定

search() 方法顧名思義，尋找到第乙個就ok

match() 直接匹配所給，不尋找，匹配不上就是失敗

另外，每次匹配都要寫一次正規表示式很麻煩：re.compile() 將正規表示式例項化就更方便了

例如：假設已經建立了乙個正規表示式的物件——pat = re.compile(r'[a-za-z0-9]')

那麼，pat.search(xstring) 等同於 re.search(r'[a-za-z0-9]')

當然，如果是一次性的匹配，也可以不用 re.compile() 方法：有乙個需求，我想要去掉多餘的符號，只要元素

split( 正規表示式，待匹配文字) 讓你明白

import re
text = 'alpha,,,beta,, gama delta'
print re.split('[,]+', text)

1. 匹配以a開頭，z 結尾的三個字母構成的字串

result = re.match(r'a.z','a8z')
print result.span()

—— . 小數點：匹配除了換行符（\n）以外的任意乙個字元，乙個字元，乙個

所以下列**是匹配失敗的

result = re.match(r'a.z','a\nz')
result = re.match(r'a.z','axxz')

2. 匹配以字母或者數字開頭的字串

result = re.match(r'[a-za-z0-9]','hello')
print result.group()

—— 方括號：能夠匹配方括號內任意乙個字元

上述問題解決方法二：

result = re.match(r'[\w]','hello')
print result.group()

—— \w：表示任意乙個字母或數字或下劃線，也就是 a~z,a~z,0~9,_ 中任意乙個

3. 匹配以括號內有任意乙個字母或數字或下劃線開頭的字串

result = re.match(r'\[[\w]\]','[0]891') # if you want to find a str including ,you should use \
print result.group()

1. 匹配任意多個字元或者數字或下劃線開頭的字串

result = re.match(r'[\w]*','dsuio28a$$$$')
print result.group()

—— * 星號：匹配前乙個規則 0~n 次

2. 匹配乙個有效的 python 變數（以下劃線字母開頭，所以必須存在1次或以上）

result = re.match(r'[_a-za-z]+[_\w]*','_python')
print result.group()

3. 匹配 0- 99 的數字

第一反應可能會是：

result = re.match(r'[0-99]','99')
print result.group()

發現結果其實只能匹配第乙個數字，因為用的是上面的匹配乙個字元，那麼這裡的 0 - 99 要匹配多個字元，也有可能是乙個字元

result = re.match(r'[0-9]?[0-9]','55')
print result.group()

一些有用的例項：

1. 用something 替換 *something*

pattern = r''

正規表示式正規表示式總結

非負整數 d 正整數 0 9 1 9 0 9 非正整數 d 0 負整數 0 9 1 9 0 9 整數 d 非負浮點數 d d 正浮點數 0 9 0 9 1 9 0 9 0 9 1 9 0 9 0 9 0 9 1 9 0 9 非正浮點數 d d 0 0 負浮點數正浮點數正則式英文本串 a za z...

正規表示式表示式

網域名稱 a za z0 9 a za z0 9 a za z0 9 a za z0 9 interneturl a za z s 或 http w w w 手機號碼 13 0 9 14 5 7 15 0 1 2 3 5 6 7 8 9 18 0 1 2 3 5 6 7 8 9 d 號碼 x x x...

Linux正規表示式編寫正規表示式

為了所有實用化的用途，你可以通過使用程式產生正確的結果。然而，並不意味著程式總是如你所願的那樣正確地工作。多數情況下，如果程式不能產生想要的輸出，可以斷定真正的問題排除輸入或語法錯誤在於如何描述想要的東西。換句話說，應該考慮糾正問題的地方是描述想要的結果的表示式。表示式不完整或者公式表示得不正確...

正規表示式

正規表示式 正規表示式 總結

正規表示式 表示式

Linux正規表示式 編寫正規表示式

相關推薦

正規表示式正規表示式總結

正規表示式表示式

Linux正規表示式編寫正規表示式