python文字處理

基本的文字操作：

在python中，可以用下列方式表現乙個文字字串'',""：

'this
is a literal
string'
out[1]: 'this
is a literal
string'
"this is a literal string"
out[2]: 'this
is a literal
string'

使用3引用符，無須在文字中加入換行和續行。按原貌儲存。

bigger = """ this is an even bigger string that spans three lines

"""

在字串面前加r或r，表示該字串是乙個真正的「原」字串，在字串前面加乙個u或u使之成為乙個unicode

big = r"this is a long string \ with a backslash and a newline in it"

hello = u'hello\u0020world'

s.isdigit()

s.upper()

s.count('needle')

呼叫內建list，用字串作為引數：thelist = list(thestring)

for語句完成遍歷：

for c in thestring:
do_something_with(c)
results = [do_something_with(c) for c in thestring]

用map函式處理：results = map(do_something, thestring)

內建函式chr,str,ord,unichr

print ord('a')
97print chr(97)
aprint ord(u'\u2002')
8194
print repr(unichr(8224))
u'\u2020'
print map(ord, 'dxlmaoe')
[100, 120, 108, 109, 97, 111, 101]

利用內建的isinstance和basestring 來簡單快速檢查某個物件是否是字串或者unicode物件

def
isastring
(anobj):
return isinstance(anobj, basestring)

或許還有型別測試的方法：

def
i***actlyastring
(anobj):
return type(anobj) is type('')

然而unicode物件無法通過。basestring 是 str和unicode型別的共同基類。但是python標準庫中的userstring類的例項，是無能為力的。

def
isstringlike
(anobj):
try: anobj + ''
except: return
false
else: return
true

python中通常的型別檢查方法是所謂的鴨子判斷法：如果它走路像鴨子，叫聲也像鴨子，那麼對於我們的應用而言，就可以認為它是鴨子。

實現字串對齊：左對齊，居中對齊，或者右對齊

print '|','hej'.ljust(20),'|','hej'.rjust(20),'|','hej'.center(20),'|'
| hej                  |                  hej |         hej          |
print 'hej'.center(20, '+')
++++++++hej+++++++++

獲得乙個開頭和末尾都沒有多餘空格的字串

x = '  hej  '
print '|',x.lstrip(), '|', x.rstrip(), '|', x.strip(), '|'
| hej   |   hej | hej |

x = 'xyxxyy hejyx yyx'
print '|'+x.strip('xy')+'|'
| hejyx |

有一些小的字串，想把這些字串合併成乙個大字串。

largestring = ''.join(pieces)
largestring = '%s%s something %s yet more'%(smal11,small2, sma113)
largestring = sma111+sma112+'something' + smal13+'yet more'

import operator
largestring = reduce(operator.add, pieces, '')

把字串逐字元或逐詞反轉過來。

revchars =astring[::-1]

revwords = astring.split()
revwords.reverse()
revwords = ''.join(revwords)

revwords = ' '.join(astring.split()[::-1])

如果想逐詞反轉但又同時不改變原先的空格，可以用正規表示式來分隔原字串：

import re
revwords = re.split(r'(\s+)', astring)
revwords.reverse()
revwords = ''.join(revwords)

或者也可以這樣寫:revwords = ''.join(re.split(r'(\s+)', astring)[::-1])

def containsany (seq, aset): for c in seq: if c in aset: return true return

false

import itertools
defcontainsany
(seq, aset):
for item in itertools.ifilter(aset.__contains__, seq):
return
true
return
false

python 文字處理

我們談到文字處理時，我們通常是指處理的內容。python 將文字檔案的內容讀入可以操作的字串變數非常容易。檔案物件提供了三個讀方法 read readline 和 readlines 每種方法可以接受乙個變數以限制每次讀取的資料量，但它們通常不使用變數。read 每次讀取整個檔案，它通常用於...

python文字處理

日常操作中，少不了文字處理，如程式輸入資料準備，python憑藉其簡潔優雅的語法，在文字處理上比c 等編譯型語言開發效率高出一大截。檔案操作示例輸入檔案 f open r d python27 pro 123.bak 輸出檔案 fw open r d python27 pro 123e.bak w...

python文字處理

python 文字處理

python文字處理

python文字處理

相關推薦