Python 標準庫學習 string

想要**寫得好，除了參與開源專案、在大公司實習，最快捷高效的方法就是閱讀 python 標準庫。學習 python 標準庫，不是背誦每乙個標準庫的用法，而是要過一遍留下印象，挑自己感興趣的庫重點研究。這樣實際做專案的時候，我們就可以游刃有餘地選擇標準庫。

第一步

# 匯入 string 模組
import string

capwords

string 模組中提供了 capwords 函式，該函式使得字串中的每個單詞變為大寫形式。我們來看看原始碼中是如何定義的:

def
capwords
(s, sep=
none):
return
(sep or
' ')
.join(x.capitalize(
)for x in s.split(sep)
)

capwords 接受乙個位置引數：待處理的字串，和乙個可選關鍵字引數：字串的分隔符。字串預設使用空格分隔，比如『my name is python 』，也可以指定 seq 分隔，比如傳入 seq 為『-』：『my-name-is-python』。這個函式使得被分隔的單詞首字母大寫。

>>
> s =
'my name is python'
>>
> capwords(s)
'my name is python'
>>
> s =
'my-name-is-python'
>>
> capwords(s，'-'
)'my-name-is-python'

總結一下子：我們需要首先向 capwords 函式中傳入字串。capwords 函式通過 str.split 方法將字串分割成單詞，再通過生成器表示式和 str.capitalize 方法，使得每乙個單詞首字母大寫，最後再通過 str.join 方法將單詞拼裝為字串。

上面是 cpython 的實現。對於標準庫中比較簡單的函式，我們可以考慮，如果是自己的話，會用什麼方法寫這個函式，最後再使用 timeit 模組比較一下這兩者的效能。

我舉個例子，比如說，這個函式還可以使用 map 函式重寫，下面這兩種方法實質上和 cpython 的實現等價的。乙個使用了 str 的 capitalize 方法，另乙個通過 methodcaller 方法呼叫字串的 capitalize 方法。

def
capwords1
(s:str
, seq:
str=
none)-
>
str:
return
(seq or
' ')
.join(
map(
str.capitalize, s.split(seq)))
from operator import methodcaller
defcapwords2
(s:str
, seq:
str=
none)-
>
str:
return
(seq or
' ')
.join(
map(methodcaller(
'capitalize'
), s.split(seq)
))

我們再和標準實現比較效能,我是在 ipython 上測試的:

text =
"your time is limted, so don't waste it living someone else's lives"
*10000
%timeit capwords(text)
24.9 ms ± 588 µs per loop (mean ± std. dev. of 7 runs,
10 loops each)
%timeit capwords1(text)
22.1 ms ± 721 µs per loop (mean ± std. dev. of 7 runs,
10 loops each)
%timeit capwords2(text)
28.4 ms ± 3.38 ms per loop (mean ± std. dev. of 7 runs,
10 loops each)

通過測試，我們可以發現，我們實現的第乙個版本的函式，效能可能好一些；而第二個版本的實現則要遜色不少。

下面是 template 的基本用法，這是 string 模組提供給我們的字串插值函式。該函式會將傳進來的引數轉化為字串，然後進行插值，所以不支援格式化字串，但是優點是更加安全。

首先建立乙個模板接受 string 引數，string 的格式要求為：$+ 識別符號(首個字元必須為字母或者下劃線，之後的字元只能是字母、下劃線、數字)，使用 substitute 方法，我們就可以替換識別符號。

匹配的樣式：$$, %name, %

from string import template
string =
'姓名：$name 年齡：$ 愛好：$hobby'
template = template(string)

substitue 的引數可以是字典：

>>
> template.substitute(
)'姓名：python 年齡：30 愛好：all'

還可以是關鍵字引數：

>>
> template.substitute(name=
'python'
, age=
30, hobby=
'all'
)'姓名：python 年齡：30 愛好：all'

關鍵字錯誤，直譯器會報 keyerror:

>>
> template.substitute(name=
'python'
, age=
20, hobb=
'all'
)keyerror:
'hobby'

這時候，我們可以使用 template 提供的另外乙個方法 safe_subsitute 來防止編譯器報錯。當 safe_substitute 方法沒有找到相應的關鍵字，會原封不動地返回識別符號。

>>
> template.safe_substitute(name=
'python'
, age=
30, hobb=
'all'
)'姓名：python 年齡：30 愛好：$hobby'

template 有四個類屬性，其中 delimiter 為分隔符，預設為$，後面接識別符號。通過重寫 delimiter，我們可以支援 % 等符號替換。類屬性 idpattern 為識別符號匹配規則，類屬性 flags 表示忽略大小寫。

class
template
(metaclass=_templatemetaclass)
:"""a string class for supporting $-substitutions."""
delimiter =
'$'    idpattern = r'(?a:[_a-z][_a-z0-9]*)'
braceidpattern =
none
flags = _re.ignorecase

比如說，我們可以重寫類屬性 delimiter 和 idpattern。

class
mytemplate
(template)
:    delimiter =
'%'    idpattern =
'[_][a-z]+_[a-z]+'

上面我們自定義了乙個類，繼承自 string.template，並重寫了 delimiter 和 idpattern 類屬性。

>>
> s =
'%_name_main %age'
>>
> template = mytemplate(s)
>>
> template.substitute(_name_main=
'python'
, age =30)
valueerror: invalid placeholder in string
>>
> template.safe_substitute(_name_main=
'python'
, age =30)
'python %age'

我們可以看到，分隔符已經換成了百分號，而識別符號必須符合_字母_字母的形式，否則會提示 valueerror。

我們還可以從原始碼中學到一些技巧：

from collections import chainmap as _chainmap
defsubstitute
(*args,
**kws):0
])

*args 接受乙個字典, kws 接受關鍵字引數，chainmap 函式將多個對映連線起來，就可以查詢 args 和 kws 中的關鍵字。

以上就是我學習 python 標準庫的思考，還請大家多多**支援。

Python 標準庫學習 string

python標準庫學習

C primer三章二節標準庫型別string

python標準庫os模組學習

Python 標準庫學習 string

python標準庫學習

C primer三章二節標準庫型別string

python標準庫os模組學習

相關推薦