經常會需要用到將zhangwei轉化為zhangwei、zw、zhangw之類的,就涉及到乙個拼音拆分演算法,這裡寫了乙個demo分享給大家
我的思路是先將聲母轉換為大寫,然後就可以根據大寫字母來分割單個拼音
轉化**
def sm(strs):
smlist = 'bpmfdtnlgkhjqxrzcsyw'
for s in smlist:
strs = strs.replace(s,s.upper())
return strs
然後發現有個問題,韻母中也包含了聲母的元素,zhangwei就會變成zhangwei
發現兩個問題,乙個是zh、ch、sh這類的包含了聲母h,乙個是er、an、en、in、un、vn、ang、eng、ing、ong這類的包含了聲母r、n、g
於是再加乙個轉換
def sm(strs):
smlist = 'bpmfdtnlgkhjqxrzcsyw'
nosm = ['er','an','en','in','un','vn','ng','ng']
rep =
for s in smlist:
strs = strs.replace(s,s.upper())
for s in nosm:
strs = strs.replace(s,s.lower())
for s in rep.keys():
strs = strs.replace(s,rep[s])
return strs
這時候zhangwei已經可以轉為zhangwei了
在進行批量轉換的時候又遇到乙個問題,碰到chenguiying(陳桂英)這種拼音的時候,會轉化為chenguiying,這是因為r、n、g既可以做結尾,也可以做聲母,於是又對nosm這個list進行一次判斷,發現這類後,再往後判斷乙個字元,判斷是否在聲母表中
def sm(strs):
smlist = 'bpmfdtnlgkhjqxrzcsyw'
nosm = ['er','an','en','in','un','vn','ng','ng']
rep =
for s in smlist:
strs = strs.replace(s,s.upper())
for s in nosm:
strs = strs.replace(s,s.lower())
for s in rep.keys():
strs = strs.replace(s,rep[s])
for s in nosm:
tmp_num = 0
isok = false
while (tmp_num < len(strs)) and (isok==false):
try:
tmp_num = strs.index(s.lower(),tmp_num)
except:
isok = true
else:
tmp_num = tmp_num + len(s)
if strs[tmp_num:tmp_num+1].lower() not in smlist:
strs = strs[:tmp_num-1]+strs[tmp_num-1:tmp_num].upper()+strs[tmp_num:]
return strs
這時候已經可以提取聲母了,剩下就簡單了,碰到大寫字母後就是乙個拼音的開始,提取簡拼就只找大寫字母
拆分def onep(strs):
restr = ''
strs = sm(strs)
for s in strs:
if 'a' <= s and s <= 'z':
restr = restr + ' ' + s
else:
restr = restr + s
restr = restr[1:]
restr = restr.lower()
return restr.split(' ')
返回['chen','gui','ying']
簡拼提取
def ******p(strs):
restr = ''
strs = sm(strs)
for s in strs:
if 'a' <= s and s <= 'z':
restr = restr + s
restr = restr.lower()
return restr
返回cgy
然後就可以玩很多了
附乙個通過拼音生成弱口令字典的指令碼
#!/usr/bin/python
# author : wkong
# crack
def clearchar(chars):
restr = ['\n','\r','\t',' ']
for res in restr:
chars = chars.replace(res, '')
return chars
def sm(strs):
smlist = 'bpmfdtnlgkhjqxrzcsyw'
nosm = ['er','an','en','in','un','vn','ng','ng']
rep =
for s in smlist:
strs = strs.replace(s,s.upper())
for s in nosm:
strs = strs.replace(s,s.lower())
for s in rep.keys():
strs = strs.replace(s,rep[s])
for s in nosm:
tmp_num = 0
isok = false
while (tmp_num < len(strs)) and (isok==false):
try:
tmp_num = strs.index(s.lower(),tmp_num)
except:
isok = true
else:
tmp_num = tmp_num + len(s)
if strs[tmp_num:tmp_num+1].lower() not in smlist:
strs = strs[:tmp_num-1]+strs[tmp_num-1:tmp_num].upper()+strs[tmp_num:]
return strs
def ******p(strs):
restr = ''
strs = sm(strs)
for s in strs:
if 'a' <= s and s <= 'z':
restr = restr + s
restr = restr.lower()
restr = restr.capitalize()
return restr
def repass(name):
ulist =
pwdlist =
ce = ['!@#123','123!@#','@123','@1234','@12345','@123456','123','1234','12345','123456','123.','1234.','12345.','123456.','123123','abc','abc@123','qwer!@#','!@#qwer','qwe!@#','!@#qwe','!qaz2wsx','1q2w3e']
for s in ce:
for u in ulist:
return pwdlist
def autocrack(username, password):
print(username+':'+password)
if __name__ == '__main__':
userfile = 'zhangwei.txt'
puserfile = open(userfile, 'r')
userlist = puserfile.readlines()
puserfile.close()
for user in userlist:
user = clearchar(user)
pwd = repass(user)
for pw in pwd:
autocrack(user, pw)
image.png
python處理漢字的拼音
一 漢字拼音轉換工具 python 版 二 安裝 pip install pinyin三 例項 import pinyin as py print py.get 我是乙個中國人 print py.get initial 我是乙個中國人 print type py.get initial 我是乙個中國...
python漢字轉換為拼音
使用pypinyin包 pip install pypinyin from pypinyin import pinyin,normal 將漢字轉換為拼音,pinyin 轉換後是列表,不加style轉換後帶聲調 pos 1 for piny in pinyin self.name,style norm...
Python漢字轉換成拼音
最近在使用python做專案時,需要將漢字轉化成對應的拼音.網上的一些包大多是python2.x的,使用下面這個包,支援python3.6 from xpinyin import pinyin p pinyin default splitter is p.get pinyin u 上海 shang ...