可以先split/,做好特殊字元標記,儲存到臨時變數裡面,比如,元組,陣列,或者字典之類的;再遍歷上面的變數,拆分括號,用乙個特殊標記,標記括號裡面的內容,總之找到區分括號和非括號內容就可以,之後儲存到變數;最後遍歷第二個變數,生成句型
抱歉最近精神狀態不太好,又比較忙,今天大概寫了下,應該沒有啥問題,還有就是生成循序的問題,這個我有時間再看下,如果要改的話大概是bottom_fuc函式,和呼叫它的那裡的邏輯。還有一種方式就是對每個句型生成乙個列表,最後直接joint,但是我覺得這樣會佔更大的快取空間,所以沒有用。**直接貼上來
import logging
import re
f = open("./phasesplit")
line_true = f.readline()
list_all =
list_size = 0
i = 0
# 將兩個引數進行排列組合連線
# inner_list:待新增的字串列表
# org_str_list:已經連線的字串列表
def bottom_fuc(inner_list = list, org_str_list = list):
inner_new_str_list = list()
for s in inner_list:
st = str(s)
for s1 in org_str_list:
st1 = str(s1)
return inner_new_str_list
#主迴圈
while line_true:
# 儲存分號後的內容
semi_str = ""
# 分號前面的內容
line = ""
# 可以判斷分號個數,這裡不進行判斷
if line_true.find(";") > 0:
# 賦值
line, semi_str = line_true.split(";")
semi_str = str(semi_str).strip()
line = str(line).strip()
else:
line = line_true
list_for_loop = re.split("(\(.+?\))", line)
list_for_loop_new =
# 繼續進行資料置換
for ind, lp in enumerate(list_for_loop, 0):
tmp_lp = lp
# 存在空格且沒有括號
if tmp_lp.find("(") + tmp_lp.find(")") < 0 and tmp_lp.find(" "):
# 進行置換
for data in tmp_lp.split(" "):
else:
list_str =
# 將資料進行進一步拆分
for ind, s in enumerate(list_for_loop_new, 0):
str_tmp = s
pare_*** = 0
# 去除括號,新增空格
if str_tmp.find("(")+str_tmp.find(")") >= 0:
str_tmp = str_tmp.strip(r"(").strip(r")")
str_tmp = " /"+ str_tmp
pare_*** = 1
# 按/拆分
if str_tmp.find("/") >= 0:
if pare_*** == 1:
pare_str = str_tmp.split("/")
else:
else:
pare_*** = 0
new_str_list =
# 組裝拆分後的資料
for l_str in list_str:
if isinstance(l_str, str):
if len(new_str_list) == 0:
else:
for ind, ns in enumerate(new_str_list, 0):
new_str_list[ind] = new_str_list[ind] + " " +l_str
elif isinstance(l_str, list):
if len(new_str_list) == 0:
new_str_list = bottom_fuc(l_str, new_str_list)
else:
logging.error("錯誤型別: ", type(l_str), l_str)
exit(-1)
# 格式處理
for ind, ns in enumerate(new_str_list, 0):
ns.rstrip("\r\n")
if len(semi_str) > 0:
new_str_list[ind] = re.sub(" ", " ", new_str_list[ind].strip()) + ";" + semi_str
else:
new_str_list[ind] = re.sub(" ", " ", new_str_list[ind].strip())
if len(semi_str) > 0:
new_str_list.insert(0, line + ";" + semi_str)
else:
new_str_list.insert(0, line.rstrip("\r\n"))
i += 1
# 讀取下一行
line_true = f.readline()
# 新增到總列表
list_size = i
f.close()
# 寫檔案
with open("result.txt", "w") as nf:
nf.write("#############################################\r")
nf.write("#section:{}\r".format(list_size))
nf.write("#############################################\r")
for la in list_all:
for nl in la:
nf.write(nl+"\r")
nf.write("\r")
nf.write("#############################################\r")
nf.close()
輸入檔案(phasesplit)
quarrel (with sb) about/for/over ; 2313
dabble at/in/with
(sb/sth) damn and blast (sb/sth)
dance on/upon a rope/nothing
dance on (the) air
dead/flat/stark calm
do/go/make the/one's round
do (sb/sth) grace
輸出檔案(result.txt)
#section:8
quarrel (with sb) about/for/over;2313
quarrel about;2313
quarrel with sb about;2313
quarrel for;2313
quarrel with sb for;2313
quarrel over;2313
quarrel with sb over;2313
dabble at/in/with
dabble at
dabble in
dabble with
(sb/sth) damn and blast (sb/sth)
damn and blast
sb damn and blast
sth damn and blast
damn and blast sb
sb damn and blast sb
sth damn and blast sb
damn and blast sth
sb damn and blast sth
sth damn and blast sth
dance on/upon a rope/nothing
dance on a rope
dance upon a rope
dance on a nothing
dance upon a nothing
dance on (the) air
dance on air
dance on the air
dead/flat/stark calm
dead calm
flat calm
stark calm
do/go/make the/one's round
do the round
go the round
make the round
do one's round
go one's round
make one's round
do (sb/sth) grace
do grace
do sb grace
do sth grace
python中如何建立包 如何建立python的包
包是模組的集合,更適合乙個專案。像很多的第三方知名的模組都是以包的形式存 簡單的包實現 自己做乙個ammd包,功能簡單的只有加減乘除等功能,加減在乙個模組matham裡,乘除位於另乙個模組裡mathmd。下面是matham模組的 def add x,y return x ydef minus x,y...
陣列拆分,leetcode刷題記錄 python
給定長度為 2n 的陣列,你的任務是將這些數分成 n 對,例如 a1,b1 a2,b2 an,bn 使得從1 到 n 的 min ai,bi 總和最大。輸入 1,4,3,2 輸出 4 解釋 n 等於 2,最大總和為 4 min 1,2 min 3,4 class solution def array...
如何用Python計算Softmax?
softmax函式,或稱歸一化指數函式,它能將乙個含任意實數的k維向量z 壓縮 到另乙個k維實向量 sigma 中,使得每乙個元素的範圍都在 0,1 之間,並且所有元素的和為1。該函式的形式通常按下面的式子給出 sigma frac e quad for j 1,k 輸入向量 1,2,3,4,1,2...