用python實現的可以自動補全的字首樹

1，以下是**部分

import os,sys
import json
class
trietree:
def__init__
(self,is_debug=1,is_sentence=0):
self.tree = none
self.tree = {}
self.is_debug = is_debug
self.is_sentence = is_sentence
self.prefix_list = 
defaddfromfile
(self,filepath):
with open(filepath) as f:
for line in f:
line_list = line.strip().strip("#").split("#")
main_word = line_list[0].strip().split()
ifnot self.is_sentence:
sub_word_list = [
u.replace(" ","") for u in line_list
]else:
sub_word_list = line_list
for i,w in enumerate(main_word):
if i == 0:
target_dict = self.tree
else:
target_dict = target_dict[main_word[i-1]]
if w not
in target_dict:
target_dict[w] = {}
target_dict[w]["##cnt"] = 1
target_dict[w]["##terminal"] = 
target_dict[w]["##wordtag"] = 0
else:
target_dict[w]["##cnt"] += 1
if i== len(main_word)-1:
target_dict[w]["##terminal"].extend(sub_word_list)
target_dict[w]["##wordtag"] = 1
if self.is_debug:
context = json.dumps(self.tree,indent=2,ensure_ascii=false)
print>>file("./debug.json","w"),context
defsearchprefix
(self,prefix_string):
self.prefix_list = 
target_dict = self.tree
ifnot self.tree:
return self.prefix_list
if self.is_sentence:
prefix_string = prefix_string.strip().split(" ")
for i,w in enumerate(prefix_string):
if w not
in target_dict:
return self.prefix_list
else: 
target_dict = target_dict[w]
defdeepsearch
(target_dict):
if len(target_dict.keys())==3:
self.prefix_list.extend(target_dict["##terminal"])
return
else:
self.prefix_list.extend(target_dict["##terminal"])
for k in target_dict.keys():
if k not
in ["##terminal","##cnt","##wordtag"]:
deepsearch(target_dict[k])
deepsearch(target_dict)
return self.prefix_list
if __name__ == "__main__":
trie = trietree(is_debug=1,is_sentence=1)
trie.addfromfile(sys.argv[1])
while
1:        raw=raw_input("please input:")
print trie.searchprefix(raw)

2，以下是測試用例部分，將下面的英文句子貼上到乙個檔案名字是sent.d中；

hi, my name is steve.#

it』s nice to meet you.#

it』s a pleasure to meet you i』m jack.#

what do you do for a living.#

i work at a restaurant.#

i work at a bank.#

i work in a software company.#

i』m a dentist.#

what is your name.#

what was that again.#

excuse me.#

pardon me.#

are you ready?#

are you free now?#

are you mr. murthy?#

are you angry with me?#

are you afraid of them?#

are you tired?#

are you married?#

are you employed?#

are you interested in that?#

are you awake?#

are you aware of that?#

are you a relative of mr. mohan?#

are you not well?#

are they your relatives?#

are they from abroad?#

are the shops open?#

are you satisfied now?#

are you joking?#

3，測試過程

在linux shell中執行：

python trietree.py sent.d

即可輸入乙個完整的單詞字首進行查詢了！

** 這裡你可能會有疑問，這個演算法只能是按照字首搜尋，即

按照2裡面的例子來看，輸入are，只能得到一are 開頭的句子，輸入are you 只能得到以are you 開頭的句子，如果我想知道所有含有單詞shops的句子呢？該如何處理，這個時候「字尾樹」就會發揮作用了，名字為字尾樹，實則不然，其實是把所有句子的字尾單元都壓入到乙個字首樹中，例如

are you a lucky dog？

這個句子的所有的字尾就是

are you a lucky dog?

you lucky dog?

lucky dog?

dog?

把每個句子的所有的字尾都壓入到字首樹中，那麼是不是就會很方便的查詢到含有某個單詞的所有句子了呢？

用python實現自動拍照專案

當下，python是熱門語言之一，python可以實現各種各樣的功能，從而在現實生活中幫助我們。專案全部 import cv2 import time def snapshotct camera idx 1 camera idx的作用是選擇攝像頭。如果為0則使用內建攝像頭，比如筆記本的攝像頭，用1或...

用python實現的NYOJ自動簽到程式

程式簡介使用說明 import requests from bs4 import beautifulsoup deflogin check response 抓取獲得登入結果 soup beautifulsoup response.text,html.parser lists soup.find ...

用python的OCR實現自動拍照搜題

學以致用系列而且！以上都是可以用寫出來的。因為用python實現的，部分主要是需要搭建乙個python中ocr的環境 ocr安裝在這裡每道題的答題時間是三十秒，上面三步完成基本是夠的。為了答題的命中率我也是蠻拼的了。1.截圖的題幹 2.文字識別出來的結果 text pytesseract.im...

用python實現的可以自動補全的字首樹

用python實現自動拍照專案

用python實現的NYOJ自動簽到程式

用python的OCR實現自動拍照搜題

相關推薦