NLP code 檢視文字覆蓋率

深度學習視覺
import pandas as pd
from tqdm import tqdm
import operator
# 獲取詞彙表中的所有字
dict_path =
'../bertmodel/vocab.txt'
token_dict = gettokendict(dict_path)
# 獲取sentences
相關函式
# 對文字建立詞典
defbuild_vocab
(sentences)
:# key is word,value is frequency
'''    sentences:[sentence]
sentence:"w1w2w3w4,w5w6,w7."
return:文字詞頻
'''vocab =
for sentence in tqdm(sentences)
:for word in sentence:
try:
vocab[word]+=1
except
:                vocab[word]=1
return vocab
defcheck_coverage
(vocab,embeddings_index)
:'''
統計詞典與文字的覆蓋率
return:沒有覆蓋到的字的頻數
'''iv =
# in vocab
oov =
# out of vocba
k =0    i =
0for word in tqdm(vocab)
:try
:# 詞典中的單詞在embedding中
iv[word]
= embeddings_index[word]
k += vocab[word]
except
:            oov[word]
= vocab[word]
i += vocab[word]
pass
print
('found embeddings for  of vocab'
.format
(len
(iv)
/len
(vocab)))
print
('found embeddings for   of all text'
.format
(k /
(k + i)))
sorted_x =
sorted
(oov.items(
), key=operator.itemgetter(1)
)[::
-1]return sorted_x
defgettokendict
(dict_path,encoding=
'utf-8'):
'''    dict_path:字典檔案，每乙個字為一行。
'''token_dict =
with
open
(dict_path, encoding=encoding)
as reader:
for line in reader:
token = line.strip(
)            token_dict[token]
=len
(token_dict)
return token_dict
defclean_numbers
(x):
'''    將數字替換
'''x = re.sub(
'[0-9]'
,'#####'
, x)
x = re.sub(
'[0-9]'
,'####'
, x)
x = re.sub(
'[0-9]'
,'###'
, x)
x = re.sub(
'[0-9]'
,'##'
, x)
return x
				覆蓋率選項，覆蓋率分析
covergroup選項提供不同的覆蓋率選項，來計算覆蓋率。乙個covergroup可能會被多個地方例化使用，預設情況下sv會將所有的例項的覆蓋率合併到一起計算。如果需要單獨列出每個covergroup例項的覆蓋率，需要以下設定覆蓋率選項。covergroup cov coverpoint tr.l...
				功能覆蓋率
功能覆蓋率 function coverage 是屬於黑盒測試範疇內的，在實際測試中，涉及到的覆蓋率一般都是結構化覆蓋率，與黑盒相關的覆蓋率比較少。功能覆蓋中最常見的是需求覆蓋，其含義是通過設計一定的測試用例，要求每個需求點都被測試到。其公式是 需求覆蓋 被驗證到的需求數量 總的需求數量 在黑盒測試...
				測試覆蓋率
摘要 在測試方法中粗略的介紹了幾種測試方法。其中，白盒測試的動態分析方法中提到邏輯覆蓋率測試有 語句覆蓋 分支覆蓋 判定覆蓋 條件覆蓋 條件 判定覆蓋和路徑覆蓋。這裡將詳細闡述邏輯覆蓋率測試。準備知識 可執行語句 可執行的一項操作 真 假分支 ture false 運算元 opreand 操作符 o...
NLP code 檢視文字覆蓋率

覆蓋率選項，覆蓋率分析

功能覆蓋率

測試覆蓋率

相關推薦