hamlet詞頻統計

part2 code

#10.1calhamlet.py
def gettext():
txt = open("hamlet.txt","r").read()
#將文字中的英文本母全部轉為小寫字母
txt = txt.lower()
return txt
hamlettxt = gettext()
words = hamlettxt.split()
#定義字典型別來儲存文字和文字出現的次數
counts = {}
for word in words:
counts[word] = counts.get(word,0) + 1
#將字典轉換為記錄列表
items = list(counts.items())
#根據items的第二個值進行降序排列
items.sort(key=lambda x:x[1],reverse= true)
for i in range(10):
word,count = items[i]
#左對齊，佔位10位；右對齊，佔位5位，填充字元為空格
print("".format(word,count))

注意：

1、開啟hamlet.txt時要在前面加上具體儲存的路徑；

2、字典型別的counts.get(word,0) 方法表示：如果word在counts中，則返回word對應的值，如果word不在counts中，則返回0.

counts[word] = counts.get(word,0) + 1等價於：

if word incounts：

counts[word] = counts[word] + 1

else:

counts[word] = 1

part3 the result

the 70

and 55

of 47

to 37

our 24

it 23

bernardo 22

this 22

in 22

horatio 20

PTA 詞頻統計

請編寫程式，對一段英文文字，統計其中所有不同單詞的個數，以及詞頻最大的前10 的單詞。所謂單詞是指由不超過80個單詞字元組成的連續字串，但長度超過15的單詞將只擷取保留前15個單詞字元。而合法的單詞字元為大小寫字母數字和下劃線，其它字元均認為是單詞分隔符。輸入給出一段非空文字，最後以符號 ...

詞頻統計（上機）

include include include define error 1 define ok 0 const int word length 250 定義單個單詞最大長度 typedef int status 定義儲存單詞及其出現次數的結構體 typedef struct nodewordnod...

Python 統計詞頻

calhamletv1.py def gettext txt open hamlet.txt r read txt txt.lower for ch in txt txt.replace ch,將文字中特殊字元替換為空格 return txt hamlettxt gettext words haml...

hamlet詞頻統計

PTA 詞頻統計

詞頻統計（上機）

Python 統計詞頻

相關推薦