python練習冊0004題

在任意乙個英文文件中，統計單詞出現的次數，

分析:

本題不是很難，單詞通常以空格隔開，但是有些單詞後面跟一些特殊符號，只需把這些特殊符號替換掉就可以了，

**一

1
importre2
3 file_name = '
code.txt'4
5 lines_count =0
6 words_count =0
7 chars_count =0
8 words_dict ={}
9 lines_list =
1011 with open(file_name, 'r'
) as f:
12for line in
f:13         lines_count = lines_count + 1
14         chars_count = chars_count +len(line)
15         match = re.findall(r'
[^a-za-z0-9]+
', line)
1617
#正則 re.findall  的簡單用法（返回string中所有與pattern相匹配的全部字串，返回形式為陣列）語法：
18for i in
match:19#
只要英文單詞，刪掉其他字元
20             line = line.replace(i, '')
21         lines_list =line.split()
22for i in
lines_list:
23if i not
inwords_dict:
24                 words_dict[i] = 1
25else
:26                 words_dict[i] = words_dict[i] + 1
2728
print('
words_count is
', len(words_dict))
29print('
lines_count is
', lines_count)
30print('
chars_count is
', chars_count)
3132
for k, v in
words_dict.items():
33print( k, v)

該**有些囉嗦，網上找的，說下思路把，利用正規表示式找到所有的不是字母也不是數字的資料儲存下來，然後再訪問文字中的資料，將非字母和數字的資料替換為空

弱弱的說一句，直接替換掉不就完了。

**二：

這是本人所寫的，較**一稍微簡潔些；

import
ref=open("
code.txt
",'r')
s=f.read()
s.replace(
"[^a-za-z]
",''
)s=s.split()
word={}
for i in
s:    
if i not
inword:
word[i]=1
else
:        word[i]=word[i]+1
for k,v in
word.items():
print(k,v)

**三：

你以為你寫的夠簡潔了嗎？不，python早就幫你封裝好函式了。

點開才能看。

import
collections
importre 
defcalwords(path):
word =
with open(path) as file:
data =file.readlines()
for line in
data:
word += re.split('
|，',line.strip('\n'
))    
print
(collections.counter(word))  
if__name__ == '
__main__':
calwords(
'e:')

view code

用到的方法說明

正則 re.findall 的簡單用法（返回string中所有與pattern相匹配的全部字串，返回形式為陣列）

語法：findall(pattern, string, flags=0)

string的replace方法，用後乙個引數替換字串中的前乙個引數。

string.split方法

str.split()
單一分隔符，使用str.split()即可 
str.split不支援正則及多個切割符號，不感知空格的數量
re.split()
多個分隔符，複雜的分隔情況，使用re.split
原型： re.split(pattern, string, maxsplit=0)
通過正規表示式將字串分離。如果用括號將正規表示式括起來，那麼匹配的字串也會被列入到list中返回。maxsplit是分離的次數，maxsplit=1分離一次，預設為0，不限制次數。
eg:>>>a='
w w w
'>>>import
re1.空格分
>>>re.split(r'
[\s]
',a)['
w','
w','w'
]2.只分割一次
>>>re.split(r'
[\s]
',a,1)['
w','ww'
]3.多個字元分割
>>>c='
w!w@w%w^w
'>>>re.split(r'
[!@%^],c)['
w','
w','
w','
w','w'
]4.還原?:
>>>re.split(r'
(?:!@%^),c)['
w!w@w%w^w']

描述

python strip() 方法用於移除字串頭尾指定的字元（預設為空格或換行符）或字串行。

注意：該方法只能刪除開頭或是結尾的字元，不能刪除中間部分的字元。

couter 是乙個容器，可以統計列表中元素的出現次數.

posted @

2018-10-23 20:27

大眼俠閱讀(

...)

編輯收藏

python練習冊第0002題

將 0001 題生成的 200 個啟用碼或者優惠券儲存到 mysql 關係型資料庫中。這道題是送分題，就是讓人熟悉一下鏈結資料庫以及mysql的使用。import pymysql import random import string def generate length s join ran...

Python練習冊第01題

我就假定啟用碼是 ta0e8 e9zvk urwgb jiklx 這樣的形式的 import random,string defgencdk num 隨機種子範圍取所有大小寫字母和數字 str base string.ascii letters string.digits 建乙個列表用來存放最終20...

Python練習冊第07題

跟前面的題目一樣，依舊是遍歷資料夾裡的檔案，比記錄單詞容易多了，唯一需要注意的是python裡面的3引號多行注釋我平時都不用的，為了記錄特地改了幾個多行注釋 import os,re if name main 分別計算總行數，空行數，注釋行數 count,ept line,comment 0,0,...

python練習冊0004題

python練習冊 第0002題

Python練習冊第01題

Python練習冊第07題

相關推薦

python練習冊第0002題