Python計算大檔案行數方法及效能比較

如何使用python快速高效地統計出大檔案的總行數, 下面是一些實現方法和效能的比較。

1.readline讀所有行

使用readlines方法讀取所有行:

def
readline_count
(file_name)
:return
len(
open
(file_name)
.readlines(
))

2.依次讀取每行依次讀取檔案每行內容進行計數:

def
******_count
(file_name)
:    lines =
0for _ in
open
(file_name)
:        lines +=
1return lines

3.sum計數使用sum函式計數:

def
sum_count
(file_name)
:return
sum(
1for _ in
open
(file_name)
)

4.enumerate列舉計數:

'''
'''def
enumerate_count
(file_name)
:with
open
(file_name)
as f:
for count, _ in
enumerate
(f,1):
pass
return count

5.buff count每次讀取固定大小,然後統計行數:

def
buff_count
(file_name)
:with
open
(file_name,
'rb'
)as f:
count =
0        buf_size =
1024
*1024
buf = f.read(buf_size)
while buf:
count += buf.count(b'\n'
)            buf = f.read(buf_size)
return count

6.wc count呼叫使用wc命令計算行:

'''
'''def
wc_count
(file_name)
:import subprocess
out = subprocess.getoutput(
"wc -l %s"
% file_name)
return
int(out.split()[0])

7.partial count在buff_count基礎上引入partial:

def
partial_count
(file_name)
:from functools import partial
buffer
=1024
*1024
with
open
(file_name)
as f:
return
sum(x.count(
'\n'
)for x in
iter
(partial(f.read,
buffer),
''))

8.iter count在buff_count基礎上引入itertools模組 :

'''
'''def
iter_count
(file_name)
:from itertools import
(takewhile, repeat)
buffer
=1024
*1024
with
open
(file_name)
as f:
buf_gen = takewhile(
lambda x: x,
(f.read(
buffer
)for _ in repeat(
none))
)return
sum(buf.count(
'\n'
)for buf in buf_gen)

下面是在我本機 4c8g python3.6的環境下,分別測試100m、500m、1g、10g大小檔案執行的時間，單位秒：

Python計算大檔案行數方法及效能比較

如何使用python快速高效地統計出大檔案的總行數,下面是一些實現方法和效能的比較。1.readline讀所有行使用readlines方法讀取所有行 def readline count file name return len open file name readlines 2.依次讀取每行依...

python獲取大檔案行數

背景處理一些日誌或者請求資料時，幾百萬行的資料，有時候在做效能測試任務時估算出大概需要的詞表數，需要一定行數的資料，需要提前看下原始檔案大小，在這記一下，的確比較快如下獲取檔案行數,一塊一塊讀取 def get file lines filepath with open filepath,rb...

python計算檔案的行數的方法

1 簡單方法把檔案讀入乙個大的列表中,然後統計列表的長度。如果檔案的路徑是以引數的形式filepath傳遞的，那麼只用一行即可 count len open filepath,ru readlines 如果是非常大的檔案，上面的方法可能很慢，甚至失效。count len open 檔名 read...

Python計算大檔案行數方法及效能比較

Python計算大檔案行數方法及效能比較

python獲取大檔案行數

python計算檔案的行數的方法

相關推薦