Python讀csv檔案去掉一列後再寫入新的檔案

用了兩種方式解決該問題，都是網上現有的解決方案。

場景說明：

有乙個資料檔案，以文字方式儲存，現在有三列user_id,plan_id,mobile_id。目標是得到新檔案只有mobile_id,plan_id。

解決方案

方案一：用python的開啟檔案寫檔案的方式直接擼一遍資料，for迴圈內處理資料並寫入到新檔案。

**如下：

def
readwrite1( input_file,output_file):
f = open(input_file, 'r'
)    out = open(output_file,'w'
)    
print
(f)    
for line in
f.readlines():
a = line.split(","
)        x=a[0] + "
," + a[1]+"\n"
out.writelines(x)
f.close()
out.close()

方案二：用 pandas 讀資料到 dataframe 再做資料分割，直接用 dataframe 的寫入功能寫到新檔案

**如下：

def
readwrite2(input_file,output_file):
date_1=pd.read_csv(input_file,header=0,sep=','
)    date_1[[
'mobile
', '
plan_id
']].to_csv(output_file, sep='
,', header=true,index=false)

從**上看，pandas邏輯更清晰。

下面看下執行的效率吧！

def
getruntimes( fun ,input_file,output_file):
begin_time=int(round(time.time() * 1000))
fun(input_file,output_file)
end_time=int(round(time.time() * 1000))
print("
",(end_time-begin_time),"ms"
)getruntimes(readwrite1,input_file,output_file)  #直接擼資料
getruntimes(readwrite2,input_file,output_file1) #使用dataframe讀寫資料

input_file 大概有27萬的資料，dataframe的效率比for迴圈效率還是要快一點的，如果資料量更大些，效果是否更明顯呢？

下面試下增加input_file記錄的數量試試，有如下結果

input_file

readwrite1

readwrite2

27w976

77755w

1989

1509

110w

4312

3158

從上面測試結果來看,dataframe的效率提高大約30%左右。

python讀 python讀寫csv檔案

今天閒來無事，寫了會csv，簡單總結下csv具體操作什麼是csv 逗號分隔值 comma separated values，csv，有時也稱為字元分隔值，因為分隔字元也可以不是逗號其檔案以純文字形式儲存資料數字和文字讀csv檔案 1 首先匯入csv模板 2 建立乙個csv檔案物件 3 開啟...

python學習筆記 CSV檔案讀

python pandas io tools 之csv檔案讀寫讀取csv檔案 pd.read csv 寫入csv檔案 pd.to csv import pandas as pd obj pd.read csv test.csv print objunnamed 0 c1 c2 c3 0 a 0 5...

Python學習讀csv檔案並顯示

雇員.csv 檔案內容 1,張,穎,銷售代表,女士,1968 12 8,1992 5 1,復興門 245 號,100098 2,王,偉,副總裁銷售博士,1962 2 19,1992 8 14,羅馬花園 890 號,109801 3,李,芳,銷售代表,女士,1973 8 30,1992 4 1,芍...

Python讀csv檔案去掉一列後再寫入新的檔案

python讀 python讀寫csv檔案

python學習筆記 CSV檔案讀

Python學習 讀csv檔案並顯示

相關推薦

Python學習讀csv檔案並顯示