python檔案資料操作

#coding=utf-8
import re
import sys
import struct
import array
import numpy as np
define=8000*3
if __name__ =='__main__':
sourcefilename=r'c:\users\sophie\desktop\allfea_basedon_2194_localconv3_normalized.txt'
sourcefile=open(sourcefilename,'r')
feature = 
person_feat = {}
print "start"
for i, line in enumerate(sourcefile.readlines()):
if i % 8000 == 0 :
#if i % 8000 == 0 and i != 0:
person_feat = {}
if i>=define:
break
l=line.strip('\n')
k=l.split(' ')
person_num=k[0]
#tmp = np.array([float(x) for x in k[1:]])  # 一行的資料
tmp=np.array(k[1:])
if person_num not in person_feat:
person_feat[person_num] = 
#寫入準備
sample= #每個人對應的樣本個數
number=0    #for test 總人數
print "start write"
#print feature
for feat in feature:
pernum=len(feat)
number+=pernum
for (nump,pfeat) in feat.items():
tmplen=len(pfeat)
print tmplen
featurelen=256
outfilename=r'outfile.txt'
outfile=open(outfilename,'wb')
outfile.write(struct.pack("ii",featurelen,number))
for ii in range(len(sample)):
outfile.write(struct.pack('i',sample[ii]))
for feat in feature:
for (nump,pfeat) in feat.items():
for jj in range(len(pfeat)):
pfeat[jj].tofile(outfile)

背景：人臉識別專案中，提取出了很多資料，需要提取合適的資料供人臉識別用。

人臉識別中，需要乙個二進位制檔案，格式是：特徵長度，總共人數，每個人對應的樣本數，相應的特徵

特徵長度是256，總共人數和樣本數需要計算，特徵是float型別的資料。

現在的檔案格式是：

每8000行是一部電視劇集的人臉特徵，一共有10多部電視劇集。

檔案每一行有257維資料，第一維是標示不同的人，剩下的都是特徵。

這裡巧妙的用到了python中的字典，和numpy陣列。

其實這裡面是有漏洞的，下次再說。

Java基礎檔案資料IO操作

檔案資料io操作 1.1.字元流原理 reader是所有字元輸入流的父類而writer是所有字元輸出流的父類。字元流是以字元 char 為單位讀寫資料的。一次處理乙個unicode。字元流都是高階流，其底層都是依靠位元組流進行讀寫資料的，所以底層仍然是基於位元組讀寫資料的。1.2.常用方法 read...

python讀取各種檔案資料解析

1.讀取文字檔案資料 txt結尾的檔案或日誌檔案 log結尾的檔案以下是檔案中的內容，檔名為data.txt 與data.log內容相同且處理方式相同，呼叫時改個名稱就可以了以下是python實現 coding gb2312 import json defread txt high file...

檔案資料組織

資料庫的基本概念二曾士熊原文 http www.ascc.sinica.edu.tw nl 83 1009 section3 3.html 劉建文整理 http blog.csdn.net keminlau 本文接第10個卷08期68頁常見的電腦檔案包括可執行程式檔案，批處理檔案，文字檔案...

python檔案資料操作

Java基礎 檔案資料IO操作

python讀取各種檔案資料解析

檔案資料組織

相關推薦

Java基礎檔案資料IO操作