生物資訊指令碼練習（4）按照行列合併檔案

這是個新的需求，要求把如下的兩個2x5 的檔案合併成乙個3x5 的。

program2_1.txt

seq length

cog4 210

cog2 94

cog3 210

cog1 113

cog5 152

program2_2.txt

seq depth

cog5 93

cog1 110

cog2 114

cog4 91

cog3 111

下面是我的解法：

import re
with open ("program2_1.txt","r") as f:
lenth = f.readlines()
print ("origin",lenth)
with open ("program2_2.txt","r") as f:
dep = f.readlines()
output = open("data2.txt","w")
print ("origin",dep)
#lll = ['seq\tlenth\tdepth\n']
result = 
lenth_data = lenth[1:]
lenth_data = sorted(lenth_data)
dep_data = dep[1:]
dep_data = sorted(dep_data)
print ("lenthdata",lenth_data,"\n","depthdata",dep_data)
print ("^^^^^^^^^^^^^")
lenth_temp = 
lenth_temp_temp = 
temp = ["seq\tdepth\tlenth\n",]
for i in lenth_data:
for i  in lenth_temp:
i = i.strip()
print (lenth_temp_temp)
#然後把dep_data最後的換行符變成回車符
depth_data_t = 
for i in dep_data:
i = i.strip()
i += "\t"
print("depth_data_t",depth_data_t)
n = 0
for i in depth_data_t:
if n < len(lenth_temp):      #通過迴圈將提取出來的lenth資料加到depth最後的換行符之前
ele = i+lenth_temp[n]
n+=1
print(temp)
for i  in temp:
output.writelines(i)
output.close()

生物資訊指令碼練習（3）gb檔案轉換

這是個genebank的序列檔案這個檔案需要轉換成fasta格式的檔案，指令碼如下 import re output open data3.txt w with open sequence.gb r as f read f.readlines title read 0 title title 12...

生物資訊 Call snp by soapsnp

生物資訊 call snp by soapsnp 全基因組資料人全基因組 100多g，兩個gz檔案，已去接頭，pe測序，90讀長方法 call snp by soapsnp 每步估計需要投多大僅作參考 bwa 13g 從兩個clean.fq.gz到兩個.sai再到乙個.sam 注在生成完s...

生物資訊 related

生物資訊學的研究重點主要體現在基因組學 genomics 和蛋白質組學 proteomics 兩個方面，intron 內含子，exon外顯子雙螺旋結構是基於對鹼基配對規律的認識氫鍵結合只發生於互補的鹼基a與t g與c之間。雙螺旋分子兩條鏈的嚴格互補性，是指一條鏈的核苷酸順序，無例外地取決於另一條...

生物資訊指令碼練習（4）按照行列合併檔案

生物資訊指令碼練習（3）gb檔案轉換

生物資訊 Call snp by soapsnp

生物資訊 related

相關推薦