轉錄組分析轉錄組分析使用STAR進行比對

通過二代測序我們可以獲得150bp左右的reads，如果想要知道reads是從哪個轉錄本上測出來的，就需要將reads比對到參考基因組上。比對的演算法很複雜，但簡單理解就是看reads與基因組上哪個區域一致。

wget -c

## 解壓 star

tar -xvzf 2.7.3a.tar.gz

## 執行 star

./star-2.7.3a/bin/linux_x86_64/star

在進行reads比對前，我們需要先構建基因組索引。

## 構建基因組索引 star --runthreadn 6 --runmode genomegenerate --genomedir index_dir --genomefastafiles genome.fasta --sjdbgtffile genome.gtf --sjdboverhang 149

--runthreadn：執行緒數。

--runmode genomegenerate：構建基因組索引。

--genomedir：索引目錄。（index_dir一定要是存在的資料夾，需提前建好）

--genomefastafiles：基因組檔案。

--sjdbgtffile：基因組注釋檔案。

--sjdboverhang：reads長度減1。

索引構建完成後，就可以看到index_dir中生成了以下檔案：

有了索引後，我們就可以進行reads比對了。

## 進行 reads 比對 star --twopassmode basic --quantmode transcriptomesam genecounts --runthreadn 6 --genomedir index_dir --alignintronmin 20 --alignintronmax 50000 --outsamtype bam sortedbycoordinate --sjdboverhang 149 --outsamattrrgline id:sample sm:sample pl:illumina --outfiltermismatchnmax 2 --outsjfilterreads unique --outsammultnmax 1 --outfilenameprefix out_prefix --outsammapqunique 60 --readfilescommand gunzip -c --readfilesin seq1.fq.gz seq2.fq.gz

--twopassmode basic：使用two-pass模式進行reads比對。簡單來說就是先按索引進行第一次比對，而後把第一次比對發現的新剪下位點資訊加入到索引中進行第二次比對。

--quantmode transcriptomesam genecounts：將reads比對至轉錄本序列。

--runthreadn：執行緒數。

--genomedir：索引目錄。

--alignintronmin：最短的內含子長度。（根據gtf檔案計算）

--alignintronmax：最長的內含子長度。（根據gtf檔案計算）

--outsamtype bam sortedbycoordinate：輸出bam檔案並進行排序。

--sjdboverhang：reads長度減1。

--outsamattrrgline：id代表樣本id，sm代表樣本名稱，pl為測序平台。在使用gatk進行snp calling時同一sm的樣本可以合併在一起。

--outfiltermismatchnmax：比對時允許的最大錯配數。

--outsjfilterreads unique：對於跨越剪下位點的reads（junction reads)，只考慮跨越唯一剪下位點的reads。

--outsammultnmax：每條reads輸出比對結果的數量。

--outfilenameprefix：輸出檔案字首。

--readfilescommand：對fastq檔案進行操作。

--readfilesin：輸入fastq檔案的路徑。比對完成後，我們可以看到輸出目錄下有以下檔案：

我們可以使用samtools檢視生成的bam檔案。

## 檢視 bam 檔案 samtools view ck-1_aligned.sortedbycoord.out.bam |head -n 5

可以看到，以"aligned.sortedbycoord.out.bam"為字尾的bam檔案中，reads比對到的位置是基因組位置。

以"aligned.totranscriptome.out.bam"為字尾的bam檔案中，reads比對到的位置是轉錄本位置。

"log.final.out"裡記錄了許多比對情況的統計資訊。

轉錄組分析轉錄組分析使用STAR進行比對

轉錄組分析處理流程

轉錄組分析高階轉錄組分析和R資料視覺化

轉錄組分析的正確姿勢

轉錄組分析 轉錄組分析 使用STAR進行比對

轉錄組分析處理流程

轉錄組分析 高階轉錄組分析和R資料視覺化

轉錄組分析的正確姿勢

相關推薦

轉錄組分析轉錄組分析使用STAR進行比對

轉錄組分析高階轉錄組分析和R資料視覺化