bam文件可以按照染色体或者tag分割,bam文件的分割可以使用bamtools. 用法:
代码语言:javascript复制Description: splits a BAM file on user-specified property, creating a new BAM output file for each value found.
Usage: bamtools split [-in <filename>] [-stub <filename stub>] < -mapped | -paired | -reference [-refPrefix <prefix>] | -tag <TAG> >
Input & Output:
-in <BAM filename> the input BAM file [stdin]
-refPrefix <string> custom prefix for splitting by
references. Currently files end with
REF_<refName>.bam. This option allows you
to replace "REF_" with a prefix of your
choosing.
-tagPrefix <string> custom prefix for splitting by
tags. Current files end with
TAG_<tagname>_<tagvalue>.bam. This option
allows you to replace "TAG_" with a prefix
of your choosing.
-stub <filename stub> prefix stub for output BAM
files (default behavior is to use input
filename, without .bam extension, as
stub). If input is stdin and no stub
provided, a timestamp is generated as the
stub.
-tagListDelim <string> delimiter used to separate
values in the filenames generated from
splitting on list-type tags [--]
Split Options:
-mapped split mapped/unmapped
alignments
-paired split single-end/paired-end
alignments
-reference split alignments by reference
-tag <tag name> splits alignments based on all
values of TAG encountered (i.e. -tag RG
creates a BAM file for each read group in
original BAM file)
简单来说,bamtools split 用法为: -in :指定输入的需要分割的bam文件 -reference :按染色体分割 -refPrefix :将按染色体分割生成的文件名字前缀"REF_"替换 -tagPrefix:将按tag分割生成的文件名字前缀"TAG_"替换
1.按染色体分割bam文件
代码语言:javascript复制bamtools split -in tmp.bam -reference
2.按tag分割bam文件
代码语言:javascript复制bamtools split -in tmp.bam -tag RG
参考: https://github.com/pezmaster31/bamtools/issues/135