RTK：针对大数据的稀释曲线

Journal: Bioinformatics

Year: 2017

Link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870771/

Rarefaction toolkit (RTK)是一个针对大数据做稀释曲线的工具，有C 和R两种版本。是对QIIME和Mothur有效的补充。

现在被引用了10次，引用它的还都是不错的文章。

RTK除了可以做稀释曲线，还能计算Pielou’s evenness, chao 1, Shannon,Simpson等多样性指数。

结果表明其内存占用与运行时间都优于QIIME、Mothur及R中的vegan包。

代码语言：javascript复制

install.packages("rtk")
library(rtk)
#基本用法：
#rtk(input, repeats = 10, depth = 1000, ReturnMatrix = 1, margin = 2,
#     verbose = FALSE, threads = 1, tmpdir = NULL )

#input可以是一个矩阵或路径
#repeats重复计算次数，默认10次
#depth重抽的深度，默认1000
#ReturnMatrix结果返回的矩阵数量，默认1
#margin：=1按行计算；=2按列计算
#verbose运行过程是否输出
#tmpdir临时文件储存位置

data  <- matrix(sample(x = c(rep(0, 15000),rep(1:10, 100)),
                             size = 10000, replace = TRUE), ncol = 80)
data.r  <- rtk(data, ReturnMatrix = 1, depth = min(colSums(data)))
#结果包含richness，evenness, chao 1, Shannon, Simpson，及稀释曲线的结果。

#collectors.curve画物种积累曲线
collectors.curve(data.r, xlab = "No. of samples (rarefied data)", ylab = "richness")
#分组上色
cls <- rep_len(c("a","b","c","d"), ncol(data))  # study origin of each sample
accumOrder <- c("b","a","d","c")      # define the order, for the plot
colors     <- c(1,2,3,4)
names(colors) <- accumOrder # names used for legend
collectors.curve(data, xlab = "No. of samples",
                 ylab = "richness", col = colors, bin = 1,cls = cls, 
                 accumOrder = accumOrder)
####多样性曲线
data  <- matrix(sample(x = c(rep(0, 1500),rep(1:10, 500),1:1000),
                                 size = 120, replace = TRUE), 40)
samplesize  <- min(colSums(data))
d1  <- rtk(input = data, depth = samplesize)
# 多个重抽深度
d2  <- rtk(input = data, depth = round(seq(1, samplesize, length.out = 10)))

# richness 曲线plot(d1, div = "richness")# richness与稀释曲线拟合
plot(d2, div = "eveness", fit = "arrhenius", pch = c(1,2,3))

—END—

c++ bioinformatics

0 人点赞