一、散点图
代码语言:javascript复制m <- read.table("prok_representative.csv",sep = ",",header = T);
x <- m[,2]
y <- m[,4]
plot(x,y,pch=16,xlab="Genome Size",ylab="Genes");
fit <- lm(y~x);
abline( fit,col="blue",lwd=1.8 );
rr <- round( summary(fit)$adj.r.squared,2);
intercept <- round( summary(fit)$coefficients[1],2);
slope <- round( summary(fit)$coefficients[2],2);
eq <- bquote( atop( "y = " * .(slope) * " x " * .(intercept), R^2 == .(rr) ) );
text(12,6e3,eq);
基因组大小与基因数目相关性散点图
二、基因长度分布直方图
代码语言:javascript复制#基因长度分布图
x <- read.table("H37Rv.gff",sep = "t",header = F,skip = 7,quote = "")
x <- x[x$V3=="gene",]
# x <- x %>% dplyr::filter(V3 == 'gene')
x <- abs(x$V5-x$V4) 1
# x <- x %>% dplyr::mutate(gene_len=abs(V5-V4) 1)
# head(x$gene_len)
length(x)
range(x)
hist(x)
hist(x,breaks = 80)
?hist
hist(x,breaks = 'Sturges')
hist(x,breaks = c(0,500,1000,1500,2000,2500,15000))
hist(x,breaks = 80,freq = F)
hist(x,breaks = 80,density = T)
hist(rivers,density = T,breaks = 10)
?hist
pdf(file = 'hist.pdf')
h <- hist(x,nclass=80,col="pink",xlab="Gene Length (bp)",main="Histogram of Gene Length");
rug(x);
xfit<-seq(min(x),max(x),length=100);
yfit<-dnorm(xfit,mean=mean(x),sd=sd(x));
yfit <- yfit*diff(h$mids[1:2])*length(x);
lines(xfit, yfit, col="blue", lwd=2);
dev.off()
基因长度分布直方图
写在最后:有时间我们会努力更新的。大家互动交流可以前去论坛,地址在下面,复制去浏览器即可访问,弥补下公众号没有留言功能的缺憾。原地址暂未启用(bioinfoer.com)。
代码语言:javascript复制sx.voiceclouds.cn
有些板块也可以预设为大家日常趣事的分享等,欢迎大家来提建议。