使用conda安装复杂软件不妨给它独立的环境(以rmats为例)

2022-03-03 14:05:34 浏览数 (1)

首先在自己的服务器上面安装conda,安装方法代码如下:

代码语言:javascript复制
# 首先下载文件,20M/S的话需要几秒钟即可
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
# 接下来使用bash命令来运行我们下载的文件,记得是一路yes下去
bash Miniconda3-latest-Linux-x86_64.sh 
#  安装成功后需要更新系统环境变量文件
source ~/.bashrc

安装好conda后需要设置镜像。

代码语言:javascript复制
conda config --add channels r 
conda config --add channels conda-forge 
conda config --add channels bioconda
conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/cloud/bioconda/
conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge/
conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.bfsu.edu.cn/anaconda/pkgs/main/
conda config --set show_channel_urls yes 

我们已经多次强调了,之前推荐的清华大学镜像可能是人满为患,大家需要自己机智一点哦。

使用conda新建rmats环境

记住,是新建rmats环境 ,然后在rmats环境 里面去安装rmats软件哦,代码如下:

代码语言:javascript复制
conda create -n  rmats 
conda activate rmats

conda search rmats  -c bioconda

conda install -c bioconda rmats=4.1.1
conda clean --al 
conda install -c bioconda rmats=4.1.1

需要仔细查看安装rmats这一个软件,我们的conda需要做的工作 :

代码语言:javascript复制
The following NEW packages will be INSTALLED:

  _libgcc_mutex      anaconda/cloud/conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
  _openmp_mutex      anaconda/cloud/conda-forge/linux-64::_openmp_mutex-4.5-1_gnu
  ca-certificates    anaconda/cloud/conda-forge/linux-64::ca-certificates-2020.12.5-ha878542_0
  certifi            anaconda/cloud/conda-forge/linux-64::certifi-2020.12.5-py38h578d9bd_1
  gsl                anaconda/cloud/conda-forge/linux-64::gsl-2.6-he838d99_2
  ld_impl_linux-64   anaconda/cloud/conda-forge/linux-64::ld_impl_linux-64-2.35.1-hea4e1c9_2
  libblas            anaconda/cloud/conda-forge/linux-64::libblas-3.9.0-8_openblas
  libcblas           anaconda/cloud/conda-forge/linux-64::libcblas-3.9.0-8_openblas
  libffi             anaconda/cloud/conda-forge/linux-64::libffi-3.3-h58526e2_2
  libgcc-ng          anaconda/cloud/conda-forge/linux-64::libgcc-ng-9.3.0-h2828fa1_18
  libgfortran-ng     anaconda/cloud/conda-forge/linux-64::libgfortran-ng-7.5.0-h14aa051_18
  libgfortran4       anaconda/cloud/conda-forge/linux-64::libgfortran4-7.5.0-h14aa051_18
  libgomp            anaconda/cloud/conda-forge/linux-64::libgomp-9.3.0-h2828fa1_18
  liblapack          anaconda/cloud/conda-forge/linux-64::liblapack-3.9.0-8_openblas
  libopenblas        anaconda/cloud/conda-forge/linux-64::libopenblas-0.3.12-pthreads_hb3c22a3_1
  libstdcxx-ng       anaconda/cloud/conda-forge/linux-64::libstdcxx-ng-9.3.0-h6de172a_18
  ncurses            anaconda/cloud/conda-forge/linux-64::ncurses-6.2-h58526e2_4
  numpy              anaconda/cloud/conda-forge/linux-64::numpy-1.20.1-py38h18fd61f_0
  openssl            anaconda/cloud/conda-forge/linux-64::openssl-1.1.1j-h7f98852_0
  pip                anaconda/cloud/conda-forge/noarch::pip-21.0.1-pyhd8ed1ab_0
  python             anaconda/cloud/conda-forge/linux-64::python-3.8.8-hffdb5ce_0_cpython
  python_abi         anaconda/cloud/conda-forge/linux-64::python_abi-3.8-1_cp38
  readline           anaconda/cloud/conda-forge/linux-64::readline-8.0-he28a2e2_2
  rmats              bioconda/linux-64::rmats-4.1.1-py38h566bde1_0
  setuptools         anaconda/cloud/conda-forge/linux-64::setuptools-49.6.0-py38h578d9bd_3
  sqlite             anaconda/cloud/conda-forge/linux-64::sqlite-3.34.0-h74cdb3f_0
  star               bioconda/linux-64::star-2.7.8a-0
  tk                 anaconda/cloud/conda-forge/linux-64::tk-8.6.10-h21135ba_1
  wheel              anaconda/cloud/conda-forge/noarch::wheel-0.36.2-pyhd3deb0d_0
  xz                 anaconda/cloud/conda-forge/linux-64::xz-5.2.5-h516909a_1
  zlib               anaconda/cloud/conda-forge/linux-64::zlib-1.2.11-h516909a_1010

安装成功后,就查看自己的软件:

代码语言:javascript复制
$ STAR --version 
2.7.8a 
$ rmats.py  --version 
v4.1.1

对star运行成功后的bam文件进行可变剪切操作

star运行成功后的bam文件大小示例如下所示:

代码语言:javascript复制
$ cat *txt|xargs ls -lh |cut -d" " -f 5-
3.7G 3月  10 18:25 SRR8518122.bam
3.9G 3月  10 19:25 SRR8518123.bam
3.6G 3月  10 18:21 SRR8518124.bam
7.9G 3月  12 12:12 SRR8518436.bam
3.2G 3月  12 12:59 SRR8518442.bam
7.2G 3月  12 15:05 SRR8518448.bam

bam文件全路径需要制作成为两个文本文件,如下所示:

代码语言:javascript复制
jmzeng 21:30:42 ~/tnbc/test_rmats
$ cat g1.txt 
SRR8518122.bam,SRR8518123.bam,SRR8518124.bam
$ cat g2.txt 
SRR8518436.bam,SRR8518442.bam,SRR8518448.bam  

运行rmats的时候,选择--b1--b2

代码语言:javascript复制
gtf=$HOME/rna/SUPPA2/gtf/gencode.v37.annotation.gtf
rmats.py --b1  g1.txt  --b2  g2.txt  
--gtf $gtf 
-t paired --readLength 147 --nthread 4 
--od results --tmp tmp_output

运行成功的日志如下所示:

代码语言:javascript复制
gtf: 26.418766975402832
There are 60651 distinct gene ID in the gtf file
There are 234485 distinct transcript ID in the gtf file
There are 36780 one-transcript genes in the gtf file
There are 1460986 exons in the gtf file
There are 25134 one-exon transcripts in the gtf file
There are 22496 one-transcript genes with only one exon in the transcript
Average number of transcripts per gene is 3.866136
Average number of exons per transcript is 6.230616
Average number of exons per transcript excluding one-exon tx is 6.858587
Average number of gene per geneGroup is 8.495835
statistic: 0.04167461395263672

通常呢,运行速度很快:

代码语言:javascript复制
==========
Done processing each gene from dictionary to compile AS events
Found 55759 exon skipping events
Found 4089 exon MX events
Found 18752 alt SS events
There are 11349 alt 3 SS events and 7403 alt 5 SS events.
Found 8037 RI events
==========

ase: 3.8115618228912354
count: 5.383385896682739
Processing count files.
Done processing count files.

得到的结果不是一般的多:

代码语言:javascript复制
382K 3月  13 21:44 A3SS.MATS.JCEC.txt
368K 3月  13 21:44 A3SS.MATS.JC.txt
257K 3月  13 21:44 A5SS.MATS.JCEC.txt
241K 3月  13 21:44 A5SS.MATS.JC.txt
1.1M 3月  13 21:44 fromGTF.A3SS.txt
703K 3月  13 21:44 fromGTF.A5SS.txt
464K 3月  13 21:44 fromGTF.MXE.txt
 16K 3月  13 21:44 fromGTF.novelJunction.A3SS.txt
 11K 3月  13 21:44 fromGTF.novelJunction.A5SS.txt
 30K 3月  13 21:44 fromGTF.novelJunction.MXE.txt
2.1K 3月  13 21:44 fromGTF.novelJunction.RI.txt
356K 3月  13 21:44 fromGTF.novelJunction.SE.txt
 102 3月  13 21:44 fromGTF.novelSpliceSite.A3SS.txt
 102 3月  13 21:44 fromGTF.novelSpliceSite.A5SS.txt
 140 3月  13 21:44 fromGTF.novelSpliceSite.MXE.txt
 108 3月  13 21:44 fromGTF.novelSpliceSite.RI.txt
 104 3月  13 21:44 fromGTF.novelSpliceSite.SE.txt
758K 3月  13 21:44 fromGTF.RI.txt
5.3M 3月  13 21:44 fromGTF.SE.txt
 83K 3月  13 21:44 JCEC.raw.input.A3SS.txt
 56K 3月  13 21:44 JCEC.raw.input.A5SS.txt
 38K 3月  13 21:44 JCEC.raw.input.MXE.txt
123K 3月  13 21:44 JCEC.raw.input.RI.txt
356K 3月  13 21:44 JCEC.raw.input.SE.txt
 80K 3月  13 21:44 JC.raw.input.A3SS.txt
 52K 3月  13 21:44 JC.raw.input.A5SS.txt
 34K 3月  13 21:44 JC.raw.input.MXE.txt
112K 3月  13 21:44 JC.raw.input.RI.txt
329K 3月  13 21:44 JC.raw.input.SE.txt
199K 3月  13 21:44 MXE.MATS.JCEC.txt
177K 3月  13 21:44 MXE.MATS.JC.txt
579K 3月  13 21:44 RI.MATS.JCEC.txt
523K 3月  13 21:44 RI.MATS.JC.txt
1.6M 3月  13 21:44 SE.MATS.JCEC.txt
1.5M 3月  13 21:44 SE.MATS.JC.txt
 377 3月  13 21:44 summary.txt

具体的解读,就很耗费时间了,需要一点点的看文档。

如果你确实觉得我的教程对你的科研课题有帮助,让你茅塞顿开,或者说你的课题大量使用我的技能,烦请日后在发表自己的成果的时候,加上一个简短的致谢,如下所示:

代码语言:javascript复制
We thank Dr.Jianming Zeng(University of Macau), and all the members of his bioinformatics team, biotrainee, for generously sharing their experience and codes.

十年后我环游世界各地的高校以及科研院所(当然包括中国大陆)的时候,如果有这样的情谊,我会优先见你。

0 人点赞