R语言中的分子描述的计算

2019-07-31 10:18:30 浏览数 (1)

学习药化的同志们应该都了解化合物分子的特征描述有很多计算软件,今天我们来给大家展示下在R语言中如何实现分子特征描述的计算。主要以MACCS分子指纹的实现作为案例。

  1. 我们需要对应的R包有:rJava,rcdklibs,rcdk(主包)。
  2. 我们看下rcdk包的函数构成。

函数名称

简介

bpdata

Boiling Point Data

cdk.version

Get Current CDK Version

cdkFormula-class

Class cdkFormula, a class for handling molecular formula

charge

Get the Total Charges for the Molecule

compare.isotope.pattern

Compare isotope patterns.

convert.implicit.to.explicit

Operations on molecules

copy.image.to.clipboard

View and Copy 2D Structure Diagrams

depict

View and Copy 2D Structure Diagrams

do.aromaticity

Perform Aromaticity Detection, atom typing or isotopic configuration

do.isotopes

Perform Aromaticity Detection, atom typing or isotopic configuration

do.typing

Perform Aromaticity Detection, atom typing or isotopic configuration

eval.atomic.desc

Evaluate an Atomic Descriptor

eval.desc

Evaluate a Molecular Descriptor

fragment

Molecule Fragmentation Methods

generate.2d.coordinates

Generate 2D Coordinates from Connectivity Information

generate.formula

Generate molecular formulae given a target mass and a set of elements and counts.

generate.formula.iter

Generate molecular formulae given a target mass and a set of elements and counts.

get.adjacency.matrix

Get adjacency matrix for a molecule.

get.alogp

Commonly Used Molecular Descriptors

get.atom.count

Get the atoms from a molecule or bond

get.atom.index

Operations on atoms

get.atomic.desc.names

Get the names of the available atomic descriptors

get.atomic.number

Operations on atoms

get.atoms

Get the atoms from a molecule or bond

get.bonds

Get the bonds from a molecule

get.charge

Operations on atoms

get.connected.atom

Get the atom connected to an atom in a bond

get.connected.atoms

Operations on atoms

get.connection.matrix

Get connection matrix for a molecule.

get.depictor

View and Copy 2D Structure Diagrams

get.desc.categories

Get Descriptor Class Names

get.desc.names

Get Descriptor Class Names

get.exact.mass

Operations on molecules

get.exhaustive.fragments

Molecule Fragmentation Methods

get.fingerprint

Evaluate Fingerprints

get.formal.charge

Operations on atoms

get.formula

Get the formula object from a formula character.

get.hydrogen.count

Operations on atoms

get.isotope.pattern.generator

Construct an isotope pattern generator.

get.isotope.pattern.similarity

Construct an isotope pattern similarity calculator.

get.isotopes.pattern

Generate the isotope pattern.

get.largest.component

Get the Largest Component in a Disconnected Molecule

get.mcs

Perform Substructure Searching & MCS Detection

get.mol2formula

Parser a molecule to formula object.

get.murcko.fragments

Molecule Fragmentation Methods

get.natural.mass

Operations on molecules

get.point2d

Operations on atoms

get.point3d

Operations on atoms

get.properties

Get All Property Values of a Molecule

get.property

Get the Value of a Molecule Property

get.smiles

Get the SMILES for a Molecule

get.smiles.parser

Get a SMILES Parser

get.symbol

Operations on atoms

get.title

Get the Value of a Molecule Property

get.total.charge

Get the Total Charges for the Molecule

get.total.formal.charge

Get the Total Charges for the Molecule

get.total.hydrogen.count

Get the Total Hydrogen Count for a Molecule

get.tpsa

Commonly Used Molecular Descriptors

get.volume

Commonly Used Molecular Descriptors

get.xlogp

Commonly Used Molecular Descriptors

hasNext

Does This Iterator Have A Next Element

hasNext.iload.molecules

Does This Iterator Have A Next Element

iload.molecules

Load Molecular Structures From Disk

is.aliphatic

Operations on atoms

is.aromatic

Operations on atoms

is.connected

Get the Largest Component in a Disconnected Molecule

is.in.ring

Operations on atoms

is.neutral

Operations on molecules

is.subgraph

Perform Substructure Searching & MCS Detection

isvalid.formula

Validate a cdkFormula object.

load.molecules

Load Molecular Structures From Disk

match

Perform Substructure Searching & MCS Detection

matches

Perform Substructure Searching & MCS Detection

mcs

Perform Substructure Searching & MCS Detection

parse.smiles

Parse a Vector of SMILES Strings

remove.hydrogens

Remove Hydrogens from a Molecule

remove.property

Remove A Property From a Molecule

set.charge.formula

Set the charge to a cdkFormula object.

set.property

Set A Property On A Molecule

show-method

Class cdkFormula, a class for handling molecular formula

smarts

Perform Substructure Searching & MCS Detection

smiles.flavors

Generate flag for customizing SMILES generation.

substructure

Perform Substructure Searching & MCS Detection

view.image.2d

View and Copy 2D Structure Diagrams

view.molecule.2d

View and Copy 2D Structure Diagrams

view.table

View 2D Structures With Data

write.molecules

Write Molecules To Disk

  1. rcdk包的安装:

a. windows下安装:

首先,在Java官网 下载Java CDK,网址如下:

http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html)

然后就是对应的依次安装rJava,rcdk。

b. Linux下安装:

同windows一样首先安装Java CDK。Ubuntu的话直接:apt installopenjdk-8-jre-headless;sudo apt-get installopenjdk-8-jdk,即可安装java环境。

R语言安装参见:R语言在Linux的安装。然后就是对应的依次安装rJava,rcdk。

  1. 数据的导入格式

a. load.molecules()。

Exp: mol=load.molecules("G:/drugbank.sdf")。

b. parse.smiles

代码语言:javascript复制
Exp:mol= parse.smiles('C1C=CCC1N(C)c1ccccc1')[[1]]。
  1. MACCS指纹的计算及基础的分子描述。

a. get.smiles() 获取分子的SMILE结构

b. get.atom.count() 获取组成分子的原子数目

c. get.fingerprint() 获取分子的MACCS指纹。结果抽取如下:

  1. 数据的导出

数据的导出还是平时我们用的write.csv()。只要把所有的指纹数据导出就可以进行我们下一步的计算了。

0 人点赞