用SeqinR包获取蛋白序列并进行比较

2019-03-04 11:04:42 浏览数 (1)

1 uniprot获取蛋白序列

代码语言:javascript复制
#retrieving a uniprot protein sequence using SeqinR
library("seqinr")
choosebank("swissprot")
leprae <- query("leprae","AC=Q9CD83")
lepraeseq <- getSequence(leprae$req[[1]])
ulcerans <- query("ulcerans","AC=A0PQ23")
ulceransseq <- getSequence(ulcerans$req[[1]])
closebank()
lepraeseq
代码语言:javascript复制
> lepraeseq
  [1] "M" "T" "N" "R" "T" "L" "S" "R" "E" "E" "I" "R" "K" "L" "D" "R" "D" "L" "R"
 [20] "I" "L" "V" "A" "T" "N" "G" "T" "L" "T" "R" "V" "L" "N" "V" "V" "A" "N" "E"
 [39] "E" "I" "V" "V" "D" "I" "I" "N" "Q" "Q" "L" "L" "D" "V" "A" "P" "K" "I" "P"
 [58] "E" "L" "E" "N" "L" "K" "I" "G" "R" "I" "L" "Q" "R" "D" "I" "L" "L" "K" "G"
 [77] "Q" "K" "S" "G" "I" "L" "F" "V" "A" "A" "E" "S" "L" "I" "V" "I" "D" "L" "L"
 [96] "P" "T" "A" "I" "T" "T" "Y" "L" "T" "K" "T" "H" "H" "P" "I" "G" "E" "I" "M"
[115] "A" "A" "S" "R" "I" "E" "T" "Y" "K" "E" "D" "A" "Q" "V" "W" "I" "G" "D" "L"
[134] "P" "C" "W" "L" "A" "D" "Y" "G" "Y" "W" "D" "L" "P" "K" "R" "A" "V" "G" "R"
[153] "R" "Y" "R" "I" "I" "A" "G" "G" "Q" "P" "V" "I" "I" "T" "T" "E" "Y" "F" "L"
[172] "R" "S" "V" "F" "Q" "D" "T" "P" "R" "E" "E" "L" "D" "R" "C" "Q" "Y" "S" "N"
[191] "D" "I" "D" "T" "R" "S" "G" "D" "R" "F" "V" "L" "H" "G" "R" "V" "F" "K" "N"
[210] "L"

2.使用dotplot对上述两条序列进行比较

  • 比较两个蛋白,DNA或RNA序列,可以做一个dotplot进行比较,dotplot可以对两条序列之间的相似性进行比较,这会产生一个二维矩阵,对蛋白序列可以在水平和垂直方向进行比较。
  • 为了使简单的dotplot来表示两个序列之间的相似性,如果残基或碱基相同,则矩阵中的单个位点可以用黑色阴影,那么两条序列中匹配的序列片段显示为跨越整个矩阵的对角线。
  • 如果两条序列中残基不一样,但又有相似的区域,那么dotplot在主对角线可能有一些短线并偏离一些距离。
  • 也就是说,dotplot可以清晰解释两条蛋白或DNA序列之间的任何区域的相似性。 SeqinR包中的dotPlot函数可以进行绘制。
代码语言:javascript复制
 dotPlot(lepraeseq, ulceransseq)

dotplot

解释:

  • 上图中,lepraeseq在x轴,ulcerans序列在y轴。展示了两条氨基酸序列之间的相似性。
  • 可见很多dots沿对角线分布,这显示了两条蛋白序列包含很多相同或相似的氨基酸。这不难理解,因为这两条序列是同源序列。

0 人点赞