GEO数据挖掘—3

2023-03-20 22:19:59 浏览数 (1)

GEO数据挖掘—3

富集分析

(一)GO富集分析(用差异基因做富集)

输入数据

代码语言:javascript复制
#(1)输入数据
gene_up = deg$ENTREZID[deg$change == 'up'] 
gene_down = deg$ENTREZID[deg$change == 'down'] 
gene_diff = c(gene_up,gene_down)   #得到了差异基因
代码语言:javascript复制
#(2)富集
#以下步骤耗时很长,设置了存在即跳过
f = paste0(gse_number,"_GO.Rdata")
if(!file.exists(f)){
  ego <- enrichGO(gene = gene_diff,
                  OrgDb= org.Hs.eg.db,
                  ont = "ALL",
                  readable = TRUE)
  ego_BP <- enrichGO(gene = gene_diff,
                  OrgDb= org.Hs.eg.db,
                  ont = "BP",
                  readable = TRUE)
  #ont参数:One of "BP", "MF", and "CC" subontologies, or "ALL" for all three.
  save(ego,ego_BP,file = f)
}
代码语言:javascript复制
#(3)可视化
#条带图
barplot(ego)
barplot(ego, split = "ONTOLOGY", font.size = 10, 
        showCategory = 5)   
  facet_grid(ONTOLOGY ~ ., space = "free_y",scales = "free_y") 
  
  #气泡图
dotplot(ego)
​
dotplot(ego, split = "ONTOLOGY", font.size = 10, 
        showCategory = 5)   
  facet_grid(ONTOLOGY ~ ., space = "free_y",scales = "free_y") 
代码语言:javascript复制
#(4)展示top通路的共同基因,要放大看。
​
#gl 用于设置下图的颜色
gl = deg$logFC
names(gl)=deg$ENTREZID
#Gene-Concept Network,要放大看
cnetplot(ego,
         #layout = "star",
         color.params = list(foldChange = gl),
         showCategory = 3)

(二)KEGG富集分析

上调、下调、差异、所有基因

代码语言:javascript复制
#(1)输入数据
gene_up = deg[deg$change == 'up','ENTREZID'] 
gene_down = deg[deg$change == 'down','ENTREZID'] 
gene_diff = c(gene_up,gene_down)
代码语言:javascript复制
#(2)对上调/下调/所有差异基因进行富集分析
f2 = paste0(gse_number,"_KEGG.Rdata")
if(!file.exists(f2)){
  kk.up <- enrichKEGG(gene         = gene_up,
                      organism     = 'hsa')
  kk.down <- enrichKEGG(gene         =  gene_down,
                        organism     = 'hsa')
  kk.diff <- enrichKEGG(gene         = gene_diff,
                        organism     = 'hsa')
  save(kk.diff,kk.down,kk.up,file = f2)
}
load(f2)
代码语言:javascript复制
#(3)看看富集到了吗?https://mp.weixin.qq.com/s/NglawJgVgrMJ0QfD-YRBQg
table(kk.diff@result$p.adjust<0.05)
table(kk.up@result$p.adjust<0.05)
table(kk.down@result$p.adjust<0.05)
代码语言:javascript复制
#(4)双向图
# 富集分析所有图表默认都是用p.adjust,富集不到可以退而求其次用p值,在文中说明即可
down_kegg <- kk.down@result %>%
  filter(pvalue<0.05) %>% #筛选行
  mutate(group=-1) #新增列
​
up_kegg <- kk.up@result %>%
  filter(pvalue<0.05) %>%
  mutate(group=1)
​
source("kegg_plot_function.R")
g_kegg <- kegg_plot(up_kegg,down_kegg)
g_kegg
标准流程的后续
问题数据和常见错误分析
复杂数据及其分析

1.多分组数据:示例GSE474

2.多数据联系分析:例如GSE83521_ and_ GSE89143

批次效应

0 人点赞