导语
GUIDE ╲
富集热图是一种特殊类型的热图,可将特定目标区域上基因组信号的富集可视化。 例如可以在转录起始位点富集组蛋白修饰。
背景介绍
今天小编给大家带来的就是一个专门用来绘制富集热图的R包--EnrichedHeatmap,作者是基于 ComplexHeatmap 包实现的热图绘制,通过使用EnrichedHeatmap包,我们可以对各种表观遗传学数据进行丰富的可视化!
R包安装
代码语言:javascript复制BiocManager::install("EnrichedHeatmap")
library(EnrichedHeatmap)
可视化展示
01
基本可视化
首先加载来自 Roadmap 数据集的人类肺组织的数据。
代码语言:javascript复制set.seed(123)
load(system.file("extdata", "chr21_test_data.RData", package = "EnrichedHeatmap")
##对 100000 个 CpG 位点进行了下采样以获得甲基化数据。
tss = promoters(genes, upstream = 0, downstream = 1)
tss[1:5]
与其他工具类似,可视化的任务分为两个步骤:
可视化的流程分为两部分,通过归一化为矩阵来获得基因组信号和目标之间的关联,和通过热图可视化矩阵。
代码语言:javascript复制mat1 = normalizeToMatrix(H3K4me3, tss, value_column = "coverage",
extend = 5000, mean_mode = "w0", w = 50)
mat1
## Normalize H3K4me3 to tss:
## Upstream 5000 bp (100 windows)
## Downstream 5000 bp (100 windows)
## Include target regions (width = 1)
## 720 target regions
EnrichedHeatmap(mat1, name = "H3K4me3")
02
参数设置
与普通热图类似,设置颜色的最简单方法是提供颜色向量。
代码语言:javascript复制EnrichedHeatmap(mat1, col = c("white", "red"), name = "H3K4me3")
定义一个颜色映射函数,它只将颜色映射到小于 99% 的值,大于 99% 的值使用与 99% 相同的颜色(避免极值的影响)。
代码语言:javascript复制library(circlize)
col_fun = colorRamp2(quantile(mat1, c(0, 0.99)), c("white", "red"))
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3")
按行拆分
通过指定 row_split 选项,按向量或数据框拆分行。
代码语言:javascript复制EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3",
row_split = sample(c("A", "B"), length(genes), replace = TRUE),
column_title = "Enrichment of H3K4me3")
通过指定 row_km 选项,通过 k 均值聚类拆分行。
代码语言:javascript复制set.seed(123)
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3", row_km = 3,
column_title = "Enrichment of H3K4me3", row_title_rot = 0)
对行的聚类
代码语言:javascript复制EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3",
cluster_rows = TRUE, column_title = "Enrichment of H3K4me3")
目标区域的扩展
代码语言:javascript复制# 上游1kb, 下游2kb(上游或下游可以设置为0)
mat12 = normalizeToMatrix(H3K4me3, tss, value_column = "coverage",
extend = c(1000, 2000), mean_mode = "w0", w = 50)
EnrichedHeatmap(mat12, name = "H3K4me3", col = col_fun)
富集注释
富集注释的axis由 anno_enriched() 中的axis_param 设置。
代码语言:javascript复制EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3",
top_annotation = HeatmapAnnotation(
enriched = anno_enriched(
ylim = c(0, 10),
axis_param = list(
at = c(0, 5, 10),
labels = c("zero", "five", "ten"),
side = "left",
facing = "outside"
)))
)
生成矩阵时,可以通过将 smooth 设置为 TRUE 来进行平滑处理。
代码语言:javascript复制mat1_smoothed = normalizeToMatrix(H3K4me3, tss, value_column = "coverage",
extend = 5000, mean_mode = "w0", w = 50, background = 0, smooth = TRUE)
EnrichedHeatmap(mat1_smoothed, col = col_fun, name = "H3K4me3_smoothed",
column_title = "smoothed")
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3", column_title = "unsmoothed")
在上面的图中,大家可能会觉得左侧的热图比右侧的未平滑热图更好。 下面,我们将证明平滑可以显着改善甲基化数据集的富集模式。
代码语言:javascript复制##未平滑
mat2 = normalizeToMatrix(meth, tss, value_column = "meth", mean_mode = "absolute",
extend = 5000, w = 50, background = NA)
meth_col_fun = colorRamp2(c(0, 0.5, 1), c("blue", "white", "red"))
EnrichedHeatmap(mat2, col = meth_col_fun, name = "methylation", column_title = "methylation near TSS")
##平滑
mat2 = normalizeToMatrix(meth, tss, value_column = "meth", mean_mode = "absolute",
extend = 5000, w = 50, background = NA, smooth = TRUE)
EnrichedHeatmap(mat2, col = meth_col_fun, name = "methylation", column_title = "methylation near TSS")
多个热图
EnrichedHeatmap 包的强大之处在于可以串联并行热图,可以用于丰富热图、普通热图以及行注释。这提供了一种非常有效的方式来可视化多个信息源。
代码语言:javascript复制EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3",
top_annotation = HeatmapAnnotation(enrich = anno_enriched(axis_param = list(side = "left"))))
EnrichedHeatmap(mat2, col = meth_col_fun, name = "methylation")
Heatmap(log2(rpkm 1), col = c("white", "orange"), name = "log2(rpkm 1)",
show_row_names = FALSE, width = unit(5, "mm"))
代码语言:javascript复制partition = paste0("cluster", kmeans(mat1, centers = 3)$cluster)
lgd = Legend(at = c("cluster1", "cluster2", "cluster3"), title = "Clusters",
type = "lines", legend_gp = gpar(col = 2:4))
ht_list = Heatmap(partition, col = structure(2:4, names = paste0("cluster", 1:3)), name = "partition",
show_row_names = FALSE, width = unit(3, "mm"))
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3",
top_annotation = HeatmapAnnotation(lines = anno_enriched(gp = gpar(col = 2:4))),
column_title = "H3K4me3")
EnrichedHeatmap(mat2, col = meth_col_fun, name = "methylation",
top_annotation = HeatmapAnnotation(lines = anno_enriched(gp = gpar(col = 2:4))),
column_title = "Methylation")
Heatmap(log2(rpkm 1), col = c("white", "orange"), name = "log2(rpkm 1)",
show_row_names = FALSE, width = unit(15, "mm"),
top_annotation = HeatmapAnnotation(summary = anno_summary(gp = gpar(fill = 2:4),
outline = FALSE, axis_param = list(side = "right"))))
draw(ht_list, split = partition, annotation_legend_list = list(lgd),
ht_gap = unit(c(2, 8, 8), "mm"))
分别可视化正信号和负信号
代码语言:javascript复制load(system.file("extdata", "H3K4me1_corr_normalize_to_tss.RData", package = "EnrichedHeatmap"))
mat_H3K4me1
corr_col_fun = colorRamp2(c(-1, 0, 1), c("darkgreen", "white", "red"))
EnrichedHeatmap(mat_H3K4me1, col = corr_col_fun, name = "corr_H3K4me1",
top_annotation = HeatmapAnnotation(
enrich = anno_enriched(gp = gpar(neg_col = "darkgreen", pos_col = "red"),
axis_param = list(side = "left"))
), column_title = "separate neg and pos")
EnrichedHeatmap(mat_H3K4me1, col = corr_col_fun, show_heatmap_legend = FALSE,
top_annotation = HeatmapAnnotation(enrich = anno_enriched(value = "abs_mean")),
column_title = "pool neg and pos")
小编总结
EnrichedHeatmap的作者与ComplexHeatmap是同一个人,属实是将热图玩明白了,EnrichedHeatmap提供了一个非常强大的对富集结果进行可视化的功能,尤其适用于表观遗传学修饰的可视化,大家可以多多尝试哟!
详细使用说明见:
https://www.bioconductor.org/packages/release/bioc/vignettes/EnrichedHeatmap/inst/doc/EnrichedHeatmap.html