文章目录

  • 使用ComplexHeatmap包绘制个性化热图
  • 检测安装加载包
  • 创建测试数据集
  • 一行命令绘图
  • 调参美化
  • 猜你喜欢
  • 写在后面

使用ComplexHeatmap包绘制个性化热图

作者:刘梦瑶 诺禾致源 微生物信息


ComplexHeatmap包由顾祖光博士创建,是一个非常全面的绘制热图的R包,可以利用它来绘制许多文献中的美图,例如下图展示的16S文献分析中的热图。这里主要介绍一下如何用这个R包来绘制类似的个性化热图。

检测安装加载包

# 检测安装CRAN包
package_list = c("circlize","grid","BiocManager")
for(p in package_list){
if (!requireNamespace(p, quietly = TRUE))
    install.packages(p)
}
# 检测安装bioconductor包
package_list = c("ComplexHeatmap")
for(p in package_list){
if (!requireNamespace(p, quietly = TRUE))
    BiocManager::install(p)
}

# 加载依赖包
library(circlize)
library(grid)
library(ComplexHeatmap)

创建测试数据集

可以按照的Bioconductor官网上ComplexHeatmap包的说明来创建一个测试数据(http://bioconductor.org/packages/release/bioc/vignettes/ComplexHeatmap/inst/doc/s2.single_heatmap.html

# 设置随机数种子,保证数据分析随机过程可重复
set.seed(123)
# 生成模拟数据:12行成10列矩阵
mat = cbind(rbind(matrix(rnorm(16, -1), 4), 
                  matrix(rnorm(32, 1), 8)),
            rbind(matrix(rnorm(24, 1), 4),
                  matrix(rnorm(48, -1), 8)))
# 随机重排
mat = mat[sample(nrow(mat), nrow(mat)),
          sample(ncol(mat), ncol(mat))]
# 添加行、列名
rownames(mat) = paste0("R", 1:12)
colnames(mat) = paste0("C", 1:10)

一行命令绘图

使用默认参数,一行命令即可出图

#默认对行和列都进行聚类
Heatmap(mat)

调参美化

下面我们通过参数设置来进行个性化热图定制。

使用HeatmapAnnotation函数可以构建注释对象,我们可以进行自定义,也可以直接使用它的内置函数。

注释按位置来分类可分为行注释和列注释,以列注释为例,其内置函数按照图形的类型可以分为6种,anno_points(),anno_barplot(),anno_boxplot(),anno_histogram(),anno_density(),anno_text()。

行注释的内置函数和列注释类似,前面加上row即可,如row_anno_points()。
详细示例可参照网址:http://bioconductor.org/packages/release/bioc/vignettes/ComplexHeatmap/inst/doc/s4.heatmap_annotation.html

本文重点讨论anno_points()的用法。

# 生成包含10个0.5数值的向量
value = rep(0.5,10)
# 设置值、形状、大小、颜色等
ha = HeatmapAnnotation("type" = anno_points(value, pch=c(19,19,15,15,24,24,23,23,3,3), size = unit(7, "mm"),gp = gpar(col = c("#bf94e4","#bf94e4","#bf94e4","#bf94e4","#1dacd6","#1dacd6","#1dacd6","#1dacd6","red","red")),border=FALSE,ylim=c(0,1)),show_annotation_name = FALSE)

"type"为这一行注释的名称,show_annotation_name = FALSE,即不显示名称。pch可指定绘制点时使用的符号,共25种,如上三角,下三角,圆形,方形等,具体可见《R In Action》。size可指定符号的大小,gp可指定符号的颜色。

# 批量按行中心标准化,减均值除方差,Z-score
mat_scaled = apply(mat, 1, scale)
# 继续原数据表列名
rownames(mat_scaled) = colnames(mat)
# 转置才与原方向一致
mat_scaled = t(mat_scaled)
# 通过circlize包中的colorRamp2()函数,来自定义颜色
col_fun = circlize::colorRamp2(c(-3, 0, 3), c("black", "white", "yellow"))
# 新矩阵
shape<-mat_scaled
# 循环元素筛选,变为+或空,显著标记常用
x<-nrow(mat_scaled)
y<-ncol(mat_scaled)
for(i in 1:x ){
        for (j in 1:y ){
        if(shape[i,j]>=1){
        shape[i,j]<-"+";} else{
        shape[i,j]<-"";
        }}}

如需对数据进行标准化,需使用apply函数来处理数据。我们可以通过circlize包中的colorRamp2()函数,来自定义颜色。对mat_scaled的数值进行筛选,生成一个符号是加号或空值的新数据框。这一部分可以根据作图要求来自定义。

P1=Heatmap(mat_scaled, 
name = "hello",
top_annotation = ha,
col = col_fun,
rect_gp = gpar(col = "black",lty = 2, lwd = 1),
cell_fun = function(j, i, x, y, width, height, fill) {grid.text(shape[i,j], x = x, y = y,gp = gpar(fontsize = 10,col="red"))},
cluster_rows = FALSE,
cluster_columns = FALSE, 
row_names_side = "left",
column_names_side="bottom",
row_names_gp = gpar(col = c("#8B7500","#8B7500","#8B7500","#8B7500","#8B7500","#8B7500","#0000FF","#0000FF","#0000FF","#0000FF","#0000FF","#0000FF")))

name可定义图例的名称。top_annotation 可引用上面定义好的列注释, 并将列注释放在heatmap上方;bottom_annotation 则将列注释放在heatmap下方。rect_gp定义小方格的边框颜色,线条类型及宽度。cell_fun可以对heatmap的每个小方格进行自定义,这里用其来显示”+”号,也可以显示数字等。cluster_rows和cluster_columns可定义是否聚类。row_names_side可定义行名的显示位置,默认值right。column_names_side可定义列名的显示位置,默认值bottom。row_names_gp可定义列名的颜色。

# 行名第一列
texta = c("A","B","C","D","EEEEE","F","G","H","I","J","K","L")
# 行注释,宽度为最大文本
ha_texta =rowAnnotation(text = row_anno_text(texta), width = max_text_width(texta))
# 行名第二列
textb = c("M","N","O","P","Q","R","S","T","U","V","W","X")
ha_textb =rowAnnotation(text = row_anno_text(textb), width = max_text_width(textb))
# 添加行名注释
ht_list = P1 + ha_texta + ha_textb
# 添加图例对应文字、形状和颜色
lgd = legendGrob(c("A","B","C","D","E"), pch = c(19,15,24,23,3),gp= gpar(col =c("#bf94e4","#bf94e4","#1dacd6","#1dacd6","red")))
# 绘图,添加热图图例左,注释图例
draw(ht_list,heatmap_legend_side = "left",annotation_legend_list = list(lgd))