一、Trajectory analysis 轨迹推断
轨迹推断(Trajectory Inference,TI),也称为细胞分析轨迹(differentiation trajectories)基于单细胞转录组数据,利用模型预测细胞分化过程。
由于分析结果是基于某个组织一瞬间的群体单细胞数据预测得到的,并非真实时间维度上的分化结果,因此也被称为伪时序分析 (pseudotime analysis),也可以叫做拟时分析。
拟时分析通过分析关键基因的表达模式,将所有细胞按照发育时间的先后排布在拟时间轴上,模拟发育过程中的细胞分化过程。通过对细胞轨迹的分析,我们可以挖掘出细胞分化过程中经历的细胞类型变化、伴随发育过程变化的动态变化基因、祖细胞不同的分化命运等与生命发育息息相关的信息。
二、monocle3 进行细胞轨迹推断
文档:https://cole-trapnell-lab.github.io/monocle3/docs/trajectories/
代码语言:javascript复制#1 读取数据
getwd()
setwd('/home/xhs/jyxy/16.rscrna/')
expression_matrix <- readRDS("./monocle3/packer/packer_embryo_expression.rds")
cell_metadata <- readRDS("./monocle3/packer/packer_embryo_colData.rds")
gene_annotation <- readRDS("./monocle3/packer/packer_embryo_rowData.rds")
cds <- new_cell_data_set(expression_matrix,cell_metadata = cell_metadata,gene_metadata = gene_annotation)
#2 数据处理以及去除批次效应
cds <- preprocess_cds(cds, num_dim = 50)
cds <- align_cds(cds, alignment_group = "batch", residual_model_formula_str = "~ bg.300.loading bg.400.loading bg.500.1.loading bg.500.2.loading bg.r17.loading bg.b01.loading bg.b02.loading")
#3 数据降维以及结果可视化
cds <- reduce_dimension(cds)
#设置group_label_size = 3,cell_size = 0.7调整文本和点的大小
plot_cells(cds, label_groups_by_cluster=FALSE,color_cells_by = "cell.type",group_label_size = 3,cell_size = 0.7)
#基因沿着轨迹变化
ciliated_genes <- c("che-1","hlh-17","nhr-6","dmd-6","ceh-36","ham-1")
plot_cells(cds,genes=ciliated_genes,label_cell_groups=FALSE,show_trajectory_graph=FALSE)
#4 聚类
cds <- cluster_cells(cds)
plot_cells(cds, color_cells_by = "partition")
#5 构建Learn the trajectory graph
cds <- learn_graph(cds)
plot_cells(cds,color_cells_by = "cell.type",label_groups_by_cluster=FALSE,label_leaves=FALSE,label_branch_points=FALSE)
#6排序 Order the cells in pseudotime
plot_cells(cds,color_cells_by = "embryo.time.bin",label_cell_groups=FALSE,label_leaves=TRUE,label_branch_points=TRUE, graph_label_size=1.5)
#7 选取root跟节点
cds <- order_cells(cds)
plot_cells(cds,
color_cells_by = "pseudotime",
label_cell_groups=FALSE,
label_leaves=FALSE,
label_branch_points=FALSE,
graph_label_size=1.5)
#8 自定义函数,确定根节点
get_earliest_principal_node <- function(cds, time_bin="130-170"){
cell_ids <- which(colData(cds)[, "embryo.time.bin"] == time_bin)
closest_vertex <-
cds@principal_graph_aux[["UMAP"]]$pr_graph_cell_proj_closest_vertex
closest_vertex <- as.matrix(closest_vertex[colnames(cds), ])
root_pr_nodes <-
igraph::V(principal_graph(cds)[["UMAP"]])$name[as.numeric(names
(which.max(table(closest_vertex[cell_ids,]))))]
root_pr_nodes
}
cds <- order_cells(cds, root_pr_nodes=get_earliest_principal_node(cds))
plot_cells(cds,color_cells_by = "pseudotime",label_cell_groups=FALSE,label_leaves=FALSE,label_branch_points=FALSE,graph_label_size=1.5)
#9 选取树枝Subset cells by branch
cds_sub <- choose_graph_segments(cds)
#10 绘制3D 轨迹图
cds_3d <- reduce_dimension(cds, max_components = 3)
cds_3d <- cluster_cells(cds_3d)
cds_3d <- learn_graph(cds_3d)
cds_3d <- order_cells(cds_3d, root_pr_nodes=get_earliest_principal_node(cds))
cds_3d_plot_obj <- plot_cells_3d(cds_3d, color_cells_by="partition")
写在最后:有时间我们会努力更新的。大家互动交流可以前去论坛,地址在下面,复制去浏览器即可访问,弥补下公众号没有留言功能的缺憾。原地址暂未启用(bioinfoer.com)。
代码语言:javascript复制sx.voiceclouds.cn
有些板块也可以预设为大家日常趣事的分享等,欢迎大家来提建议。