单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析1:https://cloud.tencent.com/developer/article/2055573
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析2:https://cloud.tencent.com/developer/article/2072069
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析3:https://cloud.tencent.com/developer/article/2078159
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析4:https://cloud.tencent.com/developer/article/2078348
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析5:https://cloud.tencent.com/developer/article/2084580
代码解析
代码语言:javascript复制# Part 4: SingleR cell typing
###########################################################
# SingleR labeling of celltypes
##########################################################################
##开始进行单细胞的细胞注释
##SingleR里面自带了数据集,依托于人类和小鼠的注释内容进行细胞注释
##构建S4对象
rna.sce <- as.SingleCellExperiment(rna)
# Set reference paths:
# Wang et al. GSE111976
ref.data.counts <- readRDS("./GSE111976_ct_endo_10x.rds")
meta <- read.csv("./GSE111976_summary_10x_day_donor_ctype.csv")
rownames(meta) <- meta$X
length(which(rownames(meta) == colnames(ref.data.counts)))
ref.data.endo <- CreateSeuratObject(ref.data.counts,meta.data = meta)
Idents(ref.data.endo) <- "cell_type"
ref.data.endo <- NormalizeData(ref.data.endo)
# SingleR annotation
#####################################################
##读取内置数据集
# Read in reference datasets for SingleR annotation
# 1) Slyper et al. Nat. Medicine 2020 scRNA-seq ovarian tumor
ref.data.endo <- as.SingleCellExperiment(ref.data.endo)
# 2) Human Primary Cell Atlas Data (microarray)
ref.data.HPCA <- readRDS("./HPCA_celldex.rds")
#
# # 3) BluePrint Encode (bulk RNA-seq)
ref.data.BED <- readRDS("./BluePrintEncode_celldex.rds")
# 2) Single-cell level label transfer:
##SingleR:单细胞细胞类型定义工具
predictions.HPCA.sc <- SingleR(test=rna.sce, assay.type.test="logcounts", assay.type.ref="logcounts",
ref=ref.data.HPCA, labels=ref.data.HPCA$label.main)
predictions.BED.sc <- SingleR(test=rna.sce, assay.type.test="logcounts", assay.type.ref="logcounts",
ref=ref.data.BED, labels=ref.data.BED$label.main)
# Use de.method wilcox with scRNA-seq reference b/c the reference data is more sparse
predictions.endo.sc <- SingleR(test=rna.sce, assay.type.test="logcounts", assay.type.ref="logcounts",
ref=ref.data.endo, labels=ref.data.endo$cell_type,de.method = "wilcox")
rna$SingleR.HPCA <- predictions.HPCA.sc$pruned.labels
rna$SingleR.BED <- predictions.BED.sc$pruned.labels
rna$SingleR.endo <- predictions.endo.sc$pruned.labels
# Save Seurat object
#date <- Sys.Date()
saveRDS(rna,paste0(SAMPLE.ID,"_scRNA_processed.rds"))
# ###########################################################################################################
# # Starting cells, PostQC cells, doublets, Post doublet/QC cells, Cluster #
output.meta <- data.frame(StartingNumCells=length(colnames(counts.init)),
nMADLogCounts =2,
nMADLogFeatures = 2,
nMADLog1pMito =2,
PostQCNumCells=PostQCNumCells,
ExpectedDoubletFraction=doublet.rate,
ObservedDoubletFraction=length(doublets)/length(colnames(counts.init)),
PostDoubletNumCells=length(colnames(rna)),
NumClusters=length(levels(Idents(rna))),
DoubletFinderpK = pK.1,
MinNumCounts=min(rna$nCount_RNA),
MaxNumCounts= max(rna$nCount_RNA),
MedianNumbCounts = median(rna$nCount_RNA),
MinNumFeats=min(rna$nFeature_RNA),
MaxNumFeats= max(rna$nFeature_RNA),
MedianNumbFeats = median(rna$nFeature_RNA),
stringsAsFactors=FALSE)
output <- as.data.frame(t(output.meta))
colnames(output) <- SAMPLE.ID
xlsx::write.xlsx(output, "scRNA_pipeline_summary.xlsx",
row.names = T, col.names = TRUE)
#########################################################################################################
# END OF SCRIPT
##软件的详细链接网址:[https://github.com/dviraran/SingleR](https://github.com/dviraran/SingleR)
#########################################################################################################
总结
为了快速的获得细胞的类型,如果有大量的数据集支撑还是比较好的,但是对于植物而言,除了拟南芥这个比较模式物种外,其他的物种注释也是个问题,因此我觉得结合普通转录组的数据比较好,后面我自己也要放这个内容,准备下面做其他的复现的时候,加这些内容。感觉作者提供的这些单细胞的数据的代码有些冗余,其实不用这么多的代码就可以解决上面的问题。