单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析1:https://cloud.tencent.com/developer/article/2055573
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析2:https://cloud.tencent.com/developer/article/2072069
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析3:https://cloud.tencent.com/developer/article/2078159
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析4:https://cloud.tencent.com/developer/article/2078348
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析5:https://cloud.tencent.com/developer/article/2084580
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析6:https://cloud.tencent.com/developer/article/2085385
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析7:https://cloud.tencent.com/developer/article/2085705
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析8:https://cloud.tencent.com/developer/article/2086805
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析9:https://cloud.tencent.com/developer/article/2087563
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析10:https://cloud.tencent.com/developer/article/2090290
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析11:https://cloud.tencent.com/developer/article/2093123
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析12:https://cloud.tencent.com/developer/article/2093208
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析13:https://cloud.tencent.com/developer/article/2093567
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析14: https://cloud.tencent.com/developer/article/2094034
单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析15:https://cloud.tencent.com/developer/article/2102671
这部分是远端调控元件的转录因子基序分析的分析内容。
首先是用cellranger-dna对样本8和9进行bam文件的分析
代码语言:javascript复制#!/bin/bash
##cellranger-dna bamslice 从 cellranger-dna cnv 获取 BAM 文件并将其子集到指定的感兴趣单元格
cellranger-dna bamslice --id=BAMs_3E5CFL
--csv=3E5CFL_config.csv
--bam=/home/regnerm/ENDO_OVAR_PROJ/ovar_3E5CFL_ATAC/possorted_bam.bam
cellranger-dna bamslice --id=BAMs_3BAE2L
--csv=3BAE2L_config.csv
--bam=/home/regnerm/ENDO_OVAR_PROJ/ovar_3BAE2L_ATAC/possorted_bam.bam#!/bin/bash
我们使用患者 9(又名患者 ID 3E5CFL)的恶性特异性 bam 文件作为变体调用的输入,然后在感兴趣的基因组区域进行 FIMO 基序扫描:
代码语言:javascript复制#!/bin/bash
# # call variants in the epithelial fraction of Patient 9
# The Patient_9 epithelial bam file was generated with Cell Rangerf's bamslice on the original Patient 9 bam file
# (https://support.10xgenomics.com/single-cell-dna/software/pipelines/latest/using/bamslice)
##进行snp及indel分析
bcftools mpileup -d 1000 -Ou -f /fasta/genome.fa ./BAMs_3E5CFL/outs/subsets/3E5CFL_Epithelial_cell.bam | bcftools call -mv -Oz -o variants.vcf.gz
bcftools index variants.vcf.gz
bcftools view --types snps variants.vcf.gz > variants_new.vcf
bgzip -c variants_new.vcf > variants_new.vcf.gz
tabix -p vcf variants_new.vcf.gz
#Apply cluster variants to output cluster reference genome
cat /fasta/genome.fa | bcftools consensus variants_new.vcf.gz > consensus.fa
# Get sequence of enhancers and promoter
##getfasta可以根据BED/GFF/VCF文件提供的feature在染色体上的位置信息,从fasta中提取feature的碱基序列
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_enhancer_1.fa -bed enhancer_1.bed
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_enhancer_2.fa -bed enhancer_2.bed
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_enhancer_3.fa -bed enhancer_3.bed
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_enhancer_4.fa -bed enhancer_4.bed
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_enhancer_5.fa -bed enhancer_5.bed
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_promoter.fa -bed promoter.bed
# Run FIMO motif scanning
fimo --bgfile motif-file --oc enhancer_1_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_enhancer_1.fa
fimo --bgfile motif-file --oc enhancer_2_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_enhancer_2.fa
fimo --bgfile motif-file --oc enhancer_3_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_enhancer_3.fa
fimo --bgfile motif-file --oc enhancer_4_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_enhancer_4.fa
fimo --bgfile motif-file --oc enhancer_5_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_enhancer_5.fa
fimo --bgfile motif-file --oc promoter_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_promoter.fa
最后,我们将 FIMO motif扫描结果读入 R 并筛选 q 值 < 0.10 的 TF 匹配。 为了进一步对这些 TF 基序预测进行排序,我们按患者 9 恶性部分中标准化的假体 TF 基因表达的降序对它们进行排序,并在箱形图中绘制它们相应的表达模式。 请注意,“0-上皮细胞”、“7-上皮细胞”、“11-上皮细胞”、“16-上皮细胞”代表患者 9 的恶性分数。
代码语言:javascript复制library(Seurat)
library(dplyr)
library(ggplot2)
##因子处理 forcats包
library(forcats)
ovar_HGSOC_scRNA_processed <- readRDS("../ovar_HGSOC_scRNA_processed.rds")
labels <- c("0-Epithelial cell","7-Epithelial cell","11-Epithelial cell","16-Epithelial cell")
hgsoc <- ovar_HGSOC_scRNA_processed[,ovar_HGSOC_scRNA_processed$cell.type %in% labels]
counts <- hgsoc@assays$RNA@data
counts.bulk <- as.data.frame(rowSums(as.matrix(counts)))
counts.bulk$gene <- rownames(counts.bulk)
fimo_outs <- list.files(pattern = "_fimo")
for (i in fimo_outs){
fimo <- read.table(paste0(i,"/fimo.txt"))
counts.bulk.new <- counts.bulk[counts.bulk$gene %in% fimo$V2,]
colnames(fimo)[2] <- "gene"
test <- merge(fimo,counts.bulk.new,by = "gene")
colnames(test)[11] <- "Expr"
test <- dplyr::filter(test,V9 <= 0.1)
test <- dplyr::arrange(test,desc(Expr))
write.table(test,paste0(i,"_w_expr.txt"),sep = "t",quote = F,row.names = F,col.names = F)
}
# Make boxplot for enhancer 2 TF expression levels:
enh.2.tfs.1 <- data.frame(Data = hgsoc@assays$RNA@data[12582,],
TF ="A.KLF6")
print(rownames(hgsoc)[12582])
enh.2.tfs.2 <- data.frame(Data = hgsoc@assays$RNA@data[15548,],
TF="B.YY1")
print(rownames(hgsoc)[15548])
enh.2.tfs.3 <- data.frame(Data = hgsoc@assays$RNA@data[6857,],
TF="C.TFAP2A")
print(rownames(hgsoc)[6857])
# Make boxplot for enhancer 4 TF expression levels:
enh.4.tfs.1 <- data.frame(Data = hgsoc@assays$RNA@data[9985,],
TF ="D.CEBPD")
print(rownames(hgsoc)[9985])
enh.4.tfs.2 <- data.frame(Data = hgsoc@assays$RNA@data[15548,],
TF="E.YY1")
print(rownames(hgsoc)[15548])
enh.4.tfs.3 <- data.frame(Data = hgsoc@assays$RNA@data[6857,],
TF="F.TFAP2A")
print(rownames(hgsoc)[6857])
# Make boxplot for promoter TF expression levels:
prom.tfs.1 <- data.frame(Data = hgsoc@assays$RNA@data[12582,],
TF ="G.KLF6")
print(rownames(hgsoc)[12582])
prom.tfs.2 <- data.frame(Data = hgsoc@assays$RNA@data[15260,],
TF="H.HIF1A")
print(rownames(hgsoc)[15260])
prom.tfs.3 <- data.frame(Data = hgsoc@assays$RNA@data[6925,],
TF="I.SOX4")
print(rownames(hgsoc)[6925])
prom.tfs.4 <- data.frame(Data = hgsoc@assays$RNA@data[15548,],
TF="J.YY1")
print(rownames(hgsoc)[15548])
tfs <- rbind(enh.2.tfs.1,enh.2.tfs.2,enh.2.tfs.3,
enh.4.tfs.1,enh.4.tfs.2,enh.4.tfs.3,
prom.tfs.1,prom.tfs.2,prom.tfs.3,prom.tfs.4)
ggplot(tfs,aes(x=fct_rev(TF),y=Data))
geom_boxplot(lwd=0.45,outlier.size = 0.95,fatten = 0.95) theme_classic() ylim(c(0,4))
coord_flip()
ggsave("Vln.pdf",width =4,height = 8)
writeLines(capture.output(sessionInfo()), "sessionInfo.txt")