单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析16

2022-09-12 16:47:36 浏览数 (3)

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析1:https://cloud.tencent.com/developer/article/2055573

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析2:https://cloud.tencent.com/developer/article/2072069

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析3:https://cloud.tencent.com/developer/article/2078159

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析4:https://cloud.tencent.com/developer/article/2078348

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析5:https://cloud.tencent.com/developer/article/2084580

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析6:https://cloud.tencent.com/developer/article/2085385

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析7:https://cloud.tencent.com/developer/article/2085705

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析8:https://cloud.tencent.com/developer/article/2086805

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析9:https://cloud.tencent.com/developer/article/2087563

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析10:https://cloud.tencent.com/developer/article/2090290

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析11:https://cloud.tencent.com/developer/article/2093123

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析12:https://cloud.tencent.com/developer/article/2093208

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析13:https://cloud.tencent.com/developer/article/2093567

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析14: https://cloud.tencent.com/developer/article/2094034

单细胞代码解析-妇科癌症单细胞转录组及染色质可及性分析15:https://cloud.tencent.com/developer/article/2102671

image.pngimage.png

这部分是远端调控元件的转录因子基序分析的分析内容。

首先是用cellranger-dna对样本8和9进行bam文件的分析

代码语言:javascript复制
#!/bin/bash

##cellranger-dna bamslice 从 cellranger-dna cnv 获取 BAM 文件并将其子集到指定的感兴趣单元格
cellranger-dna bamslice --id=BAMs_3E5CFL 
                       --csv=3E5CFL_config.csv 
                       --bam=/home/regnerm/ENDO_OVAR_PROJ/ovar_3E5CFL_ATAC/possorted_bam.bam
cellranger-dna bamslice --id=BAMs_3BAE2L 
                       --csv=3BAE2L_config.csv 
                       --bam=/home/regnerm/ENDO_OVAR_PROJ/ovar_3BAE2L_ATAC/possorted_bam.bam#!/bin/bash

我们使用患者 9(又名患者 ID 3E5CFL)的恶性特异性 bam 文件作为变体调用的输入,然后在感兴趣的基因组区域进行 FIMO 基序扫描:

代码语言:javascript复制
#!/bin/bash

# # call variants in the epithelial fraction of Patient 9
# The Patient_9 epithelial bam file was generated with Cell Rangerf's bamslice on the original Patient 9 bam file 
# (https://support.10xgenomics.com/single-cell-dna/software/pipelines/latest/using/bamslice)
##进行snp及indel分析
bcftools mpileup -d 1000 -Ou -f /fasta/genome.fa ./BAMs_3E5CFL/outs/subsets/3E5CFL_Epithelial_cell.bam | bcftools call -mv -Oz -o variants.vcf.gz

bcftools index variants.vcf.gz
bcftools view --types snps variants.vcf.gz > variants_new.vcf
bgzip -c variants_new.vcf > variants_new.vcf.gz
tabix -p vcf variants_new.vcf.gz

#Apply cluster variants to output cluster reference genome
cat /fasta/genome.fa | bcftools consensus variants_new.vcf.gz > consensus.fa

# Get sequence of enhancers and promoter
##getfasta可以根据BED/GFF/VCF文件提供的feature在染色体上的位置信息,从fasta中提取feature的碱基序列
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_enhancer_1.fa -bed enhancer_1.bed
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_enhancer_2.fa -bed enhancer_2.bed
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_enhancer_3.fa -bed enhancer_3.bed
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_enhancer_4.fa -bed enhancer_4.bed
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_enhancer_5.fa -bed enhancer_5.bed
bedtools getfasta -fi consensus.fa -fo GeneOfInterest_promoter.fa -bed promoter.bed

# Run FIMO motif scanning
fimo --bgfile motif-file --oc enhancer_1_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_enhancer_1.fa
fimo --bgfile motif-file --oc enhancer_2_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_enhancer_2.fa
fimo --bgfile motif-file --oc enhancer_3_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_enhancer_3.fa
fimo  --bgfile motif-file --oc enhancer_4_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_enhancer_4.fa
fimo --bgfile motif-file --oc enhancer_5_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_enhancer_5.fa
fimo --bgfile motif-file --oc promoter_fimo ${dir}/JASPAR2020_CORE_vertebrates_non-redundant_pfms_meme.meme GeneOfInterest_promoter.fa					   

最后,我们将 FIMO motif扫描结果读入 R 并筛选 q 值 < 0.10 的 TF 匹配。 为了进一步对这些 TF 基序预测进行排序,我们按患者 9 恶性部分中标准化的假体 TF 基因表达的降序对它们进行排序,并在箱形图中绘制它们相应的表达模式。 请注意,“0-上皮细胞”、“7-上皮细胞”、“11-上皮细胞”、“16-上皮细胞”代表患者 9 的恶性分数。

代码语言:javascript复制
library(Seurat)
library(dplyr)
library(ggplot2)
##因子处理 forcats包
library(forcats)

ovar_HGSOC_scRNA_processed <- readRDS("../ovar_HGSOC_scRNA_processed.rds")

labels <- c("0-Epithelial cell","7-Epithelial cell","11-Epithelial cell","16-Epithelial cell")


hgsoc <- ovar_HGSOC_scRNA_processed[,ovar_HGSOC_scRNA_processed$cell.type %in% labels] 

counts <- hgsoc@assays$RNA@data
counts.bulk <- as.data.frame(rowSums(as.matrix(counts)))

counts.bulk$gene <- rownames(counts.bulk)



fimo_outs <- list.files(pattern = "_fimo")

for (i in fimo_outs){
  fimo <- read.table(paste0(i,"/fimo.txt"))
  counts.bulk.new <- counts.bulk[counts.bulk$gene %in% fimo$V2,]
  
  colnames(fimo)[2] <- "gene"
  
  test <- merge(fimo,counts.bulk.new,by = "gene")
  colnames(test)[11] <- "Expr"
  
  test <- dplyr::filter(test,V9 <= 0.1)
  test <- dplyr::arrange(test,desc(Expr))
  write.table(test,paste0(i,"_w_expr.txt"),sep = "t",quote = F,row.names = F,col.names = F)
}

# Make boxplot for enhancer 2 TF expression levels:
enh.2.tfs.1 <- data.frame(Data = hgsoc@assays$RNA@data[12582,],
                          TF ="A.KLF6")
print(rownames(hgsoc)[12582])
enh.2.tfs.2 <- data.frame(Data = hgsoc@assays$RNA@data[15548,],
                          TF="B.YY1")
print(rownames(hgsoc)[15548])
enh.2.tfs.3 <- data.frame(Data = hgsoc@assays$RNA@data[6857,],
                          TF="C.TFAP2A")
print(rownames(hgsoc)[6857])


# Make boxplot for enhancer 4 TF expression levels:
enh.4.tfs.1 <- data.frame(Data = hgsoc@assays$RNA@data[9985,],
                          TF ="D.CEBPD")
print(rownames(hgsoc)[9985])
enh.4.tfs.2 <- data.frame(Data = hgsoc@assays$RNA@data[15548,],
                          TF="E.YY1")
print(rownames(hgsoc)[15548])
enh.4.tfs.3 <- data.frame(Data = hgsoc@assays$RNA@data[6857,],
                          TF="F.TFAP2A")
print(rownames(hgsoc)[6857])

# Make boxplot for promoter TF expression levels:
prom.tfs.1 <- data.frame(Data = hgsoc@assays$RNA@data[12582,],
                         TF ="G.KLF6")
print(rownames(hgsoc)[12582])
prom.tfs.2 <- data.frame(Data = hgsoc@assays$RNA@data[15260,],
                         TF="H.HIF1A")
print(rownames(hgsoc)[15260])
prom.tfs.3 <- data.frame(Data = hgsoc@assays$RNA@data[6925,],
                         TF="I.SOX4")
print(rownames(hgsoc)[6925])
prom.tfs.4 <- data.frame(Data = hgsoc@assays$RNA@data[15548,],
                         TF="J.YY1")
print(rownames(hgsoc)[15548])


tfs <- rbind(enh.2.tfs.1,enh.2.tfs.2,enh.2.tfs.3,
                  enh.4.tfs.1,enh.4.tfs.2,enh.4.tfs.3,
                  prom.tfs.1,prom.tfs.2,prom.tfs.3,prom.tfs.4)
ggplot(tfs,aes(x=fct_rev(TF),y=Data)) 
  geom_boxplot(lwd=0.45,outlier.size = 0.95,fatten = 0.95) theme_classic() ylim(c(0,4)) 
  coord_flip() 
  ggsave("Vln.pdf",width =4,height = 8)

writeLines(capture.output(sessionInfo()), "sessionInfo.txt")				

0 人点赞