全基因组关联 meta 分析 (GWAS meta-analysis) 能够通过整合多个 GWAS 研究来找到基因型和表型之间的关联, 从而提高统计能力,同时控制假阳性结果的比率。
前文说到,在PDE5和ED这篇文章中,作者提到血压的数据是meta分析整合而得的。因此我也关注了一些gwas meta分析的知识,METAL是最常见的,简单介绍一下~
METAL
下载地址:https://csg.sph.umich.edu/abecasis/Metal/download/
除了gwas数据之外,需要准备的东西只有一个,就是config文件
按照示例文件准备一个这样的txt文件即可,注意这里的effect是beta。
命令也很简单那~
代码语言:javascript复制../generic-metal/metal metal metal_config.txt
但如果不注意的话,就会发现你的结果里只有zscore和p值,苦思冥想(怪不好意思的……)良久,再回头看这个软件的示例文件:
To help identify allele flips, it can be useful to track allele frequencies in the meta-analysis. To enable this capability, uncomment the following two lines: AVERAGEFREQ ON MINMAXFREQ ON To restric meta-analysis to two previously reported SNPs and summarize study specific results, uncomment the two lines that follow: ADDFILTER SNP IN (rs10830963,rs563694) VERBOSE ON
更多的细节推荐这篇帖子:【软件介绍】GWAS meta分析软件:METAL_metal gwas https://blog.csdn.net/qq_22253901/article/details/117464933)
metagen
在这篇文章的补充材料中,作者提到另一种meta分析方法:
Single-nucleotide (SNP) genetic associations with erectile dysfunction were estimated as the inverse-variance weighted meta-analysis of SNP effects form two GWASs.
IVW meta分析?
逆方差加权平均法(IVW)通过计算效应大小的加权平均值,以单项研究的逆方差作为权重,汇总多项独立研究的效应大小[1]。
在《Doing Meta-Analysis in R: A Hands-on Guide》中,作者提到:
This method is the most common approach to calculate average effects in meta-analyses. Because we use the inverse of the variance, it is often called inverse-variance weighting or simply inverse-variance meta-analysis.
如何在R中实现呢?
The function of choice for pre-calculated effect sizes is metagen【来自meta package】. Its name stands for generic inverse variance meta-analysis.
关于ED的数据分别来自芬兰数据库和catalogue gwas数据库,都是公开获取的。
https://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST006001-GCST007000/GCST006956/ED_AJHG_Bovijn_et_al_2018.gz https://storage.googleapis.com/finngen-public-data-r10/summary_stats/finngen_R10_ERECTILE_DYSFUNCTION.gz【这里我用了R10的数据】
接下来试试用这两个数据进行meta分析——
代码语言:javascript复制rm(list = ls())
library(meta)
library(data.table)
library(dplyr)
library(tidyr)
# 创建一个包含效应量和标准误差的数据框
ED_cat <- fread("../control_outcome/ED_Bovijn_gwas.txt",data.table = F)
ED_fin <- fread("../control_outcome/ED_finn_gwas.txt",fill=TRUE) %>% drop_na()
View(head(ED_cat))
View(head(ED_fin))
colnames(ED_fin) <- c("SNP","CHR","BP","A1","A2","FRQ","BETA","SE","P","n")
data <- rbind(ED_cat,ED_fin)
dat <- data[1:10,] ##取个子集先跑一下
# 进行ivw meta分析
result <- metagen(
TE = dat$BETA,
seTE = dat$SE,
studlab = dat$SNP,
method.tau = "DL"
)
# 查看meta分析结果
print(result)
# 绘制森林图
forest(result)
这里不取子集的话根本跑不出来啊【抓狂~】
但是,蓦然回首……发现METAL是支持IVW 法分析的,只是默认情况下权重是基于sample size。
代码语言:javascript复制SCHEME STDERR - classical approach, uses effect size estimates and standard errors
那就好办了,这一步鼓励大家自己先用ED的两个数据集尝试一下,明晚我会把运行METAL需要的config文件放在评论区,大家加油(ง •_•)ง
PS:这篇文章还有其他地方也用到了meta分析,下一期会再尝试这一步➡
Wald estimates for each variant were then meta-analysed with a multiplicative random effects model while using a linkage disequilibrium matrix corresponding to the European ancestry participants within the 1000G panel, as a source of reference to account for the correlation between variants.
参考文献:
[1] Lee CH, Cook S, Lee JS, Han B. Comparison of Two Meta-Analysis Methods: Inverse-Variance-Weighted Average and Weighted Sum of Z-Scores. Genomics Inform. 2016 Dec;14(4):173-180. doi: 10.5808/GI.2016.14.4.173. Epub 2016 Dec 30. PMID: 28154508; PMCID: PMC5287121.