Bayesian multivariate re-analysis of large genetic studies identifies many novel associations
Genome-wide association studies (GWAS) are now a common tool to identify genetic variants that affect traits of interest. To date, the NHGRI GWAS Catalog has over 24,000 SNP-phenotype associations. However, the vast majority of these GWAS are conducted in univariate frameworks, ie when genetic variants are only tested against a single phenotype one at a time. This is in contrast to multivariate frameworks where genetic variants are tested against different combinations of traits simultaneously. Under many biological scenarios, taking into account the context of multiple phenotypes drastically increases power. Additionally, by testing combinations of traits, multivariate frameworks allow researchers to investigate a greater level of biological complexity. Despite these clear advantages, multivariate analyses are seldom implemented. Univariate GWAS already involve a large computational and statistical burden; performing an additional, exponentially greater number of tests is highly deterring. Furthermore, it is often unclear how to properly compare different multivariate models even when they can be efficiently conducted.

Here, we present a framework and R package that alleviates these obstacles — Bayesian multivariate analysis of summary statistics, or bmass. bmass runs solely using univariate GWAS summary statistics. bmass can quickly conduct all possible multivariate analyses for up to 8 phenotypes. And bmass provides Bayes factors for each multivariate analysis, thus allowing models to be directly compared. Running bmass on various publicly available GWAS datasets consistently show an increase in power up to 40% over univariate approaches while keeping FDRs as low as 15%. bmass identifies many new significant associations as well as the phenotypic combinations driving these associations, thus providing novel levels of biological insight. Overall, bmass is a powerful tool that should further enable researchers to perform multivariate analysis of GWAS.
Date: 15 May 2017, 14:00 (Monday, 4th week, Trinity 2017)
Venue: Big Data Institute, Old Road Campus OX3 7LF
Venue Details: BDI seminar room
Speaker: Michael Turchin (Department of Human Genetics, University of Chicago)
Organisers: Natasha Bowyer (BDI, University of Oxford), Carol Mulligan-John (University of Oxford)
Part of: BDI seminars
Booking required?: Not required
Audience: Members of the University only
Editors: Graham Bagley, Natasha Bowyer