Next-generation genotype imputation service and methods

Sayantan Das, Lukas Forer, Sebastian Schönherr, Carlo Sidore, Adam E Locke, Alan Kwong, Scott I Vrieze, Emily Y Chew, Shawn Levy, Matt McGue, David Schlessinger, Dwight Stambolian, Po-Ru Loh, William G Iacono, Anand Swaroop, Laura J Scott, Francesco Cucca, Florian Kronenberg, Michael Boehnke, Gonçalo R Abecasis, Christian Fuchsberger, Sayantan Das, Lukas Forer, Sebastian Schönherr, Carlo Sidore, Adam E Locke, Alan Kwong, Scott I Vrieze, Emily Y Chew, Shawn Levy, Matt McGue, David Schlessinger, Dwight Stambolian, Po-Ru Loh, William G Iacono, Anand Swaroop, Laura J Scott, Francesco Cucca, Florian Kronenberg, Michael Boehnke, Gonçalo R Abecasis, Christian Fuchsberger

Abstract

Genotype imputation is a key component of genetic association studies, where it increases power, facilitates meta-analysis, and aids interpretation of signals. Genotype imputation is computationally demanding and, with current tools, typically requires access to a high-performance computing cluster and to a reference panel of sequenced genomes. Here we describe improvements to imputation machinery that reduce computational requirements by more than an order of magnitude with no loss of accuracy in comparison to standard imputation tools. We also describe a new web-based service for imputation that facilitates access to new reference panels and greatly improves user experience and productivity.

Figures

Figure 1
Figure 1
Overview of state space reduction. We consider a chromosome region with M = 9 markers and H = 8 haplotypes: X1, X2, ..., X8. We break the region into consecutive genomic segments (blocks) and start by analyzing block B from marker 1 to marker 6. In block B, we identify U = 3 unique haplotypes: Y1, Y2, and Y3 (colored in green, red, and blue, respectively). Given we know the left probabilities of the original state space at marker 1 (that is, L1(X1), ..., L1(X8)), we fold them to get the left probabilities of the reduced state space at marker 1: L1(Y1), L1(Y2), and L1(Y3). We implement HMM on the reduced state space (Y1, Y2, and Y3) from marker 1 to marker 6 to get L6(Y1), L6(Y2), and L6(Y3). We next unfold the left probabilities of the reduced state space at marker 6 to obtain the left probabilities of the original state space: L6(X1), ..., L6(X8). We repeat this procedure on the next block, starting with L6(X1), ..., L6(X8), to finally obtain L9(X1), ..., L9(X8).

Source: PubMed

3
订阅