Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genome

Tomasz Zemojtel, Sebastian Köhler, Luisa Mackenroth, Marten Jäger, Jochen Hecht, Peter Krawitz, Luitgard Graul-Neumann, Sandra Doelken, Nadja Ehmke, Malte Spielmann, Nancy Christine Oien, Michal R Schweiger, Ulrike Krüger, Götz Frommer, Björn Fischer, Uwe Kornak, Ricarda Flöttmann, Amin Ardeshirdavani, Yves Moreau, Suzanna E Lewis, Melissa Haendel, Damian Smedley, Denise Horn, Stefan Mundlos, Peter N Robinson, Tomasz Zemojtel, Sebastian Köhler, Luisa Mackenroth, Marten Jäger, Jochen Hecht, Peter Krawitz, Luitgard Graul-Neumann, Sandra Doelken, Nadja Ehmke, Malte Spielmann, Nancy Christine Oien, Michal R Schweiger, Ulrike Krüger, Götz Frommer, Björn Fischer, Uwe Kornak, Ricarda Flöttmann, Amin Ardeshirdavani, Yves Moreau, Suzanna E Lewis, Melissa Haendel, Damian Smedley, Denise Horn, Stefan Mundlos, Peter N Robinson

Abstract

Less than half of patients with suspected genetic disease receive a molecular diagnosis. We have therefore integrated next-generation sequencing (NGS), bioinformatics, and clinical data into an effective diagnostic workflow. We used variants in the 2741 established Mendelian disease genes [the disease-associated genome (DAG)] to develop a targeted enrichment DAG panel (7.1 Mb), which achieves a coverage of 20-fold or better for 98% of bases. Furthermore, we established a computational method [Phenotypic Interpretation of eXomes (PhenIX)] that evaluated and ranked variants based on pathogenicity and semantic similarity of patients' phenotype described by Human Phenotype Ontology (HPO) terms to those of 3991 Mendelian diseases. In computer simulations, ranking genes based on the variant score put the true gene in first place less than 5% of the time; PhenIX placed the correct gene in first place more than 86% of the time. In a retrospective test of PhenIX on 52 patients with previously identified mutations and known diagnoses, the correct gene achieved a mean rank of 2.1. In a prospective study on 40 individuals without a diagnosis, PhenIX analysis enabled a diagnosis in 11 cases (28%, at a mean rank of 2.4). Thus, the NGS of the DAG followed by phenotype-driven bioinformatic analysis allows quick and effective differential diagnostics in medical genetics.

Conflict of interest statement

Competing interests: Y.M. is an equity holder of Cartagenia NV, and S.M. is a paid consultant for Agilent. The other authors declare that they have no competing interests.

Copyright © 2014, American Association for the Advancement of Science.

Figures

Fig. 1
Fig. 1
Computational evaluation of PhenIX. HGMD mutations were inserted into variant files from DAG panels from which the causative mutations had been removed and phenotypic annotations of the corresponding diseases were extracted from the HPO database. The genes were ranked with PhenIX. Results were simulated either on the entire disease set (All) or by filtering for known autosomal dominant (AD) or autosomal recessive (AR) diseases (fig. S2). A total of 8504 (All), 3471 (AD), and 5006 (AR) simulations were performed. Data are shown as the percentage of simulations in which the correct genes was ranked in Nth place. Variant, only variant scores used to rank candidate genes; All terms, all HPO terms used to annotate a disease were used for PhenIX analysis; ≤5 terms, up to five HPO terms were chosen at random from the terms used to annotate the disease; ≤5 terms & noise, up to five annotations are used, two of which are made imprecise by exchanging them with a more general parent term; additionally, two random noise terms were added. Results are shown for the correct gene being ranked as the single top hit, or being among the top 5, 10, or 20 hits for the three test scenarios.
Fig. 2
Fig. 2
PhenIX workflow, showing the clinical and bioinformatic analysis steps. After initial clinical evaluation, a decision is made to perform PhenIX analysis if no clinical diagnosis can be found. After sequencing and computational analysis, clinical evaluation of the top 20 gene candidates identifies genes for validation by Sanger sequencing and cosegregation studies.

Source: PubMed

3
Se inscrever