Genome-wide association study-based prediction of atrial fibrillation using artificial intelligence
Oh-Seok Kwon, Myunghee Hong, Tae-Hoon Kim, Inseok Hwang, Jaemin Shim, Eue-Keun Choi, Hong Euy Lim, Hee Tae Yu, Jae-Sun Uhm, Boyoung Joung, Seil Oh, Moon-Hyoung Lee, Young-Hoon Kim, Hui-Nam Pak, Oh-Seok Kwon, Myunghee Hong, Tae-Hoon Kim, Inseok Hwang, Jaemin Shim, Eue-Keun Choi, Hong Euy Lim, Hee Tae Yu, Jae-Sun Uhm, Boyoung Joung, Seil Oh, Moon-Hyoung Lee, Young-Hoon Kim, Hui-Nam Pak
Abstract
Objective: We previously reported early-onset atrial fibrillation (AF) associated genetic loci among a Korean population. We explored whether the AF-associated single-nucleotide polymorphisms (SNPs) selected from the Genome-Wide Association Study (GWAS) of an external large cohort has a prediction power for AF in Korean population through a convolutional neural network (CNN).
Methods: This study included 6358 subjects (872 cases, 5486 controls) from the Korean population GWAS data. We extracted the lists of SNPs at each p value threshold of the association statistics from three different previously reported ethnical-specific GWASs. The Korean GWAS data were divided into training (64%), validation (16%) and test (20%) sets, and a stratified K-fold cross-validation was performed and repeated five times after data shuffling.
Results: The CNN-GWAS predictive power for AF had an area under the curve (AUC) of 0.78±0.01 based on the Japanese GWAS, AUC of 0.79±0.01 based on the European GWAS, and AUC of 0.82±0.01 based on the multiethnic GWAS, respectively. Gradient-weighted class activation mapping assigned high saliency scores for AF associated SNPs, and the PITX2 obtained the highest saliency score. The CNN-GWAS did not show AF prediction power by SNPs with non-significant p value subset (AUC 0.56±0.01) despite larger numbers of SNPs. The CNN-GWAS had no prediction power for odd-even registration numbers (AUC 0.51±0.01).
Conclusions: AF can be predicted by genetic information alone with moderate accuracy. The CNN-GWAS can be a robust and useful tool for detecting polygenic diseases by capturing the cumulative effects and genetic interactions of moderately associated but statistically significant SNPs.
Trial registration number: NCT02138695.
Keywords: atrial fibrillation; genetics; genome-wide association study.
Conflict of interest statement
Competing interests: None declared.
© Author(s) (or their employer(s)) 2022. Re-use permitted under CC BY. Published by BMJ.
Figures
References
- Kim D, Yang P-S, Jang E, et al. . 10-Year nationwide trends of the incidence, prevalence, and adverse outcomes of non-valvular atrial fibrillation nationwide health insurance data covering the entire Korean population. Am Heart J 2018;202:20–6. 10.1016/j.ahj.2018.04.017
- Kirchhof P, Benussi S, Kotecha D, et al. . 2016 ESC guidelines for the management of atrial fibrillation developed in collaboration with EACTS. Eur Heart J 2016;37:2893–962. 10.1093/eurheartj/ehw210
- Steg PG, Alam S, Chiang C-E, et al. . Symptoms, functional status and quality of life in patients with controlled and uncontrolled atrial fibrillation: data from the RealiseAF cross-sectional international registry. Heart 2012;98:195–201. 10.1136/heartjnl-2011-300550
- Lubitz SA, Yin X, Fontes JD, et al. . Association between familial atrial fibrillation and risk of new-onset atrial fibrillation. JAMA 2010;304:2263–9. 10.1001/jama.2010.1690
- Lee J-Y, Kim T-H, Yang P-S, et al. . Korean atrial fibrillation network genome-wide association study for early-onset atrial fibrillation identifies novel susceptibility loci. Eur Heart J 2017;38:2586–94. 10.1093/eurheartj/ehx213
- Choi SH, Weng L-C, Roselli C, et al. . Association between titin loss-of-function variants and early-onset atrial fibrillation. JAMA 2018;320:2354–64. 10.1001/jama.2018.18179
- Choi E-K, Park JH, Lee J-Y, et al. . Korean atrial fibrillation (AF) network: genetic variants for AF do not predict ablation success. J Am Heart Assoc 2015;4:e002046. 10.1161/JAHA.115.002046
- Bellot P, de Los Campos G, Pérez-Enciso M. Can deep learning improve genomic prediction of complex human traits? Genetics 2018;210:809–19. 10.1534/genetics.118.301298
- Selvaraju RR, Cogswell M, Das A. Grad-cam: visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE International Conference on Computer Vision 2017:618–26.
- Low S-K, Takahashi A, Ebana Y, et al. . Identification of six new genetic loci associated with atrial fibrillation in the Japanese population. Nat Genet 2017;49:953–8. 10.1038/ng.3842
- Nielsen JB, Thorolfsdottir RB, Fritsche LG, et al. . Biobank-driven genomic discovery yields new insight into atrial fibrillation biology. Nat Genet 2018;50:1234–9. 10.1038/s41588-018-0171-3
- Roselli C, Chaffin MD, Weng L-C, et al. . Multi-Ethnic genome-wide association study for atrial fibrillation. Nat Genet 2018;50:1225–33. 10.1038/s41588-018-0133-9
- Kohavi R, Sommerfield D. Feature subset selection using the wrapper method: Overfitting and dynamic search space topology. KDD 1995:192–7.
- Liu Y, Wang D, He F, et al. . Phenotype prediction and genome-wide association study using deep Convolutional neural network of soybean. Front Genet 2019;10:1091. 10.3389/fgene.2019.01091
- Lewis CM, Vassos E. Polygenic risk scores: from research tools to clinical instruments. Genome Med 2020;12:44. 10.1186/s13073-020-00742-5
- Hand DJ, Till RJ. A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach Learn 2001;45:171–86. 10.1023/A:1010920819831
- Gal Y, Ghahramani Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning. International Conference on Machine Learning: PMLR 2016:1050–9.
- Muse ED, Barrett PM, Steinhubl SR, et al. . Towards a smart medical home. The Lancet 2017;389:358. 10.1016/S0140-6736(17)30154-X
- Dey D, Slomka PJ, Leeson P, et al. . Artificial intelligence in cardiovascular imaging: JACC state-of-the-art review. J Am Coll Cardiol 2019;73:1317–35. 10.1016/j.jacc.2018.12.054
- Betancur J, Commandeur F, Motlagh M, et al. . Deep learning for prediction of obstructive disease from fast myocardial perfusion SPECT: a multicenter study. JACC Cardiovasc Imaging 2018;11:1654–63. 10.1016/j.jcmg.2018.01.020
- Motwani M, Dey D, Berman DS, et al. . Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis. Eur Heart J 2017;38:500–7. 10.1093/eurheartj/ehw188
- Arnar DO, Thorvaldsson S, Manolio TA, et al. . Familial aggregation of atrial fibrillation in Iceland. Eur Heart J 2006;27:708–12. 10.1093/eurheartj/ehi727
- Johnson KW, Torres Soto J, Glicksberg BS, et al. . Artificial intelligence in cardiology. J Am Coll Cardiol 2018;71:2668–79. 10.1016/j.jacc.2018.03.521
- Duch W, Jankowski N, Maszczyk T. Make it cheap: learning with O (nd) complexity. The 2012 International Joint Conference on Neural Networks (IJCNN): IEEE 2012:1–4.
- Ellis RP, Mookim PG. K-Fold cross-validation is superior to split sample validation for risk adjustment models. Boston University-Department of Economics, 2013.
Source: PubMed