Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View

Wei Luo, Dinh Phung, Truyen Tran, Sunil Gupta, Santu Rana, Chandan Karmakar, Alistair Shilton, John Yearwood, Nevenka Dimitrova, Tu Bao Ho, Svetha Venkatesh, Michael Berk, Wei Luo, Dinh Phung, Truyen Tran, Sunil Gupta, Santu Rana, Chandan Karmakar, Alistair Shilton, John Yearwood, Nevenka Dimitrova, Tu Bao Ho, Svetha Venkatesh, Michael Berk

Abstract

Background: As more and more researchers are turning to big data for new opportunities of biomedical discoveries, machine learning models, as the backbone of big data analysis, are mentioned more often in biomedical journals. However, owing to the inherent complexity of machine learning methods, they are prone to misuse. Because of the flexibility in specifying machine learning models, the results are often insufficiently reported in research articles, hindering reliable assessment of model validity and consistent interpretation of model outputs.

Objective: To attain a set of guidelines on the use of machine learning predictive models within clinical settings to make sure the models are correctly applied and sufficiently reported so that true discoveries can be distinguished from random coincidence.

Methods: A multidisciplinary panel of machine learning experts, clinicians, and traditional statisticians were interviewed, using an iterative process in accordance with the Delphi method.

Results: The process produced a set of guidelines that consists of (1) a list of reporting items to be included in a research article and (2) a set of practical sequential steps for developing predictive models.

Conclusions: A set of guidelines was generated to enable correct application of machine learning models and consistent reporting of model specifications and results in biomedical research. We believe that such guidelines will accelerate the adoption of big data analysis, particularly with machine learning methods, in the biomedical research community.

Keywords: clinical prediction rule; guideline; machine learning.

Conflict of interest statement

Conflicts of Interest: None declared.

©Wei Luo, Dinh Phung, Truyen Tran, Sunil Gupta, Santu Rana, Chandan Karmakar, Alistair Shilton, John Yearwood, Nevenka Dimitrova, Tu Bao Ho, Svetha Venkatesh, Michael Berk. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 16.12.2016.

Figures

**Figure 1**
Steps to identify the prediction problem.

**Figure 2**
Information flow in the predictive modelling process.

References

1. Ayaru L, Ypsilantis P, Nanapragasam A, Choi RC, Thillanathan A, Min-Ho L, Montana G. Prediction of outcome in acute lower gastrointestinal bleeding using gradient boosting. PLoS One. 2015;10(7):e0132485. doi: 10.1371/journal.pone.0132485.
1. Ogutu J, Schulz-Streeck T, Piepho HP. Genomic selection using regularized linear regression models: ridge regression, lasso, elastic net and their extensions. BMC Proc. 2012;6(Suppl 2):S10. doi: 10.1186/1753-6561-6-S2-S10.
1. Tran T, Luo W, Phung D, Harvey R, Berk M, Kennedy RL, Venkatesh S. Risk stratification using data from electronic medical records better predicts suicide risks than clinician assessments. BMC Psychiatry. 2014 Mar 14;14:76. doi: 10.1186/1471-244X-14-76.
1. Breiman L, Friedman J, Stone C, Olshen R. Classification and regression trees. New York: Chapman & Hall; 1984.
1. Jordan MI, Mitchell TM. Machine learning: trends, perspectives, and prospects. Science. 2015 Jul 17;349(6245):255–60. doi: 10.1126/science.aaa8415.
1. Ghahramani Z. Probabilistic machine learning and artificial intelligence. Nature. 2015 May 28;521(7553):452–9. doi: 10.1038/nature14541.
1. Bone D, Goodwin MS, Black MP, Lee C, Audhkhasi K, Narayanan S. Applying machine learning to facilitate autism diagnostics: pitfalls and promises. J Autism Dev Disord. 2015 May;45(5):1121–36. doi: 10.1007/s10803-014-2268-6.
1. Metaxas P, Mustafaraj E, Gayo-Avello D. How (not) to predict elections. 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing; 2011; Boston. 2011.
1. Jungherr A, Jurgens P, Schoen H. Why the pirate party won the German election of 2009 or the trouble with predictions: a response to Tumasjan A, Sprenger TO, Sander PG, & Welpe IM “Predicting elections with Twitter: what 140 characters reveal about political sentiment”. Social Science Computer Review. 2011 Apr 25;30(2):229–234. doi: 10.1177/0894439311404119.
1. Lazer D, Kennedy R, King G, Vespignani A. Big data. The parable of Google flu: traps in big data analysis. Science. 2014 Mar 14;343(6176):1203–5. doi: 10.1126/science.1248506.
1. Foster KR, Koprowski R, Skufca JD. Machine learning, medical diagnosis, and biomedical engineering research - commentary. Biomed Eng Online. 2014 Jul 05;13:94. doi: 10.1186/1475-925X-13-94.
1. Smialowski P, Frishman D, Kramer S. Pitfalls of supervised feature selection. Bioinformatics. 2010 Feb 1;26(3):440–3. doi: 10.1093/bioinformatics/btp621.
1. Babyak M. What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models. Psychosom Med. 2004;66(3):411–21.
1. Hawkins DM. The problem of overfitting. J Chem Inf Comput Sci. 2004;44(1):1–12. doi: 10.1021/ci0342472.
1. Subramanian J, Simon R. Overfitting in prediction models - is it a problem only in high dimensions? Contemp Clin Trials. 2013 Nov;36(2):636–41. doi: 10.1016/j.cct.2013.06.011.
1. Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference and prediction. New York, NY: Springer; 2009.
1. Kuhn M, Johnson K. Applied predictive modeling. Berlin: Springer; 2013.
1. Steyerberg E. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating (Statistics for Biology and Health) New York: Springer; 2009.
1. Cox D. Two further applications of a model for binary regression. Biometrika. 1958 Dec;45(3/4):562–565. doi: 10.2307/2333203.
1. Quinlan J. Simplifying decision trees. Int J Man Mach Stud. 1987 Sep;27(3):221–234. doi: 10.1016/S0020-7373(87)80053-6.
1. Quinlan J. Induction of decision trees. Mach Learn. 1986 Mar;1(1):81–106. doi: 10.1007/BF00116251.
1. Podgorelec V, Kokol P, Stiglic B, Rozman I. Decision trees: an overview and their use in medicine. J Med Syst. 2002 Oct;26(5):445–63.
1. Kotsiantis S. Decision trees: a recent overview. Artif Intell Rev. 2011 Jun 29;39(4):261–283. doi: 10.1007/s10462-011-9272-4.
1. Kingsford C, Salzberg SL. What are decision trees? Nat Biotechnol. 2008 Sep;26(9):1011–3. doi: 10.1038/nbt0908-1011.
1. Luo W, Gallagher M. Unsupervised DRG upcoding detection in healthcare databases. IEEE International Conference on Data Mining Workshops (ICDMW); 2010; Sydney. 2010. pp. 600–605.
1. Siddique J, Ruhnke GW, Flores A, Prochaska MT, Paesch E, Meltzer DO, Whelan CT. Applying classification trees to hospital administrative data to identify patients with lower gastrointestinal bleeding. PLoS One. 2015;10(9):e0138987. doi: 10.1371/journal.pone.0138987.
1. Bae S, Lee SA, Lee SH. Prediction by data mining, of suicide attempts in Korean adolescents: a national study. Neuropsychiatr Dis Treat. 2015;11:2367–75. doi: 10.2147/NDT.S91111.
1. Satomi J, Ghaibeh AA, Moriguchi H, Nagahiro s. Predictability of the future development of aggressive behavior of cranial dural arteriovenous fistulas based on decision tree analysis. J Neurosurg. 2015 Jul;123(1):86–90. doi: 10.3171/2014.10.JNS141429.
1. Breiman L. Random forests. Mach Learn. 2001;45(1):5–32. doi: 10.1023/A:1010933404324.
1. Boulesteix A, Janitza S, Kruppa J, König Ir. Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. WIREs Data Mining Knowl Discov. 2012 Oct 18;2(6):493–507. doi: 10.1002/widm.1072.
1. Strobl C, Malley J, Tutz G. An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods. 2009 Dec;14(4):323–48. doi: 10.1037/a0016973.
1. Touw W, Bayjanov JR, Overmars L, Backus L, Boekhorst J, Wels M, van Hijum SA. Data mining in the Life Sciences with random forest: a walk in the park or lost in the jungle? Brief Bioinform. 2013 May;14(3):315–26. doi: 10.1093/bib/bbs034.
1. Asaoka R, Iwase A, Tsutsumi T, Saito H, Otani S, Miyata K, Murata H, Mayama C, Araie M. Combining multiple HRT parameters using the 'Random Forests' method improves the diagnostic accuracy of glaucoma in emmetropic and highly myopic eyes. Invest Ophthalmol Vis Sci. 2014 Apr 17;55(4):2482–90. doi: 10.1167/iovs.14-14009.
1. Yoshida T, Iwase A, Hirasawa H, Murata H, Mayama C, Araie M, Asaoka R. Discriminating between glaucoma and normal eyes using optical coherence tomography and the 'Random Forests' classifier. PLoS One. 2014;9(8):e106117. doi: 10.1371/journal.pone.0106117.
1. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol. 1996:267–288.
1. Tibshirani R. Regression shrinkage and selection via the lasso: a retrospective. Journal of the Royal Statistical Societyries B (Statistical Methodology) 2011;73(3):273–282. doi: 10.1111/j.1467-9868.2011.00771.x.
1. Vidaurre D, Bielza C, Larrañaga P. A survey of L1 regression. Int Stat Rev. 2013 Oct 24;81(3):361–387. doi: 10.1111/insr.12023.
1. Hesterberg T, Choi Nh, Meier L, Fraley C. Least angle and ℓ 1 penalized regression: A review. Statist Surv. 2008;2:61–93. doi: 10.1214/08-SS035.
1. Fujino Y, Murata H, Mayama C, Asaoka R. Applying “lasso” regression to predict future visual field progression in glaucoma patients. Invest Ophthalmol Vis Sci. 2015 Apr;56(4):2334–9. doi: 10.1167/iovs.15-16445.
1. Shimizu Y, Yoshimoto J, Toki S, Takamura M, Yoshimura S, Okamoto Y, Yamawaki S, Doya K. Toward probabilistic diagnosis and understanding of depression based on functional MRI data analysis with logistic group LASSO. PLoS One. 2015;10(5):e0123524. doi: 10.1371/journal.pone.0123524.
1. Lee T, Chao P, Ting H, Chang L, Huang Y, Wu J, Wang H, Horng M, Chang C, Lan J, Huang Y, Fang F, Leung SW. Using multivariate regression model with least absolute shrinkage and selection operator (LASSO) to predict the incidence of Xerostomia after intensity-modulated radiotherapy for head and neck cancer. PLoS One. 2014;9(2):e89700. doi: 10.1371/journal.pone.0089700.
1. Friedman J. Stochastic gradient boosting. Comput Stat Data Anal. 2002 Feb;38(4):367–378. doi: 10.1016/S0167-9473(01)00065-2.
1. Natekin A, Knoll A. Gradient boosting machines, a tutorial. Front Neurorobot. 2013;7:21. doi: 10.3389/fnbot.2013.00021.
1. De'ath G. Boosted trees for ecological modeling and prediction. Ecology. 2007 Jan;88(1):243–51.
1. Mayr A, Binder H, Gefeller O, Schmid M. The evolution of boosting algorithms. From machine learning to statistical modelling. Methods Inf Med. 2014;53(6):419–27. doi: 10.3414/ME13-01-0122.
1. González-Recio O, Jiménez-Montero JA, Alenda R. The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets. J Dairy Sci. 2013 Jan;96(1):614–24. doi: 10.3168/jds.2012-5630.
1. Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods. Cambridge: Cambridge University Press; 2000.
1. Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–297. doi: 10.1023/A:1022627411411.
1. Burges C. A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov. 1998;2(2):121–167. doi: 10.1023/A:1009715923555.
1. Smola A, Schölkopf B. A tutorial on support vector regression. Stat Comput. 2004 Aug;14(3):199–222. doi: 10.1023/B:STCO.0000035301.49549.88.
1. Vapnik V, Mukherjee S. Support Vector Method for Multivariate Density Estimation. Neural Information Processing Systems; 1999; Denver. 2000.
1. Manevitz L, Yousef M. One-class SVMs for document classification. J Mach Learn Res. 2001;2:139–154.
1. Tsochantaridis I, Hofmann T, Altun Y. Support vector machine learning for interdependent and structured output spaces. the twenty-first international conference on Machine learning; 2004; Banff. 2004.
1. Shilton A, Lai DT, Palaniswami M. A division algebraic framework for multidimensional support vector regression. IEEE Trans Syst Man Cybern B Cybern. 2010 Apr;40(2):517–28. doi: 10.1109/TSMCB.2009.2028314.
1. Shashua A, Levin A. Taxonomy of large margin principle algorithms for ordinal regression problems. Adv Neural Inf Process Syst. 2002;15:937–944.

Source: PubMed

Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View

Abstract

Conflict of interest statement

Figures

References

Sponzoři a spolupracovníci

Zdravotní podmínky

Drogové intervence