Objective risk stratification of prostate cancer using machine learning and radiomics applied to multiparametric magnetic resonance images

Bino Varghese, Frank Chen, Darryl Hwang, Suzanne L Palmer, Andre Luis De Castro Abreu, Osamu Ukimura, Monish Aron, Manju Aron, Inderbir Gill, Vinay Duddalwar, Gaurav Pandey, Bino Varghese, Frank Chen, Darryl Hwang, Suzanne L Palmer, Andre Luis De Castro Abreu, Osamu Ukimura, Monish Aron, Manju Aron, Inderbir Gill, Vinay Duddalwar, Gaurav Pandey

Abstract

Multiparametric magnetic resonance imaging (mpMRI) has become increasingly important for the clinical assessment of prostate cancer (PCa), but its interpretation is generally variable due to its relatively subjective nature. Radiomics and classification methods have shown potential for improving the accuracy and objectivity of mpMRI-based PCa assessment. However, these studies are limited to a small number of classification methods, evaluation using the AUC score only, and a non-rigorous assessment of all possible combinations of radiomics and classification methods. This paper presents a systematic and rigorous framework comprised of classification, cross-validation and statistical analyses that was developed to identify the best performing classifier for PCa risk stratification based on mpMRI-derived radiomic features derived from a sizeable cohort. This classifier performed well in an independent validation set, including performing better than PI-RADS v2 in some aspects, indicating the value of objectively interpreting mpMRI images using radiomics and classification methods for PCa risk assessment.

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Results of the performance evaluation of various classification algorithms and their resultant models tested in our framework, grouped by several evaluation measures (A): High-risk class; (B): Lower-risk class). Also shown are the results of the statistical comparison of these performances in the form of Critical Difference (CD) plots for the high (C) and lower (D) PCa risk classes respectively. Classification algorithms, represented by vertical + horizontal lines, are displayed from left to right in terms of the average rank obtained by their resultant models in each of the ten cross-validation rounds, and the classifiers producing statistically equivalent performance are connected by horizontal lines. These results show that the Quadratic kernel-based SVM (QSVM) is the best performer overall, especially because it is the only classifier that is statistically the best performer (leftmost classifier in the plots, either by itself or tied with another classifier like CSVM or LogReg) in terms of all the evaluation measures for both the classes. The CD plots were drawn using open-source Matlab code.
Figure 2
Figure 2
Flowchart of some sample quantitative radiomic features used in our study that were extracted from segmented tumor regions of interest (ROI) of mpMRI images. In summary, 55 different features were extracted per image type (i.e., T2WI or ADC) using four different texture extraction methods, yielding 110 radiomic features per patient. The four texture methods included histogram analysis, Gray-Level Co-occurrence and Difference Matrix methods (GLCM and GLDM) and Fast Fourier Transform (FFT). Some of these features are highlighted in green, blue and red respectively. The full list and details of these features are provided in the online Appendix in Supplementary Information. Note that all these features were 2D, as the input imaging data were two-dimensional.
Figure 3
Figure 3
Workflow of our ML-based framework used to identify the best combination of radiomic features and classification algorithm for categorizing PCa patients into high-risk and lower-risk categories. Cross-validation was used to identify the best performing algorithm out of seven commonly used algorithms, which was then used to train the final classifier on the entire development set (68 PCa patients). This classifier was then evaluated on an independent validation set of 53 PCa patients in terms of a variety of performance measures, namely AUC, Fmax, Pmax and Rmax.

References

    1. Key Statistics for Prostate Cancer | Prostate Cancer Facts. Available at: . (Accessed: 20th June 2018)
    1. Chang AJ, Autio KA, Roach M, Scher HI. “High-Risk” Prostate Cancer: Classification and Therapy. Nat. Rev. Clin. Oncol. 2014;11:308–323.
    1. Wang Q, et al. Histogram analysis of diffusion kurtosis magnetic resonance imaging in differentiation of pathologic Gleason grade of prostate cancer. Urol. Oncol. 2015;33(337):e15–24.
    1. Fütterer JJ, et al. Can Clinically Significant Prostate Cancer Be Detected with Multiparametric Magnetic Resonance Imaging? A Systematic Review of the Literature. Eur. Urol. 2015;68:1045–1053.
    1. Chen F, Cen S, Palmer S. Application of Prostate Imaging Reporting and Data System Version 2 (PI-RADS v2): Interobserver Agreement and Positive Predictive Value for Localization of Intermediate- and High-Grade Prostate Cancers on Multiparametric Magnetic Resonance Imaging. Acad. Radiol. 2017;24:1101–1106.
    1. Greer MD, et al. Accuracy and agreement of PI-RADS v2 for prostate cancer mpMRI: A multireader study. J. Magn. Reson. Imaging JMRI. 2017;45:579–585.
    1. Renard-Penna R, et al. Prostate Imaging Reporting and Data System and Likert Scoring System: Multiparametric MR Imaging Validation Study to Screen Patients for Initial Biopsy. Radiology. 2015;275:458–468.
    1. Thompson JE, et al. The Diagnostic Performance of Multiparametric Magnetic Resonance Imaging to Detect Significant Prostate Cancer. J. Urol. 2016;195:1428–1435.
    1. Langer DL, et al. Prostate tissue composition and MR measurements: investigating the relationships between ADC, T2, K(trans), v(e), and corresponding histologic features. Radiology. 2010;255:485–494.
    1. Donati OF, et al. Prostate cancer aggressiveness: assessment with whole-lesion histogram analysis of the apparent diffusion coefficient. Radiology. 2014;271:143–152.
    1. Jung SI, et al. Transition zone prostate cancer: incremental value of diffusion-weighted endorectal MR imaging in tumor detection and assessment of aggressiveness. Radiology. 2013;269:493–503.
    1. Bittencourt LK, Barentsz JO, de Miranda LCD, Gasparetto EL. Prostate MRI: diffusion-weighted imaging at 1.5T correlates better with prostatectomy Gleason Grades than TRUS-guided biopsies in peripheral zone tumours. Eur. Radiol. 2012;22:468–475.
    1. Donati OF, et al. Prostate MRI: evaluating tumor volume and apparent diffusion coefficient as surrogate biomarkers for predicting tumor Gleason score. Clin. Cancer Res. Off. J. Am. Assoc. Cancer Res. 2014;20:3705–3711.
    1. Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology. 2015;278:563–577.
    1. Aerts HJ, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 2014;5:4006.
    1. El Naqa I, et al. Radiation Therapy Outcomes Models in the Era of Radiomics and Radiogenomics: Uncertainties and Validation. Int. J. Radiat. Oncol. Biol. Phys. 2018;102:1070–1073.
    1. Lopes R, et al. Prostate cancer characterization on MR images using fractal features. Med. Phys. 2011;38:83–95.
    1. Lv D, Guo X, Wang X, Zhang J, Fang J. Computerized characterization of prostate cancer by fractal analysis in MR images. J. Magn. Reson. Imaging JMRI. 2009;30:161–168.
    1. Wibmer A, et al. Haralick texture analysis of prostate MRI: utility for differentiating non-cancerous prostate from prostate cancer and differentiating prostate cancers with different Gleason scores. Eur. Radiol. 2015;25:2840–2850.
    1. Haralick RM, Shanmugam K, Dinstein I. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. SMC. 1973;3:610–621.
    1. Alpaydin, E. Introduction to Machine Learning. (MIT Press, 2014).
    1. Cleophas, T. J. & Zwinderman, A. H. Machine Learning in Medicine - a Complete Overview. (Springer, 2015).
    1. Smith, C. P. et al. Radiomics and radiogenomics of prostate cancer. Abdom. Radiol. N. Y. 10.1007/s00261-018-1660-7 (2018).
    1. Stoyanova R, et al. Prostate cancer radiomics and the promise of radiogenomics. Transl. Cancer Res. 2016;5:432–447.
    1. Burges CJC. A Tutorial on Support Vector Machines for Pattern Recognition. Data Min. Knowl. Discov. 1998;2:121–167.
    1. Niaf É, Flamary R, Rouvière O, Lartizien C, Canu S. Kernel-Based Learning From Both Qualitative and Quantitative Labels: Application to Prostate Cancer Diagnosis Based on Multiparametric MR Imaging. IEEE Trans. Image Process. 2014;23:979–991.
    1. Lever J, Krzywinski M, Altman N. Points of Significance: Classification evaluation. Nature Methods. 2016
    1. Liu, P. et al. A prostate cancer computer-aided diagnosis system using multimodal magnetic resonance imaging and targeted biopsy labels. In Medical Imaging 2013: Computer-Aided Diagnosis 8670, 86701G (International Society for Optics and Photonics, 2013).
    1. Tiwari P, Kurhanewicz J, Madabhushi A. Multi-kernel graph embedding for detection, Gleason grading of prostate cancer via MRI/MRS. Med. Image Anal. 2013;17:219–235.
    1. Witten, I. H., Frank, E., Hall, M. A. & Pal, C. J. Data Mining: Practical Machine Learning Tools and Techniques. (Morgan Kaufmann, 2016).
    1. Arlot S, Celisse A. A survey of cross-validation procedures for model selection. Stat. Surv. 2010;4:40–79.
    1. Demšar J. Statistical Comparisons of Classifiers over Multiple Data Sets. J Mach Learn Res. 2006;7:1–30.
    1. NCCN Guidelines for Patients®|Prostate Cancer. Available at: . (Accessed: 18th July 2018)
    1. Lobo, J. M., Jiménez‐Valverde, A. & Real, R. AUC: a misleading measure of the performance of predictive distribution models. Glob. Ecol. Biogeogr. 17, 145–151 (2008).
    1. Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PloS One. 2015;10:e0118432.
    1. Fehr D, et al. Automatic classification of prostate cancer Gleason scores from multiparametric magnetic resonance images. Proc. Natl. Acad. Sci. USA. 2015;112:E6265–6273.
    1. Madabhushi A, Feldman MD, Metaxas DN, Tomaszeweski J, Chute D. Automated detection of prostatic adenocarcinoma from high-resolution ex vivo MRI. IEEE Trans. Med. Imaging. 2005;24:1611–1625.
    1. Stoltzfus JC. Logistic Regression: A Brief Primer. Acad. Emerg. Med. 2011;18:1099–1104.
    1. Wang J, et al. Machine learning-based analysis of MR radiomics can help to improve the diagnostic performance of PI-RADS v2 in clinically relevant prostate cancer. Eur. Radiol. 2017;27:4082–4090.
    1. Sarkar S, Das S. A Review of Imaging Methods for Prostate Cancer Detection. Biomed. Eng. Comput. Biol. 2016;7:1–15.
    1. Rosenkrantz AB, et al. Interobserver Reproducibility of the PI-RADS Version 2 Lexicon: A Multicenter Study of Six Experienced Prostate Radiologists. Radiology. 2016;280:793–804.
    1. Algohary, A. et al. Radiomic features on MRI enable risk categorization of prostate cancer patients on active surveillance: Preliminary findings. J. Magn. Reson. Imaging JMRI, 10.1002/jmri.25983 (2018).
    1. Niaf E, Rouvière O, Mège-Lechevallier F, Bratan F, Lartizien C. Computer-aided diagnosis of prostate cancer in the peripheral zone using multiparametric MRI. Phys. Med. Biol. 2012;57:3833–3851.
    1. Kwak JT, et al. Automated prostate cancer detection using T2-weighted and high-b-value diffusion-weighted magnetic resonance imaging. Med. Phys. 2015;42:2368–2378.
    1. Fried DV, et al. Prognostic value and reproducibility of pretreatment CT texture features in stage III non-small cell lung cancer. Int. J. Radiat. Oncol. Biol. Phys. 2014;90:834–842.
    1. Coroller TP, et al. CT-based radiomic signature predicts distant metastasis in lung adenocarcinoma. Radiother. Oncol. J. Eur. Soc. Ther. Radiol. Oncol. 2015;114:345–350.
    1. Parmar C, et al. Radiomic Machine-Learning Classifiers for Prognostic Biomarkers of Head and Neck Cancer. Front. Oncol. 2015;5:272.
    1. Chawla, N. V. Data Mining for Imbalanced Datasets: An Overview. In Data Mining and Knowledge Discovery Handbook 853–867 10.1007/0-387-25465-X_40 (Springer, Boston, MA, 2005).
    1. Mackin D, et al. Measuring Computed Tomography Scanner Variability of Radiomics Features. Invest. Radiol. 2015;50:757–765.
    1. Mackin, D. et al. Harmonizing the pixel size in retrospective computed tomography radiomics studies. PLoS One12 (2017).
    1. Mackin D, et al. Effect of tube current on computed tomography radiomic features. Sci. Rep. 2018;8:2354.
    1. Fave X, et al. Impact of image preprocessing on the volume dependence and prognostic potential of radiomics features in non-small cell lung cancer. Transl. Cancer Res. 2016;5:349–363.
    1. Lv W, et al. Robustness versus disease differentiation when varying parameter settings in radiomics features: application to nasopharyngeal PET/CT. Eur. Radiol. 2018;28:3245–3254.
    1. Mayerhoefer ME, Szomolanyi P, Jirak D, Materka A, Trattnig S. Effects of MRI acquisition parameter variations and protocol heterogeneity on the results of texture analysis and pattern discrimination: an application-oriented study. Med. Phys. 2009;36:1236–1243.
    1. Collewet G, Strzelecki M, Mariette F. Influence of MRI acquisition protocols and image intensity normalization methods on texture classification. Magn. Reson. Imaging. 2004;22:81–91.
    1. Yang L, et al. Can CT-based radiomics signature predict KRAS/NRAS/BRAF mutations in colorectal cancer? Eur. Radiol. 2018;28:2058–2067.
    1. Haury A-C, Gestraud P, Vert J-P. The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures. PloS One. 2011;6:e28210.
    1. Zhai Y, Og Y, Tsang IW. The Erging ‘Big Dimensionality’. IEEE Comput. Intell. Mag. 2014;9:14–26.
    1. Pandey G, et al. A Nasal Brush-based Classifier of Asthma Identified by Machine Learning Analysis of Nasal RNA Sequence Data. Sci. Rep. 2018;8:8826.
    1. Saeys Y, Inza I, Larrañaga P. A review of feature selection techniques in bioinformatics. Bioinformatics. 2007;23:2507–2517.
    1. Kunapuli, G. et al. A Decision-Support Tool for Renal Mass Classification. J. Digit. Imaging, 10.1007/s10278-018-0100-0 (2018).
    1. Pesapane, F., Codari, M. & Sardanelli, F. Artificial intelligence in medical imaging: threat or opportunity? Radiologists again at the forefront of innovation in medicine. Eur. Radiol. Exp. 2 (2018).
    1. Varghese BA, et al. Differentiation of Predominantly Solid Enhancing Lipid-Poor Renal Cell Masses by Use of Contrast-Enhanced CT: Evaluating the Role of Texture in Tumor Subtyping. Am. J. Roentgenol. 2018;211:W288–W296.
    1. Izenman, A. J. Linear Discriminant Analysis. In Modern Multivariate Statistical Techniques 237–280, doi:10.1007/978-0-387-78189-1_8 (Springer, New York, NY, 2013).
    1. Breiman L. Random Forests. Mach. Learn. 2001;45:5–32.
    1. Li D-C, Liu C-W, Hu SC. A learning method for the class imbalance problem with medical data sets. Comput. Biol. Med. 2010;40:509–518.
    1. plot a critical difference diagram, MATLAB code - 华东博客 - 博客园. Available at, (Accessed: 29th July 2018).
    1. Larue RTHM, Defraene G, De Ruysscher D, Lambin P, van Elmpt W. Quantitative radiomics studies for tissue characterization: a review of technology and methodological procedures. Br. J. Radiol. 2017;90:20160665.
    1. Parmar C, Grossmann P, Bussink J, Lambin P, Aerts HJWL. Machine Learning methods for Quantitative RadiomicBiomarkers. Sci. Rep. 2015;5:13087.
    1. Whalen S, Pandey OP, Pandey G. Predicting protein function and other biomedical characteristics with heterogeneous ensembles. Methods San Diego Calif. 2016;93:92–102.
    1. Radivojac P, et al. A large-scale evaluation of computational protein function prediction. Nat. Methods. 2013;10:221–227.
    1. Jiang Y, et al. An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol. 2016;17:184.

Source: PubMed

3
구독하다