A deep learning model for detection of cervical spinal cord compression in MRI scans

Zamir Merali, Justin Z Wang, Jetan H Badhiwala, Christopher D Witiw, Jefferson R Wilson, Michael G Fehlings, Zamir Merali, Justin Z Wang, Jetan H Badhiwala, Christopher D Witiw, Jefferson R Wilson, Michael G Fehlings

Abstract

Magnetic Resonance Imaging (MRI) evidence of spinal cord compression plays a central role in the diagnosis of degenerative cervical myelopathy (DCM). There is growing recognition that deep learning models may assist in addressing the increasing volume of medical imaging data and provide initial interpretation of images gathered in a primary-care setting. We aimed to develop and validate a deep learning model for detection of cervical spinal cord compression in MRI scans. Patients undergoing surgery for DCM as a part of the AO Spine CSM-NA or CSM-I prospective cohort studies were included in our study. Patients were divided into a training/validation or holdout dataset. Images were labelled by two specialist physicians. We trained a deep convolutional neural network using images from the training/validation dataset and assessed model performance on the holdout dataset. The training/validation cohort included 201 patients with 6588 images and the holdout dataset included 88 patients with 2991 images. On the holdout dataset the deep learning model achieved an overall AUC of 0.94, sensitivity of 0.88, specificity of 0.89, and f1-score of 0.82. This model could improve the efficiency and objectivity of the interpretation of cervical spine MRI scans.

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Consort diagram showing the process of data acquisition and partitioning into training/validation and holdout datasets. CSM-NA—Cervical Spondylotic Myelopathy North American Clinical Trial. CSM-I—Cervical Spondylotic Myelopathy International Trial.
Figure 2
Figure 2
Representative axial T2-weighted MRI images showing (A) no spinal cord compression, (B) partial spinal cord compression, and (C) circumferential spinal cord compression.
Figure 3
Figure 3
Overview of the convolutional neural network model architecture. The convolutional layers (orange) were derived from the Resnet-50 model, while the fully connected layers were modified for our classification task. Seven separate model configurations were tested as shown.
Figure 4
Figure 4
An area under the receiver operating characteristic curve plot showing model performance on each patient in the holdout dataset. The green curve represents the average ROC curve for all patients, while the light blue lines represent ROC curves for each individual patient. The gray region represents one standard deviation above and below the average curve.
Figure 5
Figure 5
Class activation maps for example images that were classified correctly (true positives) or incorrectly (false negatives). The first column represents the axial T2-weighted MRI image. The second column shows the MRI image with the class activation map overlaid. In the class activation maps blue represents no activation while red represents maximal activation. On these example images of true positive classifications, the maximal activation tended to be over the spinal canal and spinal cord (Images A,B,C,D,E,F). In the false negative classifications, the activation was sometimes over irrelevant areas of the image, such as the paraspinal muscles or vascular structures (Images G,I,K,L). In other examples of false negative classifications there was activation over the spinal cord and spinal canal (Images H and J), but in these cases there was also activation over other seemingly irrelevant areas of the image.

References

    1. Karadimas SK, Erwin WM, Ely CG, Dettori JR, Fehlings MG. Pathophysiology and natural history of cervical spondylotic myelopathy. Spine. 2013;38(22 Suppl 1):S21–36. doi: 10.1097/BRS.0b013e3182a7f2c3.
    1. Nouri A, Tetreault L, Singh A, Karadimas SK, Fehlings MG. Degenerative cervical myelopathy: Epidemiology, genetics, and pathogenesis. Spine. 2015;40(12):E675–E693. doi: 10.1097/BRS.0000000000000913.
    1. Martin AR, Tadokoro N, Tetreault L, et al. Imaging evaluation of degenerative cervical myelopathy: Current state of the art and future directions. Neurosurg. Clin. N. Am. 2018;29(1):33–45. doi: 10.1016/j.nec.2017.09.003.
    1. Harrop JS, Naroji S, Maltenfort M, et al. Cervical myelopathy: A clinical and radiographic evaluation and correlation to cervical spondylotic myelopathy. Spine. 2010;35(6):620–624. doi: 10.1097/BRS.0b013e3181b723af.
    1. Wang S, Summers RM. Machine learning and radiology. Med. Image Anal. 2012;16(5):933–951. doi: 10.1016/j.media.2012.02.005.
    1. Chan S, Siegel EL. Will machine learning end the viability of radiology as a thriving medical specialty? British Journal of Radiology. 2019;92(1094). 10.1259/bjr.20180416.
    1. Kim M, Yun J, Cho Y, et al. Deep learning in medical imaging. Neurospine. 2019;16(4):657–668. doi: 10.14245/ns.1938396.198.
    1. Razzak MI, Naz S, Zaib A. Deep Learning for Medical Image Processing: Overview, Challenges and Future.
    1. Lundervold AS, Lundervold A. An overview of deep learning in medical imaging focusing on MRI. Z. Med. Phys. 2019;29(2):102–127. doi: 10.1016/j.zemedi.2018.11.002.
    1. Setio AAA, Ciompi F, Litjens G, et al. Pulmonary nodule detection in CT images: False positive reduction using multi-view convolutional networks. IEEE Trans. Med. Imaging. 2016;35(5):1160–1169. doi: 10.1109/TMI.2016.2536809.
    1. Chilamkurthy S, Ghosh R, Tanamala S, et al. Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study. The Lancet. 2018;392(10162):2388–2396. doi: 10.1016/S0140-6736(18)31645-3.
    1. Michopoulou SK, Costaridou L, Panagiotopoulos E, Speller R, Panayiotakis G, Todd-Pokropek A. Atlas-based segmentation of degenerated lumbar intervertebral discs from MR images of the spine. IEEE Trans. Biomed. Eng. 2009;56(9):2225–2231. doi: 10.1109/TBME.2009.2019765.
    1. Castro-Mateos I, Pozo JM, Lazary A, Frangi AF. 2D segmentation of intervertebral discs and its degree of degeneration from T2-weighted magnetic resonance images. In: Medical Imaging 2014: Computer-Aided Diagnosis. Vol 9035. SPIE; 2014:903517. 10.1117/12.2043755.
    1. Jin R, Luk KD, Cheung J, Hu Y. A machine learning based prognostic prediction of cervical myelopathy using diffusion tensor imaging. In: 2016 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications, CIVEMSA 2016 - Proceedings. Institute of Electrical and Electronics Engineers Inc.; 2016. 10.1109/CIVEMSA.2016.7524318.
    1. Weber KA, Smith AC, Wasielewski M, et al. deep learning convolutional neural networks for the automatic quantification of muscle fat infiltration following whiplash injury. Sci. Rep. 2019;9(1):1–8. doi: 10.1038/s41598-019-44416-8.
    1. Fehlings MG, Kopjar B, Arnold PM, et al. The AOSpine North America cervical spondylotic myelopathy study: 2-year surgical outcomes of a prospective multicenter study in 280 patients. Neurosurgery. 2010;67(2):543–543. doi: 10.1227/01.neu.0000386994.66508.c5.
    1. Fehlings MG, Wilson JR, Kopjar B, et al. Efficacy and safety of surgical decompression in patients with cervical spondylotic myelopathy results of the aospine north america prospective multi-center study. J. Bone Joint Surg. Ser. A. 2013;95(18):1651–1658. doi: 10.2106/JBJS.L.00589.
    1. Tetreault L, Kopjar B, Nouri A, et al. The modified Japanese Orthopaedic Association scale: establishing criteria for mild, moderate and severe impairment in patients with degenerative cervical myelopathy. Eur. Spine J. 2017;26(1):78–84. doi: 10.1007/s00586-016-4660-8.
    1. Tetreault L, Kopjar B, Cote P, Arnold P, Fehlings MG. A clinical prediction rule for functional outcomes in patients undergoing surgery for degenerative cervical Myelopathy Analysis of an international prospective Multicenter data Set of 757 Subjects. J. Bone Joint Surg. Am. Vol. 2014;97(24):2038–2046. doi: 10.2106/JBJS.O.00189.
    1. Aryanto KYE, Oudkerk M, van Ooijen PMA. Free DICOM de-identification tools in clinical research: Functioning and safety of patient privacy. Eur. Radiol. 2015;25(12):3685–3695. doi: 10.1007/s00330-015-3794-0.
    1. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol 2016-December. IEEE Computer Society; 2016:770–778. 10.1109/CVPR.2016.90.
    1. Deng J, Dong W, Socher R, Li L-J, Kai Li, Li Fei-Fei. ImageNet: A large-scale hierarchical image database. In: Institute of Electrical and Electronics Engineers (IEEE); 2010:248–255. 10.1109/cvpr.2009.5206848.
    1. Torrey L, Shavlik J. Transfer Learning.
    1. Yamashita R, Nishio M, Do RKG, Togashi K. Convolutional neural networks: an overview and application in radiology. Insights Imaging. 2018;9(4):611–629. doi: 10.1007/s13244-018-0639-9.
    1. Basha SHS, Dubey SR, Pulabaigari V, Mukherjee S. Impact of Fully Connected Layers on Performance of Convolutional Neural Networks for Image Classification. Neurocomputing. 2019;378:112–119. doi: 10.1016/j.neucom.2019.10.008.
    1. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research. Published online 2014:1929–1958.
    1. Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019;1(5):206–215. doi: 10.1038/s42256-019-0048-x.
    1. Rajpurkar P, Irvin J, Ball RL, et al. Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists. Sheikh A, ed. PLOS Medicine. 2018;15(11):e1002686. 10.1371/journal.pmed.1002686.
    1. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning Deep Features for Discriminative Localization. Accessed September 19, 2020. .
    1. GitHub - raghakot/keras-vis: Neural network visualization toolkit for keras. Accessed September 19, 2020. .
    1. Char DS, Shah NH, Magnus D. Implementing machine learning in health care ’ addressing ethical challenges. N. Engl. J. Med. 2018;378(11):981–983. doi: 10.1056/NEJMp1714229.
    1. Singh SP, Wang L, Gupta S, Goli H, Padmanabhan P, Gulyás B. 3D Deep Learning on Medical Images: A Review.
    1. Shen L, Margolies LR, Rothstein JH, Fluder E, McBride R, Sieh W. Deep learning to improve breast cancer detection on screening mammography. Sci. Rep. 2019;9(1):1–12. doi: 10.1038/s41598-019-48995-4.
    1. Jaiswal AK, Tiwari P, Kumar S, Gupta D, Khanna A, Rodrigues JJPC. Identifying pneumonia in chest X-rays: A deep learning approach. Measurement: Journal of the International Measurement Confederation. 2019;145:511–518. 10.1016/j.measurement.2019.05.076.
    1. Arbabshirani MR, Fornwalt BK, Mongelluzzo GJ, et al. Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration. npj Digital Medicine. 2018;1(1):9. 10.1038/s41746-017-0015-z.
    1. Hao S, Jiang J, Guo Y, Li H. Active learning based intervertebral disk classification combining shape and texture similarities. Neurocomputing. 2013;101:252–257. doi: 10.1016/j.neucom.2012.08.008.
    1. Ruiz-España S, Arana E, Moratal D. Semiautomatic computer-aided classification of degenerative lumbar spine disease in magnetic resonance imaging. Comput. Biol. Med. 2015;62:196–205. doi: 10.1016/j.compbiomed.2015.04.028.
    1. Urrutia J, Besa P, Campos M, et al. The Pfirrmann classification of lumbar intervertebral disc degeneration: an independent inter- and intra-observer agreement assessment. Eur. Spine J. 2016;25(9):2728–2733. doi: 10.1007/s00586-016-4438-z.
    1. Jamaludin A, Kadir T, Zisserman A. SpineNet: Automated classification and evidence visualization in spinal MRIs. Med. Image Anal. 2017;41:63–73. doi: 10.1016/j.media.2017.07.002.
    1. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings. International Conference on Learning Representations, ICLR; 2015. Accessed January 22, 2021. .
    1. Lu J-T, Pedemonte S, Bizzo B, et al. DeepSPINE: Automated Lumbar Vertebral Segmentation, Disc-level Designation, and Spinal Stenosis Grading Using Deep Learning. Published online July 26, 2018. Accessed May 5, 2020.
    1. Lewandrowski KU, Muraleedharan N, Eddy SA, et al. Feasibility of deep learning algorithms for reporting in routine spine magnetic resonance imaging. Int. J. Spine Surg. 2020;14(s3):S86–S97. doi: 10.14444/7131.
    1. Lewandrowski KU, Muraleedharan N, Eddy SA, et al. Reliability analysis of deep learning algorithms for reporting of routine lumbar MRI scans. Int. J. Spine Surg. 2020;14(s3):S98–S107. doi: 10.14444/7132.
    1. Kang Y, Lee JW, Koh YH, et al. New MRI grading system for the cervical canal stenosis. Am. J. Roentgenol. 2011;197(1):W134–W140. doi: 10.2214/AJR.10.5560.
    1. Arun N, Gaw N, Singh P, et al. Assessing the (Un)Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging. 10.1101/2020.07.28.20163899.
    1. Adebayo J, Gilmer J, Muelly M, et al. Sanity checks for saliency maps. Accessed January 23, 2021. .
    1. Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Vol 07–12-June-2015. IEEE Computer Society; 2015:1–9. 10.1109/CVPR.2015.7298594.
    1. Pesapane F, Volonté C, Codari M, Sardanelli F. Artificial intelligence as a medical device in radiology: ethical and regulatory issues in Europe and the United States. Insights Imaging. 2018;9(5):745–753. doi: 10.1007/s13244-018-0645-y.
    1. Hosny A, Parmar C, Quackenbush J, Schwartz LH, Aerts HJWL. Artificial intelligence in radiology. Nat. Rev. Cancer. 2018;18(8):500–510. doi: 10.1038/s41568-018-0016-5.

Source: PubMed

3
購読する