Fast and accurate view classification of echocardiograms using deep learning

Ali Madani, Ramy Arnaout, Mohammad Mofrad, Rima Arnaout, Ali Madani, Ramy Arnaout, Mohammad Mofrad, Rima Arnaout

Abstract

Echocardiography is essential to cardiology. However, the need for human interpretation has limited echocardiography's full potential for precision medicine. Deep learning is an emerging tool for analyzing images but has not yet been widely applied to echocardiograms, partly due to their complex multi-view format. The essential first step toward comprehensive computer-assisted echocardiographic interpretation is determining whether computers can learn to recognize these views. We trained a convolutional neural network to simultaneously classify 15 standard views (12 video, 3 still), based on labeled still images and videos from 267 transthoracic echocardiograms that captured a range of real-world clinical variation. Our model classified among 12 video views with 97.8% overall test accuracy without overfitting. Even on single low-resolution images, accuracy among 15 views was 91.7% vs. 70.2-84.0% for board-certified echocardiographers. Data visualization experiments showed that the model recognizes similarities among related views and classifies using clinically relevant image features. Our results provide a foundation for artificial intelligence-assisted echocardiographic interpretation.

Conflict of interest statement

Competing interests: The authors declare no competing financial interests.

Figures

Fig. 1
Fig. 1
Convolutional neural net architecture for image classification. a The neural network algorithm used for classification included six convolutional layers and two fully-connected layers of 1028 and 512 nodes, respectively. The softmax classifier (pink circles) consisted of up to 15 nodes, depending on the classification task at hand. b Training, validation, and test data were split by study, and test data was not used for training or validating the model. The model was trained to classify images, with video classification as a majority rules vote on related image frames. Conv convolutional layer, Max Pool max pooling layer, FC fully connected layer
Fig. 2
Fig. 2
Sample input images. Views classified included parasternal long axis (psla), right ventricular inflow (rv inflow), basal short axis (sax basal), short axis at mid or mitral level (sax mid), apical four-chamber (a4c), apical five chamber (a5c), apical two chamber (a2c), apical three chamber/apical long axis (a3c), subcostal four-chamber (sub4c), subcostal inferior vena cava (ivc), subcostal/abdominal aorta (subao), suprasternal aorta/aortic arch (supao), pulsed-wave Doppler (PW), continuous-wave Doppler (CW), and m-mode (mmode). Note that these images are the actual resolution of input data to the deep learning algorithm
Fig. 3
Fig. 3
Natural variations in input data. In addition to applying data augmentation algorithms, we included in each category a range of images representing the natural variation seen in real-life echocardiography. The parasternal long-axis view is shown here for example. Variations include a range of timepoints spanning diastole and systole, differences in gain or chroma map, use of dual-mode acquisition, differences in depth and zoom, technically challenging images, use of 3D acquisition, a range of pathologies (seen here, concentric left ventricular hypertrophy and pericardial effusion), and use of color Doppler, as well as differences in angulation, sector width, and use of LV contrast. Note that these images are the actual resolution of input data to the deep learning algorithm
Fig. 4
Fig. 4
Deep learning model simultaneously distinguishes among 15 standard echocardiographic views. We developed a deep-learning method to classify among standard echocardiographic views, represented here by t-SNE clustering analysis of image classification. On the left, t-SNE clustering of input echocardiogram images. Each image is plotted in 4800-dimensional space according to the number of pixels, and projected to two-dimensional space for visualization purposes. Different colored dots represent different view classes (see legend in figure). Prior to neural network analysis, input data does not cluster into clear groups. On the right, data as processed through the last fully connected layer of the neural network are again represented in two-dimensional space, showing organization into clusters according to view category. Abbreviations: a4c apical 4 chamber, psla parasternal long axis, saxbasal short axis basal, a2c apical 2 chamber, saxmid short axis mid/mitral, a3c apical 3 chamber, sub4c subcostal 4 chamber, a5c apical 5 chamber, ivc subcostal ivc, rvinflow right ventricular inflow, supao suprasternal aorta/aortic arch, subao subcostal/abdominal aorta, cw continuous-wave Doppler, pw pulsed-wave Doppler, mmode m-mode recording
Fig. 5
Fig. 5
Echocardiogram view classification by deep-learning model. Confusional matrices showing actual view labels on y-axis, and neural network-predicted view labels on the x-axis by view category for video classification (a) and still-image classification (b) compared with a representative board-certified echocardiographer (c). Reading across true-label rows, the numbers in the boxes represent the percentage of labels predicted for each category. Color intensity corresponds to percentage, see heatmap on far right; the white background indicates zero percent. Categories are clustered according to areas of the most confusion. Rows may not add up to 100 percent due to rounding. d Comparison of accuracy by view category for deep-learning-assisted video classification, still-image classification, and still-image classification by a representative echocardiographer. e A comparison of percent of images correctly predicted by view category, when considering the model’s highest-probability top hit (white boxes) vs. its top two hits (blue boxes). f Receiver operating characteristic curves for view categories were very similar, with AUCs ranging from 0.985 to 1.00 (mean 0.996). Abbreviations: saxmid short axis mid/mitral, ivc subcostal ivc, subao subcostal/abdominal aorta, supao suprasternal aorta/aortic arch, saxbasal short axis basal, rvinflow right ventricular inflow, a2c apical 2 chamber, a3c apical 3 chamber, a4c apical 4 chamber, a5c apical 5 chamber, psla parasternal long axis, sub4c subcostal 4 chamber
Fig. 6
Fig. 6
Visualization of decision-making by neural network. a Occlusion experiments. All test images (a short axis basal sample image is shown here) were modified with grey masking of different shapes and sizes as shown, and test accuracy predicted for the test set based on each different modification. Masking that covered cardiac structures resulted in the poorest predictions. b Saliency maps. The input pixels weighted most heavily in the neural network’s classification decision for two example images (left; suprasternal aorta/aortic arch and short axis mid/mitral input examples shown) were calculated and plotted. The most important pixels (right) make an outline of structures clinically relevant to the view shown

References

    1. Karpathy, A. The Unreasonable Effectiveness of Recurrent Neural Networks. (2015).
    1. Esteva A, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115–118. doi: 10.1038/nature21056.
    1. Gulshan V, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316:2402–2410. doi: 10.1001/jama.2016.17216.
    1. Litjens, G. et al. A survey on deep learning in medical image analysis. eprint at (2017).
    1. Douglas PS, et al. ACCF/ASE/AHA/ASNC/HFSA/HRS/SCAI/SCCM/SCCT/SCMR 2011 appropriate use criteria for echocardiography. J. Am. Coll. Cardiol. 2011;57:1126–1166. doi: 10.1016/j.jacc.2010.11.002.
    1. Wharton G, et al. A minimum dataset for a standard adult transthoracic echocardiogram: a guideline protocol from the British Society of Echocardiography. Echo Res. Pract. 2015;2:G9–G24. doi: 10.1530/ERP-14-0079.
    1. Khamis H, et al. Automatic apical view classification of echocardiograms using a discriminative learning dictionary. Med. Image Anal. 2017;36:15–21. doi: 10.1016/j.media.2016.10.007.
    1. Knackstedt C, et al. Fully automated versus standard tracking of left ventricular ejection fraction and longitudinal strain: the FAST-EFs multicenter study. J. Am. Coll. Cardiol. 2015;66:1456–1466. doi: 10.1016/j.jacc.2015.07.052.
    1. Narula S, et al. Machine-learning algorithms to automate morphological and functional assessments in 2D echocardiography. J. Am. Coll. Cardiol. 2016;68:2287–2295. doi: 10.1016/j.jacc.2016.08.062.
    1. Park J, Zhou SK, Simopoulos C, Comaniciu D. AutoGate: fast and automatic Doppler gate localization in B-mode echocardiogram. Med. Image Comput. Comput. Assist. Interv. 2008;11:230–237.
    1. Sengupta, P. P. et al. Cognitive machine-learning algorithm for cardiac imaging: a pilot study for differentiating constrictive pericarditis from restrictive cardiomyopathy. Circ. Cardiovasc. Imaging9, 10.1161/CIRCIMAGING.115.004330 (2016).
    1. Gao XH, Li W, Loomes M, Wang LY. A fused deep learning architecture for viewpoint classification of echocardiography. Inf. Fusion. 2017;36:103–113. doi: 10.1016/j.inffus.2016.11.007.
    1. Penatti OA, et al. Mid-level image representations for real-time heart view plane classification of echocardiograms. Comput. Biol. Med. 2015;66:66–81. doi: 10.1016/j.compbiomed.2015.08.004.
    1. Abadi, M. et al. TensorFlow: Large-scale machine learning on heterogeneous systems (2015).
    1. Keras (GitHub, 2015).
    1. Python Language Reference, version 2.7. Python Software Foundation (2017).
    1. Russakovsky O, et al. ImageNet large scale visual recognition challenge. Int. J. Comput. Vision. 2015;115:211–252. doi: 10.1007/s11263-015-0816-y.
    1. van der Maaten L, Hinton G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008;9:2579–2605.

Source: PubMed

3
Abonnieren