BS-Net: Learning COVID-19 pneumonia severity on a large chest X-ray dataset

Alberto Signoroni, Mattia Savardi, Sergio Benini, Nicola Adami, Riccardo Leonardi, Paolo Gibellini, Filippo Vaccher, Marco Ravanelli, Andrea Borghesi, Roberto Maroldi, Davide Farina, Alberto Signoroni, Mattia Savardi, Sergio Benini, Nicola Adami, Riccardo Leonardi, Paolo Gibellini, Filippo Vaccher, Marco Ravanelli, Andrea Borghesi, Roberto Maroldi, Davide Farina

Abstract

In this work we design an end-to-end deep learning architecture for predicting, on Chest X-rays images (CXR), a multi-regional score conveying the degree of lung compromise in COVID-19 patients. Such semi-quantitative scoring system, namely Brixia score, is applied in serial monitoring of such patients, showing significant prognostic value, in one of the hospitals that experienced one of the highest pandemic peaks in Italy. To solve such a challenging visual task, we adopt a weakly supervised learning strategy structured to handle different tasks (segmentation, spatial alignment, and score estimation) trained with a "from-the-part-to-the-whole" procedure involving different datasets. In particular, we exploit a clinical dataset of almost 5,000 CXR annotated images collected in the same hospital. Our BS-Net demonstrates self-attentive behavior and a high degree of accuracy in all processing stages. Through inter-rater agreement tests and a gold standard comparison, we show that our solution outperforms single human annotators in rating accuracy and consistency, thus supporting the possibility of using this tool in contexts of computer-assisted monitoring. Highly resolved (super-pixel level) explainability maps are also generated, with an original technique, to visually help the understanding of the network activity on the lung areas. We also consider other scores proposed in literature and provide a comparison with a recently proposed non-specific approach. We eventually test the performance robustness of our model on an assorted public COVID-19 dataset, for which we also provide Brixia score annotations, observing good direct generalization and fine-tuning capabilities that highlight the portability of BS-Net in other clinical settings. The CXR dataset along with the source code and the trained model are publicly released for research purposes.

Keywords: COVID-19 severity assessment; Chest X-rays; Convolutional neural networks; End-to-end learning; Semi-quantitative rating.

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Copyright © 2021. Published by Elsevier B.V.

Figures

Graphical abstract
Graphical abstract
Fig. 1
Fig. 1
Brixia score: (a) zone definition and (b–d) examples of annotations. Lungs are first divided into six zones on frontal chest X-rays. Line A is drawn at the level of the inferior wall of the aortic arch. Line B is drawn at the level of the inferior wall of the right inferior pulmonary vein. A and D upper zones; B and E middle zones; C and F lower zones. A score ranging from 0 (green) to 3 (black) is then assigned to each sector, based on the observed lung abnormalities. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 2
Fig. 2
Overview of the proposed method: representation of the two COVID-19 datasets (on the left) with associated Brixia score annotations, and of the other two datasets (on the right) used for the pre-training. Datasets splitting and usage is indicated (in the middle) for training/validation/test phases. The outputs of the proposed system are illustrated as well (bottom right).
Fig. 3
Fig. 3
Brixa score distribution with sex stratification on the Brixia COVID-19 dataset (left), and on the dataset of Cohen et al., 2020b (right).
Fig. 4
Fig. 4
Detailed scheme of the proposed architecture. In particular, in the top-middle the CXR to be analyzed is fed to the network. The produced outputs are: the segmentation mask of the lungs (top-left); the aligned mask (middle-left); the Brixia score (top-right).
Fig. 5
Fig. 5
Example of the alignment through the resampling grid produced by the transformation matrix, and its application to both the segmentation mask and the feature maps. On the right, the hard-attention mechanism and the ROI Pooling operation.
Fig. 6
Fig. 6
Consistency/confusion matrices based on lung regions score values (top, 0–3), and on Global Score values (bottom, 0–18).
Fig. 7
Fig. 7
Single and joint MAE distribution for lung regions and Global Score predictions obtained by BS-Net (ENS).
Fig. 8
Fig. 8
Training curves related to BS-Net-HA. Segmentation (a); Alignment (b); Brixia score prediction – best single model (c).
Fig. 9
Fig. 9
Pairwise inter-rater results in terms of MAE (and SD). In the most right column (orange), the inter-rater results with predictions by BS-Net-Ens.
Fig. 10
Fig. 10
Results and related explainability maps obtained on five examples from the Brixia COVID-19 test set. (top) Three examples of accurate predictions. (bottom) Two critical cases in which the prediction is poor with respect to the original clinical annotation R0. For each block, the most left image is the input CXR, followed by the aligned and masked lungs. In the second row we show the predicted Brixia score with respect to the original clinical annotation R0, and the explainability map. In such maps the relevance is colored so that white means that the region does not contribute to that prediction, while the class color (i.e., 1 = orange, 2 = red, 3 = black) means that the region had an important role in the prediction of the T score class.
Fig. 11
Fig. 11
MAE on regions (left) and MAE of the global score (right) versus synthetic rotation. The blue curve is from the network ‘without’ the alignment block, while the orange is ‘with’ the alignment block enabled. The shaded areas correspond to the 95% confidence interval. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

References

    1. Blain M., Kassin M.T., Varble N., Wang X., Xu Z., Xu D., Carrafiello G., Vespro V., Stellato E., Ierardi A.M., et al. Determination of disease severity in COVID-19 patients using deep learning in chest X-ray images. Diagn. Interv. Radiol. 2021;27(1):20–27.
    1. Bontempi D., Benini S., Signoroni A., Svanera M., Muckli L. CEREBRUM: a fast and fully-volumetric convolutional Encoder-decodeR for weakly-supervised sEgmentation of BRain strUctures from out-of-the-scanner MRI. Med. Image Anal. 2020;62
    1. Borghesi A., Maroldi R. COVID-19 outbreak in Italy: experimental chest X-ray scoring system for quantifying and monitoring disease progression. Radiol. Med. 2020;125:509–513.
    1. Borghesi A., Zigliani A., Golemi S., Carapella N., Maculotti P., Farina D., Maroldi R. Chest X-ray severity index as a predictor of in-hospital mortality in coronavirus disease 2019: a study of 302 patients from Italy. Int. J. Infect. Dis. 2020;96:291–293.
    1. Amer, R., Frid-Adar, M., Gozes, O., Nassar, J., Greenspan, H., 2020. COVID-19 in CXR: from detection and severity scoring to patient disease monitoring. arXiv: doi: .
    1. Burlacu, A., Crisan-Dabija, R., Popa, I. V., Artene, B., Birzu, V., Pricop, M., Plesoianu, C., Generali, D., 2020. Curbing the AI-induced enthusiasm in diagnosing COVID-19 on chest X-rays: the present and the near-future. medRxiv.
    1. Borghesi A., Zigliani A., Masciullo R., Golemi S., Maculotti P., Farina D., Maroldi R. Radiographic severity index in COVID-19 pneumonia: relationship to age and sex in 783 Italian patients. Radiol. Med. 2020;125:461–464.
    1. Buslaev A., Iglovikov V.I., Khvedchenya E., Parinov A., Druzhinin M., Kalinin A.A. Albumentations: fast and flexible image augmentations. Information. 2020;11(2)
    1. Candemir S., Antani S. A review on lung boundary detection in chest X-rays. Int. J. Comput. Assist. Radiol. Surg. 2019;14(4):563–576.
    1. Candemir S., Jaeger S., Palaniappan K., Musco J.P., Singh R.K., Xue Z., Karargyris A., Antani S., Thoma G., McDonald C.J. Lung segmentation in chest radiographs using anatomical atlases with nonrigid registration. IEEE Trans. Med. Imaging. 2014;33(2):577–590.
    1. Castiglioni, I., Ippolito, D., Interlenghi, M., Monti, C. B., Salvatore, C., Schiaffino, S., Polidori, A., Gandola, D., Messa, C., Sardanelli, F., 2020. Artificial intelligence applied on chest X-ray can aid in the diagnosis of COVID-19 infection: a first experience from lombardy, Italy. medRxiv.
    1. Cohen, J. P., Dao, L., Morrison, P., Roth, K., Bengio, Y., Shen, B., Abbasi, A., Hoshmand-Kochi, M., Ghassemi, M., Li, H., Duong, T. Q., 2020a. Predicting COVID-19 pneumonia severity on chest X-ray with deep learning. .
    1. Cohen J.P., Morrison P., Dao L., Roth K., Duong T.Q., Ghassemi M. COVID-19 image data collection: prospective predictions are the future. J. Mach. Learn. Biomed. Imaging. 2020;2:1–38.
    1. Toussie D., Voutsinas N., Finkelstein M., Cedillo M.A., Manna S., Maron S.Z., Jacobi A., Chung M., Bernheim A., Eber C., Concepcion J., Fayad Z., Gupta Y.S. Clinical and chest radiography features determine patient outcomes in young and middle age adults with COVID-19. Radiology. 2020;297(1):E197–E206.
    1. van Ginneken B., Stegmann M.B., Loog M. Segmentation of anatomical structures in chest radiographs using supervised methods: a comparative study on a public database. Med. Image Anal. 2006;10(1):19–40.
    1. Cohen, J. P., Morrison, P., Dao, L., 2020b. COVID-19 image data collection. arXiv:, .
    1. Glasmachers, T., 2017. Limits of End-to-End Learning. .
    1. Gozes, O., Frid-Adar, M., Greenspan, H., Browning, P. D., Zhang, H., Ji, W., Bernheim, A., Siegel, E., 2020. Rapid ai development cycle for the coronavirus (COVID-19) pandemic: initial results for automated detection and patient monitoring using deep learning CT image analysis. .
    1. Frid-Adar M., Ben-Cohen A., Amer R., Greenspan H. Image Analysis for Moving Organ, Breast, and Thoracic Images. Springer; 2018. Improving the segmentation of anatomical structures in chest radiographs using U-net with an imagenet pre-trained encoder; pp. 159–168.
    1. He K., Zhang X., Ren S., Sun J. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. Deep residual learning for image recognition; pp. 770–778.
    1. Hryniewska, W., BombiÅski, P., Szatkowski, P., Tomaszewska, P., Przelaskowski, A., Biecek, P., 2020. Do not repeat these mistakes – a critical appraisal of applications of explainable artificial intelligence for image based COVID-19 detection. .
    1. Huang L., Han R., Ai T., Yu P., Kang H., Tao Q., Xia L. Serial quantitative chest CT assessment of COVID-19: deep-learning approach. Radiol.: Cardiothorac. Imag. 2020;2(2):e200075.
    1. Huang G., Liu Z., Van Der Maaten L., Weinberger K.Q. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. Densely connected convolutional networks; pp. 4700–4708.
    1. Irvin J., Rajpurkar P., Ko M., Yu Y., Ciurea-Ilcus S., Chute C., Marklund H., Haghgoo B., Ball R., Shpanskaya K., et al. Proceedings of the AAAI Conf. on Artificial Intelligence. Vol. 33. 2019. Chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison; pp. 590–597.
    1. Jaderberg M., Simonyan K., Zisserman A., kavukcuoglu k. Advances in Neural Information Processing Systems. Vol. 28. Curran Associates, Inc.; 2015. Spatial transformer networks; pp. 2017–2025.
    1. Jaeger S., Candemir S., Antani S., Wáng Y.-X.J., Lu P.-X., Thoma G. Two public chest X-ray datasets for computer-aided screening of pulmonary diseases. Quant. Imaging Med. Surg. 2014;4(6):475.
    1. Kalkreuth, R., Kaufmann, P., 2020. COVID-19: a survey on public medical imaging data resources. .
    1. Karim, M. R., Dhmen, T., Rebholz-Schuhmann, D., Decker, S., Cochez, M., Beyan, O., 2020. DeepCOVIDExplainer: explainable COVID-19 predictions based on chest X-ray images. .
    1. Karimi, D., Dou, H., Warfield, S. K., Gholipour, A., 2019. Deep learning with noisy labels: exploring techniques and remedies in medical image analysis. .
    1. Kingma, D. P., Ba, J., 2014. Adam: a method for stochastic optimization. .
    1. Kundu S., Elhalawani H., Gichoya J.W., Kahn C.E. How might ai and chest imaging help unravel COVID-19’ mysteries? Radiol.: Artif. Intell. 2020;2(3):e200053.
    1. Laghi A. Cautions about radiologic diagnosis of COVID-19 infection driven by artificial intelligence. Lancet Digit. Health. 2020;2(5):e225.
    1. Latif S., Usman M., Manzoor S., Iqbal W., Qadir J., Tyson G., Castro I., Razi A., Boulos M.N.K., Weller A., Crowcroft J. Leveraging data science to combat COVID-19: a comprehensive review. IEEE Trans. Artif. Intell. 2020;1(1):85–103.
    1. Lessmann N., Sanchez C.I., Beenen L., Boulogne L.H., Brink M., Calli E., Charbonnier J.-P., Dofferhoff T., van Everdingen W.M., Gerke P.K., Geurts B., Gietema H.A., Groeneveld M., van Harten L., Hendrix N., Hendrix W., Huisman H.J., Igum I., Jacobs C., Kluge R., Kok M., Krdzalic J., Lassen-Schmidt B., van Leeuwen K., Meakin J., Overkamp M., van Rees Vellinga T., van Rikxoort E.M., Samperna R., Schaefer-Prokop C., Schalekamp S., Scholten E.T., Sital C., Stager L., Teuwen J., Vaidhya Venkadesh K., de Vente C., Vermaat M., Xie W., de Wilde B., Prokop M., van Ginneken B. Automated assessment of CO-RADS and chest CT severity scores in patients with suspected COVID-19 using artificial intelligence. Radiology. 2021;298(1):E18–E28. doi: 10.1148/radiol.2020202439.
    2. PMID: 32729810

    1. Li, M. D., Arun, N. T., Aggarwal, M., Gupta, S., Singh, P., Little, B. P., Mendoza, D. P., Corradi, G. C. A., Takahashi, M. S., Ferraciolli, S. F., Succi, M. D., Lang, M., Bizzo, B. C., Dayan, I., Kitamura, F. C., Kalpathy-Cramer, J., 2020a. Improvement and multi-population generalizability of a deep learning-based chest radiograph severity score for COVID-19. medRxiv. 10.1101/2020.09.15.20195453.
    1. Li M.D., Arun N.T., Gidwani M., Chang K., Deng F., Little B.P., Mendoza D.P., Lang M., Lee S.I., OShea A., Parakh A., Singh P., Kalpathy-Cramer J. Automated assessment and tracking of COVID-19 pulmonary disease severity on chest radiographs using convolutional siamese neural networks. Radiology. 2020;2(4):e200079.
    1. Li, X., Li, C., Zhu, D., 2020c. COVID-MobileXpert: On-device COVID-19 screening using snapshots of chest X-ray. .
    1. Linda Wang, Z. Q. L., Wong, A., 2020. COVID-Net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest radiography images. .
    1. Lin T.-Y., Dollar P., Girshick R., He K., Hariharan B., Belongie S. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) IEEE; 2017. Feature Pyramid Networks for Object Detection; pp. 936–944.
    1. Long J., Shelhamer E., Darrell T. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. Fully convolutional networks for semantic segmentation; pp. 3431–3440.
    1. Maguolo, G., Nanni, L., 2020. A Critic Evaluation of Methods for COVID-19 Automatic Detection from X-Ray Images. .
    1. Manna S., Wruble J., Maron S., Toussie D., Voutsinas N., Finkelstein M., Cedillo M.A., Diamond J., Eber C., Jacobi A., Chung M., Bernheim A. COVID-19: a multimodality review of radiologic techniques, clinical utility, and imaging features. Radiol.: Cardiothorac. Imag. 2020;2(3):e200210.
    1. Maroldi R., Rondi P., Agazzi G.M., Ravanelli M., Borghesi A., Farina D. Which role for chest X-ray score in predicting the outcome in COVID-19 pneumonia? Eur. Radiol. 2020:1–7. doi: 10.1007/s00330-020-07504-2.
    1. Minaee S., Kafieh R., Sonka M., Yazdani S., Jamalipour Soufi G. Deep-COVID: predicting COVID-19 from chest X-rayimages using deep transfer learning. Med. Image Anal. 2020;65:101794.
    1. Minaee S., Kafieh R., Sonka M., Yazdani S., Jamalipour Soufi G. Deep-COVID: predicting COVID-19 from chest X-ray images using deep transfer learning. Med. Image Anal. 2020;65:101794.
    1. Oh Y., Park S., Ye J.C. Deep learning COVID-19 features on CXR using limited training data sets. IEEE Trans. Med. Imaging. 2020;39(8):2688–2700.
    1. Rajaraman, S., Siegelman, J., Alderson, P. O., Folio, L. S., Folio, L. R., Antani, S. K., 2020. Iteratively pruned deep learning ensembles for COVID-19 detection in chest X-rays. .
    1. Ramachandran, P., Zoph, B., Le, Q. V., 2017. Searching for activation functions. .
    1. Pereira R.M., Bertolini D., Teixeira L.O., Silla C.N., Costa Y.M.G. COVID-19 identification in chest X-ray images on flat and hierarchical classification scenarios. Comput. Methods Prog. Biomed. 2020;194:105532.
    1. Reyes M., Meier R., Pereira S., Silva C.A., Dahlweid F.-M., Tengg-Kobligk H.v., Summers R.M., Wiest R. On the interpretability of artificial intelligence in radiology: challenges and opportunities. Radiol.: Artif. Intell. 2020;2(3):e190043.
    1. Ribeiro M.T., Singh S., Guestrin C. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery; New York, NY, USA: 2016. why should i trust you?: explaining the predictions of any classifier; p. 11351144.
    1. Ronneberger O., Fischer P., Brox T. International Conference on Medical image Computing and Computer-Assisted Intervention. Springer; 2015. U-net: convolutional networks for biomedical image segmentation; pp. 234–241.
    1. Rubin G.D., Ryerson C.J., Haramati L.B., Sverzellati N., Kanne J.P., Raoof S., Schluger N.W., Volpi A., Yim J.-J., Martin I.B.K., Anderson D.J., Kong C., Altes T., Bush A., Desai S.R., Goldin J., Goo J.M., Humbert M., Inoue Y., Kauczor H.-U., Luo F., Mazzone P.J., Prokop M., Remy-Jardin M., Richeldi L., Schaefer-Prokop C.M., Tomiyama N., Wells A.U., Leung A.N. The role of chest imaging in patient management during the COVID-19 pandemic: a multinational consensus statement from the Fleischner society. Radiology (Simultaneously Published inChest) 2020;296(1):172–180.
    1. Sardanelli F., Di Leo G. Assessing the value of diagnostic tests in the new world of COVID-19 pandemic. Radiology. 2020;296(3):E193–E194.
    1. Selvaraju R.R., Cogswell M., Das A., Vedantam R., Parikh D., Batra D. The IEEE International Conference on Computer Vision (ICCV) IEEE; 2017. Grad-CAM: visual explanations from deep networks via gradient-based localization; pp. 618–626.
    1. Shi F., Wang J., Shi J., Wu Z., Wang Q., Tang Z., He K., Shi Y., Shen D. Review of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for COVID-19. IEEE Rev. Biomed. Eng. 2021;14:4–15.
    1. Shiraishi J., Katsuragawa S., Ikezoe J., Matsumoto T., Kobayashi T., Komatsu K.-i., Matsui M., Fujita H., Kodera Y., Doi K. Development of a digital image database for chest radiographs with and without a lung nodule: receiver operating characteristic analysis of radiologists’ detection of pulmonary nodules. Am. J. Roentgenol. 2000;174(1):71–74.
    1. Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. .
    1. Stirenko S., Kochura Y., Alienin O., Rokovyi O., Gordienko Y., Gang P., Zeng W. 2018 IEEE 38th International Conference on Electronics and Nanotechnology (ELNANO) IEEE; 2018. Chest X-ray analysis of tuberculosis by deep learning with segmentation and augmentation; pp. 422–428.
    1. Summers R.M. Artificial intelligence of COVID-19 imaging: ahammer in search of a nail. Radiology (Published Online) 2021;298(3):E162–E164.
    2. PMID: 33350895

    1. Szegedy C., Ioffe S., Vanhoucke V., Alemi A.A. Thirty-First AAAI Conference on artificial Intelligence. ACM; 2017. Inception-v4, inception-resnet and the impact of residual connections on learning; pp. 4278–4284.
    1. Tajbakhsh N., Jeyaseelan L., Li Q., Chiang J.N., Wu Z., Ding X. Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation. Med. Image Anal. 2020;63:101693.
    1. Tan, M., Pang, R., Le, Q. V., 2019. EfficientDet: Scalable and Efficient Object Detection. .
    1. Tartaglione, E., Barbano, C. A., Berzovini, C., Calandri, M., Grangetto, M., 2020. Unveiling COVID-19 from chest X-ray with deep learning: a hurdles race with small data. .
    1. Ting D.S.W., Carin L., Dzau V., Wong T.Y. Digital technology and COVID-19. Nat. Med. 2020;26(4):459–461.
    1. Vedaldi A., Soatto S. In: European Conference of Computer Vision (ECCV) Forsyth D., Torr P., Zisserman A., editors. Springer Berlin Heidelberg; Berlin, Heidelberg: 2008. Quick shift and kernel methods for mode seeking; pp. 705–718.
    1. Wang X., Peng Y., Lu L., Lu Z., Bagheri M., Summers R.M. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017. ChestX-ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases; pp. 3462–3471.
    1. Warren M.A., Zhao Z., Koyama T., Bastarache J.A., Shaver C.M., Semler M.W., Rice T.W., Matthay M.A., Calfee C.S., Ware L.B. Severity scoring of lung oedema on the chest radiograph is associated with clinical outcomes in ARDS. Thorax. 2018;73(9):840–846.
    1. WHO, 2020. Coronavirus disease (COVID-19) outbreak. World Health Organization. .
    1. Wong, A., Lin, Z. Q., Wang, L., Chung, A. G., Shen, B., Abbasi, A., Hoshmand-Kochi, M., Duong, T. Q., 2020a. COVIDNet-S: Towards computer-aided severity assessment via training and validation of deep neural networks for geographic extent and opacity extent scoring of chest X-rays for SARS-CoV-2 lung disease severity. .
    1. Wong H.Y.F., Lam H.Y.S., Fong A.H.-T., Leung S.T., Chin T.W.-Y., Lo C.S.Y., Lui M.M.-S., Lee J.C.Y., Chiu K.W.-H., Chung T., Lee E.Y.P., Wan E.Y.F., Hung F.N.I., Lam T.P.W., Kuo M., Ng M.-Y. Frequency and distribution of chest radiographic findings in COVID-19 positive patients. Radiology. 2020;296(2):E72–E78.
    1. Xu Y., Zhu J.-Y., Chang E.I.-C., Lai M., Tu Z. Weakly supervised histopathology cancer image segmentation and classification. Med. Image Anal. 2014;18(3):591–604.
    1. Zhou Z., Rahman Siddiquee M.M., Tajbakhsh N., Liang J. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer International Publishing; Cham: 2018. UNet++: a nested U-Net architecture for medical image segmentation; pp. 3–11.
    1. Zhou Z.-H. A brief introduction to weakly supervised learning. Natl. Sci. Rev. 2017;5(1):44–53.
    1. Zhu J., Shen B., Abbasi A., Hoshmand-Kochi M., Li H., Duong T.Q. Deep transfer learning artificial intelligence accurately stages COVID-19 lung disease severity on portable chest radiographs. PLoS One. 2020;15(7):1–11.

Source: PubMed

3
订阅