Application of U-Net with Global Convolution Network Module in Computer-Aided Tongue Diagnosis

Meng-Yi Li, Ding-Ju Zhu, Wen Xu, Yu-Jie Lin, Kai-Leung Yung, Andrew W H Ip, Meng-Yi Li, Ding-Ju Zhu, Wen Xu, Yu-Jie Lin, Kai-Leung Yung, Andrew W H Ip

Abstract

The rapid development of intelligent manufacturing provides strong support for the intelligent medical service ecosystem. Researchers are committed to building Wise Information Technology of 120 (WIT 120) for residents and medical personnel with the concept of simple smart medical care and through core technologies such as Internet of Things, Big Data Analytics, Artificial Intelligence, and microservice framework, to improve patient safety, medical quality, clinical efficiency, and operational benefits. Among them, how to use computers and deep learning technology to assist in the diagnosis of tongue images and realize intelligent tongue diagnosis has become a major trend. Tongue crack is an important feature of tongue states. Not only does change of tongue crack states reflect objectively and accurately changed circumstances of some typical diseases and TCM syndrome but also semantic segmentation of fissured tongue can combine the other features of tongue states to further improve tongue diagnosis systems' identification accuracy. Although computer tongue diagnosis technology has made great progress, there are few studies on the fissured tongue, and most of them focus on the analysis of tongue coating and body. In this paper, we do systematic and in-depth researches and propose an improved U-Net network for image semantic segmentation of fissured tongue. By introducing the Global Convolution Network module into the encoder part of U-Net, it solves the problem that the encoder part is relatively simple and cannot extract relatively abstract high-level semantic features. Finally, the method is verified by experiments. The improved U-Net network has a better segmentation effect and higher segmentation accuracy for fissured tongue image dataset. It can be used to design a computer-aided tongue diagnosis system.

Conflict of interest statement

The authors declare that there are no conflicts of interest regarding the publication of this study.

Figures

**Figure 1**
Inception module with dimensionality reduction from the GoogLeNet architecture.

**Figure 2**
Residual block from the ResNet architecture.

**Figure 3**
The 3D U-Net architecture. Blue boxes represent feature maps. The number of channels is denoted above each feature map. (a) U-Net network structure, (b) GCR module and BR module, and (c) improved U-Net network structure.

**Figure 4**
Six cases of fissured tongue images and data preprocessing results: (a) the cracks in the picture are evenly distributed and obvious; (b) the cracks in the picture are scattered and obvious; (c, d) the crack distribution in the picture is scattered and not obvious; (e) the cracks in the picture are widely distributed and obvious, which is difficult to segment; (f) the crack distribution in the picture is single and easy to segment.

**Figure 5**
A different pretraining net comparison.

**Figure 6**
Different pretraining as encoder part of prediction result.

**Figure 7**
Comparison of MIoU between classical segmentation model and improved U-Net model in the test dataset.

**Figure 8**
Segmentation prediction results of different models in the test dataset.

**Figure 9**
Overlay effect of the original picture and predicted segmentation result image.

**Figure 10**
Experimental flow chart of validation model.

**Figure 11**
The experimental prediction results of six models with high MIoU in Section 4.5.

References

1. Yang Y., Yung K.-L., Hung T. W. R., Yu K.-M. Analyzing liver surface indentation for in vivo refinement of tumor location in minimally invasive surgery. Annals of Biomedical Engineering . 2020;49(5):1–14. doi: 10.1007/s10439-020-02698-4.
1. Yang Y., Li K., Gerhard S., Yung K.-L., Holzapfel G. A. Mechanical characterization of porcine liver properties for computational simulation of indentation on cancerous tissue. Mathematical Medicine and Biology: A Journal of the IMA . 2020;37(4):p. 4. doi: 10.1093/imammb/dqaa006.
1. Yung K. L., Cheung J. L. K., Chung S. W., Singh S., Yeung C. K. A single-port robotic platform for laparoscopic surgery with a large central channel for additional instrument. Annals of Biomedical Engineering . 2017;45(9):2211–2221. doi: 10.1007/s10439-017-1865-x.
1. Stoecklein D., Lore K. G., Davies M., Sarkar S, Ganapathysubramanian B. Deep learning for flow sculpting: insights into efficient learning using scientific simulation data. Scientific Reports . 2017;7(46368):1–11. doi: 10.1038/srep46368.
1. Kajikawa T., Kadoya N., Ito K., et al. A convolutional neural network approach for IMRT dose distribution prediction in prostate cancer patients. Journal of Radiation Research . 2019;60(5):1–12. doi: 10.1093/jrr/rrz051.
1. Krizhevsky A., Sutskever I., Hinton G. ImageNet classification with deep convolutional neural networks. Advances In Neural Information Processing Systems . 2012;25(2):106–114.
1. Long J., Shelhamer E., Darrell T. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence . 2015;39(4):640–651. doi: 10.1109/cvpr.2015.7298965.
1. Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition. Computer Science . 2014 .
1. Chiu C.-C. A novel approach based on computerized image analysis for traditional Chinese medical diagnosis of the tongue. Computer Methods and Programs in Biomedicine . 2000;61(2):77–89. doi: 10.1016/s0169-2607(99)00031-0.
1. Tang Y. P., Wang L. R., He X., et al. Classification of tongue image based on multi-task deep convolutional neural network. Computer Science . 2018;45(12):255–261.
1. Xiao Q. X., Zhang J., Li X. G., et al. Tongue coating color classification based on shallow convolutional neural network. Measurement and Control Technology . 2019;38(3):26–31.
1. Li J., Xu B. C., Ban X. J., et al. A tongue image segmentation method based on enhanced HSV convolutional neural network. Proceedings of the International Conference on Cooperative Design; Sept 2017; Mallorca, Spain. Visualization and Engineering;
1. Selvaraju R. R., Cogswell M., Das A., Vedantam R., Parikh D., Batra D. Grad-CAM: visual explanations from deep networks via gradient-based localization. International Journal of Computer Vision . 2020;128(2):336–359. doi: 10.1007/s11263-019-01228-7.
1. Wang A. M., Zhang Z. X., Shen L. S. Research on the tongue color classification in automatic tongue analysis of traditional Chinese medicine. Beijing Biomedical Engineering . 2000;3:136–142.
1. Ji Q. F. Inspection of the tongue shape; Differentiation of tongue image in traditional Chinese medicine ; 2018; Tajin, China: Tianjin Science and technology translation and publishing company; pp. 29–32.
1. Ronneberger O., Fischer P., Brox T. U-net: convolutional networks for biomedical image segmentation. Lecture Notes in Computer Science ; Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention Munich; October 2015; Springer, Munich, Germany. pp. 234–241.
1. Chao P., Zhang X., Gang Y., et al. Large kernel matters—improve semantic segmentation by global convolutional network. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition; July 2017; Honolulu, HI, USA. CVPR);
1. Badrinarayanan V., Kendall A., Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence . 2017;39(12):2481–2495. doi: 10.1109/tpami.2016.2644615.
1. Szegedy C., Liu W., Jia Y., et al. Going deeper with convolutions. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); June 2014; Boston, MA, USA.
1. He K., Zhang X. Y., Ren S. Q., Sun J. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition; June 2016; Las Vegas, NV, USA. IEEE Computer Society; pp. 770–778.
1. Szegedy C., Ioffe S., Vanhoucke V., et al. Inception-v4, inception-ResNet and the impact of residual connections on learning. 2017. .
1. Xie S., Girshick R., Dol1ar P., Tu Z., He K. Aggregated residual transformations for deep neural networks. Proceedings of the In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017; Honolulu, HI, USA. pp. 5987–5995.
1. Jie H., Li S., Gang S. Squeeze-and-Excitation networks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2017; Salt Lake City, UT, USA. p. p. 99.
1. Lin T. Y., Dollar P., Girshick R., He K., Hariharan B., Belongie S. Feature pyramid networks for object detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); July 2017; Honolulu, Hawaii, USA. pp. 2117–2125.
1. Chen L. C., Zhu Y. K., Papanderou G., Schroff F., Adam H. Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision; June 2018; Paris, France. pp. 801–818.
1. Chao W. Study on Crack Identification of Tongue Image ; 2020; Shanghai, China: East China University of Science and Technology; pp. 36–43.
1. Hui C. Y. Research on Diagnostic Classification for Fissured Tongue Images in Computerized Tongue Diagnosis . Harbin, China: Harbin Institute of Technology; 2010. pp. 53–68.
1. Zhang H. K., Hu Y. Y., Li X., Wang L., Zhang W. Q., Li F.-F. Computer identification and quantification of fissured tongue diagnosis. Proceedings of the Procee- dings of International Conference on Bioinformatics & Biomedicine; 2018; Washington, DC, USA. Madrid: IEEE Computer Society Press; pp. 1953–1958.
1. Chang W. H., Chu H. T, Chang H. H. Tongue fissure visualization with deep learning. Proceedings of the Conference on Technologies and Applications of Artificial Intelligence; November 2018; Taichung, Taiwan. IEEE Computer Society Press; pp. 14–17.
1. Selvaraju R. R., Cogswell M., Das A., Vedantam R., Parikh D., Batra D. Grad-CAM: visual explanations from deep networks via gradient-based localization. Proceedings of the International Conference on Computer Vision; 2018; Singapore. IEEE Computer Society Press; pp. 14–17.
1. Meng L., Dan X. W., Lu Z., et al. Study on extraction and recognition of traditional Chinese medicine tongue manifestation: based on deep learning and migration learning. Journal of Traditional Chinese Medicine . 2018;60(10):835–840.
1. Li X., Shao O., Yao O. Cracked tongue recognition using statistic feature. Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine; November 2014; Belfast, UK. pp. 72–73.
1. Liu L. L., Zhang D. Extracting tongue cracks using the whole the wide line detector. Proceedings of the International Conference on Medical Biometrics; 2018; Mantra on View Hotel, Gold Coast, Australia. Springer-Verlag; pp. 49–56.
1. Li F., Dong C. W. Diagnostics of Traditional Chinese Medicine . Vol. 1. Beijing, China: Science Press; 2018. Inspection of the tongue shape; pp. 42–68.
1. Garcia-Garcia A., Orts-Escolano S., Oprea S., Villena-Martinez V., Garcia-Rodriguez J. A review on deep learning techniques applied to semantic segmentation. 2017. .
1. Huang G., Liu Z., Weinberger K. Q., et al. Densely connected convolutional networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); July 2017; Honolulu, HI, USA. pp. 4700–4708.
1. Zhang X., Zhou X., Lin M., Sun J. ShuffleNet: an extremely efficient convolutional neural network for mobile devices. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition; June 2017; Salt Lake City, UT, USA.
1. Wu Y., He K. Group normalization. International Journal Of Computer Vision . 2018 doi: 10.1007/978-3-030-01261-8_1. .
1. Chollet F. Xception: deep learning with depthwise separable convolutions.. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017; Honolulu, HI, USA. IEEE; pp. 1800–1807.
1. Chaurasia A., Culurciello E. LinkNet: exploiting encoder representations for efficient semantic segmentation. Proceedings of the IEEE Visual Communications and Image Processing (VCIP); 2017; Petersburg, FL, USA. IEEE; pp. 1–4.
1. Zhang Z., Zhang X., Peng C., Xue X., Sun J. ExFuse: enhancing feature fusion for semantic segmentation. Proceedings of the IComputer Vision - ECCV 2018; 2018; Munich, Germany. Springer; pp. 273–288.
1. Zhao H., Shi J., Qi X., et al. Pyramid Scene Parsing Network . Washington, D.C., US: IEEE Computer Society; 2016.
1. Tang Z., Zhao G., Ouyang T. Two-phase deep learning model for short-term wind direction forecasting. Renewable Energy . 2021;173(72):1005–1016. doi: 10.1016/j.renene.2021.04.041.
1. Ning G., Yao Z. B. The integration of attention mechanism and dense atrous convolution for lung image segmentation. Journal of Image and Graphics . 2021;26(9):2146–2155.
1. Qiang Z. L., Tao Z., Huang X. C., et al. A tissue recovery-based brain tumor image registration methodv. Journal of Southern Medical University . 2021;41(2):292–298.
1. Yun Y., Qing Z. L. Z., Yong Q. Combining optimized U-Net and residual learning for cell membrane segmentation. COMPUTER ENGINEERING AND DESIGN . 2021;40(11):3313–3318.
1. Dong J. D., Ling H. W. Effective biomedical image segmentation method based on full convolutional neural network. Journal of Chinese Computer System . 2021;42(6):1281–1287.
1. Ling L. W., Mei F. L., Jian A. L. Person Re-identification based on convolution neural network feature weighting. Journal of Chinese Computer System . 2019;40(4):834–838.
1. Bo N. Y., Nan J. L., Li G., et al. Auto-segmentation method based on deep learning for the knee joint in MR images. Chinese Journal of Scientific Instrument . 2020;41(6):140–149.
1. Szegedy C., Zaremba W., Sutskever I., et al. Intriguing properties of neural networks. Computer Science . 2013 .
1. Nag M., Melody M., Teng S. M. Defending deep learning models against adversarial attacks. International Journal of Software Science and Computational Intelligence . 2021;13(1):72–89. doi: 10.4018/ijssci.2021010105.
1. Goodfellow I. J., Shlens J., Szegedy C. Explaining and harnessing adversarial examples. Computer Science . 2014 .
1. Lin W., Xu M., He J., Zhang W. Privacy, security and resilience in mobile Healthcare applications. Enterprise Information Systems . 2021;7:1–15. doi: 10.1080/17517575.2021.1939896.

Source: PubMed

Application of U-Net with Global Convolution Network Module in Computer-Aided Tongue Diagnosis

Abstract

Conflict of interest statement

Figures

References

Sponsors and Collaborators

Medical Conditions

Drug Interventions