Assessment of Accuracy of an Artificial Intelligence Algorithm to Detect Melanoma in Images of Skin Lesions

Michael Phillips, Helen Marsden, Wayne Jaffe, Rubeta N Matin, Gorav N Wali, Jack Greenhalgh, Emily McGrath, Rob James, Evmorfia Ladoyanni, Anthony Bewley, Giuseppe Argenziano, Ioulios Palamaras, Michael Phillips, Helen Marsden, Wayne Jaffe, Rubeta N Matin, Gorav N Wali, Jack Greenhalgh, Emily McGrath, Rob James, Evmorfia Ladoyanni, Anthony Bewley, Giuseppe Argenziano, Ioulios Palamaras

Abstract

Importance: A high proportion of suspicious pigmented skin lesions referred for investigation are benign. Techniques to improve the accuracy of melanoma diagnoses throughout the patient pathway are needed to reduce the pressure on secondary care and pathology services.

Objective: To determine the accuracy of an artificial intelligence algorithm in identifying melanoma in dermoscopic images of lesions taken with smartphone and digital single-lens reflex (DSLR) cameras.

Design, setting, and participants: This prospective, multicenter, single-arm, masked diagnostic trial took place in dermatology and plastic surgery clinics in 7 UK hospitals. Dermoscopic images of suspicious and control skin lesions from 514 patients with at least 1 suspicious pigmented skin lesion scheduled for biopsy were captured on 3 different cameras. Data were collected from January 2017 to July 2018. Clinicians and the Deep Ensemble for Recognition of Malignancy, a deterministic artificial intelligence algorithm trained to identify melanoma in dermoscopic images of pigmented skin lesions using deep learning techniques, assessed the likelihood of melanoma. Initial data analysis was conducted in September 2018; further analysis was conducted from February 2019 to August 2019.

Interventions: Clinician and algorithmic assessment of melanoma.

Main outcomes and measures: Area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity of the algorithmic and specialist assessment, determined using histopathology diagnosis as the criterion standard.

Results: The study population of 514 patients included 279 women (55.7%) and 484 white patients (96.8%), with a mean (SD) age of 52.1 (18.6) years. A total of 1550 images of skin lesions were included in the analysis (551 [35.6%] biopsied lesions; 999 [64.4%] control lesions); 286 images (18.6%) were used to train the algorithm, and a further 849 (54.8%) images were missing or unsuitable for analysis. Of the biopsied lesions that were assessed by the algorithm and specialists, 125 (22.7%) were diagnosed as melanoma. Of these, 77 (16.7%) were used for the primary analysis. The algorithm achieved an AUROC of 90.1% (95% CI, 86.3%-94.0%) for biopsied lesions and 95.8% (95% CI, 94.1%-97.6%) for all lesions using iPhone 6s images; an AUROC of 85.8% (95% CI, 81.0%-90.7%) for biopsied lesions and 93.8% (95% CI, 91.4%-96.2%) for all lesions using Galaxy S6 images; and an AUROC of 86.9% (95% CI, 80.8%-93.0%) for biopsied lesions and 91.8% (95% CI, 87.5%-96.1%) for all lesions using DSLR camera images. At 100% sensitivity, the algorithm achieved a specificity of 64.8% with iPhone 6s images. Specialists achieved an AUROC of 77.8% (95% CI, 72.5%-81.9%) and a specificity of 69.9%.

Conclusions and relevance: In this study, the algorithm demonstrated an ability to identify melanoma from dermoscopic images of selected lesions with an accuracy similar to that of specialists.

Conflict of interest statement

Conflict of Interest Disclosures: Mr Phillips reported having a familial relationship with Skin Analytics Limited. Dr Marsden reported working for Skin Analytics Limited and receiving share options during the conduct of the study. Dr Matin reported receiving grants from Barco outside the submitted work and being coauthor of a suite of Cochrane diagnostic test accuracy systematic reviews, including the diagnosis of melanoma. Dr Greenhalgh reported working for Skin Analytics Limited during the conduct of the study and holding patent US 20150254851 A1. Dr Bewley reported working as an ad hoc consultant for Almirall, AbbVie, Galderma, LEO Pharma, Eli Lilly and Co, Sanofi, Novartis, and Janssen Pharmaceuticals. No other disclosures were reported.

Figures

Figure 1.. Receiver Operating Characteristic Curves for…
Figure 1.. Receiver Operating Characteristic Curves for Clinical and Trained Algorithm Assessment of Biopsied Lesions
A, Area under receiver operator characteristic curve, 0.779. B, Area under receiver operator characteristic curve, 0.902. C, Area under receiver operator characteristic curve, 0.858. D, Area under receiver operator characteristic curve, 0.869. DSLR indicates digital single-lens reflex.
Figure 2.. Receiver Operating Characteristic Curves for…
Figure 2.. Receiver Operating Characteristic Curves for Clinical and Trained Algorithm Assessment of All Lesions
A, Area under receiver operator characteristic curve, 0.909. B, Area under receiver operator characteristic curve, 0.959. C, Area under receiver operator characteristic curve, 0.938. D, Area under receiver operator characteristic curve, 0.918. DSLR indicates digital single-lens reflex.

References

    1. Cancer Research UK . Melanoma skin cancer survival statistics. . Accessed March 21, 2019.
    1. Wernli KJ, Henrikson NB, Morrison CC, Nguyen M, Pocobelli G, Whitlock EP; US Preventive Services Task Force Evidence Syntheses formerly Systematic Evidence Reviews . Screening for skin cancer in adults: an updated systematic evidence review for the US Preventive Services Task Force. JAMA. 2016;316(4):-.
    1. National Institute for Health and Care Excellence . Melanoma: assessment and management. . Accessed September 5, 2019.
    1. Welch HG, Woloshin S, Schwartz LM. Skin biopsy rates and incidence of melanoma: population based ecological study. BMJ. 2005;331(7515):481. doi:10.1136/bmj.38516.649537.E0
    1. Chen SC, Pennie ML, Kolm P, et al. . Diagnosing and managing cutaneous pigmented lesions: primary care physicians versus dermatologists. J Gen Intern Med. 2006;21(7):678-682. doi:10.1111/j.1525-1497.2006.00462.x
    1. Bainbridge S, Cake R, Meredith M, Furness P, Gordon B. Testing times to come: an evaluation of pathology capacity across the UK. . Accessed September 5, 2019.
    1. Dinnes J, Deeks JJ, Chuchu N, et al. ; Cochrane Skin Cancer Diagnostic Test Accuracy Group . Dermoscopy, with and without visual inspection, for diagnosing melanoma in adults. Cochrane Database Syst Rev. 2018;12(12):CD011902. doi:10.1002/14651858.cd011902.pub2
    1. Dinnes J, Deeks JJ, Saleh D, et al. ; Cochrane Skin Cancer Diagnostic Test Accuracy Group . Reflectance confocal microscopy for diagnosing cutaneous melanoma in adults. Cochrane Database Syst Rev. 2018;12(12):CD013190. doi:10.1002/14651858.cd013190
    1. Chuchu N, Dinnes J, Takwoingi Y, et al. ; Cochrane Skin Cancer Diagnostic Test Accuracy Group . Teledermatology for diagnosing skin cancer in adults. Cochrane Database Syst Rev. 2018;12(12):CD013193. doi:10.1002/14651858.cd013193
    1. Ferrante di Ruffano L, Takwoingi Y, Dinnes J, et al. ; Cochrane Skin Cancer Diagnostic Test Accuracy Group . Computer-assisted diagnosis techniques (dermoscopy and spectroscopy-based) for diagnosing skin cancer in adults. Cochrane Database Syst Rev. 2018;12(12):CD013186. doi:10.1002/14651858.cd013186
    1. Chuchu N, Takwoingi Y, Dinnes J, et al. ; Cochrane Skin Cancer Diagnostic Test Accuracy Group . Smartphone applications for triaging adults with skin lesions that are suspicious for melanoma. Cochrane Database Syst Rev. 2018;12(12):CD013192. doi:10.1002/14651858.cd013192
    1. Kittler H, Pehamberger H, Wolff K, Binder M. Diagnostic accuracy of dermoscopy. Lancet Oncol. 2002;3(3):159-165. doi:10.1016/S1470-2045(02)00679-4
    1. Brewer AC, Endly DC, Henley J, et al. . Mobile applications in dermatology. JAMA Dermatol. 2013;149(11):1300-1304. doi:10.1001/jamadermatol.2013.5517
    1. Kassianos AP, Emery JD, Murchie P, Walter FM. Smartphone applications for melanoma detection by community, patient and generalist clinician users: a review. Br J Dermatol. 2015;172(6):1507-1518. doi:10.1111/bjd.13665
    1. Ferrero NA, Morrell DS, Burkhart CN. Skin scan: a demonstration of the need for FDA regulation of medical apps on iPhone. J Am Acad Dermatol. 2013;68(3):515-516. doi:10.1016/j.jaad.2012.10.045
    1. Wolf JA, Moreau JF, Akilov O, et al. . Diagnostic inaccuracy of smartphone applications for melanoma detection. JAMA Dermatol. 2013;149(4):422-426. doi:10.1001/jamadermatol.2013.2382
    1. Stoecker WV, Rader RK, Halpern A. Diagnostic inaccuracy of smartphone applications for melanoma detection: representative lesion sets and the role for adjunctive technologies. JAMA Dermatol. 2013;149(7):884. doi:10.1001/jamadermatol.2013.4334
    1. Wolf JA, Ferris LK. Diagnostic inaccuracy of smartphone applications for melanoma detection: reply. JAMA Dermatol. 2013;149(7):885. doi:10.1001/jamadermatol.2013.4337
    1. Esteva A, Kuprel B, Novoa RA, et al. . Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi:10.1038/nature21056
    1. Haenssle HA, Fink C, Schneiderbauer R, et al. ; Reader study level-I and level-II Groups . Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 2018;29(8):1836-1842. doi:10.1093/annonc/mdy166
    1. Phillips M, Greenhalgh J, Marsden H, Palamaras I. Detection of malignant melanoma using artificial intelligence: an observational study. Dermatol Pract Concept. In press.
    1. Pepe MS. The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford, UK: Oxford University Press; 2003.
    1. Little RJA. A test of missing completely at random for multivariate data with missing values. J Am Stat Assoc. 1988;83(404):1198-1202. doi:10.1080/01621459.1988.10478722
    1. Cook JA, Rajbhandari A. Heckroccurve: ROC curves for selected samples. Stata J. 2018;18(1):174-183. doi:10.1177/1536867X1801800110
    1. Greinert R, Breitbart E, Mohr P, Volkmer B. Health initiatives for the prevention of skin cancer. In: Reichrath J, ed. Sunlight, Vitamin D and Skin Cancer. New York, NY: Springer; 2008:485-499.
    1. Johnson MM, Leachman SA, Aspinwall LG, et al. . Skin cancer screening: recommendations for data-driven screening guidelines and a review of the US Preventive Services Task Force controversy. Melanoma Manag. 2017;4(1):13-37. doi:10.2217/mmt-2016-0022
    1. Cancer Council Australia . Position statement: screening and early detection of skin cancer. . Accessed September 5, 2019.
    1. Royal Australian College of General Practitioners . Guidelines for preventive activities in general practice, 9th edition. . Accessed September 5, 2019.
    1. Marsden JR, Newton-Bishop JA, Burrows L, et al. ; British Association of Dermatologists (BAD) Clinical Standards Unit . Revised UK guidelines for the management of cutaneous melanoma 2010. J Plast Reconstr Aesthet Surg. 2010;63(9):1401-1419. doi:10.1016/j.bjps.2010.07.006
    1. Robinson JK, Halpern AC. Cost-effective melanoma screening. JAMA Dermatol. 2016;152(1):19-21. doi:10.1001/jamadermatol.2015.2681
    1. Diamond GA. The wizard of odds: Bayes theorem and diagnostic testing. Mayo Clin Proc. 1999;74(11):1179-1182. doi:10.4065/74.11.1179
    1. Obermeyer Z, Emanuel EJ. Predicting the future: big data, machine learning, and clinical medicine. N Engl J Med. 2016;375(13):1216-1219. doi:10.1056/NEJMp1606181

Source: PubMed

3
Subscribe