Human-AI Uncertainty Callibration for Improved Skin Lesion Segmentation

March 11, 2026 updated by: Julie Renata Bjerremand, Copenhagen Academy for Medical Education and Simulation

The Effect of Human-AI Uncertainty Calibration vs. AI Uncertainty Alone on the Diagnostic Accuracy of Human Experts for Skin Lesions - a Randomized Controlled Trial.

The goal of this randomized controlled study is to compare the effect of a new, personalized uncertainty-aware decision model (FDM) to a standard image recognition model in improving the diagnostic accuracy while reducing diagnostic uncertainty in experienced dermatologists tasked with differentiating between melanomas, moles and other benign skin lesions. The main question it aims to answer: Is the FDM a feasible method for an improved human AI partnership in which trust is build, misdiagnoses are avoided, and uncertainty is duly introduced or reduced.

The investigators expect to see only a slight increase in collective diagnostic accuracy for both interventions as the the human participants are skilled dermatologist and thus have high accuracies pre-intervention.

The investigators expect to see a higher increase in diagnostic certainty for the FDM intervention compared to the diagnostic certainty in the Base Model intervention.

The investigators expect to see a higher amount of diagnosis changes from incorrect to correct in the FDM group compared to the Base Model group.

The investigators do not expect any learning effect during the study.

Participants will start by answering a series of training cases consisting of images of skin lesions. These are used to train their individual FDM (only for the FDM-intervention group). From here, the participants will be randomized into two arms determining which of the two interventions they are exposed to. The participants will solve each case withouth any intervention first, and this reply will act as a control.

Study Overview

Status

Not yet recruiting

Conditions

Intervention / Treatment

Detailed Description

A detailed description of the FDM is presented in the references.

Study Type

Interventional

Enrollment (Estimated)

50

Phase

  • Not Applicable

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Contact

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

  • Child
  • Adult
  • Older Adult

Accepts Healthy Volunteers

Yes

Description

Inclusion Criteria:

  • Board certified dermatologists with clinical experience in dermoscopic diagnosis.

Exclusion Criteria:

  • Doctors who have not yet finished their specialization and dermatologists.
  • Dermatologists without clinical experience in dermoscopic diagnosis.

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

  • Primary Purpose: Diagnostic
  • Allocation: Randomized
  • Interventional Model: Parallel Assignment
  • Masking: None (Open Label)

Arms and Interventions

Participant Group / Arm
Intervention / Treatment
Active Comparator: Base Model

The study participant is presented with a patient case including patient demographics (gender, age, placement of lesion) and two lesion images: 1 overview image, and 1 dermoscopic image. They are asked first to indicate an initial diagnosis along with their self-perceived uncertainty for this specific case before they receive Intervention 1. This initial diagnosis will act as the control. Intervention 1 is AI-generated multi-class probabilities (from a model trained on a large dataset of dermoscopic and overview images similar to the ones used for testing) and only the most likely diagnosis is presented accompanied by uncertainty estimates in percent.

After the AI input, the study participant is given the chance to change their diagnosis and indicate any potential shift in uncertainty.

See arm description.
Other Names:
  • Intervention 1
Experimental: FDM

The initial diagnosis and indication of self-perceived uncertainty follows the same procedure as for Intervention 1. Intervention 2 is the most likely diagnosis accompanied by a calibrated uncertainty generated by the FDM model (i.e. trained on the study participants previous answers + the crowd annotations on the training data + the base model prediction).

After the AI input, the study participant is given the chance to change their diagnosis and indicate any potential shift in uncertainty.

See arm description
Other Names:
  • Intervention 2
  • Final Decision Model

What is the study measuring?

Primary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Accuracy
Time Frame: Immediately after the intervention.
Diagnostic accuracy in differentiating between melanoma, nevus, and benign keratosis. Defined as the percentage of correct diagnoses. Ground truth is based on histopathologically verified diagnoses.
Immediately after the intervention.

Secondary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Uncertainty
Time Frame: Immediately after the intervention.
Changes in self-assesed uncertainty ranging from 0 (very uncertain) to 10 (very certain) from pre- to post-intervention.
Immediately after the intervention.
Cut-off uncertainty
Time Frame: Immediately after the intervention.
The self-assessed uncertainty of cases where the participant has clicked a "would you like to discuss this case with a collegue"-button.
Immediately after the intervention.

Other Outcome Measures

Outcome Measure
Measure Description
Time Frame
Time
Time Frame: Immediately after the intervention.
Time from the start to finish of each case with a split time corresponding to the end of the control phase (the time "Show AI input"-button is clicked).
Immediately after the intervention.

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Investigators

  • Study Chair: Martin Tolsgaard, Professor, Copenhagen Academy for Medical Education and Simulation

Publications and helpful links

The person responsible for entering information about the study voluntarily provides these publications. These may be about anything related to the study.

General Publications

  • Kampen, P.J.T. et al. (2026). Uncertainty-Aware Classification: A Human-Guided Bayesian Deep Learning Framework. In: Sudre, C.H., et al. Uncertainty for Safe Utilization of Machine Learning in Medical Imaging. UNSURE 2025. Lecture Notes in Computer Science, vol 16166. Springer, Cham. https://doi.org/10.1007/978-3-032-06593-3_19

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Estimated)

March 1, 2026

Primary Completion (Estimated)

July 1, 2026

Study Completion (Estimated)

November 1, 2026

Study Registration Dates

First Submitted

January 29, 2026

First Submitted That Met QC Criteria

March 11, 2026

First Posted (Actual)

March 12, 2026

Study Record Updates

Last Update Posted (Actual)

March 12, 2026

Last Update Submitted That Met QC Criteria

March 11, 2026

Last Verified

March 1, 2026

More Information

Terms related to this study

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

No

Studies a U.S. FDA-regulated device product

No

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Skin Lesions

Clinical Trials on Base Model

Subscribe