Impact of AI Feedback on Ultrasound Biometry Accuracy Across the Expertise Levels

March 26, 2026 updated by: Mary Le Ngo, Copenhagen Academy for Medical Education and Simulation

Evaluating the Sensitivity to Change of AI-Feedback in Ultrasound Biometry: A Stratified Randomized Controlled Trial Across the Expertise Gradient

Objective: To evaluate the impact of real-time AI feedback on fetal biometry accuracy and investigate the Expertise Reversal Effect-whether AI benefits diminish as user experience increases.

Design: A stratified randomized trial of 75 participants (25 Novices, 25 Intermediates, 25 Experts). Users are randomized 1:1 to either AI-assisted or manual measurement groups.

Outcomes:

  • Primary: EFW accuracy (MAPE) compared to actual birthweight.
  • Secondary: Procedure time, image quality, error relative to baseline scans, and cognitive workload (NASA-TLX).

Study Overview

Status

Not yet recruiting

Intervention / Treatment

Detailed Description

Study Overview: This study evaluates how real-time Artificial Intelligence (AI) feedback impacts the accuracy of fetal biometry measurements in obstetric ultrasound. While AI tools are designed to assist clinicians, their effectiveness may vary depending on the user's baseline skill level-a phenomenon known as the "Expertise Reversal Effect."

Research Aim: The primary objective is to determine if AI-guided feedback significantly reduces measurement error in ultrasound fetal weight estimation to traditional manual methods. The study specifically investigates whether the benefit of AI is greater for novice users, intermediate users users than for experienced specialists.

Study Design: This is a stratified, randomized controlled trial involving 75 participants categorized into three expertise tiers:

Novices (e.g., students or residents with minimal scan experience).

Intermediate Users (e.g., physicians in mid-level training).

Experts (e.g., senior specialists).

Participants within each tier will be randomized 1:1 to either the AI-Assisted Group (receiving real-time automated plane validation and calipers) or the Control Group (performing standard manual biometry).

Primary Outcome Measure: Accuracy of Estimated Fetal Weight (EFW): The Mean Absolute Percentage Error (MAPE) of the EFW relative to the actual birthweight, assessing the clinical impact of AI assistance on weight prediction.

Secondary Outcome Measures:

  • Procedural Efficiency: Total procedure time (probe-to-skin) required to complete the biometry.
  • Image Quality: Objective assessment of captured planes based on standardized salomon criteria.
  • Relative Measurement Error: Deviation of estimated fetal weight when compared to a standard (expert-validated) ultrasound scan.
  • Subjective Workload: Evaluation of cognitive load and user effort using the NASA Task Load Index (NASA-TLX).
  • Determination of Experience Threshold: Defining the 'cutoff' in clinical experience (years and total scans) for significant AI-mediated accuracy gains.

Study Type

Interventional

Enrollment (Estimated)

75

Phase

  • Not Applicable

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Contact

Study Locations

    • København Ø
      • Copenhagen, København Ø, Denmark, 2100
        • Rigshospitalet

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

  • Child
  • Adult
  • Older Adult

Accepts Healthy Volunteers

Yes

Description

Clinical Target Population: Healthcare professionals and students, including but not limited to:

  • Medical students (doing their masters.
  • Resident physicians and Senior Consultants in Obstetrics and Gynecology.

Exclusion:

- If the participants do not understand and speak either Danish or English

Pregnant women:

Inclusion Criteria:

  • Pre pregnancy BMI < 40
  • Singelton pregnancy
  • GA ≥ 37+0 at time of induction
  • Intact membranes (to ensure consistent amniotic fluid index)

Exclusion Criteria:

  • Major fetal anatomical anomaly
  • Anhydramnios (DVP < 2 cm)
  • CPR ratio < 2.5th percentile

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

  • Primary Purpose: Diagnostic
  • Allocation: Randomized
  • Interventional Model: Parallel Assignment
  • Masking: Single

Arms and Interventions

Participant Group / Arm
Intervention / Treatment
No Intervention: Control Group
Participants in the control arm perform fetal biometry using standard manual techniques without any AI assistance.
Experimental: AI intervention Group
The software provides real-time "traffic light" or score-based feedback to validate when the correct anatomical plane (BPD, HC, AC, or FL) has been reached.
Participants in the intervention arm perform fetal biometry with the assistance of real-time Artificial Intelligence (AI) feedback software.

What is the study measuring?

Primary Outcome Measures

Outcome Measure
Measure Description
Time Frame
To evaluate the sensitivity to change in ultrasound measurement accuracy when using AI-feedback compared to standard scanning
Time Frame: The two scans will be performed within a timeframe of 14 days.
Mean absolute percentage error (MAPE), defined as the absolute difference between estimated fetal weight (EFW) and actual birth weight (ABW) divided by actual birth weight and expressed as a percentage, for AI-assisted and manual fetal biometry.
The two scans will be performed within a timeframe of 14 days.

Secondary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Procedural Efficacy
Time Frame: The duration of the scan, maximum of 30 minutes
Scan duration (seconds) will be modeled as the dependent variable to assess the tem-poral impact of the AI-feedback.
The duration of the scan, maximum of 30 minutes
Cognitive and Physiological Load
Time Frame: During the ultrasound procedure (GSR) and immediately following the procedure (NASA-TLX), approximately 30 minutes in total.
Both the NASA-TLX (subjective) and GSR (objective) data will be modeled as de-pendent variables. These analyses will determine if the AI-intervention significantly alters the mental effort or autonomic stress response during the procedure.
During the ultrasound procedure (GSR) and immediately following the procedure (NASA-TLX), approximately 30 minutes in total.
Measurement Deviation:
Time Frame: The duration from pre study scan and study scan.
The absolute difference between the participant's EFW and a baseline EFW performed of an experienced clinician, will be modeled to assess if AI reduces inter-observer variability.
The duration from pre study scan and study scan.
Image Quality
Time Frame: Through study completion, an average of 1 year.
Salomon Quality Score on the 16-point fetal biometry plane quality scale, comparing AI-assisted and manual ultrasound acquisition. The scale ranges from 0 to 16, with higher scores indicating better anatomical plane quality.
Through study completion, an average of 1 year.
Experience Threshold for AI-Mediated Accuracy Gains
Time Frame: Through study completion, an average of 1 year.
Estimated interaction effect between operator experience (continuous, experience level) and intervention (AI-assisted vs manual) on mean absolute percentage error (MAPE), and the corresponding experience level at which the adjusted difference in MAPE between groups is not statistically significant.
Through study completion, an average of 1 year.

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Estimated)

March 1, 2026

Primary Completion (Estimated)

March 1, 2027

Study Completion (Estimated)

March 1, 2027

Study Registration Dates

First Submitted

February 26, 2026

First Submitted That Met QC Criteria

March 12, 2026

First Posted (Actual)

March 17, 2026

Study Record Updates

Last Update Posted (Actual)

March 31, 2026

Last Update Submitted That Met QC Criteria

March 26, 2026

Last Verified

March 1, 2026

More Information

Terms related to this study

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

No

Studies a U.S. FDA-regulated device product

No

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Fetal Weight

Clinical Trials on AI interventional group

Subscribe