Scalable Clinical Oversight of Large Language Models Via Uncertainty Triangulation (SCOUT)

February 14, 2026 updated by: China National Center for Cardiovascular Diseases

Prospective Evaluation of a Model-Agnostic Meta-Verification Framework (SCOUT) for Scalable Clinical Oversight of Large Language Model Outputs in Coronary Heart Disease Diagnosis: A Multi-Reader, Randomized, Crossover Trial

This prospective, multi-reader, randomized crossover trial evaluates SCOUT (Scalable Clinical Oversight via Uncertainty Triangulation), a model-agnostic meta-verification framework that selectively defers unreliable large language model (LLM) predictions to clinicians by triangulating three orthogonal uncertainty signals: model heterogeneity, stochastic inconsistency, and reasoning critique. The trial assesses whether SCOUT-assisted review can reduce physician review time compared with standard manual review of AI-generated diagnoses while maintaining non-inferior diagnostic accuracy in coronary heart disease (CHD) subtyping.

Study Overview

Status

Not yet recruiting

Conditions

Coronary Heart Disease (CHD)

Intervention / Treatment

Detailed Description

Background: Large language models are increasingly deployed in clinical workflows, yet requiring clinician review of every AI output negates the efficiency gains that motivate their adoption. SCOUT addresses this efficiency-safety paradox through algorithmic meta-verification.

The SCOUT framework triangulates three orthogonal external signals to determine case-level uncertainty: (1) Model Heterogeneity - whether a structurally different auxiliary LLM agrees with the primary model; (2) Stochastic Inconsistency - whether repeated sampling from the same model yields divergent outputs; (3) Reasoning Critique - whether an external checker model identifies logical flaws in the chain-of-thought reasoning.

In this crossover trial, 7 clinicians of varying seniority (2 junior residents, 3 senior residents, 2 attending physicians) each review all 110 cases under both standard manual review and SCOUT-assisted review workflows. The study evaluates workflow efficiency (primary endpoint) and diagnostic accuracy (secondary endpoint).

Study Type

Interventional

Enrollment (Estimated)

Phase

Not Applicable

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Contact

Name: Xiaojin Gao, Dr.
Phone Number: +86 010 88322415
Email: sophie_gao@sina.com

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

Adult
Older Adult

Accepts Healthy Volunteers

Description

Inclusion Criteria:

Board-certified or in-training cardiologists at Fuwai Hospital
Spanning three experience strata: junior residents, senior residents, attending physicians

Exclusion Criteria:

Clinicians involved in the development or optimization of the SCOUT framework
Clinicians involved in the gold-standard adjudication process

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

Primary Purpose: Diagnostic
Allocation: Randomized
Interventional Model: Crossover Assignment
Masking: None (Open Label)

Number of Arms

Arms and Interventions

Participant Group / Arm	Intervention / Treatment
Active Comparator: Control (Standard Manual Review) Physicians manually review all cases in the control set (n=54) with access to AI predictions and reasoning. No selective deferral.	Diagnostic test: Standard Manual Review Workflow Physicians perform a full manual review of 54 cases using raw medical records with access to the AI model's predictions and reasoning, but without SCOUT uncertainty stratification or selective deferral.
Experimental: Experimental (SCOUT-Assisted Review) Physicians process the intervention set (n=56) through the SCOUT framework. Low-uncertainty cases are auto-accepted; high-uncertainty cases undergo physician review with full audit trail.	Diagnostic test: SCOUT-Assisted Review Workflow SCOUT-Assisted Review (Intervention Arm): Physicians review 56 cases processed through the SCOUT framework. For cases classified as low-uncertainty (D(x)=0), the AI prediction is auto-accepted without physician review. For high-uncertainty cases (D(x)=1), the physician reviews the case with access to the main model's chain-of-thought reasoning and the meta-verification audit results. The main model is DeepSeek-V3.1 with chain-of-thought prompting.

Participant Group / Arm

Intervention / Treatment

Active Comparator: Control (Standard Manual Review)

Physicians manually review all cases in the control set (n=54) with access to AI predictions and reasoning. No selective deferral.

Diagnostic test: Standard Manual Review Workflow

Physicians perform a full manual review of 54 cases using raw medical records with access to the AI model's predictions and reasoning, but without SCOUT uncertainty stratification or selective deferral.

Experimental: Experimental (SCOUT-Assisted Review)

Physicians process the intervention set (n=56) through the SCOUT framework. Low-uncertainty cases are auto-accepted; high-uncertainty cases undergo physician review with full audit trail.

Diagnostic test: SCOUT-Assisted Review Workflow

SCOUT-Assisted Review (Intervention Arm): Physicians review 56 cases processed through the SCOUT framework. For cases classified as low-uncertainty (D(x)=0), the AI prediction is auto-accepted without physician review. For high-uncertainty cases (D(x)=1), the physician reviews the case with access to the main model's chain-of-thought reasoning and the meta-verification audit results. The main model is DeepSeek-V3.1 with chain-of-thought prompting.

What is the study measuring?

Primary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Mean physician review time per case (minutes) Time Frame: Through study completion, an average of 2 hours.	Mean time spent by each clinician reviewing and rendering a diagnostic decision per case under each arm. Measured in minutes.	Through study completion, an average of 2 hours.

Secondary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Diagnostic accuracy (%) Time Frame: Through study completion, an average of 2 hours.	Proportion of correct CHD subtype classifications (STEMI, NSTEMI, unstable angina, chronic coronary syndromes) under each arm.	Through study completion, an average of 2 hours.
Computational Return on Investment (ROI) Time Frame: Through study completion, an average of 2 hours.	Ratio of physician time savings (valued at standardized minute-wages from Sanming healthcare reform benchmarks) to computational cost of SCOUT inference, stratified by clinician seniority level.	Through study completion, an average of 2 hours.

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Sponsor

China National Center for Cardiovascular Diseases

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Estimated)

February 19, 2026

Primary Completion (Estimated)

February 28, 2026

Study Completion (Estimated)

February 28, 2026

Study Registration Dates

First Submitted

February 9, 2026

First Submitted That Met QC Criteria

February 14, 2026

First Posted (Actual)

February 17, 2026

Study Record Updates

Last Update Posted (Actual)

February 17, 2026

Last Update Submitted That Met QC Criteria

February 14, 2026

Last Verified

February 1, 2026

More Information

Terms related to this study

Keywords

artificial intelligence

Additional Relevant MeSH Terms

Other Study ID Numbers

2025-2702-1

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

YES

IPD Plan Description

De-identified individual participant data underlying the results reported in this study will be made available.

IPD Sharing Time Frame

Beginning 1 months after publication of the primary results and available for up to 60 months.

IPD Sharing Access Criteria

Data are available from the corresponding author upon reasonable request. Requestors will need to provide a methodologically sound research proposal and sign a data use agreement.

IPD Sharing Supporting Information Type

STUDY_PROTOCOL
SAP
ICF
ANALYTIC_CODE
CSR

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Coronary Heart Disease (CHD)

Gan Lijun

Recruiting

Single vs. Dual Antiplatelet Therapy in Patients Undergoing Percutaneous Intervention With DCB-only Strategy (KONG-FREEDOM-I) (KONG-FREEDOM-I)

Coronary Heart Disease (CHD)

China
Chinese University of Hong Kong

Not yet recruiting

The Effectiveness of a Theory-driven Behavioral Change Intervention on Sedentary Behavior in Individuals With Coronary Heart Disease: A Randomised Controlled Trial

Coronary Heart Disease (CHD)
Merck Sharp & Dohme LLC

Completed

Study to Assess the Tolerability and Efficacy of Anacetrapib in Patients With Coronary Heart Disease (CHD) or CHD Risk-Equivalent Disease (MK-0859-019) (DEFINE)

Coronary Heart Disease (CHD) | CHD Risk-Equivalent Disease
Xinjiang Medical University

Not yet recruiting

The Prognostic Value of Cardiopulmonary Exercise Testing Parameters in Patients With Coronary Heart Disease: A Retrospective Cohort Study(CPETPCAD) (CPET-PCHD)

Coronary Heart Disease (CHD)
Chinese Academy of Medical Sciences, Fuwai Hospital

Not yet recruiting

Proteomic and Inflammatory Omics Changes With Colchicine Therapy in Coronary Heart Disease (PIC-CHD)

Coronary Heart Disease (CHD)
China-Japan Friendship Hospital
Peking Union Medical College

Recruiting

AI-Driven Dynamic Prediction of Non-Target Lesion Progression After PCI: A Chinese Multicenter Cohort Study (VISION-PCI)

Coronary Heart Disease (CHD)

China
900th Hospital of PLA Joint Logistic Support Force

Completed

Effects of Dioscorea Yam Gruel on Blood Lipid Profiles in PCI Patients With Coronary Heart Disease

PCI | Coronary Heart Disease (CHD)

China
China National Center for Cardiovascular Diseases

Not yet recruiting

Coronary Heart Disease Complicated With Ischemic Mitral Regurgitation in China (China-IMR)

Ischemic Mitral Regurgitation | Coronary Heart Disease (CHD)
Azienda Ospedaliera Universitaria Integrata Verona

Active, not recruiting

Verona Coronary Physiology Interventional Registry (VR-CP)

Coronary Artery Disease | Coronary Heart Disease (CHD) | Ischemic Heart Disease (IHD)

Italy
Shenyang Medical College
The Second Hospital of Shenyang Medical College

Recruiting

Closed-Loop Neurofeedback Targeting the Left Dorsolateral Prefrontal Cortex for Cardiac Autonomic Modulation in Coronary Artery Disease With Anxiety (HEART-SET-2)

Anxiety Disorders | Coronary Heart Disease (CHD)

China

Clinical Trials on Standard Manual Review Workflow

China National Center for Cardiovascular Diseases

Completed

Performance of an OCR-Prompt-LLM Integrated Workflow for Extracting Multi-dimensional Clinical Data in Ischemic Heart Disease (OPAL-CAD)

Coronary Artery Disease | Data Collection | Artificial Intelligence (AI)

China
China National Center for Cardiovascular Diseases

Not yet recruiting

A Privacy-Preserving OCR-LLM System for Coronary Syndrome Subtyping From Admission HPI: Multicenter Validation in China and the US (OCR-LLM-CHD)

Acute Coronary Syndromes | ST-segment Elevation Myocardial Infarction (STEMI) | Coronary Artery Disease (CAD) (E.G., Angina, Myocardial Infarction, and Atherosclerotic Heart Disease (ASHD)) | Non-ST-Segment Elevation Myocardial Infarction (NSTEMI)
Hospital El Salvador

Not yet recruiting

AI Telemedicine Support for Primary Care Physicians in El Salvador

Artificial Intelligence (AI) in Diagnosis

El Salvador
Can Tho Stroke International Services Hospital
Siemens Healthineers AG

Enrolling by invitation

Optimizing Door-to-reperfusion Times of One-stop Management in Acute Ischemic Stroke (ORETOM)

Acute Ischemic Stroke

Vietnam
M.D. Anderson Cancer Center
National Cancer Institute (NCI)

Completed

Perioperative Care in the Cancer Patient -1, ARCA-1 Study

Hematopoietic and Lymphoid Cell Neoplasm | Malignant Solid Neoplasm

United States
Bispebjerg Hospital

Recruiting

Physician-initiated Medication Review in a Type 2 Diabetes Outpatient Clinic

Polypharmacy

Denmark
Chinese University of Hong Kong

Recruiting

To Evaluate an MRI-based Optimized Prostate Cancer Diagnostic Pathway Powered by Artificial Intelligence

Prostate Cancer

Hong Kong
Sebastian Koch

Withdrawn

Remote Ischemic PreConditioning (RIPC)

Unruptured Cerebral Aneurysm

United States
Göran Petersson
The Kamprad Family Foundation for Entrepreneurship, Research & Charity

Completed

Motivational Interviewing and Medication Review in Coronary Heart Disease (MIMeRiC)

Coronary Heart Disease

Sweden
Conceivable Life Sciences

Completed

Automation of Gamete Preparation, Intracytoplasmic Sperm Injection (ICSI), Embryo Culture, and Vitrification

Infertility

Mexico

Scalable Clinical Oversight of Large Language Models Via Uncertainty Triangulation (SCOUT)

Prospective Evaluation of a Model-Agnostic Meta-Verification Framework (SCOUT) for Scalable Clinical Oversight of Large Language Model Outputs in Coronary Heart Disease Diagnosis: A Multi-Reader, Randomized, Crossover Trial

Study Overview

Status

Conditions

Intervention / Treatment

Detailed Description

Study Type

Enrollment (Estimated)

Phase

Contacts and Locations

Study Contact

Participation Criteria

Eligibility Criteria

Ages Eligible for Study

Accepts Healthy Volunteers

Description

Study Plan

How is the study designed?

Design Details

Number of Arms

Arms and Interventions

Participant Group / Arm

Intervention / Treatment

What is the study measuring?

Primary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Secondary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Collaborators and Investigators

Sponsor

Study record dates

Study Major Dates

Study Start (Estimated)

Primary Completion (Estimated)

Study Completion (Estimated)

Study Registration Dates

First Submitted

First Submitted That Met QC Criteria

First Posted (Actual)

Study Record Updates

Last Update Posted (Actual)

Last Update Submitted That Met QC Criteria

Last Verified

More Information

Terms related to this study

Keywords

Additional Relevant MeSH Terms

Other Study ID Numbers

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

IPD Plan Description

IPD Sharing Time Frame

IPD Sharing Access Criteria

IPD Sharing Supporting Information Type

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

Clinical Trials on Coronary Heart Disease (CHD)

Clinical Trials on Standard Manual Review Workflow

Search Similar Trials

Sponsors and Collaborators

Medical Conditions

Drug Interventions

CROs by country

CROs in Mali

Conditions

Rare Diseases

Drug Interventions

Dietary Supplements

Sponsor/Collaborators

Locations