Physician Reasoning on Diagnostic Cases With Large Language Models

February 15, 2024 updated by: Jonathan Chen, Stanford University

Diagnostic Reasoning With Large Language Model Chat Bots

This study will evaluate the effect of providing access to GPT-4, a large language model, compared to traditional diagnostic decision support tools on performance on case-based diagnostic reasoning tasks.

Study Overview

Status

Completed

Conditions

Diagnosis

Intervention / Treatment

Other: GPT-4

Detailed Description

Artificial intelligence (AI) technologies, specifically advanced large language models like OpenAI's ChatGPT, have the potential to improve medical decision-making. Although ChatGPT-4 was not developed for its use in medical-specific applications, it has demonstrated promise in various healthcare contexts, including medical note-writing, addressing patient inquiries, and facilitating medical consultation. However, little is known about how ChatGPT augments the clinical reasoning abilities of clinicians.

Clinical reasoning is a complex process involving pattern recognition, knowledge application, and probabilistic reasoning. Integrating AI tools like ChatGPT-4 into physician workflows could potentially help reduce clinician workload and decrease the likelihood of missed diagnoses. However, ChatGPT-4 was not developed for the purpose of clinical reasoning nor has it been validated for this purpose. Further, it may be subject to disinformation, including convincing confabulations that may mislead clinicians. If clinicians misuse this tool, it may not improve diagnostic reasoning and could even cause harm. Therefore, it is important to study how clinicians use large language models to augment clinical reasoning prior to routine incorporation into patient care.

In this study, we will randomize participants to answer diagnostic cases with or without access to ChatGPT-4. The participants will be asked to give three differential diagnoses for each case, with supporting and opposing findings for each diagnosis. Additionally they will be asked to provide their top diagnosis along with next diagnostic steps. Answers will be graded by independent reviewers blinded to treatment assignment.

Study Type

Interventional

Enrollment (Actual)

Phase

Not Applicable

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Contact

Name: Robert J Gallo, MD
Phone Number: (650) 723-4000
Email: rjgallo@stanford.edu

Study Contact Backup

Name: Jonathan H Chen, MD, PhD
Phone Number: (650) 723-4000
Email: jonc101@stanford.edu

Study Locations

United States
- California
  - Palo Alto, California, United States, 94304
    - Stanford University

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

Child
Adult
Older Adult

Accepts Healthy Volunteers

Yes

Description

Inclusion Criteria:

Participants must be licensed physicians and have completed at least post-graduate year 2 (PGY2) of medical training.
Training in Internal medicine, family medicine, or emergency medicine.

Exclusion Criteria:

Not currently practicing clinically.

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

Primary Purpose: Diagnostic
Allocation: Randomized
Interventional Model: Parallel Assignment
Masking: Single

Number of Arms

Arms and Interventions

Participant Group / Arm	Intervention / Treatment
Active Comparator: GPT-4 Group will be given access to GPT-4.	Other: GPT-4 OpenAI's GPT-4 large language model with chat interface.
No Intervention: Usual resources Group will not be given access to GPT-4 but will be encouraged to use any resources they wish besides large language models (UpToDate, Dynamed, google, etc).

What is the study measuring?

Primary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Diagnostic reasoning Time Frame: During evaluation	The primary outcome will be the percent correct (range: 0 to 100) for each case. For each case, participants will be asked for three top diagnoses and findings from the case that support that diagnosis and oppose that diagnosis. Participants will receive 1 point for each plausible diagnosis. Findings supporting the diagnosis and findings opposing the diagnosis will also be graded based on correctness, with 1 point for partially correct and 2 points for completely correct responses. Participants will then be asked to name their top diagnosis, earning one point for a reasonable response and two points for the most correct response. Finally participants will be asked to name up to 3 next steps to further evaluate the patient with one point awarded for a partially correct response and two points for a completely correct response. The primary outcome will be compared on the case-level by the randomized groups.	During evaluation

Secondary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Time Spent on Diagnosis Time Frame: During evaluation	We will compare how much time (in minutes) participants spend per case between the two study arms.	During evaluation

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Sponsor

Stanford University

Collaborators

Beth Israel Deaconess Medical Center

University of Minnesota

Investigators

Principal Investigator: Jonathan H Chen, MD, PhD, Stanford University
Principal Investigator: Adam Rodman, MD, Beth Israel Deaconess Medical Center
Principal Investigator: Andrew Olson, MD, University of Minnesota

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Actual)

November 29, 2023

Primary Completion (Actual)

December 30, 2023

Study Completion (Actual)

December 30, 2023

Study Registration Dates

First Submitted

November 27, 2023

First Submitted That Met QC Criteria

November 27, 2023

First Posted (Actual)

December 6, 2023

Study Record Updates

Last Update Posted (Actual)

February 20, 2024

Last Update Submitted That Met QC Criteria

February 15, 2024

Last Verified

February 1, 2024

More Information

Terms related to this study

Keywords

Additional Relevant MeSH Terms

Other Study ID Numbers

71319

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Diagnosis

SuperSonic Imagine

Not yet recruiting

Improvement Image Quality for SuperSonic® MACH Ultrasound System (MACH IQ)

Diagnosis
Umraniye Education and Research Hospital

Completed

Uterine Artery Diastolic Notching & Apelin-13 and 36

Diagnosis

Turkey
Beytepe Murat Erdi Eker State Hospital

Completed

Effects of Selenium and Melatonin on Ocular Ischemic Syndrome

Anterior Segment Ischemia (Diagnosis)
Identifai Genetics

Recruiting

Identifai Genetics Analytic Validity Study - Compound Heterozygosity and Samples Collection

Genetics | Prenatal Diagnosis

United States
Danderyd Hospital

Recruiting

MEDECA - Markers in Early Detection of Cancer (MEDECA)

Cancer | Diagnosis

Sweden
University Hospital, Limoges
University Hospital, Bordeaux

Not yet recruiting

Echocardiographic Standards and Sub-Saharian Africans (SSA) Migrants' in New Aquitaine Region (NEMANA)

Diagnosis

France
Aisap LTD

Completed

Prospective Acquisition of Cardiac Ultrasound Images at the Point of Care (POCUS_ACQ)

Diagnosis

United States, Israel
University of Southern Denmark

Completed

Complaints and Presumptive Diagnosis in a Danish Emergency Department

Symptoms on Admission | Presumptive Diagnosis on Admission | Final Diagnosis

Denmark
Ankara Education and Research Hospital

Completed

Benign-Malign Differentiation of Axillary Lymph Nodes: The Role Of Superb Microvascular Imaging

Diagnosis | Axilla; Breast

Turkey
Seoul National University Hospital
Bayer

Completed

Added Value of Gadoxetic Acid-enhanced Liver MRI

Diagnosis | Hcc

Korea, Republic of

Clinical Trials on GPT-4

Stanford University
Beth Israel Deaconess Medical Center; University of Minnesota

Recruiting

Physician Reasoning on Management Cases With Large Language Models

Clinical Decision-making

United States
Wang Shalong
Central South University

Active, not recruiting

ChatGPT Helping Advance Training for Medical Students: A Study on Self-Directed Learning Enhancement (CHAT-MS)

Medical Education | Artificial Intelligence | Self-Directed Learning

China
Hoffmann-La Roche

Recruiting

A Study to Evaluate the Safety, Pharmacokinetics and Preliminary Anti-Tumor Activity of RO7227166 in Combination With Obinutuzumab and in Combination With Glofitamab Following a Pre-Treatment Dose of Obinutuzumab Administered in Participants With Relapsed/Refractory B-Cell Non-Hodgkin's Lymphoma

Lymphoma, Non-Hodgkin

United States, Belgium, Australia, Denmark, Italy, Spain, France, United Kingdom
Taipei Veterans General Hospital, Taiwan

Recruiting

Evaluating the Role of ChatGPT in Educating Patients With Early-stage Hepatocellular Carcinoma

Carcinoma, Hepatocellular

Taiwan
Pharma Holdings AS
CTC Clinical Trial Consultants AB

Completed

Study to Evaluate the Efficacy, Safety and Tolerability of 3% LTX-109 for Nasal Decolonisation of Staphylococcus Aureus

Nasal Decolonization of Staphylococcus Aureus

Sweden
Janssen Research & Development, LLC

Completed

A Study in Healthy Adults to Evaluate the Safety and Immunogenicity of Different Doses of JNJ-63871860

Healthy

United States
Maisonneuve-Rosemont Hospital

Completed

Inspiratory Support Improves Preoxygenation in Healthy Subjects

Healthy

Canada
University of Utah
Novartis

Withdrawn

BKM120 in Advanced, Metastatic, or Recurrent Endometrial Cancers (BKM120)

Endometrial Cancer

United States
Jeffrey A. Cohen, MD
Jacobus Pharmaceutical

Terminated

Controlled Trial of 3,4-Diaminopyridine (3-4DAP) in Lambert-Eaton Myasthenic Syndrome (LEMS) (3-4DAP)

Muscle Weakness

United States
Norwegian University of Science and Technology
St. Olavs Hospital

Completed

Interval Training and Resting Metabolism (NEAT)

Healthy Subjects

Norway

Physician Reasoning on Diagnostic Cases With Large Language Models

Diagnostic Reasoning With Large Language Model Chat Bots

Study Overview

Status

Conditions

Intervention / Treatment

Detailed Description

Study Type

Enrollment (Actual)

Phase

Contacts and Locations

Study Contact

Study Contact Backup

Study Locations

Participation Criteria

Eligibility Criteria

Ages Eligible for Study

Accepts Healthy Volunteers

Description

Study Plan

How is the study designed?

Design Details

Number of Arms

Arms and Interventions

Participant Group / Arm

Intervention / Treatment

What is the study measuring?

Primary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Secondary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Collaborators and Investigators

Sponsor

Collaborators

Investigators

Study record dates

Study Major Dates

Study Start (Actual)

Primary Completion (Actual)

Study Completion (Actual)

Study Registration Dates

First Submitted

First Submitted That Met QC Criteria

First Posted (Actual)

Study Record Updates

Last Update Posted (Actual)

Last Update Submitted That Met QC Criteria

Last Verified

More Information

Terms related to this study

Keywords

Additional Relevant MeSH Terms

Other Study ID Numbers

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

Clinical Trials on Diagnosis

Clinical Trials on GPT-4

Search Similar Trials

Sponsors and Collaborators

Medical Conditions

Drug Interventions

CROs by country

CROs in Singapore

Conditions

Rare Diseases

Drug Interventions

Dietary Supplements

Sponsor/Collaborators

Locations