Physician Reasoning on Management Cases With Large Language Models

September 26, 2024 updated by: Jonathan Chen, Stanford University

Management Reasoning With AI Chat Bots

This study will evaluate the effect of providing access to GPT-4, a large language model, compared to traditional management decision support tools on performance on case-based management reasoning tasks.

Study Overview

Status

Completed

Intervention / Treatment

Detailed Description

Artificial intelligence (AI) technologies, specifically advanced large language models like OpenAI's ChatGPT, have the potential to improve medical decision-making. Although ChatGPT-4 was not developed for its use in medical-specific applications, it has demonstrated promise in various healthcare contexts, including medical note-writing, addressing patient inquiries, and facilitating medical consultation. However, little is known about how ChatGPT augments the clinical reasoning abilities of clinicians.

Clinical reasoning is a complex process involving pattern recognition, knowledge application, and probabilistic reasoning. Integrating AI tools like ChatGPT-4 into physician workflows could potentially help reduce clinician workload and decrease the likelihood of mismanagement. However, ChatGPT-4 was not developed for clinical reasoning nor has it been validated for this purpose. Further, it may be subject to disinformation, including convincing confabulations that may mislead clinicians. If clinicians misuse this tool, it may not improve reasoning and could even cause harm. Therefore, it is important to study how clinicians use large language models to augment clinical reasoning prior to routine incorporation into patient care.

In this study, participants will be randomized to answer clinical management cases with or without access to ChatGPT-4. Each case has multiple components, and the participants will be asked to discuss their reasoning for each component. Answers will be graded by independent reviewers blinded to treatment assignment. A grading rubric was developed for each case by a panel of 4-7 expert discussants. Discussants independently developed a rubric for each case, and then any discrepancies were resolved through multiple rounds of discussions.

Study Type

Interventional

Enrollment (Actual)

92

Phase

  • Not Applicable

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Locations

    • California
      • Palo Alto, California, United States, 94304
        • Stanford University

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

  • Child
  • Adult
  • Older Adult

Accepts Healthy Volunteers

Yes

Description

Inclusion Criteria:

  • Participants must be licensed physicians and have completed at least post-graduate year 2 (PGY2) of medical training.
  • Training in Internal medicine, family medicine, or emergency medicine.

Exclusion Criteria:

  • Not currently practicing clinically.

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

  • Primary Purpose: Treatment
  • Allocation: Randomized
  • Interventional Model: Parallel Assignment
  • Masking: Single

Arms and Interventions

Participant Group / Arm
Intervention / Treatment
Active Comparator: GPT-4
Group will be given access to GPT-4
OpenAI's GPT-4 large language model with chat interface.
No Intervention: Usual Resources
Group will not be given access to GPT-4 but will be encouraged to use any resources they wish besides large language models (UpToDate, Dynamed, google, etc).

What is the study measuring?

Primary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Management Reasoning
Time Frame: Within one-hour study
Percent correct (range: 0 to 100) for each case.
Within one-hour study

Secondary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Time Spent on Management
Time Frame: Within one-hour study
Time (in minutes) participants spend per case between the two study arms.
Within one-hour study

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Investigators

  • Principal Investigator: Jonathan H Chen, MD, PhD, Stanford University
  • Principal Investigator: Adam Rodman, MD, Beth Israel Deaconess Medical Center
  • Principal Investigator: Andrew Olson, MD, University of Minnesota

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Actual)

December 28, 2023

Primary Completion (Actual)

April 19, 2024

Study Completion (Actual)

April 19, 2024

Study Registration Dates

First Submitted

December 20, 2023

First Submitted That Met QC Criteria

January 5, 2024

First Posted (Actual)

January 17, 2024

Study Record Updates

Last Update Posted (Actual)

September 27, 2024

Last Update Submitted That Met QC Criteria

September 26, 2024

Last Verified

September 1, 2024

More Information

Terms related to this study

Other Study ID Numbers

  • 71319b

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

NO

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

No

Studies a U.S. FDA-regulated device product

No

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Clinical Decision-making

Clinical Trials on GPT-4

Subscribe