- ICH GCP
- US Clinical Trials Registry
- Clinical Trial NCT06774612
The Impact of Large Language Models on Diagnostic Reasoning Among LLM-Trained Medical Doctors
Diagnostic Reasoning With and Without AI Support: A Randomized Controlled Trial of LLM-Trained Medical Doctors
Study Overview
Detailed Description
Diagnostic errors are a major source of preventable patient harm. Recent advances in Large Language Models (LLM), particularly ChatGPT-4o, have shown promise in enhancing medical decision-making. However, little is known about their impact on medical doctors' (e.g., physicians' and surgeons') diagnostic reasoning.
Diagnostic accuracy relies on complex clinical reasoning and careful evaluation of patient data. While AI assistance could potentially reduce errors and improve efficiency, ChatGPT-4o lacks medical validation and could introduce new risks through incorrect information generation (also known as hallucinations). To mitigate these risks, doctors need adequate training in understanding ChatGPT-4o's capabilities, limitations, and proper usage. Given these uncertainties and the importance of proper AI training, systematic evaluation is essential before clinical implementation.
This randomized study will assess whether ChatGPT-4o access improves LLM-trained medical doctors' diagnostic performance compared to conventional resources (e.g., textbooks, online medical databases) alone. All participating doctors will have completed at least a 10-hour training program covering ChatGPT-4o usage, prompt engineering techniques, and output evaluation strategies. Participants will provide differential diagnoses with supporting evidence and recommended next steps for clinical cases, with responses evaluated by blinded reviewers.
Study Type
Enrollment (Actual)
Phase
- Not Applicable
Contacts and Locations
Study Locations
-
-
Punjab
-
Lahore, Punjab, Pakistan, 54792
- Lahore University of Management Sciences
-
-
Participation Criteria
Eligibility Criteria
Ages Eligible for Study
- Child
- Adult
- Older Adult
Accepts Healthy Volunteers
Description
Inclusion Criteria:
- Full or Provisionally Registered Medical Practitioners with the Pakistan Medical and Dental Council (PMDC).
- Completed Bachelor of Medicine, Bachelor of Surgery (MBBS) Exam. The equivalent degree of MBBS in US and Canada is called Doctor of Medicine (MD).
- Participants must have completed a structured training program on the use of ChatGPT (or a comparable large language model), totaling at least 10 hours of instruction. The program must include hands-on practice related to LLM's aspects, specifically prompt engineering and content evaluation.
Exclusion Criteria:
- Any other Registered Medical Practitioners (Full or Provisional) with PMDC (e.g., Professionals with Bachelor of Dental Surgery or BDS).
Study Plan
How is the study designed?
Design Details
- Primary Purpose: Diagnostic
- Allocation: Randomized
- Interventional Model: Parallel Assignment
- Masking: None (Open Label)
Arms and Interventions
Participant Group / Arm |
Intervention / Treatment |
|---|---|
|
Active Comparator: ChatGPT-4o
Group will be given access to ChatGPT-4o.
|
OpenAI's ChatGPT-4o large language model with chat interface.
|
|
No Intervention: Conventional resources
Group will not be given access to ChatGPT-4o but will be encouraged to use any resources they wish besides large language models (PubMed, Google without AI Overviews, etc).
|
What is the study measuring?
Primary Outcome Measures
Outcome Measure |
Measure Description |
Time Frame |
|---|---|---|
|
Diagnostic reasoning
Time Frame: Assessed at a single time point for each case, during the scheduled diagnostic reasoning evaluation session, which takes place between 0-4 days after participant enrollment.
|
The primary outcome will be the percent correct for each case (range: 0 to 100).
For each case, participants will be asked for three top diagnoses, findings from the case that support that diagnosis, and findings from the case that oppose that diagnosis.
For each plausible diagnosis, participants will receive 1 point.
Findings supporting the diagnosis and findings opposing the diagnosis will also be graded based on correctness, with 1 point for partially correct and 2 points for completely correct responses.
Participants will then be asked to name their top diagnosis, earning one point for a reasonable response and two points for the most correct response.
Finally participants will be asked to name up to 3 next steps to further evaluate the patient with one point awarded for a partially correct response and two points for a completely correct response.
The primary outcome will be compared on the case-level by the randomized groups.
|
Assessed at a single time point for each case, during the scheduled diagnostic reasoning evaluation session, which takes place between 0-4 days after participant enrollment.
|
Secondary Outcome Measures
Outcome Measure |
Measure Description |
Time Frame |
|---|---|---|
|
Time Spent on Diagnosis
Time Frame: Assessed at a single time point for each case, during the scheduled diagnostic reasoning evaluation session, which takes place between 0-4 days after participant enrollment.
|
We will compare how much time (in seconds) participants spend per case between the two study arms.
|
Assessed at a single time point for each case, during the scheduled diagnostic reasoning evaluation session, which takes place between 0-4 days after participant enrollment.
|
Collaborators and Investigators
Collaborators
Investigators
- Principal Investigator: Ihsan Ayyub Qazi, PhD, Lahore University of Management Sciences
- Principal Investigator: Muhammad Asadullah Khawaja, MBBS, King Edward Medical University
- Principal Investigator: Ayesha Ali, PhD, Lahore University of Management Sciences
Study record dates
Study Major Dates
Study Start (Actual)
Primary Completion (Actual)
Study Completion (Actual)
Study Registration Dates
First Submitted
First Submitted That Met QC Criteria
First Posted (Actual)
Study Record Updates
Last Update Posted (Actual)
Last Update Submitted That Met QC Criteria
Last Verified
More Information
Terms related to this study
Additional Relevant MeSH Terms
Other Study ID Numbers
- IRB-0342
Plan for Individual participant data (IPD)
Plan to Share Individual Participant Data (IPD)?
Drug and device information, study documents
Studies a U.S. FDA-regulated drug product
Studies a U.S. FDA-regulated device product
product manufactured in and exported from the U.S.
This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.
Clinical Trials on Diagnosis
-
SuperSonic ImagineTerminated
-
Umraniye Education and Research HospitalCompleted
-
European Institute of OncologyEuropean UnionRecruitingCancer DiagnosisFrance, Lithuania, Germany, Italy, Spain
-
Peking Union Medical College HospitalNot yet recruitingPrenatal Diagnosis
-
Danderyd HospitalRecruiting
-
Vrije Universiteit BrusselRecruitingPerinatal Palliative Care | Life-limiting Fetal Diagnosis | Life-limiting Neonatal DiagnosisBelgium
-
Beytepe Murat Erdi Eker State HospitalCompletedAnterior Segment Ischemia (Diagnosis)
-
Columbia UniversityEunice Kennedy Shriver National Institute of Child Health and Human Development...RecruitingPrenatal Genetic DiagnosisUnited States
-
Identifai GeneticsRecruitingGenetics | Prenatal DiagnosisUnited States
-
University of AlbertaCompleted
Clinical Trials on ChatGPT-4o
-
Istituto Clinico HumanitasFondazione I.R.C.C.S. Istituto Neurologico Carlo BestaCompleted
-
Lahore University of Management SciencesCompleted
-
Maastricht UniversityAga Khan University; University of Indonesia, Jakarta, IndonesiaCompletedDiagnosis | Vignette of Fictional PatientsNetherlands, Indonesia, Kenya
-
Philipps University MarburgCompleted
-
Charite University, Berlin, GermanyGerman Research Foundation; Max Planck Institute for Human DevelopmentNot yet recruitingOvarian Cancer Screening Recommendations by GynecologistsGermany
-
North Sichuan Medical CollegePeking University; Peking University First Hospital; Monash University; Case Western... and other collaboratorsNot yet recruitingHeart Diseases | Infections | Pneumonia | Disease | Cancer | Respiratory FailureChina
-
Chang Gung University of Science and TechnologyNational Science and Technology Council, TaiwanNot yet recruitingSocial Communication | CHF - Congestive Heart Failure | 65 Years Older
-
Boston Intelligent Medical Research Center, Shenzhen...Tsinghua UniversityNot yet recruitingPreoperative Care
-
Marmara University Pendik Training and Research...Not yet recruitingEmergency Medicine | Chest Pain Rule Out Myocardial Infarction | Artificial Intelligence (AI) | Artificial Intelligence (AI) in DiagnosisTurkey (Türkiye)
-
North Sichuan Medical CollegeAffiliated Hospital of North Sichuan Medical CollegeCompleted