The Big Unknown: A Journey Into Generative AI's Transformative Effect on Medical Professions

January 21, 2026 updated by: Maastricht University

The Big Unknown: A Journey Into Generative AI's Transformative Effect on Meical Professions

A parallel group randomized controlled trial using a superiority framework. Clinical vignettes will be used to assess the impact of a large language model on the clinical reasoning of physicians. Quantitative analyses will be performed on graded vignette responses.

Study Overview

Status

Completed

Conditions

Intervention / Treatment

Other: GPT-4o

Detailed Description

This study is a multi-country, parallel-group randomized controlled trial designed to evaluate whether access to a large language model (LLM) improves physician clinical decision-making. The trial uses a superiority framework and compares physicians randomized to either complete standardized clinical vignettes with access to GPT-4o or without any AI assistance.

Clinical vignettes simulate common primary care conditions such as cardiovascular, respiratory, musculoskeletal, fatigue-related, and infectious diseases. Each vignette includes multiple steps in the clinical reasoning process, from initial history-taking to diagnosis, treatment, and follow-up. Physician responses are graded using rubrics developed from evidence-based, context-specific best-practice guidelines.

The study is conducted across three countries-Indonesia, Kenya, and the Netherlands-representing different income levels and health system contexts. The primary outcome is performance on clinical vignettes, defined as adherence to best-practice guidelines. Secondary objectives include examining cross-country variation in physician performance, variation in performance distributions, and the role of engagement with the LLM in shaping outcomes.

Study Type

Interventional

Enrollment (Actual)

249

Phase

Not Applicable

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Locations

Indonesia
- - Jakarta, Indonesia
    - Universitas Indonesia
Kenya
- - Nairobi, Kenya
    - Aga Khan University Hospital
Netherlands
- - Maastricht, Netherlands
    - Maastricht University

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

Child
Adult
Older Adult

Accepts Healthy Volunteers

Yes

Description

Inclusion Criteria:

Registered medical physicians
Training in internal or family medicine

Exclusion Criteria:

Not currently practicing clinically

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

Primary Purpose: Diagnostic
Allocation: Randomized
Interventional Model: Parallel Assignment
Masking: Single

Number of Arms

Arms and Interventions

Participant Group / Arm	Intervention / Treatment
No Intervention: Own Knowlege Group will not be given access to GPT-4 or other online resources
Active Comparator: GPT-4o Group given GPT-4o access	Other: GPT-4o GPT-4o provided via an iFrame in the online Qualtrics environment

What is the study measuring?

Primary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Percentage Correct Score Time Frame: During Evaluation	Following Peabody et al (2000), the primary outcome is a percentage correct score across all steps in a vignette. This is generated by dividing the weighted total sum of rubric items assessed as present by the total number of rubric items possible in a vignette. Rubric items will be weighted with regards to their relevance by our expert panel.	During Evaluation

Secondary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Quality Per Answer Time Frame: During Evaluation	This outcome is generated as the average weight of rubric items assessed as present across vignettes. As each item is provided a weight (0.33,0.5,1), the average weight is the sum of weights divided by the number of answers marked as present.	During Evaluation
Number of Answers Time Frame: During evaluation	This outcome is generated as the count of the total number of answers assessed as present by reviewers per vignette	During evaluation
Less obvious answers Time Frame: During evaluation	This outcome is generated as the number of answers given that are less obvious, i.e. mentioned less frequently by the control group. If the answer is mentioned by 25% or less of the control group, it is considered less obvious.	During evaluation

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Sponsor

Maastricht University

Collaborators

Aga Khan University

University of Indonesia, Jakarta, Indonesia

Investigators

Principal Investigator: Mark Levels, PhD, Maastricht University

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Actual)

August 1, 2024

Primary Completion (Actual)

January 17, 2025

Study Completion (Actual)

January 17, 2025

Study Registration Dates

First Submitted

January 21, 2026

First Submitted That Met QC Criteria

January 21, 2026

First Posted (Actual)

January 29, 2026

Study Record Updates

Last Update Posted (Actual)

January 29, 2026

Last Update Submitted That Met QC Criteria

January 21, 2026

Last Verified

September 1, 2025

More Information

Terms related to this study

Keywords

Additional Relevant MeSH Terms

Other Study ID Numbers

ERCIC_572_25_04_2024

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Diagnosis

SuperSonic Imagine

Terminated

Improvement Image Quality for SuperSonic® MACH Ultrasound System (MACH IQ)

Diagnosis

France
European Institute of Oncology
European Union

Recruiting

Digital Solutions for bEtter cAre (ALTHEA)

Cancer Diagnosis

France, Lithuania, Germany, Italy, Spain
Umraniye Education and Research Hospital

Completed

Uterine Artery Diastolic Notching & Apelin-13 and 36

Diagnosis

Turkey
Vrije Universiteit Brussel

Recruiting

Pilot-testing a Perinatal Palliative Care Intervention Program (PPC-pilot)

Perinatal Palliative Care | Life-limiting Fetal Diagnosis | Life-limiting Neonatal Diagnosis

Belgium
Beytepe Murat Erdi Eker State Hospital

Completed

Effects of Selenium and Melatonin on Ocular Ischemic Syndrome

Anterior Segment Ischemia (Diagnosis)
Columbia University
Eunice Kennedy Shriver National Institute of Child Health and Human Development...

Recruiting

guideSEQ: Genomic Understanding, Impact, Decision & Ethics in Prenatal Sequencing

Prenatal Genetic Diagnosis

United States
Identifai Genetics

Recruiting

Identifai Genetics Analytic Validity Study - Compound Heterozygosity and Samples Collection

Genetics | Prenatal Diagnosis

United States
Peking Union Medical College Hospital

Not yet recruiting

Mapping of Genomic Structural Variations in Major Birth Defects

Prenatal Diagnosis
Danderyd Hospital

Recruiting

MEDECA - Markers in Early Detection of Cancer (MEDECA)

Cancer | Diagnosis

Sweden
The Cleveland Clinic

Active, not recruiting

Shifting Options Study

Prenatal Care | Prenatal Diagnosis

United States

Clinical Trials on GPT-4o

North Sichuan Medical College
Peking University; Peking University First Hospital; Monash University; Case Western... and other collaborators

Not yet recruiting

Multi-Disciplinary Treatment on the Anthropomorphism of Large Language Models (MDTALLM)

Heart Diseases | Infections | Pneumonia | Disease | Cancer | Respiratory Failure

China
North Sichuan Medical College
Afﬁliated Hospital of North Sichuan Medical College

Completed

Ophthalmic Diseases and AI: an RCT Study

Eye Diseases

China
Marmara University Pendik Training and Research...

Recruiting

Diagnostic Accuracy of GPT-4o and Claude for HEART Score Calculation in Chest Pain (LLM-HEART)

Emergency Medicine | Chest Pain Rule Out Myocardial Infarction | Artificial Intelligence (AI) | Artificial Intelligence (AI) in Diagnosis

Turkey (Türkiye)
University College, London

Enrolling by invitation

Evaluating the Effectiveness and Acceptability of a GPT-4o and RAG-Based Voice Chatbot for Depression Screening Using PHQ-9 (GPT4-RAG-PHQ)

Depression Anxiety Disorder | Depression - Major Depressive Disorder

United Kingdom
Lahore University of Management Sciences
King Edward Medical University

Completed

The Impact of Large Language Models on Diagnostic Reasoning Among LLM-Trained Medical Doctors

Diagnosis

Pakistan
Case Comprehensive Cancer Center

Not yet recruiting

Improving Patient Understanding of Their Prostate Cancer Diagnosis Using AI

Prostate Cancer

United States
Stanford University
Beth Israel Deaconess Medical Center; University of Minnesota

Completed

Physician Reasoning on Diagnostic Cases With Large Language Models

Diagnosis

United States
Istituto Clinico Humanitas
Fondazione I.R.C.C.S. Istituto Neurologico Carlo Besta

Completed

ChatGPT in the Diagnosis and Management of Complex Polyneuropathies: Comparative Analysis With Neurologists Using Real-World Cases (REASON)

Polyneuropathies

Italy
Lahore University of Management Sciences

Completed

Automation Bias in Physician-LLM Diagnostic Reasoning

Diagnosis

Pakistan
University Hospital Heidelberg

Completed

Impact of GPT Use on Essay Writing Performance and Cognitive Abilities

Cognitive Change | Well-Being, Psychological

Germany

The Big Unknown: A Journey Into Generative AI's Transformative Effect on Medical Professions

The Big Unknown: A Journey Into Generative AI's Transformative Effect on Meical Professions

Study Overview

Status

Conditions

Intervention / Treatment

Detailed Description

Study Type

Enrollment (Actual)

Phase

Contacts and Locations

Study Locations

Participation Criteria

Eligibility Criteria

Ages Eligible for Study

Accepts Healthy Volunteers

Description

Study Plan

How is the study designed?

Design Details

Number of Arms

Arms and Interventions

Participant Group / Arm

Intervention / Treatment

What is the study measuring?

Primary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Secondary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Collaborators and Investigators

Sponsor

Collaborators

Investigators

Study record dates

Study Major Dates

Study Start (Actual)

Primary Completion (Actual)

Study Completion (Actual)

Study Registration Dates

First Submitted

First Submitted That Met QC Criteria

First Posted (Actual)

Study Record Updates

Last Update Posted (Actual)

Last Update Submitted That Met QC Criteria

Last Verified

More Information

Terms related to this study

Keywords

Additional Relevant MeSH Terms

Other Study ID Numbers

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

Clinical Trials on Diagnosis

Clinical Trials on GPT-4o

Search Similar Trials

Sponsors and Collaborators

Medical Conditions

Drug Interventions

CROs by country

CROs in Russia

Conditions

Rare Diseases

Drug Interventions

Dietary Supplements

Sponsor/Collaborators

Locations