- ICH GCP
- US Clinical Trials Registry
- Clinical Trial NCT06823765
Can Feedback From a Large Language Model Improve Health Care Quality?
A Pilot Ptudy of an LLM Tool to Support Frontline Health Workers in Low-Resource Settings
The goal of this study is to learn if computer-assisted advice can help improve patient care in Nigerian health clinics. The main question it aims to answer is: does giving healthcare workers instant computer feedback help them make better decisions about patient care?
Researchers will compare patient care notes written by healthcare workers before and after they receive computer feedback to see if the feedback improves care quality. A doctor who doesn't know if feedback was given will review these notes.
Participants will:
- Be seen by a community healthcare worker who uses the computer feedback system
- Be treated by a fully trained medical doctor
- Get tested for malaria, anemia, or urinary tract infections if they have certain symptoms
Study Overview
Status
Conditions
Intervention / Treatment
Detailed Description
This project tests whether Large Language Models (LLMs) can improve patient care in Nigerian primary care clinics by giving customized and instant feedback to the provider in natural language. An LLM-based tool integrated into an electronic patient record management system provides "second opinions" to community health extension workers (CHEWs) at two clinics in Nigeria. These second opinions are intended to mirror what a reviewing physician might advise the CHEWs after seeing or hearing their initial report on a patient.
For the main analysis, this study employs a within-patient comparison of two patient notes created by the CHEW; one during the initial patient consultation, and one after the LLM feedback was received. The patient is also seen by a fully trained medical officer who is in charge of patient care. The MO conducts a blinded review of the CHEW's patient notes to measures changes in the CHEW's care as a result of the LLM feedback. The data comes from the information captured in the electronic medical record (EMR) of the patient and from survey data collected from CHEWs, reviewing MOs, and a panel of reviewing Medical Doctors.
Study Type
Enrollment (Actual)
Phase
- Not Applicable
Contacts and Locations
Study Locations
-
-
Kano State
-
Kano, Kano State, Nigeria
- EHA Clinics REACH Community Clinic, Gyadi Gyadi
-
Kano, Kano State, Nigeria
- EHA Clinics, 33 Lamido Crescent
-
-
Participation Criteria
Eligibility Criteria
Ages Eligible for Study
- Child
- Adult
- Older Adult
Accepts Healthy Volunteers
Description
Inclusion Criteria:
- Patient is at the clinic for outpatient consultation
- Parent/guardian consent is required for individuals under 18
Exclusion Criteria:
- Patient does not require emergency care
- Patient is not at the clinic for a checkup (e.g. weight, blood pressure, follow up after recovery)
- Patient is not a trauma patient (visit is not for an accident, wound or injury)
- Patient is not at the clinic for a scheduled procedure or a birth
Study Plan
How is the study designed?
Design Details
- Primary Purpose: Health Services Research
- Allocation: N/A
- Interventional Model: Single Group Assignment
- Masking: None (Open Label)
Arms and Interventions
Participant Group / Arm |
Intervention / Treatment |
|---|---|
|
Experimental: Clinical Assessment with and without LLMs
The investigators employ a within-patient design.
Patients receive two sequential assessments from a Community Health Extension Worker: first without and then with Large Language Model assistance.
|
A Large Language Model (LLM) integrated into the clinic's Electronic Medical Record system provides real-time feedback on patient assessments.
Community Health Extension Workers first create a standard SOAP note, submit it to the LLM, and receive detailed feedback and key recommendations.
They can then update their assessment based on this feedback.
All final treatment decisions are made by Medical Officers who independently evaluate patients.
|
What is the study measuring?
Primary Outcome Measures
Outcome Measure |
Measure Description |
Time Frame |
|---|---|---|
|
Indicator for an Error in the Treatment plan (with the Potential for Harm)
Time Frame: Through study completion, an average of six months
|
During SOAP note evaluation, the MO is asked to indicate whether the treatment plan for the patient contains any errors, conditional on the MO's own diagnosis. This is coded as 1 if the MO indicates there is an error and 0 otherwise. The introductory text (here for SOAP Note A) is: Please evaluate whether the treatment in SOAP Note A is appropriate for this patient's condition. Please base this on your own diagnosis, not the CHEW's diagnosis in SOAP Note A. This is followed by the question: Is the treatment plan for the patient in SOAP Note A completely appropriate given your own diagnosis (accounting for conditional treatments based on medical tests)? Answer "No" if the patient should receive different medical care given your diagnosis. This can include both minor differences (for example, the patient should be advised to rest) and major errors (for example, the patient should receive a completely different set of medications). (Answer options: yes/no/unsure) |
Through study completion, an average of six months
|
|
Indicator for an Error in the Treatment Plan that Causes a Loss of at least X Quality-Adjusted Life Days
Time Frame: Through study completion, an average of six months
|
This variable is coded as 1 if the MO indicates there is such an error and 0 otherwise.
X is defined to be the highest benchmark on the appropriate DALY scale so that at least 5% of patients have an error that large in the unassisted SOAP note.
In other words, severe errors are any errors that generate a harm rating at or above the 95th percentile of harm on the unassisted scale (pooling child and adult scales).
|
Through study completion, an average of six months
|
|
Indicator for the Better Treatment Plan (as Determined by the MOs)
Time Frame: Through study completion, an average of six months
|
Based on the DALY rating of SOAP Note A vs. B (counting instances with no errors as 0 DALY loss), the indicator is coded as 1 if the SOAP note has the better treatment plan (lower DALY loss) and 0 if MOs judge both notes to be the same in response to the following question: Are there any meaningful differences in the treatment plans of SOAP Note A and B?
|
Through study completion, an average of six months
|
|
Indicator for whether Treatment is Consistent with a Predetermined "Standard of Care"
Time Frame: Through study completion, an average of six months
|
At-risk patients receive malaria, anemia and UTI screening in accordance with certain demographic criteria. A dataset is then constructed with one observation for each (patient, screening test, note), up to six per patient. The indicator of treatment misallocation records whether a patient was incorrectly treated for a condition based on the test result or lack of symptoms. The variable is coded as 1 if the patient tested positive and either received inappropriate or no treatment. It is also coded as 1 if the patient tested negative or was not tested based on the symptom screen but received treatment for the condition. The variable is only coded as 0 if the patient tested negative and was correctly not treated for the corresponding condition, or if they tested positive and received the correct treatment. |
Through study completion, an average of six months
|
Secondary Outcome Measures
Outcome Measure |
Measure Description |
Time Frame |
|---|---|---|
|
Indicators Denoting Diagnosis and Treatment Alignment Between CHEWs and MOs
Time Frame: Through study completion, an average of six months
|
For each medication in the CHEW's treatment plan, there is a "clinical indication" (the diagnosis associated with the drug) along with an indicator that specifies if a given prescription is conditional on a medical test result. The research team will consider three indicators of a match:
|
Through study completion, an average of six months
|
|
Alternative Indicators for Treatment Misallocation
Time Frame: Through study completion, an average of six months
|
The research team will construct the following indicators of treatment misallocation:
|
Through study completion, an average of six months
|
|
Relationship of QALY Loss to Severity of Patient Condition
Time Frame: Through study completion, an average of six months
|
In patients with only mild illnesses, the scope for QALY loss from mistakes may be limited relative to patients with more severe illnesses. With this in mind, QALY loss is regressed on indicators for mild, moderate, and severe illnesses (as assessed by the MO) each interacted with the assisted note indicator, controlling for patient fixed effects. Results will be shown graphically. |
Through study completion, an average of six months
|
|
Indicators for the Appropriateness of Medical Testing Decisions
Time Frame: Through study completion, an average of six months
|
The potential misallocation of medical testing is operationalized in two ways:
Combining these indicators, a mismatch occurs if and only if either: i) a test was not requested by the CHEW but was positive, or ii) the test was requested by the CHEW but the result was negative and no equivalent test was ordered by the MO. |
Through study completion, an average of six months
|
|
Average and Distribution of DALY Lost
Time Frame: Through study completion, an average of six months
|
The effect of LLM assistance DALY lost is measured directly rather than indirectly (as in probability of error and severe error, which note is the better note).
The full distribution of DALY ratings for the assisted and unassisted notes will also be shown in the results.
|
Through study completion, an average of six months
|
|
MO Evaluation of SOAP Notes: Deviations from the MO's SOAP
Time Frame: Through study completion, an average of six months
|
The MO is asked to assess for each SOAP note whether medical tests ordered were necessary or clinically useful, whether there are missing or incorrect/unnecessary diagnoses, and whether there are missing or incorrect/unnecessary treatment plan elements.
|
Through study completion, an average of six months
|
|
MO Evaluation of SOAP Notes: Types of Harm Incurred
Time Frame: Through study completion, an average of six months
|
The MO is asked to assess any short-term harm (additional symptoms or discomfort for some period), and any long-term serious harm (risk of impairment, death etc.) from the treatment plan in the SOAP note.
|
Through study completion, an average of six months
|
|
MO Evaluation of SOAP Notes: Measuring Healthy Time Lost in DALY
Time Frame: Through study completion, an average of six months
|
The MO also provides an overall rating that is intended to reflect the "healthy time lost" from any errors in treatment in the SOAP note.
For each assessment and plan constructed by a CHEW (with or without LLM advice), an MO will assess the expected magnitude of healthy life that would be lost if the CHEW plan were implemented instead of the MO's plan.
|
Through study completion, an average of six months
|
|
MD Evaluation of CHEW and MO Notes: Flagging MO Error
Time Frame: Through study completion, an average of six months
|
In a first step, they will review the MO notes only and record whether there is any error in the diagnosis or treatment proposed in the conditional note or in the final note.
If an error is identified the MDs will rate the error by severity to distinguish medical mistakes from differences in opinion about a patient who is not present.
|
Through study completion, an average of six months
|
|
MD Evaluation of CHEW and MO Notes: SOAP Note Rating
Time Frame: Through study completion, an average of six months
|
The MD is asked to assess any short-term harm (additional symptoms or discomfort for some period), and any long-term serious harm (risk of impairment, death etc.) from the treatment plan in the SOAP note. The MD also provides an overall rating that is intended to reflect the "healthy time lost" from any errors in treatment in the SOAP note. For each assessment and plan constructed by a CHEW (with or without LLM advice), an MO will assess the expected magnitude of healthy life that would be lost if the CHEW plan were implemented instead of the MO's plan. Healthy time is measured in units of disability-adjusted life year (DALYs), which reflect both length and quality of life. |
Through study completion, an average of six months
|
|
MD Evaluation of CHEW and MO Notes: LLM Review
Time Frame: Through study completion, an average of six months
|
The MDs will also review the LLM feedback and answer the following questions: "Did the CHEW follow all, some, or none of the LLM recommendations?" If some or none: "Imagine the CHEW had followed all the recommendations of the LLM. Would the resulting treatment plan be an improvement over their assisted note?" (Yes/no) If yes: "Please explain."" "Did the LLM make any mistakes?" (Yes/no) If yes: "Was any aspect of the CHEW's assisted treatment plan worse than the unassisted plan because the CHEW followed the LLM's erroneous recommendation?" If yes: "Please explain. |
Through study completion, an average of six months
|
|
Indicator for the Appropriateness of Triage Decisions
Time Frame: Through study completion, an average of six months
|
For each (patient, note), an indicator records whether the CHEW triage decision (an intent to triage indicated in the SOAP note) and the MO suggested triage decision align.
|
Through study completion, an average of six months
|
Collaborators and Investigators
Sponsor
Collaborators
Investigators
- Principal Investigator: Jason Abaluck, Yale University
Study record dates
Study Major Dates
Study Start (Actual)
Primary Completion (Actual)
Study Completion (Actual)
Study Registration Dates
First Submitted
First Submitted That Met QC Criteria
First Posted (Actual)
Study Record Updates
Last Update Posted (Actual)
Last Update Submitted That Met QC Criteria
Last Verified
More Information
Terms related to this study
Keywords
Other Study ID Numbers
- 2000035990
Plan for Individual participant data (IPD)
Plan to Share Individual Participant Data (IPD)?
IPD Plan Description
The following de-identified individual participant data (IDP) will be shared:
Patient demographics and vitals Symptoms and clinical findings documented by CHEWs and MOs Test results (malaria, anemia, UTI) Treatment plans and prescriptions SOAP notes with and without LLM assistance from both CHEWs and MOs Provider assessments and DALY ratings Survey responses from CHEWs and MD panel reviews
IPD Sharing Time Frame
IPD Sharing Access Criteria
IPD Sharing Supporting Information Type
- STUDY_PROTOCOL
- SAP
- ICF
- ANALYTIC_CODE
- CSR
Drug and device information, study documents
Studies a U.S. FDA-regulated drug product
Studies a U.S. FDA-regulated device product
This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.
Clinical Trials on All Conditions
-
Academisch Medisch Centrum - Universiteit van Amsterdam...Not yet recruitingAll Acute Pediatric Conditions | All Chronic Pediatric Conditions
-
DKMS gemeinnützige GmbHCompleted
-
Acacia Pharma LtdPaion UK Ltd.Active, not recruitingPediatric ALLUnited States, Denmark
-
Tata Medical CenterCompleted
-
Mathematica Policy Research, Inc.Centers for Medicare and Medicaid ServicesActive, not recruiting
-
Seoul National University HospitalRecruiting
-
Stanford UniversityRecruiting
-
Nabriva Therapeutics AGRecruitingPediatric ALLUnited States
-
Edwards LifesciencesCompleted
Clinical Trials on Large Language Model Clinical Decision Support
-
MetroWest Artificial Intelligence Research WorkgroupNot yet recruitingSepsis | Shock | Critical Illness | Acute Kidney Injury | Delirium Confusional State | Multi-organ Failure | Acute Respiratory Failure (ARF)United States
-
Second Affiliated Hospital of Nanchang UniversityFirst Affiliated Hospital of Zhejiang University; Renmin Hospital of Wuhan... and other collaboratorsRecruitingHemorrhage StrokeChina
-
Capital Medical UniversityCompleted
-
Shandong Cancer Hospital and InstituteNot yet recruiting
-
Zhongshan Ophthalmic Center, Sun Yat-sen UniversityCompletedNon-emergency Ocular DiseasesChina
-
John J ChenCompletedCommunication | Interdisciplinary Communication | Artificial Intelligence (AI) | Artificial Intelligence TechnologyUnited States
-
First Affiliated Hospital of Fujian Medical UniversityRecruiting
-
Stanford UniversityGoogle LLC.RecruitingGenetic Disease | Cardiomyopathy | Cardiology | Hypertrophic Cardiomyopathy (HCM)United States
-
Tsinghua UniversityNot yet recruiting
-
Peking University Third HospitalQingdao Municipal Hospital; Tianjin Medical University General Hospital; The... and other collaboratorsRecruitingHeart Failure With Preserved Ejection FractionChina