Clinical Language Evaluation With AI for Residents (CLEAR2)

October 28, 2025 updated by: Krislynn Michelle Mueck, The University of Texas Health Science Center, Houston

Clinical Language Evaluation With AI for Residents (CLEAR2) - A Pilot Randomized Controlled Trial

The purpose of this study is to refine and test existing enterprise-grade large language model (LLM) based on generative artificial intelligence (AI), to assess the feasibility and acceptability of LLM-based feedback, to assess the ability of LLM-based feedback to improve residents' communications,to explore the ability of standardized patients to assess residents' communication and to explore the ability of residents to self-assess their communication complexity

Study Overview

Status

Not yet recruiting

Study Type

Interventional

Enrollment (Estimated)

64

Phase

  • Not Applicable

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Contact

Study Contact Backup

Study Locations

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

  • Adult

Accepts Healthy Volunteers

Yes

Description

Inclusion Criteria:

  • McGovern Medical School (MMS) general surgery residents
  • postgraduate year (PGY) 1-5

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

  • Primary Purpose: Other
  • Allocation: Randomized
  • Interventional Model: Parallel Assignment
  • Masking: None (Open Label)

Arms and Interventions

Participant Group / Arm
Intervention / Treatment
No Intervention: Control
Experimental: Educational LLM-based feedback tool
Participants will have their verbal communications with standardized patients (SP) regarding 3 different scenarios recorded, transcribed, and analyzed in real-time by the large language model (LLM) and will receive feedback as suggestions and alternative scripts. These will be reviewed by residents between SP scenarios

What is the study measuring?

Primary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Readability discernment as assessed by a survey
Time Frame: end of intervention ( 1 hour after baseline)
This will be scored by the by Cohen's Kappa values from 1-5. Higher Cohen's kappa scores mean better outcome
end of intervention ( 1 hour after baseline)
Quality discernment as assessed by a survey
Time Frame: end of intervention ( 1 hour after baseline)
This will be scored by the by Cohen's Kappa values from 1-5. Higher Cohen's kappa scores mean better outcome
end of intervention ( 1 hour after baseline)
Correctness of recommendations as assessed by a survey
Time Frame: end of intervention ( 1 hour after baseline)
This will be reported on a 5 point Likert scale form 1 very incorrect to 5 very correct
end of intervention ( 1 hour after baseline)
Applicability of recommendations as assessed by a survey
Time Frame: end of intervention ( 1 hour after baseline)
This will be reported on a 5 point Likert scale form 1 very inapplicable to 5 very applicable
end of intervention ( 1 hour after baseline)
Perceived readability of resident-standardized patient (SP) interactions as assessed by a survey: schooling level
Time Frame: end of intervention ( 1 hour after baseline)

This will be categorically reported in the following categories:

Elementary middle high college graduate

end of intervention ( 1 hour after baseline)
confidence in communication ability
Time Frame: end of intervention ( 1 hour after baseline)
This is scored from 1( very unconfident) to 5 (very confident)
end of intervention ( 1 hour after baseline)
usefulness of the LLM
Time Frame: end of intervention ( 1 hour after baseline)
This is scored from 1( very useless) to 5 (very useful)
end of intervention ( 1 hour after baseline)
acceptability of future use
Time Frame: end of intervention ( 1 hour after baseline)
This is scored from 1( very unlikely) to 5 (very likely)
end of intervention ( 1 hour after baseline)

Secondary Outcome Measures

Outcome Measure
Measure Description
Time Frame
confidence in communication ability
Time Frame: end of intervention ( 1 hour after baseline)
This is scored from 1( very unconfident) to 5 (very confident)
end of intervention ( 1 hour after baseline)
Survey feedback on the LLM interface
Time Frame: end of intervention ( 1 hour after baseline)
This is scored from 1( very unrealistic) to 5 (very realistic)
end of intervention ( 1 hour after baseline)
readability grade level of resident-SP transcripts as assessed by the Flesch-Kincaid Grade Level (FKGL) readability tool
Time Frame: end of intervention ( 1 hour after baseline)

Readability of resident-SP encounter transcripts will be assessed using the Flesch-Kincaid Grade Level formula, which estimates the U.S. school grade level required to understand the text. Higher scores indicate a higher reading grade level (i.e., lower readability).

Formula used:

Grade level= 0.39(total words/total sentences) + 11.8 (total syllables/total words)-15.59

end of intervention ( 1 hour after baseline)
Quality based on Ensuring Quality Information for Patients (EQIP) score of resident-SP transcripts
Time Frame: end of intervention ( 1 hour after baseline)
Percentage score based on a validated questionnaire This has 20 questions and each is scored from 1(yes), 0.5(partly), 0 (no) and question is removed if it does not apply.Scores are reported as a percentage and higher percentage score indicates better quality
end of intervention ( 1 hour after baseline)
Perceived readability of SP-resident interactions as assessed by a standardized survey
Time Frame: end of intervention ( 1 hour after baseline)

This will be categorically reported in the following categories:

Elementary middle high college graduate

end of intervention ( 1 hour after baseline)

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Investigators

  • Principal Investigator: Krislynn M Mueck, MD, MS, MPH, The University of Texas Health Science Center, Houston

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Estimated)

October 23, 2025

Primary Completion (Estimated)

March 26, 2026

Study Completion (Estimated)

May 28, 2026

Study Registration Dates

First Submitted

October 28, 2025

First Submitted That Met QC Criteria

October 28, 2025

First Posted (Estimated)

October 30, 2025

Study Record Updates

Last Update Posted (Estimated)

October 30, 2025

Last Update Submitted That Met QC Criteria

October 28, 2025

Last Verified

October 1, 2025

More Information

Terms related to this study

Other Study ID Numbers

  • HSC-MS-25-0920

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

NO

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

No

Studies a U.S. FDA-regulated device product

No

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Patient Communication

Clinical Trials on educational LLM-based feedback tool

Subscribe