- ICH GCP
- US Clinical Trials Registry
- Clinical Trial NCT07491068
The Accuracy and Efficacy of Large Language Model Written Hospital Course Summaries (CLEAN)
March 25, 2026 updated by: Martin Janičko, Pavol Jozef Safarik University
Safety and Workflow Impact of Large Language Model-Assisted Hospital Course Summaries: Protocol for a Randomized, Evaluator-Blinded Non-Inferiority Trial
Background: Physicians worldwide face an increasing administrative burden that diverts time from direct patient care.
Among inpatient documentation tasks, authoring hospital course summaries is particularly time-consuming and critical for safe care transitions.
Large language models (LLMs) have shown promise for clinical text generation; however, robust evidence from randomized, evaluator-blinded trials conducted in routine hospital practice remains limited.
Objectives: The CLEAN study aims to evaluate whether LLM-assisted, specialistedited generation of hospital course summaries is non-inferior in safety compared with standard clinician-written documentation in routine inpatient care.
Secondary objectives include noninferiority assessments of resident-edited and unedited LLMgenerated summaries.
Additional objectives are to evaluate summary quality across predefined domains, quantify physician documentation time, assess LLM generation stability, measure clinician adoption following the randomized phase, and examine inter-, intra-observer, and test-retest reliability of expert assessments.
Methods: This is a single-centre, double-campus, exploratory randomized controlled non-inferiority trial conducted at a tertiary university hospital.
Consecutive hospital discharges across multiple clinical departments are randomized 1:1 to either an LLM-assisted documentation workflow or standard manual authorship.
The intervention integrates an on-premise LLM into a parallel hospital information system, generating draft hospital course summaries from complete, uncurated clinical documentation, which physicians may review and edit prior to finalization.
Safety, the primary outcome, defined as presence of all important information and absence of incorrect/hallucinated information, is assessed by an adjudication committee blinded to documentation workflow.
Secondary outcomes include content validity, workflow efficiency, generation stability, post-trial clinician adoption, and reliability metrics.
A total of 786 discharge episodes are required to assess non-inferiority using a predefined margin of 5 percentage points.
Ethics and Dissemination: The study will be conducted in accordance with the Declaration of Helsinki, Good Clinical Practice, and the General Data Protection Regulation.
A waiver of informed consent is sought due to minimal risk and exclusive use of routine clinical data.
Results will be disseminated through peer-reviewed publication and engagement with healthcare stakeholders.
Study Overview
Status
Not yet recruiting
Conditions
Intervention / Treatment
Detailed Description
The World Health Organization has identified a severe global health workforce crisis, estimating a shortage of approximately 12.7 million physicians worldwide in 2020.
Projections indicate that this deficit will continue to worsen in the coming years [1].
Furthermore, these shortages occur alongside rising healthcare demand driven by population ageing and the increasing prevalence of multimorbidity.
Despite that, administrative burden remains a persistent and systemic challenge across healthcare systems worldwide.
Initiatives aimed at reducing physicians' non-clinical workload have achieved only limited success, leading to the continued diversion of clinical expertise away from direct patient care [2].
Among administrative responsibilities, the preparation of discharge letters is one of the most time-consuming tasks for physicians caring for hospitalized patients, yet it is also central to ensuring continuity of care after discharge.
Despite major time allocation, discharge summaries frequently fail to meet expected standards, with many physicians reporting shortcomings in both their completeness and quality [3], [4].
The adoption of electronic medical records has prompted efforts to alleviate this burden by transitioning from fully manual authorship toward partial automation, enabling selected sections of discharge letters to be generated automatically.
However, the hospital course-the most clinically informative component of the discharge letter-remains almost universally dependent on manual narrative documentation by the treating physician.
B. Objectives The primary objective of this study is to evaluate whether LLM-assisted, specialist-edited generation of hospital course summaries is safe for use in routine inpatient care, compared with standard clinician-written documentation.
Secondary objectives include evaluating the safety of resident-edited and unedited LLM-generated hospital course summaries.
Additional objectives are to assess the quality of LLM-assisted summaries across predefined domains, evaluate the stability of generated outputs, quantify the impact of AI-assisted workflows on physician documentation time, examine clinician adoption of LLM-supported documentation following completion of the randomized phase, and assess the reliability of evaluator ratings.
Collectively, these objectives aim to determine the safety and feasibility of integrating LLM-based narrative generation into real-world hospital discharge workflows.
Study Type
Interventional
Enrollment (Estimated)
786
Phase
- Not Applicable
Contacts and Locations
This section provides the contact details for those conducting the study, and information on where this study is being conducted.
Study Contact
- Name: Jakub Gazda, MD, PhD
- Phone Number: +421556403517
- Email: jakub.gazda@upjs.sk
Study Contact Backup
- Name: Martin Janicko, MD, PhD
- Phone Number: +421556403527
- Email: martin.janicko@upjs.sk
Study Locations
-
-
Košice Region
-
Košice, Košice Region, Slovakia, 04001
- Louis Pasteur University Hospital Kosice
-
-
Participation Criteria
Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.
Eligibility Criteria
Ages Eligible for Study
- Adult
- Older Adult
Accepts Healthy Volunteers
No
Description
Inclusion Criteria:
- All consecutively discharged patients, including those who died during hospitalization, will be eligible for inclusion.
Exclusion Criteria:
- No exclusion criteria
Study Plan
This section provides details of the study plan, including how the study is designed and what the study is measuring.
How is the study designed?
Design Details
- Primary Purpose: Health Services Research
- Allocation: Randomized
- Interventional Model: Parallel Assignment
- Masking: Single
Arms and Interventions
Participant Group / Arm |
Intervention / Treatment |
|---|---|
|
Experimental: LLM assisted
The intervention consists of an LLM assisted workflow for generating the hospital course summary at discharge.
The treating physician initiates discharge using an application - CorteVision Hospital Suite - connected to the hospital Informix database.
The output of the model - generated draft is returned to the application interface, where the treating physician reviews and may edit, correct, expand, or shorten the text before finalization.
The finalized hospital course summary is entered into the medical record only after physician review and confirmation.
|
The intervention consists of an LLM assisted workflow for generating the hospital course summary at discharge.
The treating physician initiates discharge using an application - CorteVision Hospital Suite - connected to the hospital Informix database.
The output of the model - generated draft is returned to the application interface, where the treating physician reviews and may edit, correct, expand, or shorten the text before finalization.
The finalized hospital course summary is entered into the medical record only after physician review and confirmation.
|
|
No Intervention: Control
Standard, manual generation of hospital course summary manually by responsible physician.
|
What is the study measuring?
Primary Outcome Measures
Outcome Measure |
Measure Description |
Time Frame |
|---|---|---|
|
Safety assessed by outcome evaluator on an ordinal scale (1,2,3)
Time Frame: Assessment at one time point - hospital discharge (up to 5 days)
|
Safety will be assessed using an ordinal scale ranging from "acceptable without further changes", through "acceptable with minor revisions", to not "unacceptable in its current form" ( a combination of either absence of important information or presence of incorrect/hallucinated information)
|
Assessment at one time point - hospital discharge (up to 5 days)
|
Secondary Outcome Measures
Outcome Measure |
Measure Description |
Time Frame |
|---|---|---|
|
Safety of the edited and unedited LLM generated course summaries evaluated on an ordinal scale (1,2,3)
Time Frame: At hospital discharge (up to 5 days)
|
Safety of LLM-generated, resident-edited and un-edited hospital course summaries compared with standard specialist-written documentation.
Safety, will be assessed using an 3 categories ordinal scale ranging from "acceptable without further changes" (best outcome), through "acceptable with minor revisions" (uncertain outcome), to "not unacceptable in its current form" (absence of important information or presence of incorrect/hallucinated information; worst outcome)
|
At hospital discharge (up to 5 days)
|
|
Secondary outcome: Content validity across the following predefined domains (5C).
Time Frame: At hospital discharge (up to 5 days)
|
Secondary outcomes measures: detailed evaluation of quality across the following predefined domains (5C): completeness, conciseness, cohesiveness, absence of critical errors (including hallucinations), and cultural and linguistic fidelity.
Individual domains will be separately assessed using a 5-point Likert scale, (1 - complete absence of agreement, 2 - partial absence of agreement, 3 - neutral agreement, 4- partial agreement, 5- complete agreement)
|
At hospital discharge (up to 5 days)
|
|
Time to complete hospital summary (seconds)
Time Frame: At hospital discharge (up to 5 days)
|
Workflow-related secondary outcomes include time efficiency of hospital course documentation when using the LLM-generated workflow compared with standard practice.
The outcome measure will be the time taken to complete the hospital course summary (seconds), from initiating discharge to "ready to signing".
|
At hospital discharge (up to 5 days)
|
|
Generation stability assessed on an ordinal scale (1,2,3)
Time Frame: From hospital discharge to 30 days after hospital discharge
|
Generation stability will assess the consistency of LLM generated unedited hospital course summaries when the model is applied repeatedly to identical source clinical documentation.
Generation stability will be evaluated by the same adjudicator on the scale of 1 - stable, 2 - acceptable variation, 3 - unacceptable variation
|
From hospital discharge to 30 days after hospital discharge
|
|
Adoption. Percentage of hospital discharges where the LLM generated course summary was utilized.
Time Frame: From hospital discharge to 30 days after hospital discharge
|
After completing the randomized controlled trial, the adoption of the LLM-generated workflow will be evaluated over a one-month period.
The outcome measure will be defined as the proportion of eligible discharges for which LLM-derived summary generation was utilized.
|
From hospital discharge to 30 days after hospital discharge
|
|
Inter-rater reliability. Percentage agreement between two adjudicators of a hospital course summary on safety and content validity ratings.
Time Frame: From hospital discharge to 30 days after hospital discharge
|
Reliability-related outcomes will assess the consistency of expert evaluations of hospital course summaries.
Percentage will quantify the agreement between evaluators on safety and content validity ratings.
|
From hospital discharge to 30 days after hospital discharge
|
|
Intra-rater reliability. Concordance of safety and content validity ratings between two evaluations of the same hospital course and same adjudicator.
Time Frame: At hospital discharge (up to 5 days)
|
Intra-rater reliability will quantify the consistency of repeated evaluations by the same evaluator based on the concordance of safety and content validity ratings (percentage of agreement)
|
At hospital discharge (up to 5 days)
|
|
Temporal stability (test-retest reliability) - concordance between the ratings of the same adjudicator after washout period.
Time Frame: 30 days after hospital discharge
|
Temporal stability of evaluator judgments will be assessed through repeated evaluation of identical hospital course summaries by the same evaluator after a predefined 30 day washout period (test-retest reliability).
Outcome measures will be based on the concordance of safety and content validity ratings (percentage)
|
30 days after hospital discharge
|
Collaborators and Investigators
This is where you will find people and organizations involved with this study.
Sponsor
Collaborators
Investigators
- Principal Investigator: Jakub Gazda, MD, PhD, Pavol Jozef Safarik University
Publications and helpful links
The person responsible for entering information about the study voluntarily provides these publications. These may be about anything related to the study.
General Publications
- Kripalani S, LeFevre F, Phillips CO, Williams MV, Basaviah P, Baker DW. Deficits in communication and information transfer between hospital-based and primary care physicians: implications for patient safety and continuity of care. JAMA. 2007 Feb 28;297(8):831-41. doi: 10.1001/jama.297.8.831.
- Ganzinger M, Kunz N, Fuchs P, Lyu CK, Loos M, Dugas M, Pausch TM. Automated generation of discharge summaries: leveraging large language models with clinical data. Sci Rep. 2025 May 12;15(1):16466. doi: 10.1038/s41598-025-01618-7.
- Burden M, Astik G, Auerbach A, Bowling G, Kangelaris KN, Keniston A, Kochar A, Leykum LK, Linker AS, Sakumoto M, Rogers K, Schwatka N, Westergaard S. Identifying and Measuring Administrative Harms Experienced by Hospitalists and Administrative Leaders. JAMA Intern Med. 2024 Sep 1;184(9):1014-1023. doi: 10.1001/jamainternmed.2024.1890.
Study record dates
These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.
Study Major Dates
Study Start (Estimated)
June 1, 2026
Primary Completion (Estimated)
December 31, 2026
Study Completion (Estimated)
December 31, 2027
Study Registration Dates
First Submitted
February 25, 2026
First Submitted That Met QC Criteria
March 19, 2026
First Posted (Actual)
March 24, 2026
Study Record Updates
Last Update Posted (Actual)
March 30, 2026
Last Update Submitted That Met QC Criteria
March 25, 2026
Last Verified
March 1, 2026
More Information
Terms related to this study
Keywords
Other Study ID Numbers
- 2IK2026_1
Plan for Individual participant data (IPD)
Plan to Share Individual Participant Data (IPD)?
UNDECIDED
Drug and device information, study documents
Studies a U.S. FDA-regulated drug product
No
Studies a U.S. FDA-regulated device product
No
This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.
Clinical Trials on Hospitalizations
-
University of BaselActive, not recruitingUnplanned HospitalizationsSwitzerland
-
University of Southern CaliforniaNational Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)CompletedPeritoneal Dialysis | Dialysis | Hospitalizations
-
University Health Network, TorontoCompleted
-
Vanderbilt University Medical CenterCompletedFirearm Injury | Hospitalizations PsychiatricUnited States
-
Ankara Etlik City HospitalCompletedAdvanced Solid Tumors Cancer | HospitalizationsTurkey
-
Hillel Yaffe Medical CenterBen-Gurion University of the NegevCompletedDrug Related Problems Associated Hospitalizations
-
University of ChicagoNorthwestern University; University of Illinois at Chicago; Rush University Medical... and other collaboratorsActive, not recruitingMedicare Patients | High Risk of HospitalizationsUnited States
-
Florida Atlantic UniversityNational Institute of Nursing Research (NINR); University of MinnesotaCompletedUnnecessary Hospitalizations of Nursing Home ResidentsUnited States
-
Region of Southern DenmarkOdense Municipality, Denmark; Kerteminde Municipality, Denmark; Svendborg Municipality...Completed
-
Poitiers University HospitalNot yet recruitingElderly | Emergency Departments | Hospitalizations | Prognostic FactorsFrance
Clinical Trials on LLM assisted workflow for generating the hospital course summary
-
Duke UniversityPfizer; Agency for Healthcare Research and Quality (AHRQ)CompletedCardiovascular DiseaseUnited States