The Accuracy and Efficacy of Large Language Model Written Hospital Course Summaries (CLEAN)

March 25, 2026 updated by: Martin Janičko, Pavol Jozef Safarik University

Safety and Workflow Impact of Large Language Model-Assisted Hospital Course Summaries: Protocol for a Randomized, Evaluator-Blinded Non-Inferiority Trial

Background: Physicians worldwide face an increasing administrative burden that diverts time from direct patient care. Among inpatient documentation tasks, authoring hospital course summaries is particularly time-consuming and critical for safe care transitions. Large language models (LLMs) have shown promise for clinical text generation; however, robust evidence from randomized, evaluator-blinded trials conducted in routine hospital practice remains limited. Objectives: The CLEAN study aims to evaluate whether LLM-assisted, specialistedited generation of hospital course summaries is non-inferior in safety compared with standard clinician-written documentation in routine inpatient care. Secondary objectives include noninferiority assessments of resident-edited and unedited LLMgenerated summaries. Additional objectives are to evaluate summary quality across predefined domains, quantify physician documentation time, assess LLM generation stability, measure clinician adoption following the randomized phase, and examine inter-, intra-observer, and test-retest reliability of expert assessments. Methods: This is a single-centre, double-campus, exploratory randomized controlled non-inferiority trial conducted at a tertiary university hospital. Consecutive hospital discharges across multiple clinical departments are randomized 1:1 to either an LLM-assisted documentation workflow or standard manual authorship. The intervention integrates an on-premise LLM into a parallel hospital information system, generating draft hospital course summaries from complete, uncurated clinical documentation, which physicians may review and edit prior to finalization. Safety, the primary outcome, defined as presence of all important information and absence of incorrect/hallucinated information, is assessed by an adjudication committee blinded to documentation workflow. Secondary outcomes include content validity, workflow efficiency, generation stability, post-trial clinician adoption, and reliability metrics. A total of 786 discharge episodes are required to assess non-inferiority using a predefined margin of 5 percentage points. Ethics and Dissemination: The study will be conducted in accordance with the Declaration of Helsinki, Good Clinical Practice, and the General Data Protection Regulation. A waiver of informed consent is sought due to minimal risk and exclusive use of routine clinical data. Results will be disseminated through peer-reviewed publication and engagement with healthcare stakeholders.

Study Overview

Status

Not yet recruiting

Conditions

Detailed Description

The World Health Organization has identified a severe global health workforce crisis, estimating a shortage of approximately 12.7 million physicians worldwide in 2020. Projections indicate that this deficit will continue to worsen in the coming years [1]. Furthermore, these shortages occur alongside rising healthcare demand driven by population ageing and the increasing prevalence of multimorbidity. Despite that, administrative burden remains a persistent and systemic challenge across healthcare systems worldwide. Initiatives aimed at reducing physicians' non-clinical workload have achieved only limited success, leading to the continued diversion of clinical expertise away from direct patient care [2]. Among administrative responsibilities, the preparation of discharge letters is one of the most time-consuming tasks for physicians caring for hospitalized patients, yet it is also central to ensuring continuity of care after discharge. Despite major time allocation, discharge summaries frequently fail to meet expected standards, with many physicians reporting shortcomings in both their completeness and quality [3], [4]. The adoption of electronic medical records has prompted efforts to alleviate this burden by transitioning from fully manual authorship toward partial automation, enabling selected sections of discharge letters to be generated automatically. However, the hospital course-the most clinically informative component of the discharge letter-remains almost universally dependent on manual narrative documentation by the treating physician. B. Objectives The primary objective of this study is to evaluate whether LLM-assisted, specialist-edited generation of hospital course summaries is safe for use in routine inpatient care, compared with standard clinician-written documentation. Secondary objectives include evaluating the safety of resident-edited and unedited LLM-generated hospital course summaries. Additional objectives are to assess the quality of LLM-assisted summaries across predefined domains, evaluate the stability of generated outputs, quantify the impact of AI-assisted workflows on physician documentation time, examine clinician adoption of LLM-supported documentation following completion of the randomized phase, and assess the reliability of evaluator ratings. Collectively, these objectives aim to determine the safety and feasibility of integrating LLM-based narrative generation into real-world hospital discharge workflows.

Study Type

Interventional

Enrollment (Estimated)

786

Phase

  • Not Applicable

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Contact

Study Contact Backup

Study Locations

    • Košice Region
      • Košice, Košice Region, Slovakia, 04001
        • Louis Pasteur University Hospital Kosice

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

  • Adult
  • Older Adult

Accepts Healthy Volunteers

No

Description

Inclusion Criteria:

- All consecutively discharged patients, including those who died during hospitalization, will be eligible for inclusion.

Exclusion Criteria:

  • No exclusion criteria

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

  • Primary Purpose: Health Services Research
  • Allocation: Randomized
  • Interventional Model: Parallel Assignment
  • Masking: Single

Arms and Interventions

Participant Group / Arm
Intervention / Treatment
Experimental: LLM assisted
The intervention consists of an LLM assisted workflow for generating the hospital course summary at discharge. The treating physician initiates discharge using an application - CorteVision Hospital Suite - connected to the hospital Informix database. The output of the model - generated draft is returned to the application interface, where the treating physician reviews and may edit, correct, expand, or shorten the text before finalization. The finalized hospital course summary is entered into the medical record only after physician review and confirmation.
The intervention consists of an LLM assisted workflow for generating the hospital course summary at discharge. The treating physician initiates discharge using an application - CorteVision Hospital Suite - connected to the hospital Informix database. The output of the model - generated draft is returned to the application interface, where the treating physician reviews and may edit, correct, expand, or shorten the text before finalization. The finalized hospital course summary is entered into the medical record only after physician review and confirmation.
No Intervention: Control
Standard, manual generation of hospital course summary manually by responsible physician.

What is the study measuring?

Primary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Safety assessed by outcome evaluator on an ordinal scale (1,2,3)
Time Frame: Assessment at one time point - hospital discharge (up to 5 days)
Safety will be assessed using an ordinal scale ranging from "acceptable without further changes", through "acceptable with minor revisions", to not "unacceptable in its current form" ( a combination of either absence of important information or presence of incorrect/hallucinated information)
Assessment at one time point - hospital discharge (up to 5 days)

Secondary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Safety of the edited and unedited LLM generated course summaries evaluated on an ordinal scale (1,2,3)
Time Frame: At hospital discharge (up to 5 days)
Safety of LLM-generated, resident-edited and un-edited hospital course summaries compared with standard specialist-written documentation. Safety, will be assessed using an 3 categories ordinal scale ranging from "acceptable without further changes" (best outcome), through "acceptable with minor revisions" (uncertain outcome), to "not unacceptable in its current form" (absence of important information or presence of incorrect/hallucinated information; worst outcome)
At hospital discharge (up to 5 days)
Secondary outcome: Content validity across the following predefined domains (5C).
Time Frame: At hospital discharge (up to 5 days)
Secondary outcomes measures: detailed evaluation of quality across the following predefined domains (5C): completeness, conciseness, cohesiveness, absence of critical errors (including hallucinations), and cultural and linguistic fidelity. Individual domains will be separately assessed using a 5-point Likert scale, (1 - complete absence of agreement, 2 - partial absence of agreement, 3 - neutral agreement, 4- partial agreement, 5- complete agreement)
At hospital discharge (up to 5 days)
Time to complete hospital summary (seconds)
Time Frame: At hospital discharge (up to 5 days)
Workflow-related secondary outcomes include time efficiency of hospital course documentation when using the LLM-generated workflow compared with standard practice. The outcome measure will be the time taken to complete the hospital course summary (seconds), from initiating discharge to "ready to signing".
At hospital discharge (up to 5 days)
Generation stability assessed on an ordinal scale (1,2,3)
Time Frame: From hospital discharge to 30 days after hospital discharge
Generation stability will assess the consistency of LLM generated unedited hospital course summaries when the model is applied repeatedly to identical source clinical documentation. Generation stability will be evaluated by the same adjudicator on the scale of 1 - stable, 2 - acceptable variation, 3 - unacceptable variation
From hospital discharge to 30 days after hospital discharge
Adoption. Percentage of hospital discharges where the LLM generated course summary was utilized.
Time Frame: From hospital discharge to 30 days after hospital discharge
After completing the randomized controlled trial, the adoption of the LLM-generated workflow will be evaluated over a one-month period. The outcome measure will be defined as the proportion of eligible discharges for which LLM-derived summary generation was utilized.
From hospital discharge to 30 days after hospital discharge
Inter-rater reliability. Percentage agreement between two adjudicators of a hospital course summary on safety and content validity ratings.
Time Frame: From hospital discharge to 30 days after hospital discharge
Reliability-related outcomes will assess the consistency of expert evaluations of hospital course summaries. Percentage will quantify the agreement between evaluators on safety and content validity ratings.
From hospital discharge to 30 days after hospital discharge
Intra-rater reliability. Concordance of safety and content validity ratings between two evaluations of the same hospital course and same adjudicator.
Time Frame: At hospital discharge (up to 5 days)
Intra-rater reliability will quantify the consistency of repeated evaluations by the same evaluator based on the concordance of safety and content validity ratings (percentage of agreement)
At hospital discharge (up to 5 days)
Temporal stability (test-retest reliability) - concordance between the ratings of the same adjudicator after washout period.
Time Frame: 30 days after hospital discharge
Temporal stability of evaluator judgments will be assessed through repeated evaluation of identical hospital course summaries by the same evaluator after a predefined 30 day washout period (test-retest reliability). Outcome measures will be based on the concordance of safety and content validity ratings (percentage)
30 days after hospital discharge

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Investigators

  • Principal Investigator: Jakub Gazda, MD, PhD, Pavol Jozef Safarik University

Publications and helpful links

The person responsible for entering information about the study voluntarily provides these publications. These may be about anything related to the study.

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Estimated)

June 1, 2026

Primary Completion (Estimated)

December 31, 2026

Study Completion (Estimated)

December 31, 2027

Study Registration Dates

First Submitted

February 25, 2026

First Submitted That Met QC Criteria

March 19, 2026

First Posted (Actual)

March 24, 2026

Study Record Updates

Last Update Posted (Actual)

March 30, 2026

Last Update Submitted That Met QC Criteria

March 25, 2026

Last Verified

March 1, 2026

More Information

Terms related to this study

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

UNDECIDED

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

No

Studies a U.S. FDA-regulated device product

No

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Hospitalizations

Clinical Trials on LLM assisted workflow for generating the hospital course summary

Subscribe