- ICH GCP
- US Clinical Trials Registry
- Clinical Trial NCT07449429
A Privacy-Preserving OCR-LLM System for Coronary Syndrome Subtyping From Admission HPI: Multicenter Validation in China and the US (OCR-LLM-CHD)
Development and Multicenter Validation of a Privacy-Preserving OCR-LLM Pipeline for Four-Subtype Coronary Syndrome Classification Using Admission HPI Across Heterogeneous EHR Systems
Study Overview
Status
Conditions
Study Type
Enrollment (Estimated)
Contacts and Locations
Study Contact
- Name: Xiaojin Gao, Dr
- Phone Number: +86010 88322415
- Email: sophie_gao@sina.com
Participation Criteria
Eligibility Criteria
Ages Eligible for Study
- Adult
- Older Adult
Accepts Healthy Volunteers
Sampling Method
Study Population
Description
Inclusion Criteria:Hospital encounters with admission HPI documenting sym
2a4afaef-9dc1-47fc-874f-9dffaf7…
evant to coronary syndrome subtyping.
Cases with sufficient documentation to assign one of four target subtypes (STEMI, NSTEMI, UA, CCS) by adjudication.
-
Exclusion Criteria: Unclear subtype or incomplete/uncertain time information preventing gold standard assignment.
Non-CHD primary reason for admission after screening (for MIMIC-IV cohort).
-
Study Plan
How is the study designed?
Design Details
Cohorts and Interventions
Group / Cohort |
Intervention / Treatment |
|---|---|
|
Internal Development and Validation Cohort
Retrospective cohort used for model development and internal validation.
Inputs are de-identified admission HPI records (image or text) from the AIM-CHD dataset.
Expert adjudication provides the reference standard labels for 4-class coronary syndrome subtyping (STEMI, NSTEMI, unstable angina, chronic coronary syndrome).
|
An automated clinical data management workflow integrating Optical Character Recognition (OCR), optimized prompt engineering, and large language models (LLMs).
The system processes unstructured inpatient/ED records (primarily admission history of present illness and related narrative text) to extract prespecified key clinical indicators (e.g., left ventricular ejection fraction, coronary syndrome subtype, medications) and to classify cases into prespecified coronary artery disease categories (e.g., unstable angina, STEMI, NSTEMI, chronic coronary syndrome).
The workflow outputs structured fields and a classification result with supporting evidence excerpts.
Standard manual process in which experienced clinicians review patient medical records and extract the same prespecified clinical indicators and coronary artery disease categories using routine clinical judgment and documentation review.
This manual abstraction serves as the human benchmark for comparing diagnostic accuracy, completeness, and operational efficiency against the automated OCR-Prompt-LLM workflow.
|
|
Multicenter External Validation Cohort
Retrospective multicenter cohort used for external validation across heterogeneous EHR templates and documentation styles.
De-identified admission HPI records are processed through the same OCR-LLM pipeline, and predictions are compared with expert adjudicated reference labels to assess generalizability.
|
An automated clinical data management workflow integrating Optical Character Recognition (OCR), optimized prompt engineering, and large language models (LLMs).
The system processes unstructured inpatient/ED records (primarily admission history of present illness and related narrative text) to extract prespecified key clinical indicators (e.g., left ventricular ejection fraction, coronary syndrome subtype, medications) and to classify cases into prespecified coronary artery disease categories (e.g., unstable angina, STEMI, NSTEMI, chronic coronary syndrome).
The workflow outputs structured fields and a classification result with supporting evidence excerpts.
Standard manual process in which experienced clinicians review patient medical records and extract the same prespecified clinical indicators and coronary artery disease categories using routine clinical judgment and documentation review.
This manual abstraction serves as the human benchmark for comparing diagnostic accuracy, completeness, and operational efficiency against the automated OCR-Prompt-LLM workflow.
|
|
Emergency Department External Validation Cohort
Retrospective cohort representing real-world emergency department workflow.
De-identified ED admission HPI records are used to evaluate model performance under time-sensitive, information-limited conditions and assess robustness to ED documentation variability.
|
An automated clinical data management workflow integrating Optical Character Recognition (OCR), optimized prompt engineering, and large language models (LLMs).
The system processes unstructured inpatient/ED records (primarily admission history of present illness and related narrative text) to extract prespecified key clinical indicators (e.g., left ventricular ejection fraction, coronary syndrome subtype, medications) and to classify cases into prespecified coronary artery disease categories (e.g., unstable angina, STEMI, NSTEMI, chronic coronary syndrome).
The workflow outputs structured fields and a classification result with supporting evidence excerpts.
Standard manual process in which experienced clinicians review patient medical records and extract the same prespecified clinical indicators and coronary artery disease categories using routine clinical judgment and documentation review.
This manual abstraction serves as the human benchmark for comparing diagnostic accuracy, completeness, and operational efficiency against the automated OCR-Prompt-LLM workflow.
|
|
English EHR External Validation Cohort
Retrospective cohort derived from the public de-identified MIMIC-IV database.
English admission notes/HPI text are used to evaluate cross-language transportability and performance of the same classification prompts and post-processing rules against reference labels derived by adjudication/structured diagnosis mapping (as prespecified in the protocol).
|
An automated clinical data management workflow integrating Optical Character Recognition (OCR), optimized prompt engineering, and large language models (LLMs).
The system processes unstructured inpatient/ED records (primarily admission history of present illness and related narrative text) to extract prespecified key clinical indicators (e.g., left ventricular ejection fraction, coronary syndrome subtype, medications) and to classify cases into prespecified coronary artery disease categories (e.g., unstable angina, STEMI, NSTEMI, chronic coronary syndrome).
The workflow outputs structured fields and a classification result with supporting evidence excerpts.
Standard manual process in which experienced clinicians review patient medical records and extract the same prespecified clinical indicators and coronary artery disease categories using routine clinical judgment and documentation review.
This manual abstraction serves as the human benchmark for comparing diagnostic accuracy, completeness, and operational efficiency against the automated OCR-Prompt-LLM workflow.
|
|
Clinician Usability Cohort
Prospective usability evaluation cohort.
Physicians complete a structured coronary syndrome subtyping task using admission HPI cases.
Outcomes include diagnostic accuracy and time to completion; within-participant comparisons may be performed between unassisted and tool-assisted conditions as prespecified.
|
An automated clinical data management workflow integrating Optical Character Recognition (OCR), optimized prompt engineering, and large language models (LLMs).
The system processes unstructured inpatient/ED records (primarily admission history of present illness and related narrative text) to extract prespecified key clinical indicators (e.g., left ventricular ejection fraction, coronary syndrome subtype, medications) and to classify cases into prespecified coronary artery disease categories (e.g., unstable angina, STEMI, NSTEMI, chronic coronary syndrome).
The workflow outputs structured fields and a classification result with supporting evidence excerpts.
Standard manual process in which experienced clinicians review patient medical records and extract the same prespecified clinical indicators and coronary artery disease categories using routine clinical judgment and documentation review.
This manual abstraction serves as the human benchmark for comparing diagnostic accuracy, completeness, and operational efficiency against the automated OCR-Prompt-LLM workflow.
|
What is the study measuring?
Primary Outcome Measures
Outcome Measure |
Measure Description |
Time Frame |
|---|---|---|
|
Overall classification accuracy
Time Frame: 1 month
|
Time Frame: Up to completion of dataset evaluation (internal + external cohorts) Description: Proportion of cases with correct subtype (STEMI/NSTEMI/UA/CCS) compared with expert-adjudicated gold standard. |
1 month
|
Collaborators and Investigators
Study record dates
Study Major Dates
Study Start (Estimated)
Primary Completion (Estimated)
Study Completion (Estimated)
Study Registration Dates
First Submitted
First Submitted That Met QC Criteria
First Posted (Actual)
Study Record Updates
Last Update Posted (Actual)
Last Update Submitted That Met QC Criteria
Last Verified
More Information
Terms related to this study
Additional Relevant MeSH Terms
- Pain
- Neurologic Manifestations
- Vascular Diseases
- Cardiovascular Diseases
- Pathologic Processes
- Heart Diseases
- Infarction
- Necrosis
- Arteriosclerosis
- Arterial Occlusive Diseases
- Coronary Disease
- Myocardial Ischemia
- Ischemia
- Chest Pain
- Pathological Conditions, Signs and Symptoms
- Signs and Symptoms
- ST Elevation Myocardial Infarction
- Non-ST Elevated Myocardial Infarction
- Coronary Artery Disease
- Myocardial Infarction
- Acute Coronary Syndrome
- Angina Pectoris
Other Study ID Numbers
- CAD-LLM-002
- Sponsor (Other Grant/Funding Number: China National Center for Cardiovascular Diseases)
Plan for Individual participant data (IPD)
Plan to Share Individual Participant Data (IPD)?
IPD Plan Description
Drug and device information, study documents
Studies a U.S. FDA-regulated drug product
Studies a U.S. FDA-regulated device product
This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.
Clinical Trials on Acute Coronary Syndromes
-
Heart Care FoundationNovartis Farma S.p.A.Not yet recruitingAcute Coronary Syndromes | Chronic Coronary SyndromesItaly
-
Assistance Publique - Hôpitaux de ParisMinistry of Health, FranceNot yet recruitingAcute Coronary Syndromes | Acute Coronary Syndromes (ACS)France
-
SUK MIN SEOBoston Scientific Korea Co. LtdRecruitingAcute Coronary Syndromes (ACS)South Korea
-
Tongji HospitalRecruiting
-
China National Center for Cardiovascular DiseasesNot yet recruiting
-
Institute of medicine, Maharagjung medical campusCompletedAdherence | Acute Coronary Syndromes (ACS)Nepal
-
Heart Care FoundationDaiichi Sankyo Europe, GmbH, a Daiichi Sankyo CompanyNot yet recruitingAcute Coronary Syndromes | Secondary Prevention | LipidsItaly
-
Shenyang Northern HospitalNot yet recruitingPercutaneous Coronary Intervention | Acute Coronary Syndromes | High Bleeding Risk | Anticoagulant Therapy
-
Ceric SàrlEuropean Cardiovascular Research Center; Philips Medical SystemsNot yet recruitingStable Coronary Artery Disease | Acute Coronary Syndromes
-
SUK MIN SEODaewoong Pharmaceutical Co. LTD.RecruitingCoronary Artery Disease (CAD) | Acute Coronary Syndromes (ACS)South Korea
Clinical Trials on OCR-Prompt-LLM Information Extraction and Classification Workflow (OCR-Prompt-LLM)
-
China National Center for Cardiovascular DiseasesCompletedCoronary Artery Disease | Data Collection | Artificial Intelligence (AI)China