Performance of an OCR-Prompt-LLM Integrated Workflow for Extracting Multi-dimensional Clinical Data in Ischemic Heart Disease (OPAL-CAD)

March 24, 2026 updated by: China National Center for Cardiovascular Diseases

This research aims to evaluate a comprehensive AI-driven workflow for both clinical data extraction and diagnostic classification in coronary artery disease (CAD). Leveraging OCR and Large Language Models (LLMs), the system is designed to extract ten key clinical parameters (such as LVEF and lab results) and provide diagnostic subtypes (UA, STEMI, NSTEMI, CCS) directly from unstructured inpatient records. A man-machine comparative trial will be conducted using a test set of 308 patients, where the performance of the LLM-based workflow will be benchmarked against the average diagnostic accuracy and processing time of seven clinical physicians. The findings will provide evidence for the feasibility of using LLMs to enhance clinical data structuring and diagnostic efficiency in cardiology.

Study Overview

Status

Completed

Conditions

Intervention / Treatment

Study Type

Observational

Enrollment (Actual)

308

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Locations

China
- - Beijing, China, 100037
    - Fuwai Hospital

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

Adult
Older Adult

Accepts Healthy Volunteers

Yes

Sampling Method

Probability Sample

Study Population

he study population consists of 308 patients diagnosed with various subtypes of coronary artery disease (CAD). The cohort is derived from two major clinical studies: the AIM-CHD study (for pilot testing and prompt optimization) and the SMART-CHD study (for internal validation), both conducted at Fuwai Hospital. Additionally, an external validation cohort is included, comprising patients from 8 independent clinical sub-centers across China to ensure geographical and institutional diversity. The population covers a spectrum of CAD presentations, including Unstable Angina (UA), STEMI, NSTEMI, and Chronic Coronary Syndrome (CCS), providing a robust dataset for evaluating AI-driven diagnostic and data extraction performance.

Description

Inclusion Criteria:

Patients aged 18 years and older.
Clinical records of patients who were previously enrolled in the AIM-CHD (for the pilot/prompt optimization set) or SMART-CHD (for the internal validation cohort) studies.
Patients diagnosed with, or suspected of having, coronary artery disease (CAD), including subtypes: Unstable Angina (UA), STEMI, NSTEMI, and Chronic Coronary Syndrome (CCS).

Exclusion Criteria:

Clinical records with severe data fragmentation or missing more than 50% of the key clinical indicators.
Handwritten medical records or low-quality scans that are illegible for Optical Character Recognition (OCR) processing.
Duplicate records or records with conflicting "Gold Standard" labels that cannot be reconciled by the expert committee.

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

Number of groups / cohorts

Cohorts and Interventions

Group / Cohort	Intervention / Treatment
Test Cohort This group consists of 50 patient records from the AIM-CHD Study at Fuwai Hospital. These data are specifically utilized for refining OCR processing and optimizing Prompt Engineering for the LLM-based workflow.	Device: OCR-Prompt-LLM Information Extraction Workflow The intervention is an automated clinical data management system integrating Optical Character Recognition (OCR), optimized Prompt Engineering, and Large Language Models (LLMs). The workflow processes unstructured inpatient records to extract 10 key clinical indicators (e.g., LVEF, CAD subtypes, medications) and classifies the patient into specific coronary artery disease categories (UA, STEMI, NSTEMI, CCS) Device: Manual Clinical Data Review Standard manual process where experienced clinical physicians collect and interpret patient information from medical records. This serves as the human benchmark for comparing diagnostic accuracy and operational efficiency.
Internal Validation Cohort This cohort includes 188 clinical cases sourced from the SMART-CHD Study at Fuwai Hospital. These records serve as the primary internal benchmark to evaluate the diagnostic and extraction accuracy of the LLM workflow against the established ground truth.	Device: OCR-Prompt-LLM Information Extraction Workflow The intervention is an automated clinical data management system integrating Optical Character Recognition (OCR), optimized Prompt Engineering, and Large Language Models (LLMs). The workflow processes unstructured inpatient records to extract 10 key clinical indicators (e.g., LVEF, CAD subtypes, medications) and classifies the patient into specific coronary artery disease categories (UA, STEMI, NSTEMI, CCS) Device: Manual Clinical Data Review Standard manual process where experienced clinical physicians collect and interpret patient information from medical records. This serves as the human benchmark for comparing diagnostic accuracy and operational efficiency.
External Validation Cohort This cohort comprises 70 patient records collected from 8 independent sub-centers (excluding Fuwai Hospital) to assess the generalizability and robustness of the model across diverse clinical environments and different medical record formats.	Device: OCR-Prompt-LLM Information Extraction Workflow The intervention is an automated clinical data management system integrating Optical Character Recognition (OCR), optimized Prompt Engineering, and Large Language Models (LLMs). The workflow processes unstructured inpatient records to extract 10 key clinical indicators (e.g., LVEF, CAD subtypes, medications) and classifies the patient into specific coronary artery disease categories (UA, STEMI, NSTEMI, CCS) Device: Manual Clinical Data Review Standard manual process where experienced clinical physicians collect and interpret patient information from medical records. This serves as the human benchmark for comparing diagnostic accuracy and operational efficiency.

What is the study measuring?

Primary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Overall Diagnostic and Extraction Accuracy Rate Time Frame: Through study completion, an average of 3 months.	To calculate the overall accuracy rate of the LLM-based workflow across 308 cases (including the pilot set, internal validation cohort, and external validation cohort) for 10 clinical indicators (e.g., LVEF, blood glucose, etc.) and 4 diagnostic subtypes of coronary artery disease. Accuracy is defined as the proportion of cases where the LLM's extraction or diagnostic results are perfectly consistent with the 'Gold Standard' established by human clinical experts.	Through study completion, an average of 3 months.

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Sponsor

China National Center for Cardiovascular Diseases

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Actual)

February 23, 2026

Primary Completion (Actual)

March 1, 2026

Study Completion (Actual)

March 2, 2026

Study Registration Dates

First Submitted

March 24, 2026

First Submitted That Met QC Criteria

March 24, 2026

First Posted (Actual)

March 30, 2026

Study Record Updates

Last Update Posted (Actual)

March 30, 2026

Last Update Submitted That Met QC Criteria

March 24, 2026

Last Verified

February 1, 2026

More Information

Terms related to this study

Additional Relevant MeSH Terms

Other Study ID Numbers

CAD-LLM-2025-01

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

IPD Plan Description

To protect patient privacy and comply with the data management policies of the participating institutions (Fuwai Hospital and sub-centers), individual participant data will not be made publicly available. However, aggregated study results and statistical analyses will be included in the final publication.

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Coronary Artery Disease

Infirmerie Protestante de Lyon

Recruiting

Postural Mobilization Compared to Noradrenaline Only in Off-Pump CABG Surgery (OPTICAB)

Coronary Artery Bypass | Coronary Artery Disease(CAD) | Off Pump Coronary Artery Bypass Surgery | Hemodynamic Optimization | Hemodynamic Management | Off Pump Coronary Artery Bypass Graft | Coronary Artery Disease With Need for Bypass Surgery | Noradrenaline

France
Shanghai Bluesail Boyuan Medical Technology Co....

Not yet recruiting

Lithotripsy Versus Balloon Angioplasty for Optimal Treatment of CAlcified Lesions With and Without Optical Coherence Tomography evaluatION (LOCATION)

Coronary Artery Disease | Coronary Artery Calcification | Severe Coronary Artery Disease

China
Federal State Budgetary Institution, V. A. Almazov...

Recruiting

"Expert IVUS-Eye" vs. IVUS-guided Left Main Bifurcation PCI (EYE-IVUS LM)

Left Main Coronary Artery Disease | Coronary Artery Disease (CAD) | Chronic Coronary Syndrome | Coronary Bifurcation Lesion

Russia
I.R.C.C.S Ospedale Galeazzi-Sant'Ambrogio

Completed

High Risk Atherosclerosis Identified at Cardiac CT Among Patients With vs Without Family History of CAD

Coronary Artery Disease (CAD) | Atherosclerosis of Coronary Artery

Italy
Scitech Produtos Medicos SA

Not yet recruiting

Brazilian Prospective Registry of the Inspiron EVO Drug-Eluting Stent in Complex Coronary Lesions (PCI EVOlution)

Coronary Artery Disease (CAD) | Multivessel Coronary Artery Disease | Complex Coronary Lesions | Calcific Coronary Arteriosclerosis | Small Vessel Ischemic Disease | Stenosis Coronary

Brazil
Istanbul Mehmet Akif Ersoy Educational and Training...
Bakirkoy Dr. Sadi Konuk Research and Training Hospital; Ege University; Istanbul... and other collaborators

Recruiting

Crossover vs Accurate Ostial PCI for Medina 0.0.1 and 0.1.0 Left Main Bifurcation Lesions

Coronary Artery Disease (CAD) | Coronary Bifurcation Lesion | Left Main Coronary Artery Stenosis

Turkey (Türkiye)
University Medical Centre Ljubljana

Recruiting

Resistance Training Added to Aerobic Interval Training to Improve Aerobic Capacity and Muscle Mass in Women With Coronary Artery Disease (VAKAR)

Coronary Artery Disease With Myocardial Infarction

Slovenia
EBI Anti Sepsis BV
CR2O B.V.

Not yet recruiting

A Clinical Trial To Investigate The Effect Of EA-230 On Hospital Length Of Stay In Patients With Coronary Artery Disease (CAD) Undergoing Coronary Artery Bypass Grafting (CABG) Surgery. (EasyBoost)

Coronary Artery Disease (CAD) | Coronary Artery Bypass Graft Surgery(CABG)

United States, Netherlands, Belgium, United Kingdom
Fundación EPIC

Active, not recruiting

Concordance Between FFR and iFR for the Assessment of Intermediate Lesions in the Left Main Coronary Artery. A Prospective Validation of a Default Value for iFR (iLITRO)

Coronary Artery Disease | Left Main Coronary Artery Disease | Left Main Coronary Artery Stenosis | Restenosis, Coronary

Spain
Elixir Medical Corporation
Istituto Clinico Humanitas

Active, not recruiting

DYNAMX Bioadaptor ImplanTation for the trEatment of Complex Coronary Lesions (DYNAMITE)

Coronary Artery Disease | Chronic Total Occlusion of Coronary Artery | Multi Vessel Coronary Artery Disease | Bifurcation of Coronary Artery | Long Lesions Coronary Artery Disease

Italy

Clinical Trials on OCR-Prompt-LLM Information Extraction Workflow

China National Center for Cardiovascular Diseases

Not yet recruiting

A Privacy-Preserving OCR-LLM System for Coronary Syndrome Subtyping From Admission HPI: Multicenter Validation in China and the US (OCR-LLM-CHD)

Acute Coronary Syndromes | ST-segment Elevation Myocardial Infarction (STEMI) | Coronary Artery Disease (CAD) (E.G., Angina, Myocardial Infarction, and Atherosclerotic Heart Disease (ASHD)) | Non-ST-Segment Elevation Myocardial Infarction (NSTEMI)

Performance of an OCR-Prompt-LLM Integrated Workflow for Extracting Multi-dimensional Clinical Data in Ischemic Heart Disease (OPAL-CAD)

Study Overview

Status

Conditions

Intervention / Treatment

Study Type

Enrollment (Actual)

Contacts and Locations

Study Locations

Participation Criteria

Eligibility Criteria

Ages Eligible for Study

Accepts Healthy Volunteers

Sampling Method

Study Population

Description

Study Plan

How is the study designed?

Design Details

Number of groups / cohorts

Cohorts and Interventions

Group / Cohort

Intervention / Treatment

What is the study measuring?

Primary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Collaborators and Investigators

Sponsor

Study record dates

Study Major Dates

Study Start (Actual)

Primary Completion (Actual)

Study Completion (Actual)

Study Registration Dates

First Submitted

First Submitted That Met QC Criteria

First Posted (Actual)

Study Record Updates

Last Update Posted (Actual)

Last Update Submitted That Met QC Criteria

Last Verified

More Information

Terms related to this study

Additional Relevant MeSH Terms

Other Study ID Numbers

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

IPD Plan Description

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

Clinical Trials on Coronary Artery Disease

Clinical Trials on OCR-Prompt-LLM Information Extraction Workflow

Search Similar Trials

Sponsors and Collaborators

Medical Conditions

Drug Interventions

CROs by country

CROs in Hong Kong

Conditions

Rare Diseases

Drug Interventions

Dietary Supplements

Sponsor/Collaborators

Locations