CARAMEL: Retrospective Study for Personalized Risk Assessment of Cardiovascular Disease in Menopausal and Perimenopausal Women Using Real World Data (CARAMEL RS)

January 13, 2026 updated by: Luis Gabriel Luque Romero, Hospital Universitario Virgen Macarena

This retrospective observational study, part of the EU-funded CARAMEL project, aims to develop and validate personalized cardiovascular disease (CVD) risk assessment models specifically designed for menopausal and perimenopausal women (ages 40-60). The study leverages Real World Data (RWD) collected from multiple international clinical partners, including electronic health records (EHR), diagnostic imaging data, and signal data.

The main objective is to improve the prediction of CVD precursors such as hypertension and dyslipidemia, as well as mid- and long-term risk of CVD events, through advanced artificial intelligence (AI) models. These models will be trained on multimodal data to capture complex, individualized risk trajectories that current risk calculators fail to address, particularly in women. Special focus is placed on under-researched, women-specific risk factors and their interactions with traditional predictors.

The study includes several research objectives: (1) predicting the onset of hypertension and dyslipidemia using EHR data; (2) modeling the long-term risk of fatal and non-fatal cardiovascular events and disease trajectories; (3) identifying novel imaging biomarkers from routine screening tests such as mammography, DXA, ultrasound, and cardiac MRI; (4) developing multimodal prediction models combining imaging and clinical data; (5) creating automated AI tools for imaging biomarker extraction; and (6) using signal data from cardiac devices to predict disease progression and events.

The study population consists of middle-aged women with retrospective data available across different health systems. The expected outcome is a validated set of stratified, personalized CVD risk models that can support targeted prevention strategies and enable more equitable, sex-specific care. This will contribute to reducing the burden of CVD in women and addressing critical gaps in early detection, clinical decision-making, and health policy.

This project has received funding from the European Union's Horizon Europe Research and Innovation Programme under Grant Agreement No 101156210.

Study Overview

Study Type

Observational

Enrollment (Estimated)

1500000

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

  • Adult

Accepts Healthy Volunteers

No

Sampling Method

Non-Probability Sample

Study Population

Participants are identified retrospectively from electronic health records, imaging archives, and device registries across multiple healthcare systems and countries

Description

Inclusion Criteria:

Self-identified as female in the electronic health record (EHR). Age between 40 and 60 years at the time of data collection/index date. Availability of at least 5-6 years of retrospective data in the EHR, depending on the research objective.

At least one healthcare encounter (visit, imaging, lab test, diagnosis, etc.) within the defined age range.

For imaging substudies (e.g., RO3-RO5): availability of at least one relevant imaging test (e.g., DXA, digital mammography, cMRI, CCTA, US) during the age range.

For signal-based analysis (RO6): presence of ECG monitoring data from implanted devices and at least 2 years of follow-up.

Exclusion Criteria:

Prior diagnosis of cardiovascular disease before the observation window (only applicable to specific ROs, e.g., RO2, RO4).

Insufficient data quality or missing key variables needed for modeling (e.g., absence of blood pressure or lipid profile).

Patients with incomplete or inconsistent records (e.g., duplicate IDs, mismatched time frames).

For signal-based RO6: hospitalizations or diagnoses unrelated to cardiovascular health that may bias AI model training.

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

Cohorts and Interventions

Group / Cohort
ASCIRES IMAGE DATABASE
Digital imaging biobank 10y long from several manufact 1,000 cMRI; 500 cardiac CT; 500 coronary artery calcification; 1,000 DXA From women 40- 60y urers / modalities
Basque Health Service Database

Longitudinal EHR data up to 15y including diagnosis, procedures, prescriptions, lab tests, visits, imaging, etc.

~128,00 women 40-60 14,880 DM, 3,124 DXA, 332 carotid US

Clalit Primary Prevention Database

Manually curated DB of structured EHR data

~750,000 middleaged women

Irish Implant Devices Registry

Irish Implant Devices Registry (REG) (HRI) 15y of data for implant procedures and follow-ups (pacemakers, ICD's, loop recorders)

~85,000 implant (pacemaker) proced ures ~700,000 follow-up w. indications & diagnosis

Keralty Colombia Database

EHR data from primary/specialised care centres. Longitudinal EHR data up to 5-10y Including diagnosis, procedures, prescriptions, lab tests, visits, etc.

~85,593 women 40-60y ~25,000 women with CVD problems

Andalusian Health Population Database & Macarena University Hospital EHR

Longitudinal EHR data up to 15y including diagnosis, clinical procedures, prescriptions, lab tests, visits, etc. The hospital Dataset is OMOP CMD mapped

~700,000 middleaged women

Lithuanian High Cardiovascular Risk (LitHiR) primary prevention programme database

EHR data from primary cardiovascular prevention programme in VULSK (1 centre). Data including demographics, risk factors, lab tests (including lipid profile, renal function, etc.), arterial markers (pulse wave velocity analysis data; CardioAngle Vascular Index data; carotid artery intimamedia thickness data).

Some patients have 5-10y longitudinal data with outcomes.

~6000 women 40-65y with high - very high cardiovascular risk, but without overt CVD;

National and Kapodistrian University of Athens Database - Aretaieion Hospital

EHR data from Menopause clinic of Aretaieion university hospital including blood tests, medication, prescriptions, visits

~4000 middle aged women

CoroPrevention - Tampere University (TAU)
Pan-European (25 sites) contemporary prospective CVD prevention cohort from ongoing HEU project it includes clinical data, 3-year CV event data, lifestyle, RFs. Standard + CVD biomarkers (CERT2, hsTNI, NTproBNP, Cystatin C…) N=~3,000 women (subsample of whole cohort)
AKRIBEA - Cooperative Research Centre for Biosciences Association (CIC)
Non-oriented 7y follow-up cohort from Basque Country Region. Urine+serum biomarkers and metabolome; serum lipoproteins by NMR; demographics & RFs N=~ 2,500 women (40 to 60 y)
MENO - Cooperative Research Centre for Biosciences Association (CIC)
Pre- and post-menopausal women cohort from Basque Country Region. Urine+serum biomarkers and metabolome; serum lipoproteins by NMR; demographics & RFs N =~ 1,700 women
UK Biobank - UK Biobank

Largest geno-phenotype-rich population-based study in the world (500K), includes multi-modal imaging data (60K) and eye and vision (67K), biomarkers, demographic data, lifestyle (100K with wearables) and health outcomes.

Middle-aged women among:

  • 500K baseline
  • 60K imaging study
  • 67K retina & OCT
Qatar Biobank
Population-based with annotated data, biological samples, tests and imaging for 60K participants. It includes Demographics data, lifestyle, biomarkers, weight & body fat, hip&waist, BP, ECG, carotid US, full-body MRI, retinography, DXA Middle-aged women among ~60K total participants
International Agency for Research on Cancer (IARC) / EPIC-Europa

Long-term European population-based cohort (520K participants across 10 countries). Includes clinical data, anthropometric measurements, demographic, lifestyle, dietary habits, and socioeconomic data, reproductive history, and biological samples such as serum, plasma and DNA for biochemical data and genotyping data N = ~367k women between 35 to 65 years old (subsample of whole cohort)

~65k CVD cases across the full cohort

ILERVAS -Institute for Research in Biomedicine IRB Lleida

Interventional longitudinal study that includes detailed assessments of subclinical atheromatosis in 12 vascular territories using ultrasound, along with clinical, anthropometric, lifestyle, dietary, and biochemical data.

N = ~4165 women (50 to 70y) (subsample of whole cohort)

What is the study measuring?

Primary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Occurrence and Predicted Risk of Cardiovascular Disease (CVD) Events (fatal and non-fatal)
Time Frame: up to 10 years

The study will retrospectively evaluate the occurrence of cardiovascular disease (CVD) events and develop predictive models to estimate individual risk profiles for such events. CVD events include both fatal and non-fatal occurrences such as myocardial infarction, stroke, heart failure, arrhythmias, and atherosclerotic disease. Events will be identified using structured electronic health records (EHR) and coded using ICD-10 classifications. Risk will be modeled using multimodal data sources (EHR, imaging, and signals) to predict short- and long-term outcomes, stratified by individual characteristics.

The outcome integrates:

Event-based measures: Time to first fatal or non-fatal CVD event.

Risk-based measures: Individual predicted probabilities of experiencing a CVD event or precursor condition (e.g., hypertension, dyslipidemia) over different time frames.

up to 10 years

Secondary Outcome Measures

Outcome Measure
Measure Description
Time Frame
RO1. Personalized risk prediction of CVD precursors
Time Frame: up to 8 years

First observation of HT or DY registered in the EHR, registered as a diagnostic code, or as a laboratory result or test. These include:

  • Diagnosis of HT registered in the EHR with either of the following ICD10 codes:

    • I10 Essential (primary) hypertension
    • I11.0 Hypertensive heart disease with heart failure
    • I11.9 Hypertensive heart disease without heart failure
    • I12.0 Hypertensive chronic kidney disease with stage 5 chronic kidney disease or end stage renal disease
    • I13.0 Hypertensive heart and chronic kidney disease with heart failure and stage 1 through stage 4 chronic kidney disease, or unspecified chronic kidney disease
    • I13.1 Hypertensive heart and chronic kidney disease without heart failure
    • I13.2 Hypertensive heart and chronic kidney disease with heart failure and with stage 5 chronic kidney disease, or end stage renal disease
  • Diagnosis of DY registered in the EHR with either of the following ICD10 codes:

    • E78.1 Pure hyperglyceridemia
    • E78.2 Mixed hyperlipidemia
up to 8 years
RO2. Personalized Risk Prediction of CVD Events and CVD trajectories
Time Frame: Up to 16 years

The occurrence of CVD events, which will be classified in fatal (if they are registered as the cause of death) or not fatal (if they are not registered as cause of death).

  • Fatal CVD events include the following ICD codes registered in the EHR:

    • I10-16 Hypertensive disease
    • I20-25 Ischemic heart disease
    • I46-52 Arrhythmias and heart failure, excluding I51.4 (Myocarditis unspecified)
    • I60-69 Cerebrovascular diseases
    • I70-73 Atherosclerosis/AAA
  • Not fatal CVD events include only the following ICD codes:

    • I21-I23 Not fatal myocardial infarction
    • I60-69 Not-fatal stroke
Up to 16 years
RO3. Novel Imaging Biomarkers and Patterns for CVD Risk Assessment
Time Frame: Baseline
Evaluates the predictive performance of multimodal models combining imaging features (e.g., cardiac MRI, DXA, digital mammography) and electronic health record (EHR) variables to estimate the mid- and long-term risk of cardiovascular events (CVD) in women aged 40-60. The endpoint is the first occurrence of a fatal or non-fatal CVD event after the imaging test, as documented in the EHR. The models will be compared against standard risk assessment tools (e.g., SCORE2).
Baseline
RO4. Multimodal EHR and ImageBased CVD Prediction Models
Time Frame: Up to 16 years

The occurrence of CVD events, which will be classified in fatal (if they are registered as the cause of death) or not fatal (if they are not registered as cause of death).

  • Fatal CVD events include the following ICD codes registered in the EHR:

    • I10-16 Hypertensive disease
    • I20-25 Ischemic heart disease
    • I46-52 Arrhythmias and heart failure, excluding I51.4 (Myocarditis unspecified)
    • I60-69 Cerebrovascular diseases
    • I70-73 Atherosclerosis/AAA
  • Not fatal CVD events include only the following ICD codes:

    • I21-I23 Not fatal myocardial infarction
    • I60-69 Not-fatal stroke
Up to 16 years
RO5. Automatic imaging marker and pattern extraction
Time Frame: Baseline
The performance and clinical relevance of AI-based tools for the automatic extraction of cardiovascular imaging biomarkers in women aged 40-60. These tools will be used to segment anatomical regions and calculate quantitative measures from multimodal imaging (e.g., ultrasound, DXA, cardiac CT, cMRI, mammography).
Baseline
RO6. Signal-based CVD prediction models
Time Frame: Up to 16 years

Occurrance of CVD events, which include:

  • The occurrence of Arrhythmia episodes including atrial fibrillation, ventricular tachycardia, and bradyarrhythmias.
  • The occurrence of Heart failure and structural heart disease, particularly severe left ventricular dysfunction and cardiomyopathy.
  • The occurrence of Ischemic events such as myocardial infarction (MI), coronary artery disease (CAD), and cerebrovascular accidents (CVA).
  • Device-related events, including the transition from loop recorders to pacemakers or ICDs due to worsening conditions. Unit of Measure: Recorded episodes (frequency/time) or binary outcome (present/absent).
Up to 16 years

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Estimated)

March 1, 2026

Primary Completion (Estimated)

December 1, 2027

Study Completion (Estimated)

April 30, 2028

Study Registration Dates

First Submitted

May 14, 2025

First Submitted That Met QC Criteria

May 22, 2025

First Posted (Actual)

May 31, 2025

Study Record Updates

Last Update Posted (Actual)

January 15, 2026

Last Update Submitted That Met QC Criteria

January 13, 2026

Last Verified

January 1, 2026

More Information

Terms related to this study

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

UNDECIDED

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

No

Studies a U.S. FDA-regulated device product

No

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Cardiovascular Risk Factors

Subscribe