Diagnostic Accuracy of Carebot AI MMG in Mammography Screening: Multicenter MRMC Study (CARE-MMG-MRMC)

January 13, 2026 updated by: Carebot s.r.o.

Retrospective Multicenter Multi-Reader, Multi-Case Diagnostic Accuracy Study of Carebot AI MMG Compared With Radiologists on 2D Full-Field Digital Mammography in Breast Cancer Screening

This study evaluates the diagnostic performance of Carebot AI MMG, an artificial intelligence (AI)-enabled medical device for evaluating mammograms. The software analyzes standard full-field digital mammography (FFDM) images and classifies each examination as having no suspicious finding ("Low Risk"), a probably benign mass ("Medium Risk"), or a suspicious malignant mass ("High Risk").

The study is retrospective and observational. It uses anonymized mammography examinations from four screening centers, without any additional imaging or contact with patients. Three experienced breast radiologists independently read the same set of cases, and their assessments are used as the human benchmark. A histopathology-based reference standard, supplemented by radiologist consensus and follow-up information for negative cases, is used to determine whether cancer is present.

The main goal is to compare the AI system with human radiologists in terms of sensitivity and specificity for detecting breast cancer, and to assess whether the AI can achieve non-inferior performance at two predefined operating points: one favoring higher sensitivity and negative predictive value (rule-out) and one favoring higher specificity and positive predictive value (rule-in).

Study Overview

Status

Completed

Conditions

Intervention / Treatment

Device: Carebot AI MMG software analysis

Detailed Description

Design and setting This is a retrospective, multicenter, multi-reader, multi-case (MRMC) diagnostic accuracy study of Carebot AI MMG, conducted on anonymized 2D full-field digital mammography (FFDM) examinations acquired as part of routine breast cancer screening. Mammograms were collected from four screening centers over a defined time period. No additional imaging was performed for the purpose of this study, and no subjects were contacted.

Data source and population The source dataset consists of 4,729 screening mammography examinations from women aged 32 to 88 years (mean approximately 57 years). Only 2D FFDM studies with a complete set of standard projections (LCC, RCC, LMLO, RMLO) were included. Examinations with incomplete series, unreadable or corrupted DICOM files, or missing/inconsistent key metadata were excluded, as were tomosynthesis (DBT) studies, men, and women under 18 years of age. To ensure sufficient precision of performance estimates, the dataset was enriched with additional biopsy-proven cancers. The final analytical subset comprises 222 examinations, including 48 malignant and 174 non-malignant studies, with representation across three mammography devices (Hologic Selenia Dimensions, Hologic Lorad Selenia, Fujifilm FDR-3000AWS).

Investigational device and comparator The investigational device is Carebot AI MMG (software version 2.9, deep-learning models v2.3), a stand-alone AI system that analyzes 2D FFDM exams and outputs a case-level classification into three categories, together with an internal risk score. Two predefined operating points are evaluated: a high-sensitivity (HSe) threshold, where both benign and malignant masses are treated as "positive" (rule-out setting), and a high-specificity (HSp) threshold, where only malignant masses are counted as "positive" (rule-in setting).

As a human comparator, three experienced radiologists (RAD 1-3) independently read the same anonymized studies using a dedicated DICOM viewer integrated with a labeling application. Radiologists were blinded to AI outputs, clinical information, and outcomes and recorded a case-level classification into the same three categories (Negative/Benign/Malignant). For primary analyses, their binary decisions are derived using the same HSe and HSp rules. In addition, a "random reader" benchmark is constructed for balanced accuracy by repeatedly sampling one radiologist's decision per case in a bootstrap framework (20,000 iterations).

Reference standard The reference standard is established at the study level. A case is labeled "Malignant" if there is histopathological confirmation of breast cancer from biopsy performed in temporal association with the index mammogram. A case is labeled "Non-malignant" if there is consensus between two local radiologists that the finding is negative or stably benign, typically corroborated by at least 2 years of imaging follow-up. Tumor staging (e.g., TNM) is not used in the present analysis. All 48 malignant cases from the participating centers are included in the analytical subset; no cancer-positive examinations were excluded.

Objectives and endpoints The primary objectives are: (1) to demonstrate that the balanced accuracy (BA) of Carebot AI MMG is at least 0.80 in both HSe and HSp operating points; (2) to demonstrate non-inferiority of the AI's balanced accuracy compared with the MRMC "random reader" benchmark with a non-inferiority margin of 0.05; and (3) to demonstrate non-inferiority of sensitivity (Se) of AI versus each of the three radiologists in both HSe and HSp, with a non-inferiority margin of 0.07. Secondary objectives are to describe specificity (Sp), positive predictive value (PPV), negative predictive value (NPV), and to characterize patterns of false-negative and false-positive decisions and their potential implications for clinical risk management.

Statistical analysis Diagnostic performance metrics (Se, Sp, PPV, NPV, BA) are calculated at the case level for Carebot AI MMG and each radiologist in both HSe and HSp. Wilson 95% confidence intervals are used for proportions. Paired McNemar tests are used to compare AI and individual readers in terms of Se and Sp. For balanced accuracy, a MRMC bootstrap procedure with 20,000 iterations is used to construct the distribution of the random reader and to estimate the probability that BA_AI is greater than or equal to BA_random minus the pre-specified margin. Non-inferiority in sensitivity is assessed using a Nam-Blackwelder-type framework on discordant pairs. False-negative and false-positive cases are reviewed qualitatively with emphasis on lesion conspicuity, breast density, and typical error patterns (e.g., dense parenchyma, lesions near the pectoral muscle, benign vascular structures, asymmetries).

Risk, ethics, and data protection The study is non-interventional and entirely retrospective. All mammography examinations were acquired as part of routine care before the study, and all DICOM data were irreversibly anonymized at the site level in compliance with GDPR and applicable national law before transfer to the sponsor. No additional radiation exposure or patient contact occurs, and no adverse events are expected. Given this design, the study does not meet the MDR definition of a clinical investigation under Article 62 and is not subject to prior notification under Article 74(1); individual informed consent is not required. Results are intended to support the clinical evaluation of Carebot AI MMG as a decision-support tool in organized mammography screening.

Study Type

Observational

Enrollment (Actual)

222

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Locations

Czechia
- - Prague, Czechia, 14000
    - Poliklinika MEDICON Budějovická
Slovakia
- - Dolný Kubín, Slovakia, 026 14
    - Dolnooravská nemocnica s poliklinikou MUDr. L. N. Jégého
  - Považská Bystrica, Slovakia, 017 01
    - Nemocnica s poliklinikou Považská Bystrica
  - Stará Ľubovňa, Slovakia, 064 01
    - Ľubovnianska nemocnica

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

Adult
Older Adult

Accepts Healthy Volunteers

Yes

Sampling Method

Non-Probability Sample

Study Population

Women undergoing routine screening mammography at four participating centers in Central Europe. The analytical dataset consists of a retrospectively assembled case-control subset of 222 anonymized 2D FFDM examinations (48 malignant and 174 non-malignant) selected from a source cohort of 4,729 screening studies.

Description

Inclusion Criteria:

Female sex
Age ≥ 18 years at the time of the screening mammogram
Screening full-field digital mammography (FFDM) examination with all four standard views (LCC, RCC, LMLO, RMLO) available
Sufficient image quality and complete DICOM metadata to allow retrospective analysis

Exclusion Criteria:

Male sex
Age < 18 years
Digital breast tomosynthesis (DBT/3D) examinations without a corresponding full 2D FFDM four-view series
Incomplete mammography series (missing one or more of LCC, RCC, LMLO, RMLO)
Corrupted or unreadable DICOM files
Missing or inconsistent key metadata (e.g., laterality, view, acquisition date)

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

Number of groups / cohorts

Cohorts and Interventions

Group / Cohort	Intervention / Treatment
Malignant cases Women with biopsy-proven breast cancer included in the analytical subset (n = 48). Each case corresponds to a screening full-field digital mammography (FFDM) examination with all four standard views (LCC, RCC, LMLO, RMLO), retrospectively identified from participating screening centers.	Device: Carebot AI MMG software analysis Retrospective stand-alone AI analysis of anonymized 2D full-field digital mammography (FFDM) examinations. The AI system (Carebot AI MMG, version 2.9) processes existing images and outputs case-level risk classifications; no additional imaging, randomization, or changes to patient management occur as part of this study.
Non-malignant cases Women without histopathological evidence of breast cancer, classified as negative or stably benign by two independent local radiologists with at least 2 years of imaging follow-up (n = 174). Each case corresponds to a screening FFDM examination with all four standard views (LCC, RCC, LMLO, RMLO), retrospectively selected from the same screening population.	Device: Carebot AI MMG software analysis Retrospective stand-alone AI analysis of anonymized 2D full-field digital mammography (FFDM) examinations. The AI system (Carebot AI MMG, version 2.9) processes existing images and outputs case-level risk classifications; no additional imaging, randomization, or changes to patient management occur as part of this study.

What is the study measuring?

Primary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Balanced accuracy (BA) of Carebot AI MMG for detecting malignant versus non-malignant examinations Time Frame: Baseline (index mammography examination; examinations acquired between 01-01-2025 and 14-11-2025; retrospective assessment	Balanced accuracy (BA) is defined as the average of sensitivity and specificity for classifying each mammography examination as malignant or non-malignant. BA will be calculated at two pre-specified operating points of the AI system: a high-sensitivity (HSe) setting and a high-specificity (HSp) setting. Performance will be estimated with 95% confidence intervals and compared to a multi-reader benchmark constructed from three experienced radiologists and a bootstrap-based "random reader" reference.	Baseline (index mammography examination; examinations acquired between 01-01-2025 and 14-11-2025; retrospective assessment
Sensitivity (Se) of Carebot AI MMG versus histopathology-based reference standard Time Frame: Baseline (index mammography examination; examinations acquired between 01-01-2025 and 14-11-2025; retrospective assessment	Sensitivity is defined as the proportion of malignant examinations correctly classified as positive by the AI system. Sensitivity will be calculated at both the HSe and HSp operating points and compared pairwise with the sensitivity of each of the three radiologists using the same case-level ground truth.	Baseline (index mammography examination; examinations acquired between 01-01-2025 and 14-11-2025; retrospective assessment

Secondary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Specificity (Sp) of Carebot AI MMG versus histopathology-based reference standard Time Frame: Baseline (index mammography examination; examinations acquired between 01-01-2025 and 14-11-2025; retrospective assessment	Specificity is defined as the proportion of non-malignant examinations correctly classified as negative by the AI system. Specificity will be calculated at both the HSe and HSp operating points and compared pairwise with the specificity of each of the three radiologists.	Baseline (index mammography examination; examinations acquired between 01-01-2025 and 14-11-2025; retrospective assessment
Positive predictive value (PPV) of Carebot AI MMG Time Frame: Baseline (index mammography examination; examinations acquired between 01-01-2025 and 14-11-2025; retrospective assessment	Positive predictive value is defined as the proportion of AI-positive examinations that are truly malignant according to the reference standard. PPV will be reported for both HSe and HSp operating points and compared descriptively with PPV values for each radiologist.	Baseline (index mammography examination; examinations acquired between 01-01-2025 and 14-11-2025; retrospective assessment
Negative predictive value (NPV) of Carebot AI MMG Time Frame: Baseline (index mammography examination; examinations acquired between 01-01-2025 and 14-11-2025; retrospective assessment	Negative predictive value is defined as the proportion of AI-negative examinations that are truly non-malignant according to the reference standard. NPV will be reported for both HSe and HSp operating points and compared descriptively with NPV values for each radiologist.	Baseline (index mammography examination; examinations acquired between 01-01-2025 and 14-11-2025; retrospective assessment

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Sponsor

Carebot s.r.o.

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Actual)

January 1, 2025

Primary Completion (Actual)

November 3, 2025

Study Completion (Actual)

November 3, 2025

Study Registration Dates

First Submitted

November 14, 2025

First Submitted That Met QC Criteria

December 17, 2025

First Posted (Estimated)

December 23, 2025

Study Record Updates

Last Update Posted (Actual)

January 14, 2026

Last Update Submitted That Met QC Criteria

January 13, 2026

Last Verified

January 1, 2026

More Information

Terms related to this study

Keywords

Additional Relevant MeSH Terms

Other Study ID Numbers

CB-MMG-02-MC (Other Identifier: Carebot s.r.o.)

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

IPD Plan Description

This is a retrospective, multicenter diagnostic accuracy study using anonymized full-field digital mammography (FFDM) images and associated metadata obtained under local data use agreements. Individual-level imaging data are not planned to be shared outside the participating institutions due to contractual, privacy, and regulatory constraints. Aggregate, de-identified summary results (including performance metrics and key subgroup analyses) may be shared in publications and upon reasonable request, but no IPD repository is planned.

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Breast Neoplasms

Emory University
Eisai Inc.

Terminated

Trial of Eribulin Followed by Doxorubicin & Cyclophosphamide for Her2-negative, Locally Advanced Breast Cancer

Breast Cancer | Breast Neoplasms | Breast Tumors | Neoplasms, Breast | Cancer of the Breast | Tumors, Breast

United States
Innocrin Pharmaceutical

Completed

CYP17 Lyase and Androgen Receptor Inhibitor Treatment With Seviteronel Trial (INO-VT-464-006; NCT02580448) (CLARITY-01)

Breast Cancer | Advanced Breast Cancer | Metastatic Breast Cancer | Triple Negative Breast Cancer | Male Breast Cancer | ER+ Breast Cancer | Cancer of the Breast

United States
G1 Therapeutics, Inc.

Terminated

Trilaciclib (G1T28), a CDK 4/6 Inhibitor, in Combination With Gemcitabine and Carboplatin in Metastatic Triple Negative Breast Cancer (mTNBC)

Breast Cancer | Breast Neoplasm | Triple-Negative Breast Cancer | Triple-Negative Breast Neoplasms

United States, Bulgaria, Croatia, Slovenia, Serbia, Belgium, North Macedonia, Slovakia
National Cancer Institute (NCI)

Not yet recruiting

Collection of CSF Samples From Participants With Metastatic Triple Negative Breast Cancer (TNBC) and HER2+ Breast Cancer With no Prior History Nor Active Radiographically Detectable Brain Metastases

Breast Cancer | Breast Carcinoma | Malignant Neoplasm of Breast | Cancer of the Breast

United States
University of Washington
National Cancer Institute (NCI)

Completed

Sunitinib Malate, Paclitaxel, Doxorubicin Hydrochloride, and Cyclophosphamide Before Surgery in Treating Patients With Stage IIB-IIIC Breast Cancer

Inflammatory Breast Cancer | Male Breast Cancer | Stage II Breast Cancer | Stage IIIA Breast Cancer | Stage IIIB Breast Cancer | Stage IIIC Breast Cancer

United States
Massachusetts General Hospital
Massachusetts Institute of Technology

Not yet recruiting

Wearable Ultrasound Patch for Breast Imaging

Breast Cancer | Breast Asymmetry | Breast Abnormalities | Breast Lesion

United States
Joseph Baar, MD, PhD

Completed

MUC1 Vaccine for Triple-negative Breast Cancer

Breast Cancer | Stage I Breast Cancer | Inflammatory Breast Cancer | Stage II Breast Cancer | Stage IIIA Breast Cancer | Stage IIIB Breast Cancer | Triple-negative Breast Cancer | Stage IIIC Breast Cancer

United States
Dana-Farber Cancer Institute
Incyte Corporation

Active, not recruiting

Study Of Ruxolitinib (INCB018424) With Preoperative Chemotherapy For Triple Negative Inflammatory Breast Cancer

Inflammatory Breast Cancer (IBC)

United States
Providence Health & Services
Brooklyn ImmunoTherapeutics, LLC

Completed

Pre-operative IRX-2 in Early Stage Breast Cancer (ESBC)

Breast Neoplasm | Triple Negative Breast Cancer | Breast Neoplasm, Male

United States
Wake Forest University Health Sciences
Merck Sharp & Dohme LLC

Completed

Pilot Study of Paclitaxel Plus Pembrolizumab in Metastatic HER2-Negative Breast Cancer (PePPy)

Male Breast Cancer | Breast - Female

United States

Clinical Trials on Carebot AI MMG software analysis

Carebot s.r.o.

Completed

Evaluation of Carebot AI MMG Medical Device for Breast Lesion Detection and Density Assessment (EMBLEDDA-MMG)

Breast Cancer | Breast Tumor Benign

Czechia
Carebot s.r.o.

Completed

Enhancing Diagnostic Accuracy in Fracture Identification on Musculoskeletal Radiographs Using Deep Learning

Fractures | Musculoskeletal

Czechia
Carebot s.r.o.

Completed

Retrospective Study of Carebot AI CXR Performance in Preclinical Practice

Pneumonia | Lung Diseases | Pneumothorax | Lung Cancer | Cardiomegaly | Artificial Intelligence | Pulmonary Edema | Pleural Effusion | Atelectasis | Consolidation | Hilar Calcification | Fracture Rib

Czechia
Karolinska University Hospital
Karolinska Institutet; Lunit Inc.; Capio Sankt Görans Hospital

Active, not recruiting

Artificial Intelligence in Large-scale Breast Cancer Screening (ScreenTrustCAD)

Breast Neoplasm Female

Sweden
Carebot s.r.o.

Completed

Multi-Reader Retrospective Study Examining Carebot AI CXR 2.0.21-v2.01 Implementation in Everyday Radiology Clinical Practice

Pneumothorax | Subcutaneous Emphysema | Cardiomegaly | Pleural Effusion | Atelectasis | Pulmonary Nodule, Solitary | Consolidation

Czechia
International Islamic University Malaysia

Completed

Accuracy and Reliability of Artificial Intelligence Cephalometric Analysis Software Compared to Manual Tracing

Orthodontic | Cephalometric Analysis | Cephalometry

Malaysia
THYROSCOPE INC.

Not yet recruiting

Clinical Validation of AI-Based Quantitative Eye Movement Analysis From Smartphone 9-Gaze Videos (Glandy EOM) (GLANDY-EOM-VAL)

Thyroid Eye Disease | Strabismus | Ocular Motility Disorders | Extraocular Muscle Dysfunction

United States
Methinks Software SL
Santiago Ortega- global PI

Not yet recruiting

Study on the Performance of a Machine Learning Algorithm Recognizing and Triaging Large Vessel Occlusions Using Non-contrast CT Scans (SMART-LVO)

Brain Ischemia | Stroke, Ischemic | Stroke, Acute | Stroke Hemorrhagic
Ziekenhuis Oost-Limburg
Universitair Ziekenhuis Brussel; General Hospital Groeninge; Robovision BV

Active, not recruiting

AI for Detection of Brain Aneurysm: Low-cost Opportunistic Screening (AIDALOS-III)

Intracranial Aneurysm

Belgium
Queen Mary University of London
icometrixLeuven

Recruiting

Artificial Intelligence-Assisted Magnetic Resonance Imaging for Quality, Efficiency and Equity in the National Health Service (NHS) Care of Multiple Sclerosis (AssistMS)

Multiple Sclerosis

United Kingdom

Diagnostic Accuracy of Carebot AI MMG in Mammography Screening: Multicenter MRMC Study (CARE-MMG-MRMC)

Retrospective Multicenter Multi-Reader, Multi-Case Diagnostic Accuracy Study of Carebot AI MMG Compared With Radiologists on 2D Full-Field Digital Mammography in Breast Cancer Screening

Study Overview

Status

Conditions

Intervention / Treatment

Detailed Description

Study Type

Enrollment (Actual)

Contacts and Locations

Study Locations

Participation Criteria

Eligibility Criteria

Ages Eligible for Study

Accepts Healthy Volunteers

Sampling Method

Study Population

Description

Study Plan

How is the study designed?

Design Details

Number of groups / cohorts

Cohorts and Interventions

Group / Cohort

Intervention / Treatment

What is the study measuring?

Primary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Secondary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Collaborators and Investigators

Sponsor

Study record dates

Study Major Dates

Study Start (Actual)

Primary Completion (Actual)

Study Completion (Actual)

Study Registration Dates

First Submitted

First Submitted That Met QC Criteria

First Posted (Estimated)

Study Record Updates

Last Update Posted (Actual)

Last Update Submitted That Met QC Criteria

Last Verified

More Information

Terms related to this study

Keywords

Additional Relevant MeSH Terms

Other Study ID Numbers

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

IPD Plan Description

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

Clinical Trials on Breast Neoplasms

Clinical Trials on Carebot AI MMG software analysis

Search Similar Trials

Sponsors and Collaborators

Medical Conditions

Drug Interventions

CROs by country

CROs in Benin

Conditions

Rare Diseases

Drug Interventions

Dietary Supplements

Sponsor/Collaborators

Locations