AI-Driven Cancer Diagnosis and Prediction With EHR

July 25, 2025 updated by: Kang Zhang, The Eye Hospital of Wenzhou Medical University

AI-Based Cancer Diagnosis and Prediction Using Electronic Health Records

This is a multi-center, clinical study designed to evaluate the application and effectiveness of an AI-assisted predictive model for identifying and diagnosing cancer, leveraging multimodal health data.

Study Overview

Status

Recruiting

Conditions

Detailed Description

Cancer diagnosis and early detection are crucial for improving patient outcomes and survival rates. Early identification of cancers and appropriate intervention can significantly impact treatment success and prognosis. In clinical practice, oncologists often need to integrate a variety of patient data-including medical history, laboratory test results, imaging data such as CT scans and MRIs, and genetic markers-to make an accurate diagnosis and develop a personalized treatment plan.

To build the foundation for our work, first phase of the project was initiated in 2023, conducting a large-scale retrospective study. This foundational phase involved analyzing comprehensive, multimodal data from approximately 1 million cancer patients. The goal was to identify key patterns and build robust preliminary models.

As precision medicine becomes increasingly important, the challenge remains to identify cancer at early stages, especially when symptoms are subtle or absent. Building on the insights from our initial analysis, the project's second phase was launched in February 2025: a prospective study. This current study aims to develop and validate an AI-assisted decision-making system by integrating multimodal data from electronic health records, imaging, laboratory results, and genetic data in a real-world clinical setting. The objective is to improve diagnostic accuracy, optimize clinical workflows, and provide more personalized treatment options for cancer patients. Ultimately, through this comprehensive, two-phase approach, this system seeks to improve early detection, guide effective treatment strategies, and enhance patient survival rates.

Study Type

Observational

Enrollment (Estimated)

1000000

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Contact

Study Locations

    • Guangdong
      • Guangzhou, Guangdong, China
        • Recruiting
        • Sun Yat-sen Memorial Hospital, Sun Yat-sen University
        • Contact:
      • Guangzhou, Guangdong, China
        • Recruiting
        • Guangzhou Women and Children's Medical Center
        • Contact:
      • Guangzhou, Guangdong, China
      • Guangzhou, Guangdong, China
        • Recruiting
        • Sun Yat-sen University Cancer Hospital
        • Contact:
    • Sichuan
      • Chengdu, Sichuan, China
        • Recruiting
        • West China Hospital
        • Contact:
    • Zhejiang
      • Wenzhou, Zhejiang, China
        • Recruiting
        • First Affiliated Hospital of Wenzhou Medical University
        • Contact:
      • Wenzhou, Zhejiang, China
        • Recruiting
        • Second Affiliated Hospital of Wenzhou Medical University
        • Contact:

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

  • Child
  • Adult
  • Older Adult

Accepts Healthy Volunteers

Yes

Sampling Method

Non-Probability Sample

Study Population

The study population consists of individuals aged 0 to 90 years who have received care at participating study centers. Participants must have comprehensive electronic health records (EHRs) available, including medical history, laboratory test results, imaging data, and genetic information (if available). Both individuals diagnosed with cancer (including pediatric and adult cancers) and healthy individuals with no history of cancer will be included in the study to evaluate the AI-assisted model's diagnostic and predictive capabilities. The study will focus on patients with complete and documented care records from the participating centers, ensuring a diverse cohort for analysis across different age groups and cancer types.

Description

Inclusion Criteria:

1、Patients with comprehensive electronic health records (EHRs), including medical history, laboratory test results, imaging data, and genetic data (if available).

2. Individuals without severe cognitive impairments or conditions that would prevent them from providing informed consent or participating in the study.

3. Parents or guardians must provide informed consent for minors, while adult participants must provide informed consent for themselves.

Exclusion Criteria:

  1. Patients with incomplete or missing key electronic health record data or insufficient follow-up data.
  2. Individuals with severe cognitive disorders or other terminal illnesses that would prevent meaningful participation.
  3. Pregnant women (although pediatric cancers are being considered, pregnant women would be excluded for safety reasons).

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

Cohorts and Interventions

Group / Cohort
Intervention / Treatment
Healthy Cohort
This group consists of individuals without any diagnosed cancer. Participants in this cohort will serve as the control group for comparison to the experimental group. No interventions or treatments will be administered to this cohort, as they represent a baseline of healthy individuals.
This intervention involves an AI system that integrates multimodal data, including patient medical history, laboratory test results, imaging data, and genetic information, to predict the risk of cancer. The system uses deep learning algorithms to provide real-time, accurate predictions, enabling early identification of cancer risks. By analyzing historical health data, the model aims to predict potential cancer developments, improving early detection and treatment outcomes.
Tumor Cohort
This group consists of individuals diagnosed with cancer, including various types. Participants in this cohort will serve as the experimental group for evaluating the effectiveness of the early prediction model in identifying cancer risks and improving diagnostic accuracy.
This intervention involves an AI system that integrates multimodal data, including patient medical history, laboratory test results, imaging data, and genetic information, to predict the risk of cancer. The system uses deep learning algorithms to provide real-time, accurate predictions, enabling early identification of cancer risks. By analyzing historical health data, the model aims to predict potential cancer developments, improving early detection and treatment outcomes.

What is the study measuring?

Primary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Area Under the Curve (AUC)
Time Frame: 1 year
AUC of the ROC curve, used to quantify diagnostic accuracy. No unit (a ratio or percentage, typically expressed as a number between 0 and 1).
1 year
F1 Score
Time Frame: 1 year
The F1 score is the harmonic mean of precision and sensitivity (recall). It is a good measure of the model's ability to identify both true positives and minimize false positives, especially in cases where the classes are imbalanced (e.g., when the number of healthy cases is much higher than disease cases). The F1 score ranges from 0 to 1, with 1 indicating perfect precision and recall.
1 year

Secondary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Sensitivity (True Positive Rate)
Time Frame: 1 year
Sensitivity measures how well the AI model identifies true positive cases, such as correctly diagnosing pregnant women with complications or identifying neonatal disorders.
1 year
Specificity (True Negative Rate)
Time Frame: 1 year
Specificity measures the ability of the AI model to correctly identify cases without diseases, ensuring that healthy mothers and infants are correctly identified as negative.
1 year

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Actual)

January 19, 2025

Primary Completion (Estimated)

October 1, 2025

Study Completion (Estimated)

October 1, 2025

Study Registration Dates

First Submitted

January 19, 2025

First Submitted That Met QC Criteria

January 19, 2025

First Posted (Actual)

January 24, 2025

Study Record Updates

Last Update Posted (Actual)

July 30, 2025

Last Update Submitted That Met QC Criteria

July 25, 2025

Last Verified

July 1, 2025

More Information

Terms related to this study

Additional Relevant MeSH Terms

Other Study ID Numbers

  • Cancer

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

NO

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

No

Studies a U.S. FDA-regulated device product

No

product manufactured in and exported from the U.S.

No

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Tumor

Clinical Trials on AI-Based Diagnostic and Prognostic Model

Subscribe