Lung Cancer Multi-omics Digital Human Avatars for Integrating Precision Medicine Into Clinical Practice (LANTERN)

Lung Cancer Multi-omics Digital Human Avatars for Integrating Precision Medicine Into Clinical Practice: the LANTERN Study

The goal of this multi-centric observational clinical trial is to to develop accurate predictive models for lung cancer patients, through the creation of Digital Human Avatars using various omics-based variables and integrating well-established clinical factors with "big data" and advanced imaging features

The main goals of LANTERN project are:

  • To develop prevention models for early lung cancer diagnosis;
  • To set up personalized predictive models for individual-specific treatments;

Lung cancer patients will be prospectively enrolled and main omics data (including radiomics and genomics) will be collected, reflecting the main omics domains associated with the lung cancer diagnosis and decision making pathway.

An exploratory analysis across all collected datasets will select a pool of potential biomarkers to create a multiple distinct multivariate models, trained though advanced machine learning (ML) and AI techniques sub-divided into specific areas of interest. Finally, the developed predictive models will be validated in order to test their robustness, transferability and generalizability, leading to the development of the Digital Human Avatar.

Study Overview

Status

Not yet recruiting

Intervention / Treatment

Detailed Description

Patient enrolment and omics data collection The objective of this WP is to gather information from all the clinical and omics based data sources considered as clinically significant for decision support in the lung cancer comprehensive diagnosis and therapy workflow. A structured terminological system will be developed for prospective data collection through specific Case Report Forms (CRFs).

Patients will be enrolled by the dedicated research enrolment centres and data obtained from the five omics-based variables, will be collected and recorded in a secure database.

Omics data archiving and inter-actionability The main aim of this WP is to allow complete data integration into both existing and new archiving systems and to ensure an easy and effective use and sharing of collected omics data.

All collected data representing the different considered omics-domains will be recorded according to a shared common ontology. The shared general ontology will represent a structured terminological system for data archiving and analysis where all the different omics domains will be recorded in a specific eCRF, ensuring coherence for all the collected data variables. Finally, the collected omics-related data will then undergo radiomic analysis and radiomic features will then be extracted.

Omics data modelling, Digital Human avatar (DHA) creation and validation

This WP is focused on developing accurate predictive models (by creating Digital Human Avatars (DHA)) and on their validation. The purpose of this WP is to identify effective primary biomarkers, harmonize them through compact statistical models and subsequently creating patient-specific DHAs which will be unique to each patient. We plan to integrate all the aforementioned omics data into predictive models that will represent the basis for a fully personalized and innovative lung cancer integrated decision support system. This WP is divided into three phases:

Phase 1: Omics features identification and selection Phase 2: Predictive model development and DHA creation Phase 3: Predictive model and DHA validation

Omics features identification and selection:

In the first step, an exploratory analysis across all collected datasets from an estimate of ≈ 240 NSCLC patients will enable the start of the biomarker identification process and restrict the cast amount of information towards a more selected pool of potential biomarkers. This first phase will employ robust data analysis techniques in order to identify relevant variables in a univariate setting, taking individual statistical distributions, feature-relevant correlations and general descriptive statistics into account.

Predictive model development and DHA creation:

The objective of the second phase is to create multiple distinct but modular multivariate models which will be trained through advanced ML and AI techniques, segmented into specific modular areas of interest and the subsequent creation of the DHA. Different supervised models will be developed including logistic regression, decision tree, support vector machine, random forest, XGBoost classifier, and artificial neural networks. The k-fold cross-validation will be used for hyperparameters tuning and statistical significance comparison of the performance of the ML models will be performed. This will be done to evaluate predictive performances based on accuracy (number of subjects correctly classified on the total number of patients) and precision (true positive on total test positive, recall (sensitivity), F1 score (2*precision*recall/(precision+recall)) and AUC-ROC.

The DHA creation will involve the integration of specific algorithms into the data extraction pipeline to clean and restructure the flow of data, while applying text mining and natural language processing technologies to the unstructured texts. The results of this pre-processing will then be recoded through a specifically assigned ontology to reveal duplicates. This leads to the creation of data Marts which will be updated continuously and automatically with new data. Based on the available data already processed, the developed algorithm and its underlying infrastructure will be used to classify newly updated patient data inputs by the clinicians using the interface. The resulting data presented through the dynamic interface allows the thorough exploration of previously added patient data already present in the database, to infer the best course of action based on historical data and the experience of the clinician. This will lead to a more generalized exploration workflow that will act as a hypothesis generator for the user, through clustering information based on custom criteria, thereby generating an exploratory analysis of the available data.

The investigators estimate that approximately 300 NSCLC cases with complete data will be adequate to start this process. Both user friendliness and model explainability will serve as the primary standard of the model development strategies. Easily interpretable values such as SHAP (SHapley Additive exPlanations) values will be attached to each model in order to avoid any black-box approaches that might render model outputs difficult to explain to the patients during their interactions with the clinicians.

Predictive model and DHA validation: Both the developed model and the comprehensive DHA will be validated in order to test their robustness, transferability and generalizability. Two consecutive validation strategies will be employed respectively: the internal and external validation techniques. We estimate a total number of approximately 420 NSCLC cases to start the validation process. This process will include both internal and external validation.

Study Type

Observational

Enrollment (Anticipated)

600

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Contact

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

18 years and older (Adult, Older Adult)

Accepts Healthy Volunteers

No

Genders Eligible for Study

All

Sampling Method

Non-Probability Sample

Study Population

Patients affected by early stage Non small cell lung cancer underwent surgical resection.

Description

Inclusion Criteria:

  • Patients with (suspected) NSCLC
  • Age >18 yrs
  • ECOG 0-3
  • Written Informed Consent

Exclusion Criteria:

  • ECOG 4
  • Psychosocial, or emotional conditions controindicating participation to the study

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

Cohorts and Interventions

Group / Cohort
Intervention / Treatment
enrolled patients
Non small cell lung cancer patients underwent surgical resection. We will use part of this cohort to built the predictive models and a second part to validate the creted models.
surgical resection of the lung cancer

What is the study measuring?

Primary Outcome Measures

Outcome Measure
Measure Description
Time Frame
To develop prevention models for early lung cancer diagnosis
Time Frame: 36 months

Development of prognostic model in NSCLC patients using omics data. In particular, will be determinate the association between radiomics characteristics and biomarkers to lung cancer stage and survival outcome.

Omics data and prognostic model will be tested in terms of disease free and overall survival comapring the different models.

36 months

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Anticipated)

June 1, 2023

Primary Completion (Anticipated)

October 1, 2025

Study Completion (Anticipated)

June 1, 2026

Study Registration Dates

First Submitted

February 16, 2023

First Submitted That Met QC Criteria

April 4, 2023

First Posted (Actual)

April 7, 2023

Study Record Updates

Last Update Posted (Actual)

April 7, 2023

Last Update Submitted That Met QC Criteria

April 4, 2023

Last Verified

February 1, 2023

More Information

Terms related to this study

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

NO

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

No

Studies a U.S. FDA-regulated device product

No

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Non Small Cell Lung Cancer

Clinical Trials on surgical resection

Subscribe