Applying Data Warehousing to a Phase III Clinical Trial From the Fondazione Italiana Linfomi Ensures Superior Data Quality and Improved Assessment of Clinical Outcomes

Gian Maria Zaccaria, Simone Ferrero, Samanta Rosati, Marco Ghislieri, Elisa Genuardi, Andrea Evangelista, Rebecca Sandrone, Cristina Castagneri, Daniela Barbero, Mariella Lo Schirico, Luca Arcaini, Anna Lia Molinari, Filippo Ballerini, Andres Ferreri, Paola Omedè, Alberto Zamò, Gabriella Balestra, Mario Boccadoro, Sergio Cortelazzo, Marco Ladetto, Gian Maria Zaccaria, Simone Ferrero, Samanta Rosati, Marco Ghislieri, Elisa Genuardi, Andrea Evangelista, Rebecca Sandrone, Cristina Castagneri, Daniela Barbero, Mariella Lo Schirico, Luca Arcaini, Anna Lia Molinari, Filippo Ballerini, Andres Ferreri, Paola Omedè, Alberto Zamò, Gabriella Balestra, Mario Boccadoro, Sergio Cortelazzo, Marco Ladetto

Abstract

Purpose: Data collection in clinical trials is becoming complex, with a huge number of variables that need to be recorded, verified, and analyzed to effectively measure clinical outcomes. In this study, we used data warehouse (DW) concepts to achieve this goal. A DW was developed to accommodate data from a large clinical trial, including all the characteristics collected. We present the results related to baseline variables with the following objectives: developing a data quality (DQ) control strategy and improving outcome analysis according to the clinical trial primary end points.

Methods: Data were retrieved from the electronic case reporting forms (eCRFs) of the phase III, multicenter MCL0208 trial (ClinicalTrials.gov identifier: NCT02354313) of the Fondazione Italiana Linfomi for younger patients with untreated mantle cell lymphoma (MCL). The DW was created with a relational database management system. Recommended DQ dimensions were observed to monitor the activity of each site to handle DQ management during patient follow-up. The DQ management was applied to clinically relevant parameters that predicted progression-free survival to assess its impact.

Results: The DW encompassed 16 tables, which included 226 variables for 300 patients and 199,500 items of data. The tool allowed cross-comparison analysis and detected some incongruities in eCRFs, prompting queries to clinical centers. This had an impact on clinical end points, as the DQ control strategy was able to improve the prognostic stratification according to single parameters, such as tumor infiltration by flow cytometry, and even using established prognosticators, such as the MCL International Prognostic Index.

Conclusion: The DW is a powerful tool to organize results from large phase III clinical trials and to effectively improve DQ through the application of effective engineered tools.

Conflict of interest statement

Simone Ferrero

Consulting or Advisory Role: Janssen-Cilag, EUSA Pharma

Speakers' Bureau: Janssen-Cilag, Gilead Sciences, SERVIER

Research Funding: Gilead Sciences

Travel, Accommodations, Expenses: Roche, SERVIER, Sanofi, Janssen-Cilag, EUSA Pharma, Gentili

Luca Arcaini

Consulting or Advisory Role: Roche, Celgene, Janssen-Cilag, Verastem Oncology

Speakers' Bureau: Celgene

Research Funding: Gilead

Travel, Accommodations, Expenses: Roche, Celgene, Gilead Sciences

Andres Ferreri

Consulting or Advisory Role: Kite-Gilead, Celgene, SERVIER

Research Funding: Celgene (Inst), Roche (Inst)

Travel, Accommodations, Expenses: Gilead Sciences, MolMed, Takeda, Roche

Paola Omedè

Consulting or Advisory Role: Janssen

Mario Boccadoro

Honoraria: Sanofi, Celgene, Amgen, Janssen, Novartis, Bristol-Myers Squibb, AbbVie

Research Funding: Sanofi (Inst), Celgene (Inst), Amgen (Inst), Janssen (Inst), Novartis (Inst), Bristol-Myers Squibb (Inst), Mundipharma (Inst)

Marco Ladetto

Honoraria: AbbVie, Acerta Pharma, Amgen, Archigen Biotech, ADC Therapeutics, Celgene, Gilead Sciences, Johnson & Johnson, Jazz Pharmaceuticals, Pfizer, Roche, Sandoz, Takeda

No other potential conflicts of interest were reported.

Figures

FIG 1.
FIG 1.
Structure of the data warehouse (DW) for the collection of data recorded in the electronic case reporting forms (eCRFs) and in the data sources from laboratories during the clinical trial. The DW had a snowflake architecture. The subject table represented the center of the DW design and was directly connected to other categories: Protocol Data, Laboratory Data, Pathologic Data, Clinical Data, Imaging Data, and Minimal Residual Disease (MRD) Data tables. Three auxiliary tables were used in the Imaging Data category to encode the supra-diaphragmatic, sub-diaphragmatic, and extranodal involvements. The MRD Data category contained the information of both IgH and BCL1 biomarkers detected by both nested (qualitative) and real-time quantitative polymerase chain reaction in bone marrow (BM) and peripheral blood (PB) at baseline (mrd_baseline table); the MRD analysis was performed to monitor minimal residual disease on IgH and BCL1 markers from both BM and PB at each restaging (mrd_restaging table) and after the leukapheresis procedure (mrd_lk table). DQ, data quality.
FIG 2.
FIG 2.
Results of the data quality (DQ) assessment applied after each milestone. (A-C) Radial graphs of (A) the completeness, (B) the plausibility, and (C) the concordance indexes computed after each milestone for each center. (D) The same indexes represented by bar diagrams divided into small centers (

FIG 3.

Outcome analysis after the data…

FIG 3.

Outcome analysis after the data quality (DQ) application for data retrieved from the…

FIG 3.
Outcome analysis after the data quality (DQ) application for data retrieved from the FIL-MCL0208 clinical study. Progression-free survival (PFS) curves calculated (A) at the beginning of the study (No_DQ timepoint, n = 277), and (B) after the last milestone (PM-4 time point, n = 300) for the three classes of Mantle Cell Lymphoma International Prognostic Index (MIPI). The log-rank test results are reported in terms of the P values obtained comparing the curves of adjacent classes: low (L-MIPI) versus intermediate classes (I-MIPI; P < .001), I-MIPI versus high classes (H-MIPI; P = .626) for A; L-MIPI versus I-MIPI (P < .001), I-MIPI versus H-MIPI (P = .113) for B. PFS discrimination was based on the infiltration of disease detected by flow cytometry from (C) the No_DQ time point (n = 120) to (D) the PM-4 time point (n = 252). The log-rank test results are reported in terms of P values obtained comparing the curves of adjacent classes: < median versus ≥ median (P < .567) for C (median = 4.45%); < median versus ≥ median (P < .012) for D (median = 6.55%). Significance level set at .05.
FIG 3.
FIG 3.
Outcome analysis after the data quality (DQ) application for data retrieved from the FIL-MCL0208 clinical study. Progression-free survival (PFS) curves calculated (A) at the beginning of the study (No_DQ timepoint, n = 277), and (B) after the last milestone (PM-4 time point, n = 300) for the three classes of Mantle Cell Lymphoma International Prognostic Index (MIPI). The log-rank test results are reported in terms of the P values obtained comparing the curves of adjacent classes: low (L-MIPI) versus intermediate classes (I-MIPI; P < .001), I-MIPI versus high classes (H-MIPI; P = .626) for A; L-MIPI versus I-MIPI (P < .001), I-MIPI versus H-MIPI (P = .113) for B. PFS discrimination was based on the infiltration of disease detected by flow cytometry from (C) the No_DQ time point (n = 120) to (D) the PM-4 time point (n = 252). The log-rank test results are reported in terms of P values obtained comparing the curves of adjacent classes: < median versus ≥ median (P < .567) for C (median = 4.45%); < median versus ≥ median (P < .012) for D (median = 6.55%). Significance level set at .05.

Source: PubMed

3
Abonner