Machine-learning classification of neurocognitive performance in children with perinatal HIV initiating de novo antiretroviral therapy

Robert H Paul, Kyu S Cho, Andrew C Belden, Claude A Mellins, Kathleen M Malee, Reuben N Robbins, Lauren E Salminen, Stephen J Kerr, Badri Adhikari, Paola M Garcia-Egan, Jiratchaya Sophonphan, Linda Aurpibul, Kulvadee Thongpibul, Pope Kosalaraksa, Suparat Kanjanavanit, Chaiwat Ngampiyaskul, Jurai Wongsawat, Saphonn Vonthanak, Tulathip Suwanlerk, Victor G Valcour, Rebecca N Preston-Campbell, Jacob D Bolzenious, Merlin L Robb, Jintanat Ananworanich, Thanyawee Puthanakit, PREDICT Study Group, Robert H Paul, Kyu S Cho, Andrew C Belden, Claude A Mellins, Kathleen M Malee, Reuben N Robbins, Lauren E Salminen, Stephen J Kerr, Badri Adhikari, Paola M Garcia-Egan, Jiratchaya Sophonphan, Linda Aurpibul, Kulvadee Thongpibul, Pope Kosalaraksa, Suparat Kanjanavanit, Chaiwat Ngampiyaskul, Jurai Wongsawat, Saphonn Vonthanak, Tulathip Suwanlerk, Victor G Valcour, Rebecca N Preston-Campbell, Jacob D Bolzenious, Merlin L Robb, Jintanat Ananworanich, Thanyawee Puthanakit, PREDICT Study Group

Abstract

Objective: To develop a predictive model of neurocognitive trajectories in children with perinatal HIV (pHIV).

Design: Machine learning analysis of baseline and longitudinal predictors derived from clinical measures utilized in pediatric HIV.

Methods: Two hundred and eighty-five children (ages 2-14 years at baseline; Mage = 6.4 years) with pHIV in Southeast Asia underwent neurocognitive assessment at study enrollment and twice annually thereafter for an average of 5.4 years. Neurocognitive slopes were modeled to establish two subgroups [above (n = 145) and below average (n = 140) trajectories). Gradient-boosted multivariate regressions (GBM) with five-fold cross validation were conducted to examine baseline (pre-ART) and longitudinal predictive features derived from demographic, HIV disease, immune, mental health, and physical health indices (i.e. complete blood count [CBC]).

Results: The baseline GBM established a classifier of neurocognitive group designation with an average AUC of 79% built from HIV disease severity and immune markers. GBM analysis of longitudinal predictors with and without interactions improved the average AUC to 87 and 90%, respectively. Mental health problems and hematocrit levels also emerged as salient features in the longitudinal models, with novel interactions between mental health problems and both CD4 cell count and hematocrit levels. Average AUCs derived from each GBM model were higher than results obtained using logistic regression.

Conclusion: Our findings support the feasibility of machine learning to identify children with pHIV at risk for suboptimal neurocognitive development. Results also suggest that interactions between HIV disease and mental health problems are early antecedents to neurocognitive difficulties in later childhood among youth with pHIV.

Trial registration: ClinicalTrials.gov NCT00234091.

Conflict of interest statement

Conflicts of Interest: Dr. Jintanat Ananworanich received honoraria for participating in advisory meetings for ViiV Healthcare, Gilead, Merck, Roche and AbbVie. Dr. Victor Valcour received honoraria from ViiV Healthcare. No conflicts reported for the remaining authors.

Figures

Fig. 1.
Fig. 1.
Receiver Operator Curves comparing average AUC for the baseline, longitudinal and interactive GBM and logistic regression analyses.
Fig. 2.
Fig. 2.
Feature importance ranking for the baseline, longitudinal, and interactive GBM models. Baseline model (top panel): HIV-RNA (copies/mL), white blood cell count (WBC) CD8 T cell %, Lymphocytes, CD8 T cell count, and CD4 T cell count. Longitudinal model (middle panel): CBCL Affective score minimum percent change (min % Δ); CD4 T cell count average value (avg), CBCL Somatic Complaints standard deviation (std), hematocrit (avg), HIV RNA slope, and CBCL Somatic Problems score maximum (max). Interaction model (bottom panel): hematocrit (avg) x CBCL Somatic Problems score (max % Δ), CD4 count at baseline (first) x CBCL Total Score (avg % Δ), CBCL Somatic Complaints score (std) x CBCL Internalizing score (min % Δ), CBCL Withdrawn score (max) x CBCL Withdrawn score (min % Δ), CBCL Total Score (min % Δ) x CBCL Affective score (min % Δ), and CBCL Thought Problems score (max) x CBCL Thought Problems score (avg).
Fig. 3.
Fig. 3.
Classification performance comparing baseline, longitudinal and interactive GBM to logistic regression.

Source: PubMed

3
Abonneren