NLP Analysis of Weekly Narratives for Dynamic Clinical Assessment in SUD (DYNA-NLP)

May 14, 2026 updated by: Lauro Gutiérrez Castro

Analysis of Weekly Narratives Using Natural Language Processing in Patients Undergoing Residential Rehabilitation: A Hybrid Approach for the Dynamic Assessment of Clinical Processes

This prospective observational study follows adults undergoing residential rehabilitation for severe substance use disorders at a specialized treatment center in Mexico. Participants provide weekly written narratives describing their emotions, challenges, coping strategies, and treatment experiences, and complete validated psychological questionnaires every two weeks, including the Generalized Anxiety Disorder-7 (GAD-7), Environmental Reward Observation Scale (EROS), Automatic Thoughts Questionnaire-8 (ATQ-8), and Behavioral Activation for Depression Scale (BADS).

The study applies natural language processing (NLP) and machine learning methods to analyze participants' narratives and identify emotional, cognitive, and behavioral patterns associated with clinical change over time. Narrative-derived features are combined with questionnaire scores to generate a dynamic clinical risk representation that may help detect early signs of psychological worsening or improvement during residential treatment.

Participants continue receiving standard residential care, and the study does not modify treatment decisions or clinical interventions. Up to 35 participants with sufficient longitudinal follow-up data will be included in the primary analysis. Data collection is expected to continue through September 2026.

Study Overview

Status

Recruiting

Conditions

Intervention / Treatment

Other: Residential rehabilitation as usual

Detailed Description

Patients undergoing treatment for substance use disorders frequently describe changes in mood, motivation, hopelessness, cognitive rigidity, emotional distress, and coping strategies through spontaneous written language. These narratives may contain clinically meaningful indicators associated with psychological deterioration, treatment progress, or relapse vulnerability. However, systematic manual analysis of longitudinal written narratives is difficult to implement in routine clinical practice because of the volume and complexity of the data.

Recent developments in natural language processing (NLP), representation learning, and deep learning provide methods for extracting quantitative linguistic and semantic information from written text. Integrating these features with repeated psychometric assessments may support the development of longitudinal models capable of characterizing changes in mental health status over time in patients receiving residential addiction treatment.

Objectives

The objectives of this study are:

To extract semantic, emotional, and linguistic features from weekly patient narratives using NLP methods, including sentence embeddings, sentiment and emotion classification, and semantic similarity analyses based on Acceptance and Commitment Therapy (ACT) constructs.

To integrate narrative-derived variables with repeated psychometric measures (Generalized Anxiety Disorder-7 (GAD-7), Environmental Reward Observation Scale (EROS), Automatic Thoughts Questionnaire-8 (ATQ-8), and Behavioral Activation for Depression Scale (BADS)) using a multimodal deep learning autoencoder capable of generating a low-dimensional representation of longitudinal clinical status.

To evaluate whether latent representations generated by the autoencoder are associated with periods of clinical worsening or clinical improvement across time using leave-one-patient-out cross-validation procedures.

To develop a dynamic longitudinal risk representation capable of estimating future changes in automatic negative thoughts, measured through subsequent ATQ-8 scores.

Study Design

This study is a prospective observational cohort conducted at a single residential rehabilitation center in Mexico.

Eligible participants are adults aged 18 years or older with a diagnosis of severe substance use disorder who are admitted for residential treatment. Individuals with active psychotic symptoms or cognitive impairment that substantially interferes with the ability to complete written narratives are excluded.

Participants complete:

Weekly digital written narratives using open-ended prompts focused on emotions, challenges, coping responses, interpersonal experiences, and perceived treatment progress.

Biweekly administration of the following validated self-report instruments:

GAD-7

EROS

ATQ-8

BADS

All information is collected through digital forms integrated into routine clinical monitoring procedures at the treatment center.

Participants may contribute data for up to 20 weeks, depending on duration of residential stay. The study began on 25 May 2025, and primary completion is anticipated in September 2026.

Up to 35 participants with sufficient longitudinal observations will be included in the primary analytic cohort.

NLP and Machine Learning Pipeline

Text preprocessing

Narratives are written in Spanish and undergo preprocessing procedures that include:

minimum length filtering, normalization, consolidation of narrative fields into a single weekly text sample, tokenization and linguistic annotation.

Feature extraction

Narrative features include:

Sentiment classification (positive, neutral, negative) Emotion probabilities for joy, sadness, anger, fear, surprise, and disgust using the pysentimiento library

Linguistic variables including:

type-token ratio, mean sentence length, proportion of first-person pronouns, proportion of past-tense verbs, obtained through udpipe

Semantic similarity measures between patient narratives and ACT-related prototype domains including:

experiential avoidance, cognitive fusion, rule-governed behavior, helplessness, achievement orientation, hopefulness

Semantic similarity is computed through cosine similarity between sentence-transformer embeddings and predefined prototype centroids.

Dimensionality reduction procedures using principal component analysis (PCA) are applied to selected linguistic and semantic variables to derive a principal component representing orientation toward internal emotional experience versus external contextual events.

Autoencoder Architecture

The multimodal model receives concatenated longitudinal feature vectors after robust scaling.

The architecture includes:

a bidirectional long short-term memory (BiLSTM) encoder, multi-head attention mechanisms, variational regularization using a β-variational autoencoder (β-VAE) framework, a decoder trained to reconstruct the original temporal feature sequence, and an auxiliary classification component predicting next-period clinical worsening.

The training objective combines:

mean squared reconstruction error, focal loss for classification, Kullback-Leibler divergence, and optional temporal smoothness regularization.

Model performance is evaluated using leave-one-patient-out cross-validation procedures. A final model may subsequently be trained using the complete dataset.

Primary Analyses

Primary analyses include:

evaluation of discriminative performance for prediction of subsequent clinical worsening using: area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), F1 score, Matthews correlation coefficient (MCC) examination of associations between latent trajectory representations and longitudinal psychometric changes, development of a Composite Clinical Risk Index (CCRI) derived from latent representations, and comparison of CCRI trajectories with weeks classified as clinical worsening according to predefined multimodal criteria.

Ethics and Dissemination

The study has received approval from the institutional review board of Under The Tree Miller A.C.

All participants provide written informed consent prior to participation.

Participation does not alter or replace standard residential treatment. All clinical decisions remain under the responsibility of treating professionals independent of study procedures.

Study findings will be submitted for publication in peer-reviewed scientific journals regardless of outcome. De-identified datasets and analysis code may be made available upon reasonable request and in accordance with institutional and ethical requirements.

Recruitment Status

Recruitment is ongoing. Final data collection for the primary outcome is anticipated in September 2026.

Study Type

Observational

Enrollment (Estimated)

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Contact

Name: Ricardo Fernandez
Phone Number: +52 1 33 1544 5474

Study Contact Backup

Name: Lauro Gutiérrez Castro
Phone Number: +523314696107
Email: saraqael_sefer@hotmail.com

Study Locations

Mexico
- Jalisco
  - Potrerillos, Jalisco, Mexico, 45815
    - Recruiting
    - Under The Tree Potrerillos
    - Contact:
      
      Lauro Gutiérrez Castro
      
      Phone Number: +523314696107
      
      Email: saraqael_sefer@hotmail.com

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

Adult
Older Adult

Accepts Healthy Volunteers

Sampling Method

Non-Probability Sample

Study Population

Adult males (≥18 years) with severe substance use disorder, residing in a single residential rehabilitation center in Mexico. Participants are enrolled consecutively as they enter the program and meet eligibility criteria. The sample is non-probabilistic, reflecting the real-world clinical population of this specific center.

Description

Inclusion Criteria:

Clinical diagnosis of severe substance use disorder (polydrug use, including cocaine, methamphetamines, alcohol, and/or cannabis), confirmed by the center's admission assessment.
Male sex (all participants in the center's residential program are male).
Age 18 years or older.
Current resident of the participating residential rehabilitation center in Mexico.
Completed at least four weeks of residential treatment at the time of study enrollment.
Able to write coherent weekly narratives in Spanish (no severe cognitive impairment or active psychosis).
Willing to provide written informed consent.

Exclusion Criteria:

Presence of acute psychotic symptoms that interfere with the ability to write or understand the study procedures.
Severe cognitive impairment (e.g., due to traumatic brain injury, intellectual disability) that prevents meaningful narrative production.
Inability to comply with weekly narrative writing (e.g., illiteracy, severe visual impairment).
Planned discharge from the residential program within less than 4 weeks from enrollment.
Enrollment in another interventional clinical trial that could confound the interpretation of outcomes.

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

Number of groups / cohorts

Cohorts and Interventions

Group / Cohort	Intervention / Treatment
Residential rehabilitation cohort Adult men (≥18 years) with severe substance use disorder, admitted to a residential rehabilitation center in Mexico. All participants receive the center's standard, multimodal treatment program, which includes group therapy, individual counseling, occupational activities, 12-step facilitation, and relapse prevention education. The study does not assign, modify, or withhold any component of this program. Participants are followed prospectively for the duration of their stay (10-20 weeks).	Other: Residential rehabilitation as usual Participants follow the standard residential rehabilitation program provided by the center. This is a naturalistic exposure; the study does not impose any additional intervention. The "intervention" of interest is the routine therapeutic environment and its associated psychological processes (e.g., changes in avoidance, cognitive fusion, hope, and behavioral activation). These processes are measured through weekly written narratives and biweekly validated clinical scales (GAD-7, EROS, ATQ-8, BADS).

Group / Cohort

Intervention / Treatment

Residential rehabilitation cohort

Adult men (≥18 years) with severe substance use disorder, admitted to a residential rehabilitation center in Mexico. All participants receive the center's standard, multimodal treatment program, which includes group therapy, individual counseling, occupational activities, 12-step facilitation, and relapse prevention education. The study does not assign, modify, or withhold any component of this program. Participants are followed prospectively for the duration of their stay (10-20 weeks).

Other: Residential rehabilitation as usual

Participants follow the standard residential rehabilitation program provided by the center. This is a naturalistic exposure; the study does not impose any additional intervention. The "intervention" of interest is the routine therapeutic environment and its associated psychological processes (e.g., changes in avoidance, cognitive fusion, hope, and behavioral activation). These processes are measured through weekly written narratives and biweekly validated clinical scales (GAD-7, EROS, ATQ-8, BADS).

What is the study measuring?

Primary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Change in negative automatic thoughts measured by the Automatic Thoughts Questionnaire 8-item version Time Frame: Every 2 weeks from baseline until discharge from residential treatment (up to 20 weeks)	The Automatic Thoughts Questionnaire 8-item version is administered every two weeks. Total scores range from 8 to 40, with higher scores indicating more frequent negative automatic thoughts. The primary outcome is the change from baseline in total score over the course of residential treatment.	Every 2 weeks from baseline until discharge from residential treatment (up to 20 weeks)

Secondary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Change in generalized anxiety symptoms measured by the Generalized Anxiety Disorder 7 item scale Time Frame: Every 2 weeks from baseline until discharge (up to 20 weeks)	The Generalized Anxiety Disorder 7-item scale is administered every two weeks. Total scores range from 0 to 21, with higher scores indicating greater anxiety symptom severity.	Every 2 weeks from baseline until discharge (up to 20 weeks)
Change in environmental reward measured by the Environmental Reward Observation Scale Time Frame: Every 2 weeks from baseline until discharge (up to 20 weeks)	The Environmental Reward Observation Scale iis administered every two weeks. Total scores range from 10 to 40, with higher scores indicating greater exposure to positive environmental reinforcement.	Every 2 weeks from baseline until discharge (up to 20 weeks)
Change in behavioral activation and avoidance measured by the Behavioral Activation for Depression Scale Time Frame: Every 2 weeks from baseline until discharge (up to 20 weeks)	The Behavioral Activation for Depression Scale is administered every two weeks. Higher scores on the Activation subscale indicate greater engagement in goal-directed activity; higher scores on the Avoidance/Rumination subscale indicate greater behavioral avoidance and rumination.	Every 2 weeks from baseline until discharge (up to 20 weeks)
Weekly self-reported emotional intensity Time Frame: Weekly from baseline until discharge (up to 20 weeks)	Participants report the intensity of their predominant weekly emotion on a 0 to 10 numeric rating scale, where higher scores indicate greater emotional intensity.	Weekly from baseline until discharge (up to 20 weeks)
Weekly self-reported craving intensity Time Frame: Weekly from baseline until discharge (up to 20 weeks)	Participants rate average craving for substance use during the prior week on a 0 to 10 numeric rating scale, where higher scores indicate greater craving intensity.	Weekly from baseline until discharge (up to 20 weeks)

Other Outcome Measures

Outcome Measure	Measure Description	Time Frame
Narrative sentiment valence index derived from weekly written narratives Time Frame: Weekly from baseline until discharge (up to 20 weeks)	Sentiment valence is calculated from weekly written narratives using a validated Spanish-language sentiment classification model. Scores range from -1 to +1, with lower scores indicating more negative emotional valence.	Weekly from baseline until discharge (up to 20 weeks)
Semantic similarity to Acceptance and Commitment Therapy-related constructs Time Frame: Weekly from baseline until discharge (up to 20 weeks)	Semantic similarity scores are computed between weekly narratives and prototype domains (e.g., experiential avoidance, cognitive fusion) using sentence embedding methods. Scores range from 0 to 1 (higher = greater similarity).	Weekly from baseline until discharge (up to 20 weeks)
Composite Clinical Risk Index derived from longitudinal narrative features Time Frame: Weekly from baseline until discharge (up to 20 weeks)	A composite longitudinal risk score is a composite longitudinal risk score calculated using multiple narrative-derived emotional variability indicators. Higher scores indicate greater estimated clinical vulnerability over time.	Weekly from baseline until discharge (up to 20 weeks)
Linguistic markers extracted from weekly written narratives Time Frame: Weekly from baseline until discharge (up to 20 weeks)	Linguistic variables including lexical diversity (type-token ratio), first-person pronoun frequency, and past-tense verb proportion are extracted from weekly written narratives using automated natural language processing (NLP) procedures.	Weekly from baseline until discharge (up to 20 weeks)

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Sponsor

Lauro Gutiérrez Castro

Investigators

Principal Investigator: Lauro Gutiérrez Castro, Under The Tree

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Actual)

May 25, 2025

Primary Completion (Estimated)

September 1, 2026

Study Completion (Estimated)

September 1, 2026

Study Registration Dates

First Submitted

May 9, 2026

First Submitted That Met QC Criteria

May 14, 2026

First Posted (Actual)

May 19, 2026

Study Record Updates

Last Update Posted (Actual)

May 19, 2026

Last Update Submitted That Met QC Criteria

May 14, 2026

Last Verified

May 1, 2026

More Information

Terms related to this study

Keywords

Additional Relevant MeSH Terms

Other Study ID Numbers

SISAP-TUS-HYB/WN-01

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

YES

IPD Plan Description

De-identified individual participant data (IPD) from the weekly narratives and biweekly clinical scales, after removal of all direct identifiers (name, date of birth, exact dates converted to study day/week numbers). Only aggregated or pseudonymized data will be shared.

IPD Sharing Time Frame

IPD and supporting information will be available starting 6 months after publication of the primary results and will remain available for 5 years.

IPD Sharing Access Criteria

Data will be shared upon reasonable request to the corresponding author. Requesters must sign a data use agreement that prohibits re-identification and restricts use to replicating the published analyses or conducting secondary analyses approved by the study's ethics committee. De-identified data will be provided in CSV format; analytic code will be provided as R and Python scripts via a public repository (e.g., GitHub).

IPD Sharing Supporting Information Type

STUDY_PROTOCOL
SAP
ICF
ANALYTIC_CODE

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Anxiety

University of Calabria

Not yet recruiting

VR-Counseling to Reduce Public Speaking Anxiety (VR-Counseling)

Anxiety | Anxiety Disease | Anxiety and Distress | Public Speaking Anxiety

Italy
Ann & Robert H Lurie Children's Hospital of Chicago
University of California, Los Angeles; University of Cincinnati

Active, not recruiting

Partners in Caring for Anxious Youth (PCAY)

Anxiety, Separation | Anxiety, Social | Anxiety, Generalized

United States
Clinica Alemana de Santiago
Universidad del Desarrollo

Recruiting

The Study Evaluates the Effect of an Interactive Projector as a Distraction for Children During Anesthetic Induction. The Primary Objective is to Reduce Perioperative Anxiety, Measured With the Modified Yale Preoperative Anxiety Scale (mYPAS).

Anxiety | Induction of Anesthesia | Anxiety Preoperative | Technology Use | Child Anxiety | Anesthesia Care | Anxiety After Surgery

Chile
Boston Medical Center
Patient-Centered Outcomes Research Institute; Boston University; Johns Hopkins... and other collaborators

Completed

Kids FACE FEARS Comparative Effectiveness Research

Anxiety Disorders | Anxiety | Anxiety Symptoms | Child Anxiety | Anxiety, Mild to Moderate | Pediatric Anxiety Disorders

United States
AstraZeneca

Completed

Generalized Anxiety Disorder Adjunct Study

Anxiety Disorders | Anxiety | Anxiety Neuroses | Anxiety States

United States
Abant Izzet Baysal University

Recruiting

The Effect of Emotional Freedom Technique on Parents' Anxiety Level: Newborn Hearing Screening

Anxiety | Parental Anxiety

Turkey (Türkiye)
Yale University
National Institute of Mental Health (NIMH)

Completed

Brain Response Associated With Parent-based Treatment for Childhood Anxiety Disorders

Generalized Anxiety Disorder | Anxiety Disorder of Childhood | Separation Anxiety Disorder of Childhood | Social Anxiety Disorder of Childhood

United States
Florida State University

Recruiting

Text Message Safety Behavior Fading for Pathological Worry

Anxiety | Generalized Anxiety Disorder (GAD) | Worrying

United States
Institut National de la Santé Et de la Recherche...

Completed

Prefrontal Oscillations in Social Anxiety Disorder (POSAD) (POSAD)

Anxiety Disorders | Anxiety | Anxiety and Fear

France
Prisma Health-Upstate

Completed

Reduction of Perioperative Anxiety Using a Hand-held Video Game Device

Anxiety | Anxiety, Separation | Separation Anxiety | Anxiety Generalized

Clinical Trials on Residential rehabilitation as usual

VA Office of Research and Development
VA Boston Healthcare System; Center for Biostatistics and Health Data Science

Completed

Evaluating Supplementing Residential Substance Use Treatment With Written Exposure Therapy for Veterans With Post Traumatic Stress Disorder (PTSD) and Substance Use Disorders (SUD)

PTSD | Substance Use Disorders

United States
New York University

Completed

Intervention for Persons Leaving Residential Substance Abuse Treatment

Substance Use Disorders

United States
Lund University

Recruiting

Optimized Rehabilitation Following Primary Breast Cancer Surgery (RE-SCREEN)

Breast Neoplasms | Rehabilitation | Psychological Distress

Sweden
Hans Joergen Soegaard

Completed

Application of Psychiatric Knowledge in the Rehabilitation Process in Return to Work. (PKRW)

Mental Disorders

Denmark
University of Manchester
University of Liverpool; National Institute for Health Research, United Kingdom and other collaborators

Completed

Treating Depression and Anxiety in the Cardiac Rehabilitation Pathway (PATHWAY)

Depression | Anxiety | Cardiac Rehabilitation

United Kingdom
Washington University School of Medicine

Completed

Enhanced Medical Rehabilitation for Disablement

Depression | Hip Fracture

United States
Karolinska Institutet
Forte

Completed

F@ce 2.0 - Information and Communication Technology-based Rehabilitation Intervention After Stroke in Sweden

Stroke

Sweden
University of British Columbia

Unknown

Exercise for Adolescents Following Sport-Related Concussion: A Randomized Control Trial

Concussion | Sport-related Concussion | Mild Traumatic Brain Injury (MTBI)

Canada
Maastricht University Medical Center
Fonds NutsOhra; Adelante, Centre of Expertise in Rehabilitation and Audiology; Stichting Vooruit

Completed

2B Active: Outpatient Rehabilitation for Adolescents With Chronic Pain

Chronic Pain

Netherlands
Greater Manchester Mental Health NHS Foundation...
University of Manchester; National Institute for Health Research, United Kingdom and other collaborators

Recruiting

Cardiac Rehabilitation for Young People (CardioActive)

Heart Failure | Heart Valve Diseases | Cardiomyopathies | Congenital Heart Disease | Cardiac Arrythmias | Cerebrovascular Event

United Kingdom

NLP Analysis of Weekly Narratives for Dynamic Clinical Assessment in SUD (DYNA-NLP)

Analysis of Weekly Narratives Using Natural Language Processing in Patients Undergoing Residential Rehabilitation: A Hybrid Approach for the Dynamic Assessment of Clinical Processes

Study Overview

Status

Conditions

Intervention / Treatment

Detailed Description

Study Type

Enrollment (Estimated)

Contacts and Locations

Study Contact

Study Contact Backup

Study Locations

Participation Criteria

Eligibility Criteria

Ages Eligible for Study

Accepts Healthy Volunteers

Sampling Method

Study Population

Description

Study Plan

How is the study designed?

Design Details

Number of groups / cohorts

Cohorts and Interventions

Group / Cohort

Intervention / Treatment

What is the study measuring?

Primary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Secondary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Other Outcome Measures

Outcome Measure

Measure Description

Time Frame

Collaborators and Investigators

Sponsor

Investigators

Study record dates

Study Major Dates

Study Start (Actual)

Primary Completion (Estimated)

Study Completion (Estimated)

Study Registration Dates

First Submitted

First Submitted That Met QC Criteria

First Posted (Actual)

Study Record Updates

Last Update Posted (Actual)

Last Update Submitted That Met QC Criteria

Last Verified

More Information

Terms related to this study

Keywords

Additional Relevant MeSH Terms

Other Study ID Numbers

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

IPD Plan Description

IPD Sharing Time Frame

IPD Sharing Access Criteria

IPD Sharing Supporting Information Type

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

Clinical Trials on Anxiety

Clinical Trials on Residential rehabilitation as usual

Search Similar Trials

Sponsors and Collaborators

Medical Conditions

Drug Interventions

CROs by country

CROs in Luxembourg

Conditions

Rare Diseases