Impact of GPT Use on Essay Writing Performance and Cognitive Abilities

July 4, 2025 updated by: Simiao Chen, University Hospital Heidelberg

A Randomized Controlled Trial on the Impact of Using Generative Artificial Intelligence on Analytical Writing Performance and Cognitive Abilities

The goal of this randomized controlled lab experiment is to examine if using generated artificial intelligence (AI) technology will affect people's academic performance and cognitive abilities in the context of analytical writing among college students. The main questions it aims to answer are:

Does using the technology affect students' writing performance?
Does using the technology affect students' cognitive effort during the writing process?

Participants will be randomly assigned to either a control group, which is writing without AI assistance, or an experimental group, which is writing with the assistance of ChatGPT. Researchers will compare the two groups to see if ChatGPT affects students' writing performance and cognitive effort.

For each participant, the lab experiment will last for no more than 1.5 hours. An eye-tracker will monitor the participant's gaze activities and pupil size. A functional near-infrared spectroscopy (fNIRS) will monitor the participant's brain activities in the frontal lobe. During the experiment, participants will be asked to:

Read learning materials on analytical writing techniques.
Based on the previously provided materials, complete an analytical writing assignment that will take approximately 30 minutes either with or without the aid of ChatGPT.
Answer survey questions about their experience with the writing assignment, attitudes on using ChatGPT, and demographic backgrounds.

Study Overview

Status

Completed

Conditions

Intervention / Treatment

Behavioral: GPT Support

Study Type

Interventional

Enrollment (Actual)

160

Phase

Not Applicable

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Locations

Germany
- - Heidelberg, Germany
    - Core Facility for Neuroscience of Self-Regulation, Heidelberg University

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

Adult

Accepts Healthy Volunteers

Yes

Description

Inclusion Criteria:

Full-time university student.
Able to read and write in English.
Use the computer most days of the week.
Have not taken, nor currently preparing for, the Graduate Record Examinations (GRE).
Do not wear glasses (contact lenses are allowed).
Have no eye impairment.
Not currently taking any opioids, epinephrine, or anti-hypertensive drugs.
During the experiment, not wearing any makeup around the eyes.

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

Primary Purpose: Other
Allocation: Randomized
Interventional Model: Parallel Assignment
Masking: Single

Number of Arms

Arms and Interventions

Participant Group / Arm	Intervention / Treatment
Experimental: Intervention arm In the intervention arm, participants are instructed to use ChatGPT for assistance to complete an analytical writing task.	Behavioral: GPT Support The computer interface used for the essay writing task follows a split-screen design. The writing instructions and text input field are administered on a survey platform, placed on the left half of the screen. ChatGPT is placed on the right half of the screen for technology assistance.
No Intervention: Control arm In the control arm, participants are instructed to complete an analytical writing task independently without access to any technology assistance.

What is the study measuring?

Primary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Writing Performance Time Frame: 1.5 hours	The essay writing task is derived from the Analytical Writing section in the Graduate Record Examinations (GRE), which is a worldwide and standardized computer-based exam developed by the Educational Testing Service (ETS). The participants' essays will be scored on a scale from 0 to 6 by an automatic and validated third-party scoring tool that is also developed by ETS.	1.5 hours
Cognitive Effort Measured by Pupil Size Time Frame: 1.5 hours	Cognitive effort is quantified by monitoring changes in pupil size. To achieve this, pupil diameters are recorded throughout the writing task using a near-infrared eye tracker, specifically the Tobii Pro Fusion model. At the start of the experiment, individual baseline pupil diameters are measured during a 30-second relaxation task.	1.5 hours

Secondary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Self-Perception of Writing Performance Time Frame: 1.5 hours	This is a one-item scale: Using the same grading rubric from before, what score do you think your essay should get (0 being the lowest and 6 being the highest)? The score ranges from 0 to 6. A higher score indicates higher self-perceived writing performance. The variable is treated as a continuous variable.	1.5 hours
Self-Perception of Cognitive Effort Time Frame: 1.5 hours	This is a one-item Likert scale adapted from the National Aeronautics and Space Administration-task load index (NASA-TLX; Hart, 2006; Hart & Staveland, 1988): On a scale of 1 to 7, rate how hard you have to work to accomplish your level of performance. The Likert score ranges from 1 to 7 (1 being "very low" and 7 being "very high"). A higher score indicates higher self-perceived cognitive effort. The variable is treated as a continuous variable. References: Hart, S. G. (2006). Nasa-Task Load Index (NASA-TLX); 20 Years Later. Proceedings of the Human Factors and Ergonomics Society Annual Meeting. https://doi.org/10.1177/154193120605000909 Hart, S. G., & Staveland, L. E. (1988). Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Advances in Psychology (Vol. 52, pp. 139-183). North-Holland.	1.5 hours
Cognitive Effort Measured by Cortical Hemodynamic Activity in the Frontal Lobe Time Frame: 1.5 hours	Cognitive Effort is quantified by monitoring changes in the cortical hemodynamic activity in the frontal lobe. To achieve this, the brain activity is recorded throughout the writing task using a functional near-infrared spectroscopy (fNIRS), specifically the NIRSport2 model.	1.5 hours
Self-Perception of Stress Time Frame: 1.5 hours	This is a one-item Likert sub-scale adapted from the Primary Appraisal Secondary Appraisal scale (PASA; Gaab, 2009; Pollak et al., 2020): On a scale of 1 to 7, how much would you agree or disagree with the following statement on perceived stress: The analytical writing assignment was stressful to me. The Likert score ranges from 1 to 7 (1 being "strongly disagree" and 7 being "strongly agree"). A higher score indicates higher self-perceived stress. The variable is treated as a continuous variable. References: Gaab, J. (2009). PASA-Primary Appraisal Secondary Appraisal. A questionnaire for the assessment of cognitive appraisals of situations. Verhaltenstherapie, 19(2), 114-115. Pollak, A., Paliga, M., Pulopulos, M. M., Kozusznik, B., & Kozusznik, M. W. (2020). Stress in manual and autonomous modes of collaboration with a cobot. Computers in Human Behavior, 112, 106469. https://doi.org/10.1016/j.chb.2020.106469	1.5 hours
Self-Perception of Challenge Time Frame: 1.5 hours	This is a one-item Likert sub-scale adapted from the Primary Appraisal Secondary Appraisal scale (PASA; Gaab, 2009; Pollak et al., 2020): On a scale of 1 to 7, how much would you agree or disagree with the following statement on perceived challenge: I find the analytical writing assignment a challenge. The Likert score ranges from 1 to 7 (1 being "strongly disagree" and 7 being "strongly agree"). A higher score indicates higher self-perceived challenge. The variable is treated as a continuous variable. References: Gaab, J. (2009). PASA-Primary Appraisal Secondary Appraisal. A questionnaire for the assessment of cognitive appraisals of situations. Verhaltenstherapie, 19(2), 114-115. Pollak, A., Paliga, M., Pulopulos, M. M., Kozusznik, B., & Kozusznik, M. W. (2020). Stress in manual and autonomous modes of collaboration with a cobot. Computers in Human Behavior, 112, 106469. https://doi.org/10.1016/j.chb.2020.106469	1.5 hours
Self-Efficacy in Writing Time Frame: 1.5 hours	This is a sixteen-item Likert scale that measures three dimensions of writing self-efficacy: ideation, convention and self-regulation (Bruning et al., 2013). The Likert score ranges from 1 to 7 (1 being "strongly disagree" and 7 being "strongly agree"). A higher score indicates higher self-efficacy. The three dimensions will be treated separately, each as a continuous variable. Reference: 1. Bruning, R., Dempsey, M., Kauffman, D. F., McKim, C., & Zumbrunn, S. (2013). Examining dimensions of self-efficacy for writing. Journal of educational psychology, 105(1), 25.	1.5 hours
Situational Interest in Analytical Writing Time Frame: 1.5 hours	This is a four-item Likert scale adapted from the situational interest scale (Hulleman et al., 2010). This scale measures participants' situational interest in analytical writing: On a scale of 1 to 7, how much would you agree or disagree with the following statements on your interest in the analytical writing assignment that you just completed? The analytical writing assignment was interesting. Working on the essay was fun. I enjoyed writing the essay. The analytical writing assignment was enjoyable. The Likert score ranges from 1 to 7 (1 being "strongly disagree" and 7 being "strongly agree"). A higher score indicates higher situational interest. The variable is treated as a continuous variable. Reference: 1. Hulleman, C. S., Godes, O., Hendricks, B. L., & Harackiewicz, J. M. (2010). Enhancing interest and performance with a utility value intervention. Journal of Educational Psychology, 102(4), 880.	1.5 hours
Behavioral Intention in Using ChatGPT Time Frame: 1.5 hours	This is a two-item Likert scale that measures participants' behavioral intention in using ChatGPT in the future for essay writing tasks (Albayati, 2024): On a scale of 1 to 7, how much would you agree or disagree with the following statements on using ChatGPT in essay writing assignments? If I have access to ChatGPT, I would use it for essay writing tasks. I plan to use ChatGPT in the future if I have an essay writing task. The Likert score ranges from 1 to 7 (1 being "strongly disagree" and 7 being "strongly agree"). A higher score indicates higher behavioral intention in using ChatGPT. The variable is treated as a continuous variable. Reference: 1. Albayati, H. (2024). Investigating undergraduate students' perceptions and awareness of using ChatGPT as a regular assistance tool: A user acceptance perspective study. Computers and Education: Artificial Intelligence, 6, 100203. https://doi.org/10.1016/j.caeai.2024.100203	1.5 hours

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Sponsor

University Hospital Heidelberg

Investigators

Principal Investigator: Till Bärnighausen, Heidelberg Institute of Global Health

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Actual)

July 18, 2024

Primary Completion (Actual)

February 20, 2025

Study Completion (Actual)

February 20, 2025

Study Registration Dates

First Submitted

July 15, 2024

First Submitted That Met QC Criteria

July 15, 2024

First Posted (Actual)

July 19, 2024

Study Record Updates

Last Update Posted (Actual)

July 9, 2025

Last Update Submitted That Met QC Criteria

July 4, 2025

Last Verified

July 1, 2025

More Information

Terms related to this study

Other Study ID Numbers

S-117/2024

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

product manufactured in and exported from the U.S.

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Cognitive Change

University Ramon Llull
Ainhoa Nieto Guisado; Mònica Solana-Tramunt

Unknown

The Effects of Cognitive-motor Training in Healthy Older Adults

Cognitive Change | Proprioception Change | Balance Change

Spain
Applied Science & Performance Institute

Completed

The Effects of Mirtoselect®, Virtiva®, and Enovita® on Cognitive Performance and Mood States

Cognitive Change | Mood Change | Mental Processes

United States
National Council of Scientific and Technical Research...

Completed

Naturalistic Study of Microdosing With Psilocybin (NATMICRO)

Sleep | Cognitive Change | Mood Change | Creativity

Argentina
Heilongjiang Feihe Dairy Co. Ltd.

Completed

Improved Cognitive Outcomes Associated With Feihe HMO With DHA/ARA in Infant Formula

Cognitive Change

China
University of Miami
McKnight Brain Research Foundation

Completed

Yoga Training and Retinal Vasculature

Cognitive Change

United States
University of Wisconsin, Madison
National Institute on Aging (NIA); University of California, Irvine; University...

Completed

Examining the Effectiveness of Cognitive Training

Cognitive Change

United States
Tufts University

Completed

Altering Multitasking Behavior Using Low Current Brain Stimulation

Cognitive Change

United States
Northeastern University
National Institute on Aging (NIA); University of California, Riverside

Completed

Understanding Individual Differences in Working Memory Training and Transfer in Older Adults

Cognitive Change

United States
Western University, Canada
InteraXon, Inc.; Cambridge Brain Sciences

Completed

Assessing the Effects of the Muse Meditation System on Cognition and Well-being (MuseCog)

Cognitive Change

Canada
University of Miami
United States Department of Defense

Terminated

Mindfulness Training for Senior Leaders

Cognitive Change

United States

Clinical Trials on GPT Support

Taipei Medical University Shuang Ho Hospital

Not yet recruiting

On-Demand AI Support Via LINE-Based GPT Assistant to Improve Emotional Resilience and Reduce Burnout Among Clinical Nurses (Nurse-AI-CARE)

Burnout | Compassion Fatigue | Psychological Resilience | Occupational Stress and Mental Health in Clinical Nurses
Case Comprehensive Cancer Center

Not yet recruiting

Improving Patient Understanding of Their Prostate Cancer Diagnosis Using AI

Prostate Cancer

United States
Maastricht University
Aga Khan University; University of Indonesia, Jakarta, Indonesia

Completed

The Big Unknown: A Journey Into Generative AI's Transformative Effect on Medical Professions

Diagnosis | Vignette of Fictional Patients

Netherlands, Indonesia, Kenya
Stanford University
Beth Israel Deaconess Medical Center; University of Minnesota

Completed

Physician Reasoning on Diagnostic Cases With Large Language Models

Diagnosis

United States
Stanford University
Beth Israel Deaconess Medical Center; University of Minnesota

Completed

Physician Reasoning on Management Cases With Large Language Models

Clinical Decision-making

United States
North Sichuan Medical College
Peking University; Peking University First Hospital; Monash University; Case Western... and other collaborators

Not yet recruiting

Multi-Disciplinary Treatment on the Anthropomorphism of Large Language Models (MDTALLM)

Heart Diseases | Infections | Pneumonia | Disease | Cancer | Respiratory Failure

China
zhen wang

Completed

AI in Hypertension Treatment Education: Comparing GPT and Traditional Methods (AIHT-EDU)

Hypertension | Medical Education | GPT

China
Montefiore Medical Center

Completed

Physician Response Evaluation With Contextual Insights vs. Standard Engines - Artificial Intelligence RAG vs LLM Clinical Decision Support (PRECISE)

Large Language Models

United States
Wang Shalong
Central South University

Completed

ChatGPT Helping Advance Training for Medical Students: A Study on Self-Directed Learning Enhancement (CHAT-MS)

Medical Education | Artificial Intelligence | Self-Directed Learning

China
Centre Hospitalier Universitaire de Nice

Enrolling by invitation

Diagnosis Evaluation Made by Artificial Inteligence in Response to a Request Made by General Praticioner on OMNIDOC to the Dermatologist of the CHU of Nice. A Comparison of This Response by the One Made by the Dermatologist.

Skin Diseases | Artificial Intelligence

France

Impact of GPT Use on Essay Writing Performance and Cognitive Abilities

A Randomized Controlled Trial on the Impact of Using Generative Artificial Intelligence on Analytical Writing Performance and Cognitive Abilities

Study Overview

Status

Conditions

Intervention / Treatment

Study Type

Enrollment (Actual)

Phase

Contacts and Locations

Study Locations

Participation Criteria

Eligibility Criteria

Ages Eligible for Study

Accepts Healthy Volunteers

Description

Study Plan

How is the study designed?

Design Details

Number of Arms

Arms and Interventions

Participant Group / Arm

Intervention / Treatment

What is the study measuring?

Primary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Secondary Outcome Measures

Outcome Measure

Measure Description

Time Frame

Collaborators and Investigators

Sponsor

Investigators

Study record dates

Study Major Dates

Study Start (Actual)

Primary Completion (Actual)

Study Completion (Actual)

Study Registration Dates

First Submitted

First Submitted That Met QC Criteria

First Posted (Actual)

Study Record Updates

Last Update Posted (Actual)

Last Update Submitted That Met QC Criteria

Last Verified

More Information

Terms related to this study

Other Study ID Numbers

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

product manufactured in and exported from the U.S.

Clinical Trials on Cognitive Change

Clinical Trials on GPT Support

Search Similar Trials

Sponsors and Collaborators

Medical Conditions

Drug Interventions

CROs by country

CROs in Oman

Conditions

Rare Diseases

Drug Interventions

Dietary Supplements

Sponsor/Collaborators

Locations