Prospective Validation of an Artificial Intelligence Tool for Pre-Anesthetic Assessment

December 5, 2025 updated by: Andre Prato Schmidt, Hospital Nossa Senhora da Conceicao
This prospective observational cohort study aims to validate an artificial intelligence (AI) tool designed for pre-anesthetic assessment in Portuguese, tailored to the Brazilian healthcare context. Conducted at a single tertiary hospital, the study will enroll 270 adult patients (aged >18 years) scheduled for elective non-cardiac surgeries. Participants will use the AI tool to complete a self-assessment, generating general patient guidance and a detailed medical evaluation (the latter withheld from the anesthesiologist). A standard pre-anesthetic evaluation will then be performed by an anesthesiologist blinded to the AI results. A third blinded anesthesiologist will compare the assessments for accuracy, consistency, and risk identification (e.g., ASA classification and perioperative risk models). Primary outcome is concordance between AI and human assessments using Cohen's Kappa. Secondary outcomes include anesthesiologist perceptions of the tool's utility, impact on assessment quality, and patient usability challenges. The study poses minimal risks, with data collected over 24 months, and aims to enhance perioperative safety and efficiency in Brazil.

Study Overview

Detailed Description

Background:

The pre-anesthetic assessment is a critical process preceding surgical procedures, playing a fundamental role in ensuring perioperative safety and quality of care. Over recent decades, anesthesiologists have contributed significantly to improving perioperative safety and quality, as highlighted in the Institute of Medicine's report on quality of health care. This evaluation has been continually refined to enhance clinical outcomes and reduce costs associated with unnecessary preoperative laboratory tests and examinations.

The benefits of pre-anesthetic assessment are well-documented, including increased operational efficiency through early identification of potential complications, optimization of preoperative management for patients with underlying conditions such as diabetes, cardiopulmonary diseases, and renal insufficiency, and minimization of postoperative complications. In Brazil, the Federal Council of Medicine's Resolution No. 1.802/2006 mandates pre-anesthetic evaluation as an essential component for patient safety, recommending it be performed before hospital admission for elective procedures.

Recently, Brazil has experienced a surge in demand for surgical procedures, exacerbated by the COVID-19 pandemic. A 2022 survey by the Oswaldo Cruz Foundation (Fiocruz) revealed a backlog of 910,621 surgeries in the Unified Health System (SUS), with significant deficits in digestive system surgeries (374,475), genitourinary procedures (241,752), circulatory system interventions (104,925), and upper airway, face, head, and neck surgeries (102,352). This pressure on the public health system led to the launch of the National Program for Reducing Waiting Lists in 2024 by the Brazilian federal government. Consequently, there is a need to rethink the flow and process of pre-anesthetic assessments to ensure safe and adequate care for this pent-up demand.

With technological advancements and the increasing volume of available medical data, clinical evaluations have become more complex, requiring faster and more precise decisions. In this context, artificial intelligence (AI) emerges as a promising tool to transform anesthesiology, particularly in pre-anesthetic assessment. AI tools, including Natural Language Processing (NLP) and Large Language Models (LLMs), have shown potential to improve accuracy, efficiency, and personalization of medical care.

Recent studies have demonstrated the positive impact of AI in this area. For instance, NLP for reviewing medical records and identifying relevant preoperative clinical information has shown high concordance (81.24%) between machine and anesthesiologist assessments regarding the presence or absence of conditions, and it identified medical conditions in 16.6% of cases overlooked by the anesthesiologist. Personalized decision support systems using OWL ontologies and semantic web technologies have proven effective in generating risk assessment reports and customized clinical recommendations. Machine learning models for perioperative risk prediction have enabled the creation of individual risk profiles and personalized assessments.

Large Language Models (LLMs), such as GPT, have shown potential to enhance patient communication and education during the surgical journey. A 2024 comparative study of GPT versions indicated its ability to provide accurate and readable responses to patients on anesthesia-related questions. However, studies on the clinical application of AI in anesthesiology involving LLMs remain scarce; a 2023 systematic review identified no studies using this technology. Moreover, most studies and developments have been conducted in international contexts, not addressing the Portuguese language or the specificities of the Brazilian healthcare system. To date, no published studies exist on the development and validation of an LLM tool in Portuguese for pre-anesthetic assessment, particularly considering Brazil's epidemiological profile and healthcare challenges.

Additionally, concerns about the reliability of AI tools in medicine are notable. A recent study published in Nature Medicine revealed that many AI devices approved by the FDA in the United States lack adequate clinical validation. Of 521 FDA-authorized AI devices between 1995 and 2022, only 56% reported some form of clinical validation, with just 28.4% validated prospectively and 4.2% through randomized clinical trials. Notably, 43.4% had no publicly available clinical validation data. These findings underscore the importance of conducting rigorous clinical validation studies, especially in specific contexts, before implementing AI tools in medical practice.

Therefore, this project proposes to develop and prospectively validate an AI tool based on an LLM in Portuguese for pre-anesthetic assessment in Brazil, in a clinical setting. This addresses a gap in the literature and meets the specific needs of the national context. The goal is to enhance the accuracy, personalization, reliability, and efficiency of preoperative evaluations using advanced AI, considering the peculiarities of the Brazilian population and healthcare system. Implementing this tool could potentially transform anesthesiology practice in the country, providing improved decision support and promoting greater surgical safety.

Study Objectives:

Primary Objective:

- To prospectively evaluate the accuracy and consistency of the pre-anesthetic assessment performed by the AI tool compared to assessments conducted by anesthesiologists in a national tertiary hospital.

Secondary Objectives:

  • To determine the level of concordance between the preoperative risk assessment performed by the AI tool and human anesthesiologists, considering aspects such as the American Society of Anesthesiologists (ASA) classification and validated surgical risk models in a national tertiary hospital.
  • To investigate anesthesiologists' perceptions of the AI tool's utility, including their confidence in the results generated by the tool and their willingness to integrate it into clinical practice.
  • To assess whether the use of the AI tool influences the overall quality of the pre-anesthetic assessment, including the detection of risk conditions that may be underestimated or overlooked in conventional human evaluations.
  • To evaluate any difficulties patients face in using the AI tool in a national tertiary hospital.

Study Design:

This is a prospective observational longitudinal cohort study conducted at two centers, the Anesthesia Service of Hospital Nossa Senhora da Conceição in Porto Alegre, Brazil and Santa Casa de Ribeirão Preto, Brazil.

Participants:

Inclusion Criteria:

  • Patients aged 18 years or older.
  • Scheduled for elective non-cardiac surgeries at both institutions.

Exclusion Criteria:

  • Patients undergoing diagnostic procedures with isolated sedation or local anesthesia.
  • If a patient undergoes more than one surgical intervention during the same hospitalization, only the major procedure will be considered.

Sample Size:

The sample size was calculated based on the primary outcome of concordance between AI and human preoperative risk assessments using Cohen's Kappa coefficient. Assuming an expected concordance of 80% (based on prior studies showing around 0.80 for perioperative risk assessments between AI and clinicians), with a Kappa value ≥ 0.70 considered substantial concordance, a margin of error of 5%, and a 95% confidence level, the minimum sample size is approximately 246 patients. Accounting for potential losses to follow-up and exclusions (10% margin), the total sample will be 270 patients.

Procedures:

  1. Surgical Indication:

    • The surgeon determines the patient's surgical indication and records the surgery name according to the procedure code in the Ex-Care risk model.
  2. Use of the AI Tool:

    • The surgeon instructs the patient to use the AI tool for pre-anesthetic assessment. The patient accesses the tool and completes the requested information.
  3. Processing of the Assessment:

    • The tool processes the patient's assessment, generating two types of results:
    • Generic Orientations: A set of general instructions sent directly to the patient to aid in surgical preparation.
    • Specific Assessment: A detailed evaluation, including recommendations and warning signs, generated for medical use but not made available to the anesthesiologist performing the preoperative evaluation.
  4. Pre-Anesthetic Evaluation:

    • The patient undergoes a traditional pre-anesthetic evaluation by an anesthesiologist, who will not have access to the AI-generated assessment.
  5. Comparison of Assessments:

    • A third anesthesiologist, blinded to the prior process, will compare the two assessments (AI tool versus human evaluation). This comparison will consider the quality of collected information, precision of clinical judgment, and identification of potential risks or complications.

Data Collection:

Data will be collected prospectively, including variables such as age, sex, surgery code, and results from both assessments (AI tool and anesthesiologist). A data collection form (attached in the protocol after references) outlines the variables included in this research protocol.

Outcome Measures:

Primary Outcome:

- Concordance between preoperative risk assessments by the AI tool and the human anesthesiologist, in terms of quality of collected information and clinical judgment.

Secondary Outcomes:

  • Level of agreement on ASA classification and validated surgical risk models.
  • Anesthesiologists' perceptions of the tool's utility, confidence, and integration potential (assessed via surveys).
  • Impact on assessment quality, including detection of overlooked risks.
  • Patient-reported difficulties in using the AI tool (assessed via feedback).

Statistical Analysis:

Concordance tests will be used to compare preoperative risk assessment results between the AI tool and anesthesiologist. Cohen's Kappa coefficient will measure agreement for categorical variables (information quality and clinical judgment). For continuous variables (e.g., age), Student's t-test or Mann-Whitney test will be applied, depending on data distribution (normality checked via Shapiro-Wilk test). Statistical significance will be set at P < 0.05.

Timeline:

The study will span 36 months: 24 months for data collection, 6 months for analysis, and 6 months for result formulation and publication.

Funding:

Costs for data collection materials, computers, statistical software, and documentation will be covered by the involved authors. Budget details include: folders for storage (R$250.00), data collection sheets (R$500.00), and analysis software (R$1,000.00).

Dissemination:

Results will be stored in a secure, confidential database and submitted for publication in an indexed scientific journal upon completion.

Study Type

Observational

Enrollment (Estimated)

270

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Contact

Study Locations

    • Rio Grande do Sul
      • Porto Alegre, Rio Grande do Sul, Brazil, 91787-400
        • Hospital Nossa Senhora da Conceição (Grupo Hospitalar Conceição)
        • Contact:

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

  • Adult
  • Older Adult

Accepts Healthy Volunteers

No

Sampling Method

Non-Probability Sample

Study Population

The study population consists of adult patients (aged ≥ 18 years) scheduled for elective non-cardiac surgeries at two tertiary hospitals in Brazil. This population represents a typical cohort in a Brazilian public health system context, including individuals with varying comorbidities and surgical needs, such as those arising from the backlog of procedures in the Unified Health System (SUS) due to the COVID-19 pandemic. The focus is on patients requiring pre-anesthetic assessment for non-emergent, non-cardiac interventions, reflecting common surgical demands in digestive, genitourinary, circulatory, and upper airway/head/neck procedures. The estimated sample size is 270 participants to ensure adequate statistical power for evaluating the concordance between AI-based and human pre-anesthetic assessments.

Description

Inclusion Criteria:

  • Patients aged 18 years or older
  • Patients scheduled for elective non-cardiac surgeries

Exclusion Criteria:

  • Patients undergoing diagnostic procedures with isolated sedation or local anesthesia
  • If a patient undergoes more than one surgical intervention during the same hospitalization, only the major procedure will be considered (i.e., additional procedures during the same admission are not eligible for separate inclusion).

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

What is the study measuring?

Primary Outcome Measures

Outcome Measure
Measure Description
Time Frame
Preoperative Risk Assessment
Time Frame: The assessments are conducted preoperatively for each participant, with data collected during their preoperative evaluation phase. The comparison is performed post-assessment, during the data analysis phase, which occurs after the 24-mo data collection.
The primary outcome is the level of concordance between the preoperative risk assessments performed by the artificial intelligence (AI) tool, based on a Large Language Model (LLM) in Portuguese, and those conducted by a human anesthesiologist. This concordance is evaluated in terms of the quality of information collected (e.g., completeness and relevance of patient data) and the precision of clinical judgment (e.g., identification of perioperative risks, American Society of Anesthesiologists (ASA) classification, and alignment with validated surgical risk models such as Ex-Care). Measurement Method: A third anesthesiologist, blinded to both the AI and human evaluations, will compare the assessments. The concordance will be quantified using the Cohen's Kappa coefficient for categorical variables (e.g., quality of information and clinical judgment).
The assessments are conducted preoperatively for each participant, with data collected during their preoperative evaluation phase. The comparison is performed post-assessment, during the data analysis phase, which occurs after the 24-mo data collection.

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Publications and helpful links

The person responsible for entering information about the study voluntarily provides these publications. These may be about anything related to the study.

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Estimated)

March 1, 2026

Primary Completion (Estimated)

June 30, 2027

Study Completion (Estimated)

June 30, 2028

Study Registration Dates

First Submitted

December 5, 2025

First Submitted That Met QC Criteria

December 5, 2025

First Posted (Actual)

December 18, 2025

Study Record Updates

Last Update Posted (Actual)

December 18, 2025

Last Update Submitted That Met QC Criteria

December 5, 2025

Last Verified

October 1, 2025

More Information

Terms related to this study

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

NO

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

No

Studies a U.S. FDA-regulated device product

No

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Preoperative Care

Clinical Trials on AI-based pre-anesthetic assessment

Subscribe