Performance Comparison of Large Language Models in TAP Block Ultrasound Interpretation

February 2, 2026 updated by: Engin Ihsan Turan, Kanuni Sultan Suleyman Training and Research Hospital

Performance Comparison of Large Language Models in TAP Block Ultrasound Interpretation: A Double-Blind Prospective Study

The goal of this study is to learn how accurately two artificial intelligence (AI) models, Gemini 2.5 Pro and ChatGPT-5.1, can interpret ultrasound videos of the Transversus Abdominis Plane (TAP) block, a regional anesthesia technique used for pain control after surgery.

The main questions this study aims to answer are:

How accurately can each AI model identify anatomical structures on TAP block ultrasound videos? Can the AI models correctly evaluate the spread of local anesthetic and determine whether the block is successful? How closely do the AI models' answers match the evaluations of expert anesthesiologists? No additional procedures will be performed on patients. TAP blocks will be done as part of routine clinical care, and the ultrasound videos will be recorded and de-identified.

Participants will not need to do anything extra for the study. Experienced anesthesiologists will review the videos and provide expert answers. The AI models will be given the same videos and asked the same questions. A second expert, who does not know which answers came from humans or AI, will compare all responses.

The results will help researchers understand whether advanced AI systems can safely support clinicians in interpreting ultrasound-guided regional anesthesia procedures and improve education and decision-making in anesthesia practice.

Study Overview

Status

Recruiting

Conditions

Artificial Intelegence

Intervention / Treatment

Detailed Description

This study aims to evaluate how two advanced artificial intelligence (AI) models, Gemini 2.5 Pro and ChatGPT-5.1, interpret ultrasound videos of Transversus Abdominis Plane (TAP) block procedures. TAP blocks are performed as part of routine clinical care by experienced anesthesiologists. The ultrasound videos recorded during these procedures serve as the data source for this study. No additional procedures or patient involvement are required beyond standard care.

Ultrasound Video Processing All ultrasound recordings will be fully de-identified by removing patient names, dates, and any other identifying information.

Gemini 2.5 Pro will receive original video files. ChatGPT-5.1 will receive high-resolution GIF segments generated from the same recordings.

Both models will be given identical structured prompts consisting of eight clinically relevant questions about anatomic structures, needle placement, local anesthetic spread, dermatomal effects, and potential safety concerns.

Expert Participation

Two anesthesiology experts will participate independently:

Expert A will review each ultrasound video and answer the same set of eight clinical questions. These answers will serve as the primary human clinical reference.

Expert B will independently evaluate all responses, those from Expert A, Gemini, and ChatGPT-5.1, after they have been anonymized and randomly ordered. Expert B will not know whether a response originated from an AI model or a human expert. Expert B will assess anatomical accuracy, clarity, clinical appropriateness, and overall content quality for each answer.

If Expert A and Expert B disagree on the interpretation or quality assessment of any response, a third expert (Expert C), who is also experienced in ultrasound-guided regional anesthesia, will independently review the relevant responses. Expert C's evaluation will be used to resolve discrepancies and establish the final consensus.

Data Collected

For each TAP block video, the following information will be recorded:

Ultrasound and procedural details. Patient demographic descriptors (age, sex, BMI, ASA classification), used only for general characterization of the sample.

AI-related performance features such as response completeness, relevance, confidence level, and response time.

Study Type

Observational

Enrollment (Estimated)

Contacts and Locations

This section provides the contact details for those conducting the study, and information on where this study is being conducted.

Study Contact

Name: Engin ihsan Turan, principal investigator
Phone Number: +905382431114
Email: enginihsan@hotmail.com

Study Locations

Turkey (Türkiye)
- Istanbul
  - Istanbul, Istanbul, Turkey (Türkiye), 34303
    - Recruiting
    - Health Science University İstanbul Kanuni Sultan Süleyman Education and Training Hospital
    - Contact:
      
      Engin ihsan Turan
      
      Phone Number: 05382431114
      
      Email: enginihsan@hotmail.com

Participation Criteria

Researchers look for people who fit a certain description, called eligibility criteria. Some examples of these criteria are a person's general health condition or prior treatments.

Eligibility Criteria

Ages Eligible for Study

Adult
Older Adult

Accepts Healthy Volunteers

Sampling Method

Probability Sample

Study Population

This study will include adult surgical patients who receive a lateral Transversus Abdominis Plane (L-TAP) block as part of routine anesthesia practice during elective operations. The population consists only of patients whose block procedures are already clinically indicated and performed by experienced anesthesiologists. No additional procedures are carried out for research purposes. Ultrasound videos obtained during standard block practice are de-identified and analyzed. The study does not involve patient follow-up or any change in clinical management.

Description

Inclusion Criteria:

Adults aged 18-85 years
ASA I-III physical status
Undergoing elective surgery with a lateral TAP block performed as part of routine anesthesia care
Complete ultrasound-guided block procedure recorded on video
Able to provide written informed consent

Exclusion Criteria:

Unsuccessful or incomplete TAP block procedure
Poor-quality ultrasound video (needle tip or anesthetic spread not visible)
Missing demographic or clinical data
Withdrawal of consent at any time

Study Plan

This section provides details of the study plan, including how the study is designed and what the study is measuring.

How is the study designed?

Design Details

What is the study measuring?

Primary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Anatomical Interpretation Accuracy Time Frame: At the time of video analysis	For each ultrasound video, the ability of both AI models (ChatGPT-5.1 and Gemini 2.5 Pro) to correctly identify key anatomical structures of the lateral TAP block (internal oblique, transversus abdominis, fascial plane, needle tip) will be evaluated. The accuracy of each model will be compared with the expert-defined reference answer.	At the time of video analysis

Secondary Outcome Measures

Outcome Measure	Measure Description	Time Frame
Block Success Interpretation Time Frame: At the time of video analysis.	Assessment of whether each AI model correctly determines block success based on needle placement and local anesthetic spread, compared with expert reference evaluation.	At the time of video analysis.
Needle Plane Evaluation Time Frame: At the time of video analysis.	Determination of whether AI models correctly assess the needle tip location and whether it is within the correct interfascial plane (IO-TA fascia), compared with the expert reference.	At the time of video analysis.
Dermatomal Level Prediction Time Frame: At the time of video analysis.	Comparison of each AI model's predicted dermatomal coverage (e.g., T10-T12) with the expert-provided reference dermatomal level.	At the time of video analysis.
Risk Awareness Assessment Time Frame: At the time of video analysis.	Evaluation of whether each AI model correctly identifies potential risks on ultrasound images (e.g., peritoneal proximity, vascular structures). 0 = no risk awareness, 1 = partial, 2 = complete and appropriate risk identification.	At the time of video analysis.
Recommendation Quality Time Frame: At the time of video analysis.	Assessment of the appropriateness of each model's suggestions (e.g., need for additional injection, repositioning) based on the ultrasound appearance. Qualitative scoring by expert evaluator (0-10).	At the time of video analysis.
Agreement Between Experts Time Frame: During expert evaluation phase.	To evaluate whether Expert A and Expert B provide consistent judgments for each parameter; and to resolve discrepancies through Expert C when needed. Agreement / Disagreement resolved by third expert.	During expert evaluation phase.
AI Response Time Time Frame: Captured automatically during model output.	Time required for each AI model to generate answers to the eight standardized questions. Seconds (continuous variable).	Captured automatically during model output.

Collaborators and Investigators

This is where you will find people and organizations involved with this study.

Sponsor

Kanuni Sultan Suleyman Training and Research Hospital

Study record dates

These dates track the progress of study record and summary results submissions to ClinicalTrials.gov. Study records and reported results are reviewed by the National Library of Medicine (NLM) to make sure they meet specific quality control standards before being posted on the public website.

Study Major Dates

Study Start (Actual)

January 15, 2026

Primary Completion (Estimated)

April 30, 2026

Study Completion (Estimated)

May 1, 2026

Study Registration Dates

First Submitted

November 21, 2025

First Submitted That Met QC Criteria

November 21, 2025

First Posted (Actual)

December 3, 2025

Study Record Updates

Last Update Posted (Actual)

February 4, 2026

Last Update Submitted That Met QC Criteria

February 2, 2026

Last Verified

November 1, 2025

More Information

Terms related to this study

Keywords

artificial intelligence

Other Study ID Numbers

SUCCESS OF LLMs in TAP BLOCK

Plan for Individual participant data (IPD)

Plan to Share Individual Participant Data (IPD)?

UNDECIDED

Drug and device information, study documents

Studies a U.S. FDA-regulated drug product

Studies a U.S. FDA-regulated device product

This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.

Clinical Trials on Artificial Intelegence

Kanuni Sultan Suleyman Training and Research Hospital

Completed

Capabilities ofArtificial Intelligence Models in Externation Decision of Patient Who Followed in Intensive Care Unit () (ICU)

Artificial Intelegence

Turkey
Izmir University of Economics
The Scientific and Technological Research Council of Turkey

Completed

Effect of ChatGPT-Supported Education on Nursing Students' Aspiration Practice Knowledge and Motivation

Nursing Education | Motivation | Knowledge | Endotracheal Aspiration | Artificial Intelegence

Turkey
University of Colorado, Denver
American Academy of Family Physicians

Recruiting

Fitbit and AI Chatbot in Sedentary Primary Care Patients With T2D (FIT T2D)

Type 2 Diabetes | Type 2 Diabetes Mellitus (T2DM) | T2DM (Type 2 Diabetes Mellitus) | T2D | T2DM | Artificial Intelegence | Remote Patient Monitoring

United States
Medtronic Cardiac Rhythm and Heart Failure

Terminated

Optimize RV Selective Site Pacing Clinical Trial

Cardiac Pacemaker, Artificial | Cardiac Pacing, Artificial

Canada, United States, Israel, Italy, Qatar, China
Uşak University

Completed

Physiotherapists and Artificial Intelligence

Digital Competences | Artificial Intelligence (AI) | Physiotherapist Students | Acceptance of Artificial Intelligence | Artificial Intelligence Attitude

Turkey
Bakirkoy Dr. Sadi Konuk Research and Training Hospital

Completed

Development of a Scoring and Prediction Model for Weaning Success in ARDS Patients Using Ventilation Parameters Combined with Artificial Intelligence and Deep Learning Techniques

Deep Learning | ARDS (Acute Respiratory Distress Syndrome) | Artificial Intelegence

Turkey
University of Yalova

Not yet recruiting

The Effect of AI-Assisted Nursing Process Training on Nursing Process Competence, Perception and Attitudes Towards Artificial Intelligence in Nurses: A Randomized Controlled Study

Artificial Intelligence | Nursing Education | Clinical Competence | Artificial Intelligence (AI) | Nursing Process | Nursing Process Competence | Artificial Intelligence Perception and Attitude

Turkey (Türkiye)
Cambridge Health Alliance

Enrolling by invitation

OpenEvidence Safety and Comparative Efficacy of Four LLM's in Clinical Practice

AI (Artificial Intelligence) | Large Language Model | Generative Artificial Intelligence

United States
John J Chen

Completed

Enhancing Interdisciplinary Understanding of Ophthalmology Notes Through a Local Large Language Model

Communication | Interdisciplinary Communication | Artificial Intelligence (AI) | Artificial Intelligence Technology

United States
Radboud University Medical Center
Prime Dental Alliance Eindhoven

Not yet recruiting

The Impact of Artificial Intelligence on Dentists' Decision-Making Process During Caries Detection (DECIDE-AI)

Artificial Intelligence Supported Image Reviewing | Artificial Intelligence (AI) in Diagnosis

Netherlands

Clinical Trials on Gemini 2.5 Pro Evaluation

Waymark

Completed

ANCHOR Validation Trial in High-Risk Multidisciplinary Care

Telemedicine | Clinical Decision Support | High-Risk Multidisciplinary Care | Artificial Intelligence-Assisted Care

United States
Kirsehir Ahi Evran Universitesi

Completed

LLM-Guided Rehabilitation in Degenerative Knee Disease (LLM-RehabKnee)

Degenerative Knee Disease

Turkey (Türkiye)
Bursa City Hospital

Active, not recruiting

The Predictability of the Necessity for Cardiology Consultation in Patients Scheduled for Non-Cardiac Surgery Using Artificial Intelligence Models in Preoperative Anesthesia Assessment

USE OF ARTIFICIAL INTELLIGENCE IN ANESTHESIA | PREOPERATIVE CARDIOLOGY CONSULTATION REQUIREMENT

Turkey (Türkiye)
North Sichuan Medical College
Afﬁliated Hospital of North Sichuan Medical College

Completed

Ophthalmic Diseases and AI: an RCT Study

Eye Diseases

China
Össur Iceland ehf
Clin-Experts

Completed

Comparative Study Evaluating the Mobility of Transtibial Amputee Patients Using the PRO-FLEX PIVOT® Foot Versus a Class III Energy Storing and Returning (ESAR) Prosthetic Foot. (DEFI)

Amputation

France
North Sichuan Medical College
University of Glasgow; Afﬁliated Hospital of North Sichuan Medical College; Nanchong...

Completed

Evaluating the Potential of Large Language Models for Respiratory Disease Consultations (EPLLMMRDC)

Pneumonia | Bronchiectasis | Asthma | Pulmonary Fibrosis | Lung Cancer | Tuberculosis | Acute Upper Respiratory Infection | Pulmonary Embolism | Hay Fever | Acute Bronchitis

China
Amrik Singh Khalsa
National Heart, Lung, and Blood Institute (NHLBI); Ohio State University; American...

Recruiting

Pathways to Prevention Food-is-Medicine Trial (P2P)

Heart Disease | Cardiovascular Health Status

United States
Gaylan Rockswold
National Institute of Neurological Disorders and Stroke (NINDS); Strategies...

Recruiting

Hyperbaric Oxygen Brain Injury Treatment Trial (HOBIT)

Traumatic Brain Injury

United States, Canada
Leipzig Heart Science gGmbH
Heart Center Leipzig - University Hospital

Active, not recruiting

RanDOmized stUdy Comparing Both Latest Generation Self-Expanding Valves and a Minimalist approaCH vs. Standard Of Care In transCathEter Aortic Valve Implantation (DOUBLE-CHOICE)

Aortic Valve Stenosis

Germany
Sichuan University

Not yet recruiting

Patient-Reported Symptom Care Versus Usual Care After Esophageal Cancer Surgery: A Single-Center Phase 3 Randomized Trial

Esophageal Cancer (EsC)

Performance Comparison of Large Language Models in TAP Block Ultrasound Interpretation