Cette page a été traduite automatiquement et l'exactitude de la traduction n'est pas garantie. Veuillez vous référer au version anglaise pour un texte source.

Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer

13 juin 2026 mis à jour par: XiuYuan Chen, Peking University People's Hospital

This study is an exploratory effect-size estimation study, with the following specific objectives: ① to estimate the point estimate and 95% confidence interval of the Win Ratio for the experimental group (GAPS-Agent) versus the control group (large language model) in blinded pairwise preference judgments by thoracic surgery expert adjudicators, to serve as a sample size planning parameter for subsequent multicenter confirmatory clinical trials; ② to preliminarily evaluate the value of GAPS-Agent within clinical workflows.The hypothesis of this study is as follows: compared with a general-purpose large language model without medical enhancement (control group), a structured agentic workflow optimized on the basis of the GAPS evaluation framework (GAPS-Agent, experimental group) can help junior resident physicians generate clinical decision plans for complex lung cancer cases that are more strongly preferred by senior thoracic surgery expert adjudicators.

Aperçu de l'étude

Statut

Inscription sur invitation

Les conditions

Intervention / Traitement

Type d'étude

Interventionnel

Inscription (Estimé)

Phase

N'est pas applicable

Contacts et emplacements

Cette section fournit les coordonnées de ceux qui mènent l'étude et des informations sur le lieu où cette étude est menée.

Lieux d'étude

Chine
- Beijing Municipality
  - Beijing, Beijing Municipality, Chine, 100044
    - Peking University People's Hospital

Critères de participation

Les chercheurs recherchent des personnes qui correspondent à une certaine description, appelée critères d'éligibilité. Certains exemples de ces critères sont l'état de santé général d'une personne ou des traitements antérieurs.

Critère d'éligibilité

Âges éligibles pour étudier

Adulte
Adulte plus âgé

Accepte les volontaires sains

Non

La description

Inclusion Criteria:

Resident Physician Subjects:
1. Holds a valid and legally effective Physician Practice License of the People's Republic of China;
2. Currently holds the rank of resident physician in a thoracic surgery department at a tertiary Class A (3A) hospital;
3. Agrees to complete all assessment tasks of the main study phase in accordance with the study protocol;
4. Can guarantee the time and effort required to complete all assessment tasks of the main study.
Study Cases:
1. The case was discussed at the Thoracic Oncology Multidisciplinary Team (MDT) conference of Peking University People's Hospital between January 2025 and May 2026;
2. The current version of the NCCN guidelines does not provide an explicit recommendation covering the management of the case;
3. Does not overlap with the GAPS evaluation set;
4. The case is presented in pure text in a structured format, with all direct and indirect identifiers removed and complete de-identification performed prior to inclusion;
5. From the pool of eligible cases, 12 cases will be randomly drawn using Python (numpy.random, with a fixed and archived seed) to serve as the main study cases. The cases will cover 6 themes (chest mass of undetermined diagnosis, early-stage lung cancer, locally advanced lung cancer, oligometastatic/oligoprogressive disease, special intraoperative situations, and tumor recurrence), with 2 cases per theme.
Adjudication Expert Panel:
1. Holds a valid and legally effective Physician Practice License of the People's Republic of China;
2. Currently holds the rank of attending physician or above in a thoracic surgery department at a tertiary Class A hospital;
3. Chairs or regularly participates in lung cancer multidisciplinary team (MDT) work in their department.

Exclusion Criteria:

Resident Physician Subjects:
1. Has previously participated in the construction of the GAPS evaluation set or the development of GAPS-Agent;
2. Unable to complete the tasks of the study phase.
Study Cases:
1. Key case information is missing, such as text-form data on pathology (including IHC/NGS), imaging, laboratory tests, prior medical history, comorbidities, or PS score;
2. Decision-making for the case is strictly dependent on non-text information.
Adjudication Expert Panel:
1. Participated in the construction of the GAPS evaluation set, the content validity verification, or the development of GAPS-Agent for this study;
2. Has a direct conflict of interest with any specific product among the two-arm tools of this study.

Plan d'étude

Cette section fournit des détails sur le plan d'étude, y compris la façon dont l'étude est conçue et ce que l'étude mesure.

Comment l'étude est-elle conçue ?

Détails de conception

Objectif principal: Autre
Répartition: Randomisé
Modèle interventionnel: Affectation parallèle
Masquage: Seul

Nombre de bras

Armes et Interventions

Groupe de participants / Bras	Intervention / Traitement
Expérimental: test arm GAPS-Agent	Autre: GAPS-Agent The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer. In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength. Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation. Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri
Comparateur actif: control arm LLM	Autre: LLM Open source large language model that is not specifically enhanced in medical field.

Groupe de participants / Bras

Intervention / Traitement

Expérimental: test arm

GAPS-Agent

Autre: GAPS-Agent

The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer. In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength. Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation. Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri

Comparateur actif: control arm

LLM

Autre: LLM

Open source large language model that is not specifically enhanced in medical field.

Que mesure l'étude ?

Principaux critères de jugement

Mesure des résultats	Description de la mesure	Délai
Overall plan Win Ratio Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.

Mesures de résultats secondaires

Mesure des résultats	Description de la mesure	Délai
Inter-rater agreement Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	For the ternary preference judgment results of 10 expert judges across 192 paired comparisons and 6 evaluation domains, Fleiss' kappa was used to assess inter-rater agreement. The kappa value and its 95% confidence interval are reported for each evaluation domain.	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Redundancy Win Ratio Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Evidence-based medicine adherence Win Ratio Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Actionability Win Ratio Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Completeness Win Ratio Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Safety Win Ratio Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
GAPS automated rubric score Délai: Generated up to 3 weeks after residents finished their plan generation.	A third-party large language model, independent of the two study arms' base models, served as the judge model and automatically scored all 96 plans according to the GAPS rubric.	Generated up to 3 weeks after residents finished their plan generation.
Subject physician's self-confidence score Délai: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians self-rated their confidence in their own plan using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Tool satisfaction score Délai: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians rated their satisfaction with the tool using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Tool trustworthiness score Délai: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians rated the tool's credibility using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Decision-making time Délai: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	The time taken (in minutes) by each participating physician to complete the production of each case plan was automatically recorded by the evaluation platform. Differences between groups were analyzed using a linear mixed-effects model.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.

Collaborateurs et enquêteurs

C'est ici que vous trouverez les personnes et les organisations impliquées dans cette étude.

Parrainer

Peking University People's Hospital

Dates d'enregistrement des études

Ces dates suivent la progression des dossiers d'étude et des soumissions de résultats sommaires à ClinicalTrials.gov. Les dossiers d'étude et les résultats rapportés sont examinés par la Bibliothèque nationale de médecine (NLM) pour s'assurer qu'ils répondent à des normes de contrôle de qualité spécifiques avant d'être publiés sur le site Web public.

Dates principales de l'étude

Début de l'étude (Réel)

10 juin 2026

Achèvement primaire (Estimé)

21 juin 2026

Achèvement de l'étude (Estimé)

21 juin 2026

Dates d'inscription aux études

Première soumission

10 juin 2026

Première soumission répondant aux critères de contrôle qualité

13 juin 2026

Première publication (Réel)

17 juin 2026

Mises à jour des dossiers d'étude

Dernière mise à jour publiée (Réel)

17 juin 2026

Dernière mise à jour soumise répondant aux critères de contrôle qualité

13 juin 2026

Dernière vérification

1 juin 2026

Plus d'information

Termes liés à cette étude

Mots clés

Termes MeSH pertinents supplémentaires

Autres numéros d'identification d'étude

2026PHB458-001

Plan pour les données individuelles des participants (IPD)

Prévoyez-vous de partager les données individuelles des participants (DPI) ?

NON

Informations sur les médicaments et les dispositifs, documents d'étude

Étudie un produit pharmaceutique réglementé par la FDA américaine

Non

Étudie un produit d'appareil réglementé par la FDA américaine

Non

Ces informations ont été extraites directement du site Web clinicaltrials.gov sans aucune modification. Si vous avez des demandes de modification, de suppression ou de mise à jour des détails de votre étude, veuillez contacter register@clinicaltrials.gov. Dès qu'un changement est mis en œuvre sur clinicaltrials.gov, il sera également mis à jour automatiquement sur notre site Web .

Essais cliniques sur Cancer du poumon (NSCLC)

Jianxing He
Innovent Biologics (Suzhou) Co. Ltd.

Recrutement

Thérapie néoadjuvante Fulzerasib Séquentielle Sintilimab Plus Doublet de Platine pour le NSCLC KRAS G12C-Mutant Résécable (K-NADIR)

Thérapie néoadjuvante | Mutation KRAS G12C | NSCLC résécable | NSCLC de stade IB-IIIA

Chine
Wen-zhao ZHONG

Recrutement

Sub-lobectomy vs Lobectomy in IIA-IIIB NSCLC After Neoadjuvant IO+Chemo

NSCLC

Chine
CSPC Megalith Biopharmaceutical Co.,Ltd.

Pas encore de recrutement

Une étude clinique de phase Ⅰb/Ⅲ du SYS6010 en association avec l'osimertinib chez des patients atteints de cancer du poumon non à petites cellules localement avancé ou métastatique (SYNSTAR-02)

NSCLC
Tianjin Medical University Cancer Institute and...

Recrutement

Étude TALENT : Essai de phase II du L-TIL adjuvant plus tislelizumab dans le CBNPC résécable sans pCR après chimiothérapie néoadjuvante

NSCLC

Chine
Shanghai Chest Hospital

Pas encore de recrutement

Une étude du SHR-A1811 combiné avec l'adebelimumab en tant que thérapie néoadjuvante pour le cancer du poumon non à petites cellules résécable avec altération HER2

NSCLC
Jiangsu Province Nanjing Brain Hospital

Recrutement

La surveillance dynamique de l'ADNc du liquide céphalo-rachidien

NSCLC

Chine
Radboud University Medical Center
Pfizer; ImaginAb, Inc.; University Hospital Tuebingen

Pas encore de recrutement

Réponses d’imagerie immuno-animale administrées Inhibiteur du point de contrôle immunitaire (IMPRINT)

NSCLC

Allemagne, Pays-Bas
Guangdong Provincial People's Hospital

Actif, ne recrute pas

Étude observationnelle prospective des variations des taux de cortisol après immunothérapie néoadjuvante et de leur valeur pronostique chez les patients atteints de cancer bronchique non à petites cellules

NSCLC

Chine
Shanghai Zhongshan Hospital

Complété

Le modèle multimodal prédit l'efficacité du traitement et le risque de CIP dans le cancer du poumon non à petites cellules avancé avec immunothérapie et chimiothérapie

NSCLC

Chine
TYK Medicines, Inc

Complété

Études évaluant les effets de l'itraconazole ou de la rifampicine sur la pharmacocinétique des comprimés TY-9591 chez des sujets sains

NSCLC

Chine

Essais cliniques sur GAPS-Agent

Wyeth is now a wholly owned subsidiary of Pfizer

Complété

Étude à dose unique croissante sur l'innocuité, la tolérabilité et la pharmacocinétique du GAP-134 administré par voie intraveineuse

Arythmie

États-Unis
Seattle Children's Research Institute (SCRI)

Résilié

Essai de phase 1/2a du vaccin contre le paludisme sporozoïte Pf GAP p52-/p36-

Paludisme

États-Unis
Wyeth is now a wholly owned subsidiary of Pfizer

Complété

Dose unique croissante de GAP-134 sous forme de perfusion IV de 24 heures chez des hommes japonais en bonne santé

Arythmie

Japon
Wyeth is now a wholly owned subsidiary of Pfizer

Complété

Étude de l'innocuité, de la tolérabilité et de la pharmacocinétique du GAP-134 administré par voie intraveineuse

Sujets sains

États-Unis
Universidad Autonoma de Madrid
Ilustre Colegio Profesional de Fisioterapeutas de la Comunidad de Madrid

Pas encore de recrutement

Modèle de Gestion des Soins Personnalisés (GAP-421) pour la Douleur Chronique en Physiothérapie de Soins Primaires (GAP-421)

Douleur musculo-squelettique | La douleur chronique | Soins de santé primaires | Coordination des soins | Douleur chronique non cancéreuse

Espagne
Children's Hospital Los Angeles
American Psychological Foundation

Recrutement

Pont The Gap (BTG) - Black Youth Group (Bridge the Gap)

Prévention du suicide

États-Unis
Michigan State University
University of Michigan

Actif, ne recrute pas

Diminution du risque cardiovasculaire chez les patients diabétiques

Diabète sucré, Type 2

États-Unis
George Washington University
King's College London; Duke University; Medical Research Council; University of... et autres collaborateurs

Complété

Guide d'intervention E-MhGAP dans les pays à revenu faible et intermédiaire : preuve de concept pour l'impact et l'acceptabilité (EMILIA)

La dépression

Nigeria, Népal
University of North Carolina, Chapel Hill
National Institute of Mental Health (NIMH); George Washington University; Instituto...

Complété

Intervention pilote à plusieurs niveaux auprès des femmes transgenres vivant avec le VIH à Saint-Domingue

VIH

République Dominicaine
Saglik Bilimleri Universitesi

Complété

Comparaison du score pronostique dans l'IPF et les HP

Pneumopathie d'hypersensibilité | Maladie pulmonaire interstitielle (MPI) | FPI | Maladie pulmonaire fibrotique

Turquie (Türkiye)

Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer

Aperçu de l'étude

Statut

Les conditions

Intervention / Traitement

Type d'étude

Inscription (Estimé)

Phase

Contacts et emplacements

Lieux d'étude

Critères de participation

Critère d'éligibilité

Âges éligibles pour étudier

Accepte les volontaires sains

La description

Plan d'étude

Comment l'étude est-elle conçue ?

Détails de conception

Nombre de bras

Armes et Interventions

Groupe de participants / Bras

Intervention / Traitement

Que mesure l'étude ?

Principaux critères de jugement

Mesure des résultats

Description de la mesure

Délai

Mesures de résultats secondaires

Mesure des résultats

Description de la mesure

Délai

Collaborateurs et enquêteurs

Parrainer

Dates d'enregistrement des études

Dates principales de l'étude

Début de l'étude (Réel)

Achèvement primaire (Estimé)

Achèvement de l'étude (Estimé)

Dates d'inscription aux études

Première soumission

Première soumission répondant aux critères de contrôle qualité

Première publication (Réel)

Mises à jour des dossiers d'étude

Dernière mise à jour publiée (Réel)

Dernière mise à jour soumise répondant aux critères de contrôle qualité

Dernière vérification

Plus d'information

Termes liés à cette étude

Mots clés

Termes MeSH pertinents supplémentaires

Autres numéros d'identification d'étude

Plan pour les données individuelles des participants (IPD)

Prévoyez-vous de partager les données individuelles des participants (DPI) ?

Informations sur les médicaments et les dispositifs, documents d'étude

Étudie un produit pharmaceutique réglementé par la FDA américaine

Étudie un produit d'appareil réglementé par la FDA américaine

Essais cliniques sur Cancer du poumon (NSCLC)

Essais cliniques sur GAPS-Agent

Rechercher des essais similaires

Sponsors et collaborateurs

Les conditions médicales

Interventions en matière de drogue

CROs by country

CROs in Angola

Conditions

Maladies rares

Interventions en matière de drogue

Compléments alimentaires

Commanditaire / collaborateurs

Emplacements