- ICH GCP
- Registre américain des essais cliniques
- Essai clinique NCT07654036
Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer
13 juin 2026 mis à jour par: XiuYuan Chen, Peking University People's Hospital
This study is an exploratory effect-size estimation study, with the following specific objectives: ① to estimate the point estimate and 95% confidence interval of the Win Ratio for the experimental group (GAPS-Agent) versus the control group (large language model) in blinded pairwise preference judgments by thoracic surgery expert adjudicators, to serve as a sample size planning parameter for subsequent multicenter confirmatory clinical trials; ② to preliminarily evaluate the value of GAPS-Agent within clinical workflows.The hypothesis of this study is as follows: compared with a general-purpose large language model without medical enhancement (control group), a structured agentic workflow optimized on the basis of the GAPS evaluation framework (GAPS-Agent, experimental group) can help junior resident physicians generate clinical decision plans for complex lung cancer cases that are more strongly preferred by senior thoracic surgery expert adjudicators.
Aperçu de l'étude
Statut
Inscription sur invitation
Les conditions
Intervention / Traitement
Type d'étude
Interventionnel
Inscription (Estimé)
12
Phase
- N'est pas applicable
Contacts et emplacements
Cette section fournit les coordonnées de ceux qui mènent l'étude et des informations sur le lieu où cette étude est menée.
Lieux d'étude
-
-
Beijing Municipality
-
Beijing, Beijing Municipality, Chine, 100044
- Peking University People's Hospital
-
-
Critères de participation
Les chercheurs recherchent des personnes qui correspondent à une certaine description, appelée critères d'éligibilité. Certains exemples de ces critères sont l'état de santé général d'une personne ou des traitements antérieurs.
Critère d'éligibilité
Âges éligibles pour étudier
- Adulte
- Adulte plus âgé
Accepte les volontaires sains
Non
La description
Inclusion Criteria:
Resident Physician Subjects:
- Holds a valid and legally effective Physician Practice License of the People's Republic of China;
- Currently holds the rank of resident physician in a thoracic surgery department at a tertiary Class A (3A) hospital;
- Agrees to complete all assessment tasks of the main study phase in accordance with the study protocol;
- Can guarantee the time and effort required to complete all assessment tasks of the main study.
Study Cases:
- The case was discussed at the Thoracic Oncology Multidisciplinary Team (MDT) conference of Peking University People's Hospital between January 2025 and May 2026;
- The current version of the NCCN guidelines does not provide an explicit recommendation covering the management of the case;
- Does not overlap with the GAPS evaluation set;
- The case is presented in pure text in a structured format, with all direct and indirect identifiers removed and complete de-identification performed prior to inclusion;
- From the pool of eligible cases, 12 cases will be randomly drawn using Python (numpy.random, with a fixed and archived seed) to serve as the main study cases. The cases will cover 6 themes (chest mass of undetermined diagnosis, early-stage lung cancer, locally advanced lung cancer, oligometastatic/oligoprogressive disease, special intraoperative situations, and tumor recurrence), with 2 cases per theme.
Adjudication Expert Panel:
- Holds a valid and legally effective Physician Practice License of the People's Republic of China;
- Currently holds the rank of attending physician or above in a thoracic surgery department at a tertiary Class A hospital;
- Chairs or regularly participates in lung cancer multidisciplinary team (MDT) work in their department.
Exclusion Criteria:
Resident Physician Subjects:
- Has previously participated in the construction of the GAPS evaluation set or the development of GAPS-Agent;
- Unable to complete the tasks of the study phase.
Study Cases:
- Key case information is missing, such as text-form data on pathology (including IHC/NGS), imaging, laboratory tests, prior medical history, comorbidities, or PS score;
- Decision-making for the case is strictly dependent on non-text information.
Adjudication Expert Panel:
- Participated in the construction of the GAPS evaluation set, the content validity verification, or the development of GAPS-Agent for this study;
- Has a direct conflict of interest with any specific product among the two-arm tools of this study.
Plan d'étude
Cette section fournit des détails sur le plan d'étude, y compris la façon dont l'étude est conçue et ce que l'étude mesure.
Comment l'étude est-elle conçue ?
Détails de conception
- Objectif principal: Autre
- Répartition: Randomisé
- Modèle interventionnel: Affectation parallèle
- Masquage: Seul
Armes et Interventions
Groupe de participants / Bras |
Intervention / Traitement |
|---|---|
|
Expérimental: test arm
GAPS-Agent
|
The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer.
In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength.
Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation.
Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri
|
|
Comparateur actif: control arm
LLM
|
Open source large language model that is not specifically enhanced in medical field.
|
Que mesure l'étude ?
Principaux critères de jugement
Mesure des résultats |
Description de la mesure |
Délai |
|---|---|---|
|
Overall plan Win Ratio
Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
Mesures de résultats secondaires
Mesure des résultats |
Description de la mesure |
Délai |
|---|---|---|
|
Inter-rater agreement
Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
For the ternary preference judgment results of 10 expert judges across 192 paired comparisons and 6 evaluation domains, Fleiss' kappa was used to assess inter-rater agreement.
The kappa value and its 95% confidence interval are reported for each evaluation domain.
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
Redundancy Win Ratio
Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
Evidence-based medicine adherence Win Ratio
Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
Actionability Win Ratio
Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
Completeness Win Ratio
Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
Safety Win Ratio
Délai: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
GAPS automated rubric score
Délai: Generated up to 3 weeks after residents finished their plan generation.
|
A third-party large language model, independent of the two study arms' base models, served as the judge model and automatically scored all 96 plans according to the GAPS rubric.
|
Generated up to 3 weeks after residents finished their plan generation.
|
|
Subject physician's self-confidence score
Délai: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
After submitting each case plan, the participating physicians self-rated their confidence in their own plan using a 1-5 point Likert scale.
|
Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
|
Tool satisfaction score
Délai: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
After submitting each case plan, the participating physicians rated their satisfaction with the tool using a 1-5 point Likert scale.
|
Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
|
Tool trustworthiness score
Délai: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
After submitting each case plan, the participating physicians rated the tool's credibility using a 1-5 point Likert scale.
|
Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
|
Decision-making time
Délai: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
The time taken (in minutes) by each participating physician to complete the production of each case plan was automatically recorded by the evaluation platform.
Differences between groups were analyzed using a linear mixed-effects model.
|
Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
Collaborateurs et enquêteurs
C'est ici que vous trouverez les personnes et les organisations impliquées dans cette étude.
Parrainer
Dates d'enregistrement des études
Ces dates suivent la progression des dossiers d'étude et des soumissions de résultats sommaires à ClinicalTrials.gov. Les dossiers d'étude et les résultats rapportés sont examinés par la Bibliothèque nationale de médecine (NLM) pour s'assurer qu'ils répondent à des normes de contrôle de qualité spécifiques avant d'être publiés sur le site Web public.
Dates principales de l'étude
Début de l'étude (Réel)
10 juin 2026
Achèvement primaire (Estimé)
21 juin 2026
Achèvement de l'étude (Estimé)
21 juin 2026
Dates d'inscription aux études
Première soumission
10 juin 2026
Première soumission répondant aux critères de contrôle qualité
13 juin 2026
Première publication (Réel)
17 juin 2026
Mises à jour des dossiers d'étude
Dernière mise à jour publiée (Réel)
17 juin 2026
Dernière mise à jour soumise répondant aux critères de contrôle qualité
13 juin 2026
Dernière vérification
1 juin 2026
Plus d'information
Termes liés à cette étude
Mots clés
Termes MeSH pertinents supplémentaires
Autres numéros d'identification d'étude
- 2026PHB458-001
Plan pour les données individuelles des participants (IPD)
Prévoyez-vous de partager les données individuelles des participants (DPI) ?
NON
Informations sur les médicaments et les dispositifs, documents d'étude
Étudie un produit pharmaceutique réglementé par la FDA américaine
Non
Étudie un produit d'appareil réglementé par la FDA américaine
Non
Ces informations ont été extraites directement du site Web clinicaltrials.gov sans aucune modification. Si vous avez des demandes de modification, de suppression ou de mise à jour des détails de votre étude, veuillez contacter register@clinicaltrials.gov. Dès qu'un changement est mis en œuvre sur clinicaltrials.gov, il sera également mis à jour automatiquement sur notre site Web .
Essais cliniques sur Cancer du poumon (NSCLC)
-
Jianxing HeInnovent Biologics (Suzhou) Co. Ltd.RecrutementThérapie néoadjuvante | Mutation KRAS G12C | NSCLC résécable | NSCLC de stade IB-IIIAChine
-
Wen-zhao ZHONGRecrutement
-
CSPC Megalith Biopharmaceutical Co.,Ltd.Pas encore de recrutement
-
Tianjin Medical University Cancer Institute and...Recrutement
-
Shanghai Chest HospitalPas encore de recrutement
-
Jiangsu Province Nanjing Brain HospitalRecrutement
-
Radboud University Medical CenterPfizer; ImaginAb, Inc.; University Hospital TuebingenPas encore de recrutementNSCLCAllemagne, Pays-Bas
-
Guangdong Provincial People's HospitalActif, ne recrute pas
-
Shanghai Zhongshan HospitalComplété
-
TYK Medicines, IncComplété
Essais cliniques sur GAPS-Agent
-
Wyeth is now a wholly owned subsidiary of PfizerComplété
-
Seattle Children's Research Institute (SCRI)Résilié
-
Wyeth is now a wholly owned subsidiary of PfizerComplété
-
Wyeth is now a wholly owned subsidiary of PfizerComplété
-
Universidad Autonoma de MadridIlustre Colegio Profesional de Fisioterapeutas de la Comunidad de MadridPas encore de recrutementDouleur musculo-squelettique | La douleur chronique | Soins de santé primaires | Coordination des soins | Douleur chronique non cancéreuseEspagne
-
Children's Hospital Los AngelesAmerican Psychological FoundationRecrutement
-
Michigan State UniversityUniversity of MichiganActif, ne recrute pasDiabète sucré, Type 2États-Unis
-
George Washington UniversityKing's College London; Duke University; Medical Research Council; University of... et autres collaborateursComplété
-
University of North Carolina, Chapel HillNational Institute of Mental Health (NIMH); George Washington University; Instituto...ComplétéVIHRépublique Dominicaine
-
Saglik Bilimleri UniversitesiComplétéPneumopathie d'hypersensibilité | Maladie pulmonaire interstitielle (MPI) | FPI | Maladie pulmonaire fibrotiqueTurquie (Türkiye)