Denna sida har översatts automatiskt och översättningens korrekthet kan inte garanteras. Vänligen se engelsk version för en källtext.

Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer

13 juni 2026 uppdaterad av: XiuYuan Chen, Peking University People's Hospital

This study is an exploratory effect-size estimation study, with the following specific objectives: ① to estimate the point estimate and 95% confidence interval of the Win Ratio for the experimental group (GAPS-Agent) versus the control group (large language model) in blinded pairwise preference judgments by thoracic surgery expert adjudicators, to serve as a sample size planning parameter for subsequent multicenter confirmatory clinical trials; ② to preliminarily evaluate the value of GAPS-Agent within clinical workflows.The hypothesis of this study is as follows: compared with a general-purpose large language model without medical enhancement (control group), a structured agentic workflow optimized on the basis of the GAPS evaluation framework (GAPS-Agent, experimental group) can help junior resident physicians generate clinical decision plans for complex lung cancer cases that are more strongly preferred by senior thoracic surgery expert adjudicators.

Studieöversikt

Status

Anmälan via inbjudan

Betingelser

Intervention / Behandling

Studietyp

Interventionell

Inskrivning (Beräknad)

Fas

Inte tillämpbar

Kontakter och platser

Det här avsnittet innehåller kontaktuppgifter för dem som genomför studien och information om var denna studie genomförs.

Studieorter

Kina
- Beijing Municipality
  - Beijing, Beijing Municipality, Kina, 100044
    - Peking University People's Hospital

Deltagandekriterier

Forskare letar efter personer som passar en viss beskrivning, så kallade behörighetskriterier. Några exempel på dessa kriterier är en persons allmänna hälsotillstånd eller tidigare behandlingar.

Urvalskriterier

Åldrar som är berättigade till studier

Vuxen
Äldre vuxen

Tar emot friska volontärer

Nej

Beskrivning

Inclusion Criteria:

Resident Physician Subjects:
1. Holds a valid and legally effective Physician Practice License of the People's Republic of China;
2. Currently holds the rank of resident physician in a thoracic surgery department at a tertiary Class A (3A) hospital;
3. Agrees to complete all assessment tasks of the main study phase in accordance with the study protocol;
4. Can guarantee the time and effort required to complete all assessment tasks of the main study.
Study Cases:
1. The case was discussed at the Thoracic Oncology Multidisciplinary Team (MDT) conference of Peking University People's Hospital between January 2025 and May 2026;
2. The current version of the NCCN guidelines does not provide an explicit recommendation covering the management of the case;
3. Does not overlap with the GAPS evaluation set;
4. The case is presented in pure text in a structured format, with all direct and indirect identifiers removed and complete de-identification performed prior to inclusion;
5. From the pool of eligible cases, 12 cases will be randomly drawn using Python (numpy.random, with a fixed and archived seed) to serve as the main study cases. The cases will cover 6 themes (chest mass of undetermined diagnosis, early-stage lung cancer, locally advanced lung cancer, oligometastatic/oligoprogressive disease, special intraoperative situations, and tumor recurrence), with 2 cases per theme.
Adjudication Expert Panel:
1. Holds a valid and legally effective Physician Practice License of the People's Republic of China;
2. Currently holds the rank of attending physician or above in a thoracic surgery department at a tertiary Class A hospital;
3. Chairs or regularly participates in lung cancer multidisciplinary team (MDT) work in their department.

Exclusion Criteria:

Resident Physician Subjects:
1. Has previously participated in the construction of the GAPS evaluation set or the development of GAPS-Agent;
2. Unable to complete the tasks of the study phase.
Study Cases:
1. Key case information is missing, such as text-form data on pathology (including IHC/NGS), imaging, laboratory tests, prior medical history, comorbidities, or PS score;
2. Decision-making for the case is strictly dependent on non-text information.
Adjudication Expert Panel:
1. Participated in the construction of the GAPS evaluation set, the content validity verification, or the development of GAPS-Agent for this study;
2. Has a direct conflict of interest with any specific product among the two-arm tools of this study.

Studieplan

Det här avsnittet ger detaljer om studieplanen, inklusive hur studien är utformad och vad studien mäter.

Hur är studien utformad?

Designdetaljer

Primärt syfte: Övrig
Tilldelning: Randomiserad
Interventionsmodell: Parallellt uppdrag
Maskning: Enda

Antal vapen

Vapen och interventioner

Deltagargrupp / Arm	Intervention / Behandling
Experimentell: test arm GAPS-Agent	Övrig: GAPS-Agent The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer. In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength. Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation. Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri
Aktiv komparator: control arm LLM	Övrig: LLM Open source large language model that is not specifically enhanced in medical field.

Deltagargrupp / Arm

Intervention / Behandling

Experimentell: test arm

GAPS-Agent

Övrig: GAPS-Agent

The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer. In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength. Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation. Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri

Aktiv komparator: control arm

LLM

Övrig: LLM

Open source large language model that is not specifically enhanced in medical field.

Vad mäter studien?

Primära resultatmått

Resultatmått	Åtgärdsbeskrivning	Tidsram
Overall plan Win Ratio Tidsram: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.

Sekundära resultatmått

Resultatmått	Åtgärdsbeskrivning	Tidsram
Inter-rater agreement Tidsram: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	For the ternary preference judgment results of 10 expert judges across 192 paired comparisons and 6 evaluation domains, Fleiss' kappa was used to assess inter-rater agreement. The kappa value and its 95% confidence interval are reported for each evaluation domain.	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Redundancy Win Ratio Tidsram: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Evidence-based medicine adherence Win Ratio Tidsram: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Actionability Win Ratio Tidsram: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Completeness Win Ratio Tidsram: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Safety Win Ratio Tidsram: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
GAPS automated rubric score Tidsram: Generated up to 3 weeks after residents finished their plan generation.	A third-party large language model, independent of the two study arms' base models, served as the judge model and automatically scored all 96 plans according to the GAPS rubric.	Generated up to 3 weeks after residents finished their plan generation.
Subject physician's self-confidence score Tidsram: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians self-rated their confidence in their own plan using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Tool satisfaction score Tidsram: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians rated their satisfaction with the tool using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Tool trustworthiness score Tidsram: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians rated the tool's credibility using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Decision-making time Tidsram: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	The time taken (in minutes) by each participating physician to complete the production of each case plan was automatically recorded by the evaluation platform. Differences between groups were analyzed using a linear mixed-effects model.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.

Samarbetspartners och utredare

Det är här du hittar personer och organisationer som är involverade i denna studie.

Sponsor

Peking University People's Hospital

Studieavstämningsdatum

Dessa datum spårar framstegen för inlämningar av studieposter och sammanfattande resultat till ClinicalTrials.gov. Studieposter och rapporterade resultat granskas av National Library of Medicine (NLM) för att säkerställa att de uppfyller specifika kvalitetskontrollstandarder innan de publiceras på den offentliga webbplatsen.

Studera stora datum

Studiestart (Faktisk)

10 juni 2026

Primärt slutförande (Beräknad)

21 juni 2026

Avslutad studie (Beräknad)

21 juni 2026

Studieregistreringsdatum

Först inskickad

10 juni 2026

Först inskickad som uppfyllde QC-kriterierna

13 juni 2026

Första postat (Faktisk)

17 juni 2026

Uppdateringar av studier

Senaste uppdatering publicerad (Faktisk)

17 juni 2026

Senaste inskickade uppdateringen som uppfyllde QC-kriterierna

13 juni 2026

Senast verifierad

1 juni 2026

Mer information

Termer relaterade till denna studie

Nyckelord

Ytterligare relevanta MeSH-villkor

Andra studie-ID-nummer

2026PHB458-001

Plan för individuella deltagardata (IPD)

Planerar du att dela individuella deltagardata (IPD)?

NEJ

Läkemedels- och apparatinformation, studiedokument

Studerar en amerikansk FDA-reglerad läkemedelsprodukt

Nej

Studerar en amerikansk FDA-reglerad produktprodukt

Nej

Denna information hämtades direkt från webbplatsen clinicaltrials.gov utan några ändringar. Om du har några önskemål om att ändra, ta bort eller uppdatera dina studieuppgifter, vänligen kontakta register@clinicaltrials.gov. Så snart en ändring har implementerats på clinicaltrials.gov, kommer denna att uppdateras automatiskt även på vår webbplats .

Kliniska prövningar på Lungcancer (NSCLC)

First Affiliated Hospital of Wenzhou Medical University

Har inte rekryterat ännu

Immunodynamikstyrd optimering av individualiserad immunokemoterapi i avancerad förarnegativ NSCLC: En randomiserad studie

Advanced Non-Small Cell Lung Cancer (NSCLC)
Wen-zhao ZHONG

Rekrytering

Sub-lobectomy vs Lobectomy in IIA-IIIB NSCLC After Neoadjuvant IO+Chemo

NSCLC

Kina
CSPC Megalith Biopharmaceutical Co.,Ltd.

Har inte rekryterat ännu

En fas Ⅰb/Ⅲ klinisk studie av SYS6010 i kombination med osimertinib hos patienter med lokalt avancerad eller metastaserad NSCLC (SYNSTAR-02)

NSCLC
Tianjin Medical University Cancer Institute and...

Rekrytering

TALENT-studien: Fas II-studie av adjuvant L-TIL plus tislelizumab vid resektabel NSCLC utan pCR efter neoadjuvant kemojimmunterapi

NSCLC

Kina
Shanghai Chest Hospital

Har inte rekryterat ännu

En studie av SHR-A1811 kombinerat med adebelimumab som neoadjuvant behandling för resektabel HER2-förändrad icke-småcellig lungcancer

NSCLC
Jiangsu Province Nanjing Brain Hospital

Rekrytering

Den dynamiska övervakningen av ctDNA för cerebrospinalvätska

NSCLC

Kina
Radboud University Medical Center
Pfizer; ImaginAb, Inc.; University Hospital Tuebingen

Har inte rekryterat ännu

Immuno-pet imaging svar administrerar röd immunkontrollpunktshämmare (IMPRINT)

NSCLC

Tyskland, Nederländerna
Guangdong Provincial People's Hospital

Aktiv, inte rekryterande

En prospektiv observationsstudie om förändringar i kortisolnivåer efter neoadjuvant immunterapi och deras prognostiska värde hos patienter med NSCLC

NSCLC

Kina
Shanghai Zhongshan Hospital

Avslutad

Multimodal Modell Förutsäger Behandlingseffektivitet och CIP-risk vid Avancerad NSCLC med Immunoterapi och Kemoterapi

NSCLC

Kina
TYK Medicines, Inc

Avslutad

Studier som utvärderar effekterna av itrakonazol eller rifampicin på farmakokinetiken för TY-9591-tabletter hos friska försökspersoner

NSCLC

Kina

Kliniska prövningar på GAPS-Agent

Postgraduate Institute of Dental Sciences Rohtak

Rekrytering

"Jämförande utvärdering av aggressiv gaparthroplasty med minimal gaparthroplasty i hanteringen av TMJ ankylos"

Artoplastik

Indien
Universitas Diponegoro

Rekrytering

Det prognostiska värdet av anjongap för att förutsäga dödlighet bland patienter med akut lungemboli (PRAGUE-PE)

Akut lungemboli (PE)

Indonesien
Wyeth is now a wholly owned subsidiary of Pfizer

Avslutad

Enstaka stigande dosstudie av säkerhet, tolerabilitet och farmakokinetik för GAP-134 administrerat intravenöst

Arytmi

Förenta staterna
University of Victoria

Avslutad

Ett randomiserat jämförelseförsök som undersöker effekten av en familjebaserad matlagningsworkshop

Kostvana

Kanada
Assiut University

Har inte rekryterat ännu

Glykemiskt gap kontra intagningsplasmaglukosnivå som prediktorer för intensivvårdsutfall hos patienter med typ 2-diabetes med akut hjärtsvikt

Bedöm sambandet mellan glykemisk gab och negativa kliniska resultat hos diabetespatienter som inlagda på sjukhus med hjärtsvikt
Wyeth is now a wholly owned subsidiary of Pfizer

Avslutad

Enstaka stigande dos av GAP-134 som en 24-timmars IV-infusion hos friska japanska män

Arytmi

Japan
Virginia Commonwealth University
Eunice Kennedy Shriver National Institute of Child Health and Human Development...

Rekrytering

Preventioning Firearm Violence in Youth: A Hospital-based Prevention Strategy

Våld i tonåren

Förenta staterna
Wyeth is now a wholly owned subsidiary of Pfizer

Avslutad

Studie av säkerhet, tolerabilitet och farmakokinetik för GAP-134 administrerat intravenöst

Friska ämnen

Förenta staterna
Universidad Autonoma de Madrid
Ilustre Colegio Profesional de Fisioterapeutas de la Comunidad de Madrid

Har inte rekryterat ännu

Personanpassad vårdhanteringsmodell (GAP-421) för kronisk smärta inom primärvårdsfysioterapi (GAP-421)

Muskuloskeletal smärta | Kronisk smärta | Primärsjukvård | Vårdsamordning | Kronisk icke-cancer smärta

Spanien
Medical College of Wisconsin

Rekrytering

Undersökning av profilrelaterade bevis som bestämmer individualiserad cancerterapi för patienter med aggressiva maligniteter och dåliga prognoser (MCW I-PREDICT)

Cancer

Förenta staterna

Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer

Studieöversikt

Status

Betingelser

Intervention / Behandling

Studietyp

Inskrivning (Beräknad)

Fas

Kontakter och platser

Studieorter

Deltagandekriterier

Urvalskriterier

Åldrar som är berättigade till studier

Tar emot friska volontärer

Beskrivning

Studieplan

Hur är studien utformad?

Designdetaljer

Antal vapen

Vapen och interventioner

Deltagargrupp / Arm

Intervention / Behandling

Vad mäter studien?

Primära resultatmått

Resultatmått

Åtgärdsbeskrivning

Tidsram

Sekundära resultatmått

Resultatmått

Åtgärdsbeskrivning

Tidsram

Samarbetspartners och utredare

Sponsor

Studieavstämningsdatum

Studera stora datum

Studiestart (Faktisk)

Primärt slutförande (Beräknad)

Avslutad studie (Beräknad)

Studieregistreringsdatum

Först inskickad

Först inskickad som uppfyllde QC-kriterierna

Första postat (Faktisk)

Uppdateringar av studier

Senaste uppdatering publicerad (Faktisk)

Senaste inskickade uppdateringen som uppfyllde QC-kriterierna

Senast verifierad

Mer information

Termer relaterade till denna studie

Nyckelord

Ytterligare relevanta MeSH-villkor

Andra studie-ID-nummer

Plan för individuella deltagardata (IPD)

Planerar du att dela individuella deltagardata (IPD)?

Läkemedels- och apparatinformation, studiedokument

Studerar en amerikansk FDA-reglerad läkemedelsprodukt

Studerar en amerikansk FDA-reglerad produktprodukt

Kliniska prövningar på Lungcancer (NSCLC)

Kliniska prövningar på GAPS-Agent

Sök liknande försök

Sponsorer och medarbetare

Medicinska tillstånd

Läkemedelsinterventioner

CROs by country

CROs in Belarus

Betingelser

Sällsynta sjukdomar

Läkemedelsinterventioner

Kosttillskott

Sponsor / medarbetare

Platser