Deze pagina is automatisch vertaald en de nauwkeurigheid van de vertaling kan niet worden gegarandeerd. Raadpleeg de Engelse versie voor een brontekst.

Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer

13 juni 2026 bijgewerkt door: XiuYuan Chen, Peking University People's Hospital

This study is an exploratory effect-size estimation study, with the following specific objectives: ① to estimate the point estimate and 95% confidence interval of the Win Ratio for the experimental group (GAPS-Agent) versus the control group (large language model) in blinded pairwise preference judgments by thoracic surgery expert adjudicators, to serve as a sample size planning parameter for subsequent multicenter confirmatory clinical trials; ② to preliminarily evaluate the value of GAPS-Agent within clinical workflows.The hypothesis of this study is as follows: compared with a general-purpose large language model without medical enhancement (control group), a structured agentic workflow optimized on the basis of the GAPS evaluation framework (GAPS-Agent, experimental group) can help junior resident physicians generate clinical decision plans for complex lung cancer cases that are more strongly preferred by senior thoracic surgery expert adjudicators.

Studie Overzicht

Toestand

Aanmelden op uitnodiging

Conditie

Interventie / Behandeling

Studietype

Ingrijpend

Inschrijving (Geschat)

Fase

Niet toepasbaar

Contacten en locaties

In dit gedeelte vindt u de contactgegevens van degenen die het onderzoek uitvoeren en informatie over waar dit onderzoek wordt uitgevoerd.

Studie Locaties

China
- Beijing Municipality
  - Beijing, Beijing Municipality, China, 100044
    - Peking University People's Hospital

Deelname Criteria

Onderzoekers zoeken naar mensen die aan een bepaalde beschrijving voldoen, de zogenaamde geschiktheidscriteria. Enkele voorbeelden van deze criteria zijn iemands algemene gezondheidstoestand of eerdere behandelingen.

Geschiktheidscriteria

Leeftijden die in aanmerking komen voor studie

Volwassen
Oudere volwassene

Accepteert gezonde vrijwilligers

Nee

Beschrijving

Inclusion Criteria:

Resident Physician Subjects:
1. Holds a valid and legally effective Physician Practice License of the People's Republic of China;
2. Currently holds the rank of resident physician in a thoracic surgery department at a tertiary Class A (3A) hospital;
3. Agrees to complete all assessment tasks of the main study phase in accordance with the study protocol;
4. Can guarantee the time and effort required to complete all assessment tasks of the main study.
Study Cases:
1. The case was discussed at the Thoracic Oncology Multidisciplinary Team (MDT) conference of Peking University People's Hospital between January 2025 and May 2026;
2. The current version of the NCCN guidelines does not provide an explicit recommendation covering the management of the case;
3. Does not overlap with the GAPS evaluation set;
4. The case is presented in pure text in a structured format, with all direct and indirect identifiers removed and complete de-identification performed prior to inclusion;
5. From the pool of eligible cases, 12 cases will be randomly drawn using Python (numpy.random, with a fixed and archived seed) to serve as the main study cases. The cases will cover 6 themes (chest mass of undetermined diagnosis, early-stage lung cancer, locally advanced lung cancer, oligometastatic/oligoprogressive disease, special intraoperative situations, and tumor recurrence), with 2 cases per theme.
Adjudication Expert Panel:
1. Holds a valid and legally effective Physician Practice License of the People's Republic of China;
2. Currently holds the rank of attending physician or above in a thoracic surgery department at a tertiary Class A hospital;
3. Chairs or regularly participates in lung cancer multidisciplinary team (MDT) work in their department.

Exclusion Criteria:

Resident Physician Subjects:
1. Has previously participated in the construction of the GAPS evaluation set or the development of GAPS-Agent;
2. Unable to complete the tasks of the study phase.
Study Cases:
1. Key case information is missing, such as text-form data on pathology (including IHC/NGS), imaging, laboratory tests, prior medical history, comorbidities, or PS score;
2. Decision-making for the case is strictly dependent on non-text information.
Adjudication Expert Panel:
1. Participated in the construction of the GAPS evaluation set, the content validity verification, or the development of GAPS-Agent for this study;
2. Has a direct conflict of interest with any specific product among the two-arm tools of this study.

Studie plan

Dit gedeelte bevat details van het studieplan, inclusief hoe de studie is opgezet en wat de studie meet.

Hoe is de studie opgezet?

Ontwerpdetails

Primair doel: Ander
Toewijzing: Gerandomiseerd
Interventioneel model: Parallelle opdracht
Masker: Enkel

Aantal wapens

Wapens en interventies

Deelnemersgroep / Arm	Interventie / Behandeling
Experimenteel: test arm GAPS-Agent	Ander: GAPS-Agent The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer. In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength. Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation. Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri
Actieve vergelijker: control arm LLM	Ander: LLM Open source large language model that is not specifically enhanced in medical field.

Deelnemersgroep / Arm

Interventie / Behandeling

Experimenteel: test arm

GAPS-Agent

Ander: GAPS-Agent

The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer. In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength. Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation. Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri

Actieve vergelijker: control arm

LLM

Ander: LLM

Open source large language model that is not specifically enhanced in medical field.

Wat meet het onderzoek?

Primaire uitkomstmaten

Uitkomstmaat	Maatregel Beschrijving	Tijdsspanne
Overall plan Win Ratio Tijdsspanne: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.

Secundaire uitkomstmaten

Uitkomstmaat	Maatregel Beschrijving	Tijdsspanne
Inter-rater agreement Tijdsspanne: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	For the ternary preference judgment results of 10 expert judges across 192 paired comparisons and 6 evaluation domains, Fleiss' kappa was used to assess inter-rater agreement. The kappa value and its 95% confidence interval are reported for each evaluation domain.	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Redundancy Win Ratio Tijdsspanne: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Evidence-based medicine adherence Win Ratio Tijdsspanne: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Actionability Win Ratio Tijdsspanne: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Completeness Win Ratio Tijdsspanne: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Safety Win Ratio Tijdsspanne: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
GAPS automated rubric score Tijdsspanne: Generated up to 3 weeks after residents finished their plan generation.	A third-party large language model, independent of the two study arms' base models, served as the judge model and automatically scored all 96 plans according to the GAPS rubric.	Generated up to 3 weeks after residents finished their plan generation.
Subject physician's self-confidence score Tijdsspanne: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians self-rated their confidence in their own plan using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Tool satisfaction score Tijdsspanne: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians rated their satisfaction with the tool using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Tool trustworthiness score Tijdsspanne: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians rated the tool's credibility using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Decision-making time Tijdsspanne: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	The time taken (in minutes) by each participating physician to complete the production of each case plan was automatically recorded by the evaluation platform. Differences between groups were analyzed using a linear mixed-effects model.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.

Medewerkers en onderzoekers

Hier vindt u mensen en organisaties die betrokken zijn bij dit onderzoek.

Sponsor

Peking University People's Hospital

Studie record data

Deze datums volgen de voortgang van het onderzoeksdossier en de samenvatting van de ingediende resultaten bij ClinicalTrials.gov. Studieverslagen en gerapporteerde resultaten worden beoordeeld door de National Library of Medicine (NLM) om er zeker van te zijn dat ze voldoen aan specifieke kwaliteitscontrolenormen voordat ze op de openbare website worden geplaatst.

Bestudeer belangrijke data

Studie start (Werkelijk)

10 juni 2026

Primaire voltooiing (Geschat)

21 juni 2026

Studie voltooiing (Geschat)

21 juni 2026

Studieregistratiedata

Eerst ingediend

10 juni 2026

Eerst ingediend dat voldeed aan de QC-criteria

13 juni 2026

Eerst geplaatst (Werkelijk)

17 juni 2026

Updates van studierecords

Laatste update geplaatst (Werkelijk)

17 juni 2026

Laatste update ingediend die voldeed aan QC-criteria

13 juni 2026

Laatst geverifieerd

1 juni 2026

Meer informatie

Termen gerelateerd aan deze studie

Trefwoorden

Aanvullende relevante MeSH-voorwaarden

Andere studie-ID-nummers

2026PHB458-001

Plan Individuele Deelnemersgegevens (IPD)

Bent u van plan om gegevens van individuele deelnemers (IPD) te delen?

NEE

Informatie over medicijnen en apparaten, studiedocumenten

Bestudeert een door de Amerikaanse FDA gereguleerd geneesmiddel

Nee

Bestudeert een door de Amerikaanse FDA gereguleerd apparaatproduct

Nee

Deze informatie is zonder wijzigingen rechtstreeks van de website clinicaltrials.gov gehaald. Als u verzoeken heeft om uw onderzoeksgegevens te wijzigen, te verwijderen of bij te werken, neem dan contact op met register@clinicaltrials.gov. Zodra er een wijziging wordt doorgevoerd op clinicaltrials.gov, wordt deze ook automatisch bijgewerkt op onze website .

Klinische onderzoeken op Longkanker (NSCLC)

Sun Yat-sen University

Nog niet aan het werven

Een onderzoek naar het gebruik van hetrombopag in combinatie met rhTPO bij de behandeling van kankerbehandeling-geïnduceerde trombocytopenie (CTIT) bij patiënten met solide tumoren.

Cancer therapie-geïnduceerde trombocytopenie (CTIT)
OHSU Knight Cancer Institute
Oregon Health and Science University

Actief, niet wervend

NeoOPTIMIZE: Vroegtijdig overschakelen van mFOLFIRINOX of Gemcitabine/Nab-Paclitaxel vóór de operatie voor de behandeling van resectabele, borderline reseceerbare of lokaal gevorderde inoperabele alvleesklierkanker

Pancreas Adenocarcinoom | Fase III Pancreaskanker American Joint Committee on Cancer v8 | Stadium 0 Pancreaskanker American Joint Committee on Cancer v8 | Stadium I alvleesklierkanker American Joint Committee on Cancer v8 | Stadium IV alvleesklierkanker American Joint Committee on Cancer...

Verenigde Staten
Centre Hospitalier Universitaire, Amiens

Voltooid

Met behulp van positieve drukventilatie voor preoxygenatie tijdens panendoscopie. (PANNIV)

Ent Cancer Screening

Frankrijk
Hitit University
Erol Olcok Corum Training and Research Hospital

Voltooid

HET EFFECT van ONTSLAGEDUCATIE op HERSTELKWALITEIT en ZELF-EFFECTIVITEIT NA EEN HYSTERECTOMIE

Hysterectomie (MeSH nr: E04.950.300.399) | Had een hysterectomie ondergaan | Had Not Been Diagnosed With Cancer | Na hysterectomie

Turkije (Türkiye)
M.D. Anderson Cancer Center
National Cancer Institute (NCI)

Voltooid

Bevacizumab, Capecitabine en Oxaliplatin bij de behandeling van gevorderde dunne darm of ampulla van Vater adenocarcinoom

Adenocarcinoom van de dunne darm | Stadium III Adenocarcinoom van de dunne darm AJCC v8 | Stadium IIIA Adenocarcinoom van de dunne darm AJCC v8 | Stadium IIIB dunne darm adenocarcinoom AJCC v8 | Stadium IV Adenocarcinoom van de dunne darm AJCC v8 | Ampulla van Vater Adenocarcinoom | Stadium III... en andere voorwaarden

Verenigde Staten
Xijing Hospital

Actief, niet wervend

Vrijstelling van SLNB na neoadjuvante therapie voor triple-negatieve en Her2-positieve borstkanker

Borstkanker | Borstkanker (Triple Negative Breast Cancer (TNBC))

China
University of Utah
National Cancer Institute (NCI)

Voltooid

Weerstandstraining +/- Creatine voor patiënten met gemetastaseerde prostaatkanker

Vermoeidheid | Sedentaire levensstijl | Gemetastaseerd prostaatcarcinoom | Stadium IV prostaatkanker AJCC (American Joint Committee on Cancer) v8 | Stadium IVA prostaatkanker AJCC (American Joint Committee on Cancer) v8 | Stadium IVB prostaatkanker AJCC (American Joint Committee on Cancer) v8

Verenigde Staten
Shanghai Henlius Biotech

Nog niet aan het werven

Fase II-studie naar HLX43-monotherapie of in combinatie met immuuncheckpointremmers bij patiënten met lokaal gevorderde, recidiverende of gemetastaseerde triple-negatieve borstkanker.

Borstkanker (Triple Negative Breast Cancer (TNBC))

China
Shandong Cancer Hospital and Institute

Werving

Nano-Megestrolacetaat voor Kanker Cachexie bij Gevorderde Alvleesklierkanker

Geavanceerd ductaal adenocarcinoom van de alvleesklier | Cancer Anorexia-Cachexia Syndroom

China
Assistance Publique - Hôpitaux de Paris

Nog niet aan het werven

Performance of Anthracycline-Free Neoadjuvant Chemoimmunotherapy in Early Triple-Negative Breast Cancer (NEO-AgeTN)

Borstkanker (Triple Negative Breast Cancer (TNBC))

Klinische onderzoeken op GAPS-Agent

Wyeth is now a wholly owned subsidiary of Pfizer

Voltooid

Onderzoek met enkele oplopende dosis van de veiligheid, verdraagbaarheid en farmacokinetiek van GAP-134 intraveneus toegediend

Aritmie

Verenigde Staten
Saglik Bilimleri Universitesi

Voltooid

Prognostische scorevergelijking in IPF en HP

Overgevoeligheid Pneumonitis | Interstitiële longziekte (ILD) | IPF | Fibrotische longziekte

Turkije (Türkiye)
University of Victoria

Voltooid

Een gerandomiseerde vergelijkingsproef waarin de impact van een gezinsgerichte kookworkshop wordt onderzocht

Dieet Gewoonte

Canada
Wyeth is now a wholly owned subsidiary of Pfizer

Voltooid

Enkele oplopende dosis GAP-134 als een 24-uurs IV-infusie bij gezonde Japanse mannen

Aritmie

Japan
Universidad Autonoma de Madrid
Ilustre Colegio Profesional de Fisioterapeutas de la Comunidad de Madrid

Nog niet aan het werven

Gepersonaliseerd Zorgmanagementmodel (GAP-421) voor Chronische Pijn in de Eerstelijns Fysiotherapie (GAP-421)

Musculoskeletale pijn | Chronische pijn | Basisgezondheidszorg | Zorg Coördinatie | Chronische niet-kankerpijn

Spanje
Lawson Health Research Institute

Ingetrokken

De impact van chirurgische techniek van totale knieartroplastiek op coronale vlakbeweging en patellofemorale articulatie

Artrose

Canada
Wyeth is now a wholly owned subsidiary of Pfizer

Voltooid

Studie van de veiligheid, verdraagbaarheid en farmacokinetiek van GAP-134 intraveneus toegediend

Gezonde onderwerpen

Verenigde Staten
Seattle Children's Research Institute (SCRI)

Beëindigd

Fase 1/2a-onderzoek met Pf GAP p52-/p36- Sporozoite Malaria-vaccin

Malaria

Verenigde Staten
Riphah International University

Voltooid

Mulligan-rotatiebeweging versus mediale gap-techniek bij patiënten met artrose van de knie

Artrose van de knie

Pakistan
University of North Carolina, Chapel Hill
National Institute of Mental Health (NIMH); George Washington University; Instituto...

Voltooid

Pilotinterventie op meerdere niveaus onder transgendervrouwen met hiv in Santo Domingo

Hiv

Dominicaanse Republiek

Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer

Studie Overzicht

Toestand

Conditie

Interventie / Behandeling

Studietype

Inschrijving (Geschat)

Fase

Contacten en locaties

Studie Locaties

Deelname Criteria

Geschiktheidscriteria

Leeftijden die in aanmerking komen voor studie

Accepteert gezonde vrijwilligers

Beschrijving

Studie plan

Hoe is de studie opgezet?

Ontwerpdetails

Aantal wapens

Wapens en interventies

Deelnemersgroep / Arm

Interventie / Behandeling

Wat meet het onderzoek?

Primaire uitkomstmaten

Uitkomstmaat

Maatregel Beschrijving

Tijdsspanne

Secundaire uitkomstmaten

Uitkomstmaat

Maatregel Beschrijving

Tijdsspanne

Medewerkers en onderzoekers

Sponsor

Studie record data

Bestudeer belangrijke data

Studie start (Werkelijk)

Primaire voltooiing (Geschat)

Studie voltooiing (Geschat)

Studieregistratiedata

Eerst ingediend

Eerst ingediend dat voldeed aan de QC-criteria

Eerst geplaatst (Werkelijk)

Updates van studierecords

Laatste update geplaatst (Werkelijk)

Laatste update ingediend die voldeed aan QC-criteria

Laatst geverifieerd

Meer informatie

Termen gerelateerd aan deze studie

Trefwoorden

Aanvullende relevante MeSH-voorwaarden

Andere studie-ID-nummers

Plan Individuele Deelnemersgegevens (IPD)

Bent u van plan om gegevens van individuele deelnemers (IPD) te delen?

Informatie over medicijnen en apparaten, studiedocumenten

Bestudeert een door de Amerikaanse FDA gereguleerd geneesmiddel

Bestudeert een door de Amerikaanse FDA gereguleerd apparaatproduct

Klinische onderzoeken op Longkanker (NSCLC)

Klinische onderzoeken op GAPS-Agent

Zoek naar vergelijkbare onderzoeken

Sponsors en medewerkers

Medische omstandigheden

Geneesmiddelinterventies

CROs by country

CROs in Portugal

Voorwaarden

Zeldzame ziekten

Geneesmiddelinterventies

Voedingssupplementen

Sponsor / medewerkers

Locaties