Diese Seite wurde automatisch übersetzt und die Genauigkeit der Übersetzung wird nicht garantiert. Bitte wende dich an die englische Version für einen Quelltext.

Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer

13. Juni 2026 aktualisiert von: XiuYuan Chen, Peking University People's Hospital

This study is an exploratory effect-size estimation study, with the following specific objectives: ① to estimate the point estimate and 95% confidence interval of the Win Ratio for the experimental group (GAPS-Agent) versus the control group (large language model) in blinded pairwise preference judgments by thoracic surgery expert adjudicators, to serve as a sample size planning parameter for subsequent multicenter confirmatory clinical trials; ② to preliminarily evaluate the value of GAPS-Agent within clinical workflows.The hypothesis of this study is as follows: compared with a general-purpose large language model without medical enhancement (control group), a structured agentic workflow optimized on the basis of the GAPS evaluation framework (GAPS-Agent, experimental group) can help junior resident physicians generate clinical decision plans for complex lung cancer cases that are more strongly preferred by senior thoracic surgery expert adjudicators.

Studienübersicht

Status

Anmeldung auf Einladung

Bedingungen

Intervention / Behandlung

Studientyp

Interventionell

Einschreibung (Geschätzt)

Phase

Unzutreffend

Kontakte und Standorte

Dieser Abschnitt enthält die Kontaktdaten derjenigen, die die Studie durchführen, und Informationen darüber, wo diese Studie durchgeführt wird.

Studienorte

China
- Beijing Municipality
  - Beijing, Beijing Municipality, China, 100044
    - Peking University People's Hospital

Teilnahmekriterien

Forscher suchen nach Personen, die einer bestimmten Beschreibung entsprechen, die als Auswahlkriterien bezeichnet werden. Einige Beispiele für diese Kriterien sind der allgemeine Gesundheitszustand einer Person oder frühere Behandlungen.

Zulassungskriterien

Studienberechtigtes Alter

Erwachsene
Älterer Erwachsener

Akzeptiert gesunde Freiwillige

Nein

Beschreibung

Inclusion Criteria:

Resident Physician Subjects:
1. Holds a valid and legally effective Physician Practice License of the People's Republic of China;
2. Currently holds the rank of resident physician in a thoracic surgery department at a tertiary Class A (3A) hospital;
3. Agrees to complete all assessment tasks of the main study phase in accordance with the study protocol;
4. Can guarantee the time and effort required to complete all assessment tasks of the main study.
Study Cases:
1. The case was discussed at the Thoracic Oncology Multidisciplinary Team (MDT) conference of Peking University People's Hospital between January 2025 and May 2026;
2. The current version of the NCCN guidelines does not provide an explicit recommendation covering the management of the case;
3. Does not overlap with the GAPS evaluation set;
4. The case is presented in pure text in a structured format, with all direct and indirect identifiers removed and complete de-identification performed prior to inclusion;
5. From the pool of eligible cases, 12 cases will be randomly drawn using Python (numpy.random, with a fixed and archived seed) to serve as the main study cases. The cases will cover 6 themes (chest mass of undetermined diagnosis, early-stage lung cancer, locally advanced lung cancer, oligometastatic/oligoprogressive disease, special intraoperative situations, and tumor recurrence), with 2 cases per theme.
Adjudication Expert Panel:
1. Holds a valid and legally effective Physician Practice License of the People's Republic of China;
2. Currently holds the rank of attending physician or above in a thoracic surgery department at a tertiary Class A hospital;
3. Chairs or regularly participates in lung cancer multidisciplinary team (MDT) work in their department.

Exclusion Criteria:

Resident Physician Subjects:
1. Has previously participated in the construction of the GAPS evaluation set or the development of GAPS-Agent;
2. Unable to complete the tasks of the study phase.
Study Cases:
1. Key case information is missing, such as text-form data on pathology (including IHC/NGS), imaging, laboratory tests, prior medical history, comorbidities, or PS score;
2. Decision-making for the case is strictly dependent on non-text information.
Adjudication Expert Panel:
1. Participated in the construction of the GAPS evaluation set, the content validity verification, or the development of GAPS-Agent for this study;
2. Has a direct conflict of interest with any specific product among the two-arm tools of this study.

Studienplan

Dieser Abschnitt enthält Einzelheiten zum Studienplan, einschließlich des Studiendesigns und der Messung der Studieninhalte.

Wie ist die Studie aufgebaut?

Designdetails

Hauptzweck: Sonstiges
Zuteilung: Zufällig
Interventionsmodell: Parallele Zuordnung
Maskierung: Single

Waffen und Interventionen

Teilnehmergruppe / Arm Teilnehmergruppe / Arm Eine Gruppe oder Untergruppe von Teilnehmern an einer klinischen Studie, die gemäß dem Protokoll der Studie eine bestimmte Intervention/Behandlung oder keine Intervention erhält.	Intervention / Behandlung Intervention / Behandlung Ein Prozess oder eine Handlung, die im Mittelpunkt einer klinischen Studie steht. Interventionen umfassen Medikamente, medizinische Geräte, Verfahren, Impfstoffe und andere Produkte, die entweder in der Erprobungsphase oder bereits verfügbar sind. Interventionen können auch nicht-invasive Ansätze umfassen, wie z. B. Aufklärung oder eine Änderung der Ernährung und Bewegung.
Experimental: test arm GAPS-Agent	Sonstiges: GAPS-Agent The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer. In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength. Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation. Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri
Aktiver Komparator: control arm LLM	Sonstiges: LLM Open source large language model that is not specifically enhanced in medical field.

Teilnehmergruppe / Arm

Intervention / Behandlung

Experimental: test arm

GAPS-Agent

Sonstiges: GAPS-Agent

The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer. In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength. Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation. Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri

Aktiver Komparator: control arm

LLM

Sonstiges: LLM

Open source large language model that is not specifically enhanced in medical field.

Was misst die Studie?

Primäre Ergebnismessungen

Ergebnis Maßnahme	Maßnahmenbeschreibung	Zeitfenster
Overall plan Win Ratio Zeitfenster: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.

Sekundäre Ergebnismessungen

Ergebnis Maßnahme	Maßnahmenbeschreibung	Zeitfenster
Inter-rater agreement Zeitfenster: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	For the ternary preference judgment results of 10 expert judges across 192 paired comparisons and 6 evaluation domains, Fleiss' kappa was used to assess inter-rater agreement. The kappa value and its 95% confidence interval are reported for each evaluation domain.	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Redundancy Win Ratio Zeitfenster: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Evidence-based medicine adherence Win Ratio Zeitfenster: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Actionability Win Ratio Zeitfenster: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Completeness Win Ratio Zeitfenster: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Safety Win Ratio Zeitfenster: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
GAPS automated rubric score Zeitfenster: Generated up to 3 weeks after residents finished their plan generation.	A third-party large language model, independent of the two study arms' base models, served as the judge model and automatically scored all 96 plans according to the GAPS rubric.	Generated up to 3 weeks after residents finished their plan generation.
Subject physician's self-confidence score Zeitfenster: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians self-rated their confidence in their own plan using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Tool satisfaction score Zeitfenster: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians rated their satisfaction with the tool using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Tool trustworthiness score Zeitfenster: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians rated the tool's credibility using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Decision-making time Zeitfenster: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	The time taken (in minutes) by each participating physician to complete the production of each case plan was automatically recorded by the evaluation platform. Differences between groups were analyzed using a linear mixed-effects model.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.

Mitarbeiter und Ermittler

Hier finden Sie Personen und Organisationen, die an dieser Studie beteiligt sind.

Sponsor

Peking University People's Hospital

Studienaufzeichnungsdaten

Diese Daten verfolgen den Fortschritt der Übermittlung von Studienaufzeichnungen und zusammenfassenden Ergebnissen an ClinicalTrials.gov. Studienaufzeichnungen und gemeldete Ergebnisse werden von der National Library of Medicine (NLM) überprüft, um sicherzustellen, dass sie bestimmten Qualitätskontrollstandards entsprechen, bevor sie auf der öffentlichen Website veröffentlicht werden.

Haupttermine studieren

Studienbeginn (Tatsächlich)

10. Juni 2026

Primärer Abschluss (Geschätzt)

21. Juni 2026

Studienabschluss (Geschätzt)

21. Juni 2026

Studienanmeldedaten

Zuerst eingereicht

10. Juni 2026

Zuerst eingereicht, das die QC-Kriterien erfüllt hat

13. Juni 2026

Zuerst gepostet (Tatsächlich)

17. Juni 2026

Studienaufzeichnungsaktualisierungen

Letztes Update gepostet (Tatsächlich)

17. Juni 2026

Letztes eingereichtes Update, das die QC-Kriterien erfüllt

13. Juni 2026

Zuletzt verifiziert

1. Juni 2026

Mehr Informationen

Begriffe im Zusammenhang mit dieser Studie

Schlüsselwörter

Zusätzliche relevante MeSH-Bedingungen

Andere Studien-ID-Nummern

2026PHB458-001

Plan für individuelle Teilnehmerdaten (IPD)

Planen Sie, individuelle Teilnehmerdaten (IPD) zu teilen?

NEIN

Arzneimittel- und Geräteinformationen, Studienunterlagen

Studiert ein von der US-amerikanischen FDA reguliertes Arzneimittelprodukt

Nein

Studiert ein von der US-amerikanischen FDA reguliertes Geräteprodukt

Nein

Diese Informationen wurden ohne Änderungen direkt von der Website clinicaltrials.gov abgerufen. Wenn Sie Ihre Studiendaten ändern, entfernen oder aktualisieren möchten, wenden Sie sich bitte an register@clinicaltrials.gov. Sobald eine Änderung auf clinicaltrials.gov implementiert wird, wird diese automatisch auch auf unserer Website aktualisiert .

Klinische Studien zur Lungenkrebs (NSCLC)

NCT07205627

Noch keine Rekrutierung

Bewertung des Zwerchfellindex vor und nach einem Lungenrehabilitationsprogramm bei Patienten mit interstitiellen Lungenerkrankungen und deren Auswirkungen auf die Lebensqualität

Lung Disease Interstitial Diffus
NCT07612722

Rekrutierung

Sub-lobectomy vs Lobectomy in IIA-IIIB NSCLC After Neoadjuvant IO+Chemo

NSCLC
NCT07376382

Noch keine Rekrutierung

Eine klinische Studie der Phase Ⅰb/Ⅲ von SYS6010 in Kombination mit Osimertinib bei Patienten mit lokal fortgeschrittenem oder metastasiertem NSCLC (SYNSTAR-02)

NSCLC
NCT07330037

Rekrutierung

TALENT-Studie: Phase-II-Studie zur adjuvanten L-TIL plus Tislelizumab bei resektablem NSCLC ohne pCR nach neoadjuvanter Chemoimmuntherapie

NSCLC
NCT07281209

Noch keine Rekrutierung

Eine Studie zu SHR-A1811 kombiniert mit Adebelimumab als neoadjuvante Therapie für resektablen HER2-veränderten nicht-kleinzelligen Lungenkrebs

NSCLC
NCT06315686

Rekrutierung

Die dynamische Überwachung der ctDNA der Zerebrospinalflüssigkeit

NSCLC
NCT06218069

Noch keine Rekrutierung

Immuno-pet Imaging-Reaktionen verabreichter Immun-Checkpoint-Inhibitor (IMPRINT)

NSCLC
NCT07530276

Aktiv, nicht rekrutierend

Eine prospektive Beobachtungsstudie zu Veränderungen der Cortisolspiegel nach neoadjuvanter Immuntherapie und deren prognostischem Wert bei Patienten mit NSCLC

NSCLC
NCT07243899

Abgeschlossen

Multimodales Modell sagt Behandlungseffizienz und CIP-Risiko bei fortgeschrittenem NSCLC mit Immuntherapie und Chemotherapie voraus

NSCLC
NCT06255951

Abgeschlossen

Studien zur Bewertung der Auswirkungen von Itraconazol oder Rifampicin auf die Pharmakokinetik von TY-9591-Tabletten bei gesunden Probanden

NSCLC

Klinische Studien zur GAPS-Agent

NCT04894903

Abgeschlossen

Auswirkungen einer Patientenportal-Intervention zur Behebung von Lücken in der Diabetesversorgung

Diabetes Mellitus
NCT06682013

Zurückgezogen

Machbarkeit des virtuellen Agenten bei Onkologie -Patienten (NTT -Daten)

Lungenkrebs
NCT07132333

Rekrutierung

Molekularentzündungspflicht im Zentrum für personalisierte Medizin (MEB@ZPM)

Spondylarthropathien | Schuppenflechte (PsO) | Psoriasis-Arthritis | Entzündliche Darmerkrankungen (Morbus Crohn und Colitis ulcerosa)
NCT07520123

Noch keine Rekrutierung

Union-FAST: Eine KI-Agenten-Intervention zur Steigerung der Antiviralen Therapieaufnahme bei diagnostizierten, aber unbehandelten Hepatitis-B-Patienten (Union-FAST)

Chronische Hepatitis B
NCT05674825

Rekrutierung

Untersuchung profilbezogener Evidenz zur Bestimmung einer individualisierten Krebstherapie bei Patienten mit aggressiven malignen Erkrankungen und schlechten Prognosen (MCW I-PREDICT)

Krebs
NCT03623971

Abgeschlossen

Validierung einer universellen Katarakt-Intelligence-Plattform

Katarakt | Künstliche Intelligenz
NCT02750865

Abgeschlossen

Gesprächsagenten zur Verbesserung der Lebensqualität in der Palliativmedizin (ECA-PAL)

Palliativpflege
NCT07130695

Rekrutierung

Olutasidenib Single Plus Combo -Therapie in IDH1Mut AML nach Induktion und Konsolidierung

Akute myeloische Leukämie
NCT07014137

Rekrutierung

Eine Studie mit ABSK043, einem oralen PD-L1-Inhibitor, bei Patienten mit angiogenem Sarkomen

Sarkom
NCT04156841

Unbekannt

Sentinel-Lymphknoten-Biopsie bei Brustkrebs im Frühstadium: eine praxisnahe multizentrische Querschnittsstudie (CABS001-Studie)

Brustkrebs | Sentinel-Lymphknoten

Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer

Studienübersicht

Status

Bedingungen

Intervention / Behandlung

Studientyp

Einschreibung (Geschätzt)

Phase

Kontakte und Standorte

Studienorte

Teilnahmekriterien

Zulassungskriterien

Studienberechtigtes Alter

Akzeptiert gesunde Freiwillige

Beschreibung

Studienplan

Wie ist die Studie aufgebaut?

Designdetails

Anzahl der Arme

Waffen und Interventionen

Teilnehmergruppe / Arm

Intervention / Behandlung

Was misst die Studie?

Primäre Ergebnismessungen

Ergebnis Maßnahme

Maßnahmenbeschreibung

Zeitfenster

Sekundäre Ergebnismessungen

Ergebnis Maßnahme

Maßnahmenbeschreibung

Zeitfenster

Mitarbeiter und Ermittler

Sponsor

Studienaufzeichnungsdaten

Haupttermine studieren

Studienbeginn (Tatsächlich)

Primärer Abschluss (Geschätzt)

Studienabschluss (Geschätzt)

Studienanmeldedaten

Zuerst eingereicht

Zuerst eingereicht, das die QC-Kriterien erfüllt hat

Zuerst gepostet (Tatsächlich)

Studienaufzeichnungsaktualisierungen

Letztes Update gepostet (Tatsächlich)

Letztes eingereichtes Update, das die QC-Kriterien erfüllt

Zuletzt verifiziert

Mehr Informationen

Begriffe im Zusammenhang mit dieser Studie

Schlüsselwörter

Zusätzliche relevante MeSH-Bedingungen

Andere Studien-ID-Nummern

Plan für individuelle Teilnehmerdaten (IPD)

Planen Sie, individuelle Teilnehmerdaten (IPD) zu teilen?

Arzneimittel- und Geräteinformationen, Studienunterlagen

Studiert ein von der US-amerikanischen FDA reguliertes Arzneimittelprodukt

Studiert ein von der US-amerikanischen FDA reguliertes Geräteprodukt

Klinische Studien zur Lungenkrebs (NSCLC)

Klinische Studien zur GAPS-Agent

Suchen Sie nach ähnlichen Studien

Sponsoren und Mitarbeiter

Krankheiten

Drogeninterventionen

Bedingungen

Seltene Krankheiten

Drogeninterventionen

Nahrungsergänzungsmittel

Sponsor / Mitarbeiter

Standorte