Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer
Panoramica dello studio
Stato
Stato
Condizioni
Condizioni
Intervento / Trattamento
Intervento / Trattamento
Tipo di studio
Tipo di studio
Iscrizione (Stimato)
Iscrizione
Fase
Fase
- Non applicabile
Contatti e Sedi
Luoghi di studio
-
-
Beijing Municipality
-
Beijing, Beijing Municipality, Cina, 100044
- Peking University People's Hospital
-
-
Criteri di partecipazione
Criteri di ammissibilità
Criteri di ammissibilità
Età idonea allo studio
- Adulto
- Adulto più anziano
Accetta volontari sani
Descrizione
Inclusion Criteria:
Resident Physician Subjects:
- Holds a valid and legally effective Physician Practice License of the People's Republic of China;
- Currently holds the rank of resident physician in a thoracic surgery department at a tertiary Class A (3A) hospital;
- Agrees to complete all assessment tasks of the main study phase in accordance with the study protocol;
- Can guarantee the time and effort required to complete all assessment tasks of the main study.
Study Cases:
- The case was discussed at the Thoracic Oncology Multidisciplinary Team (MDT) conference of Peking University People's Hospital between January 2025 and May 2026;
- The current version of the NCCN guidelines does not provide an explicit recommendation covering the management of the case;
- Does not overlap with the GAPS evaluation set;
- The case is presented in pure text in a structured format, with all direct and indirect identifiers removed and complete de-identification performed prior to inclusion;
- From the pool of eligible cases, 12 cases will be randomly drawn using Python (numpy.random, with a fixed and archived seed) to serve as the main study cases. The cases will cover 6 themes (chest mass of undetermined diagnosis, early-stage lung cancer, locally advanced lung cancer, oligometastatic/oligoprogressive disease, special intraoperative situations, and tumor recurrence), with 2 cases per theme.
Adjudication Expert Panel:
- Holds a valid and legally effective Physician Practice License of the People's Republic of China;
- Currently holds the rank of attending physician or above in a thoracic surgery department at a tertiary Class A hospital;
- Chairs or regularly participates in lung cancer multidisciplinary team (MDT) work in their department.
Exclusion Criteria:
Resident Physician Subjects:
- Has previously participated in the construction of the GAPS evaluation set or the development of GAPS-Agent;
- Unable to complete the tasks of the study phase.
Study Cases:
- Key case information is missing, such as text-form data on pathology (including IHC/NGS), imaging, laboratory tests, prior medical history, comorbidities, or PS score;
- Decision-making for the case is strictly dependent on non-text information.
Adjudication Expert Panel:
- Participated in the construction of the GAPS evaluation set, the content validity verification, or the development of GAPS-Agent for this study;
- Has a direct conflict of interest with any specific product among the two-arm tools of this study.
Piano di studio
Come è strutturato lo studio?
Dettagli di progettazione
- Scopo principale: Altro
- Assegnazione: Randomizzato
- Modello interventistico: Assegnazione parallela
- Mascheramento: Separare
Numero di armi
Armi e interventi
Gruppo di partecipanti / ArmGruppo di partecipanti / Arm |
Intervento / TrattamentoIntervento / Trattamento |
|---|---|
|
Sperimentale: test arm
GAPS-Agent
|
The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer.
In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength.
Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation.
Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri
|
|
Comparatore attivo: control arm
LLM
|
Open source large language model that is not specifically enhanced in medical field.
|
Cosa sta misurando lo studio?
Misure di risultato primarie
Misure di risultato primarie
Misura del risultato |
Misura Descrizione |
Lasso di tempo |
|---|---|---|
|
Overall plan Win Ratio
Lasso di tempo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
Misure di risultato secondarie
Misure di risultato secondarie
Misura del risultato |
Misura Descrizione |
Lasso di tempo |
|---|---|---|
|
Inter-rater agreement
Lasso di tempo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
For the ternary preference judgment results of 10 expert judges across 192 paired comparisons and 6 evaluation domains, Fleiss' kappa was used to assess inter-rater agreement.
The kappa value and its 95% confidence interval are reported for each evaluation domain.
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
Redundancy Win Ratio
Lasso di tempo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
Evidence-based medicine adherence Win Ratio
Lasso di tempo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
Actionability Win Ratio
Lasso di tempo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
Completeness Win Ratio
Lasso di tempo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
Safety Win Ratio
Lasso di tempo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality.
The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).
|
Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
|
|
GAPS automated rubric score
Lasso di tempo: Generated up to 3 weeks after residents finished their plan generation.
|
A third-party large language model, independent of the two study arms' base models, served as the judge model and automatically scored all 96 plans according to the GAPS rubric.
|
Generated up to 3 weeks after residents finished their plan generation.
|
|
Subject physician's self-confidence score
Lasso di tempo: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
After submitting each case plan, the participating physicians self-rated their confidence in their own plan using a 1-5 point Likert scale.
|
Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
|
Tool satisfaction score
Lasso di tempo: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
After submitting each case plan, the participating physicians rated their satisfaction with the tool using a 1-5 point Likert scale.
|
Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
|
Tool trustworthiness score
Lasso di tempo: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
After submitting each case plan, the participating physicians rated the tool's credibility using a 1-5 point Likert scale.
|
Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
|
Decision-making time
Lasso di tempo: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
The time taken (in minutes) by each participating physician to complete the production of each case plan was automatically recorded by the evaluation platform.
Differences between groups were analyzed using a linear mixed-effects model.
|
Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
|
Collaboratori e investigatori
Sponsor
Sponsor
Studiare le date dei record
Studia le date principali
Inizio studio (Effettivo)
Inizio studio
Completamento primario (Stimato)
Completamento primario
Completamento dello studio (Stimato)
Completamento dello studio
Date di iscrizione allo studio
Primo inviato
Primo inviato
Primo inviato che soddisfa i criteri di controllo qualità
Primo inviato che soddisfa i criteri di controllo qualità
Primo Inserito (Effettivo)
Primo Inserito
Aggiornamenti dei record di studio
Ultimo aggiornamento pubblicato (Effettivo)
Ultimo aggiornamento pubblicato
Ultimo aggiornamento inviato che soddisfa i criteri QC
Ultimo aggiornamento inviato che soddisfa i criteri QC
Ultimo verificato
Ultimo verificato
Maggiori informazioni
Termini relativi a questo studio
Parole chiave
Termini MeSH pertinenti aggiuntivi
Altri numeri di identificazione dello studio
Altri numeri di identificazione dello studio
- 2026PHB458-001
Piano per i dati dei singoli partecipanti (IPD)
Hai intenzione di condividere i dati dei singoli partecipanti (IPD)?
Informazioni su farmaci e dispositivi, documenti di studio
Studia un prodotto farmaceutico regolamentato dalla FDA degli Stati Uniti
Studia un dispositivo regolamentato dalla FDA degli Stati Uniti
Queste informazioni sono state recuperate direttamente dal sito web clinicaltrials.gov senza alcuna modifica. In caso di richieste di modifica, rimozione o aggiornamento dei dettagli dello studio, contattare register@clinicaltrials.gov. Non appena verrà implementata una modifica su clinicaltrials.gov, questa verrà aggiornata automaticamente anche sul nostro sito web .
Prove cliniche su Cancro ai polmoni (NSCLC)
-
NCT05035407TerminatoKita-kyushu Lung Cancer Antigen 1, umano
-
NCT07139769Reclutamento
-
NCT01208103CompletatoAdenocarcinoma dell'intestino tenue | Adenocarcinoma dell'intestino tenue in stadio III AJCC v8 | Adenocarcinoma dell'intestino tenue in stadio IIIA AJCC v8 | Adenocarcinoma dell'intestino tenue in stadio IIIB AJCC v8 | Adenocarcinoma dell'intestino tenue stadio IV AJCC v8 | Ampolla di Vater Adenocarcinoma | Stadio III Ampolla di Vater Cancer AJCC v8 | Stadio IIIA Ampolla di Vater Cancer AJCC v8 | Stadio IIIB Ampolla di Vater Cancer AJCC v8 | Stadio IV Ampolla di Vater Cancer AJCC v8
-
NCT01261520CompletatoStudio delle donne cinesi che non hanno aderito alle linee guida per lo screening mammografico dell'American Cancer Society
-
NCT06928987Attivo, non reclutanteQualità della vita al lavoro | Professionisti paramedici | Toccare Massaggio | Cancer Center
-
NCT07492342ReclutamentoTerapia neoadiuvante | Mutazione KRAS G12C | Resecabile NSCLC | Stadio IB-IIIA NSCLC
-
NCT07612722Reclutamento
-
NCT07376382Non ancora reclutamento
-
NCT07281209Non ancora reclutamento
Prove cliniche su GAPS-Agent
-
NCT04075994CompletatoProcessi patologici | Malattie cardiache | Fibrillazione atriale | Aritmia, Cardiaca | Fibrillazione atriale familiare
-
NCT07130695ReclutamentoLeucemia mieloide acuta
-
NCT07492875RitiratoSepsi | Linfopenia | Sindrome da distress respiratorio acuto (ARDS) | Polmonite acquisita in comunità (CAP) | Immunoparalisi
-
NCT07550088Non ancora reclutamentoCancro colorettale metastatico
-
NCT07045194ReclutamentoDisfunsione dell'arteria coronaria
-
NCT04078022Completato
-
NCT03374111Sconosciuto
-
NCT07492888RitiratoSepsi | Sindrome da distress respiratorio acuto | Polmonite Acuta Comunitaria Grave | Linfopenia / Immunoparalisi negli Adulti in Condizioni Critiche
-
NCT05581030ReclutamentoLeucemia linfoblastica acuta