Esta página foi traduzida automaticamente e a precisão da tradução não é garantida. Por favor, consulte o versão em inglês para um texto fonte.

Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer

13 de junho de 2026 atualizado por: XiuYuan Chen, Peking University People's Hospital

This study is an exploratory effect-size estimation study, with the following specific objectives: ① to estimate the point estimate and 95% confidence interval of the Win Ratio for the experimental group (GAPS-Agent) versus the control group (large language model) in blinded pairwise preference judgments by thoracic surgery expert adjudicators, to serve as a sample size planning parameter for subsequent multicenter confirmatory clinical trials; ② to preliminarily evaluate the value of GAPS-Agent within clinical workflows.The hypothesis of this study is as follows: compared with a general-purpose large language model without medical enhancement (control group), a structured agentic workflow optimized on the basis of the GAPS evaluation framework (GAPS-Agent, experimental group) can help junior resident physicians generate clinical decision plans for complex lung cancer cases that are more strongly preferred by senior thoracic surgery expert adjudicators.

Visão geral do estudo

Status

Inscrevendo-se por convite

Condições

Intervenção / Tratamento

Tipo de estudo

Intervencional

Inscrição (Estimado)

Estágio

Não aplicável

Contactos e Locais

Esta seção fornece os detalhes de contato para aqueles que conduzem o estudo e informações sobre onde este estudo está sendo realizado.

Locais de estudo

China
- Beijing Municipality
  - Beijing, Beijing Municipality, China, 100044
    - Peking University People's Hospital

Critérios de participação

Os pesquisadores procuram pessoas que se encaixem em uma determinada descrição, chamada de critérios de elegibilidade. Alguns exemplos desses critérios são a condição geral de saúde de uma pessoa ou tratamentos anteriores.

Critérios de elegibilidade

Idades elegíveis para estudo

Adulto
Adulto mais velho

Aceita Voluntários Saudáveis

Não

Descrição

Inclusion Criteria:

Resident Physician Subjects:
1. Holds a valid and legally effective Physician Practice License of the People's Republic of China;
2. Currently holds the rank of resident physician in a thoracic surgery department at a tertiary Class A (3A) hospital;
3. Agrees to complete all assessment tasks of the main study phase in accordance with the study protocol;
4. Can guarantee the time and effort required to complete all assessment tasks of the main study.
Study Cases:
1. The case was discussed at the Thoracic Oncology Multidisciplinary Team (MDT) conference of Peking University People's Hospital between January 2025 and May 2026;
2. The current version of the NCCN guidelines does not provide an explicit recommendation covering the management of the case;
3. Does not overlap with the GAPS evaluation set;
4. The case is presented in pure text in a structured format, with all direct and indirect identifiers removed and complete de-identification performed prior to inclusion;
5. From the pool of eligible cases, 12 cases will be randomly drawn using Python (numpy.random, with a fixed and archived seed) to serve as the main study cases. The cases will cover 6 themes (chest mass of undetermined diagnosis, early-stage lung cancer, locally advanced lung cancer, oligometastatic/oligoprogressive disease, special intraoperative situations, and tumor recurrence), with 2 cases per theme.
Adjudication Expert Panel:
1. Holds a valid and legally effective Physician Practice License of the People's Republic of China;
2. Currently holds the rank of attending physician or above in a thoracic surgery department at a tertiary Class A hospital;
3. Chairs or regularly participates in lung cancer multidisciplinary team (MDT) work in their department.

Exclusion Criteria:

Resident Physician Subjects:
1. Has previously participated in the construction of the GAPS evaluation set or the development of GAPS-Agent;
2. Unable to complete the tasks of the study phase.
Study Cases:
1. Key case information is missing, such as text-form data on pathology (including IHC/NGS), imaging, laboratory tests, prior medical history, comorbidities, or PS score;
2. Decision-making for the case is strictly dependent on non-text information.
Adjudication Expert Panel:
1. Participated in the construction of the GAPS evaluation set, the content validity verification, or the development of GAPS-Agent for this study;
2. Has a direct conflict of interest with any specific product among the two-arm tools of this study.

Plano de estudo

Esta seção fornece detalhes do plano de estudo, incluindo como o estudo é projetado e o que o estudo está medindo.

Como o estudo é projetado?

Detalhes do projeto

Finalidade Principal: Outro
Alocação: Randomizado
Modelo Intervencional: Atribuição Paralela
Mascaramento: Solteiro

Número de braços

Armas e Intervenções

Grupo de Participantes / Braço	Intervenção / Tratamento
Experimental: test arm GAPS-Agent	Outro: GAPS-Agent The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer. In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength. Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation. Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri
Comparador Ativo: control arm LLM	Outro: LLM Open source large language model that is not specifically enhanced in medical field.

Grupo de Participantes / Braço

Intervenção / Tratamento

Experimental: test arm

GAPS-Agent

Outro: GAPS-Agent

The research group has previously developed the GAPS evaluation framework for complex clinical decision-making in lung cancer. In this framework, G (Grounding) characterizes the cognitive depth of decision-making (ranging from knowledge retrieval to decisions that go beyond clinical guidelines), A (Authority) corresponds to the grading of evidence strength, P (Perturbation) describes the identification and management of real-world clinical confounding factors, and S (Strength) corresponds to the calibration of recommendation strength. Within this framework, the research group has completed the construction of a 100-item complex lung cancer decision-making evaluation set along with its corresponding rubrics, and has invited multiple thoracic oncology experts to complete content validity validation. Based on this, the research group developed GAPS-Agent, which uses an open-source large language model as its foundation and integrates functional modules such as guideline and evidence retri

Comparador Ativo: control arm

LLM

Outro: LLM

Open source large language model that is not specifically enhanced in medical field.

O que o estudo está medindo?

Medidas de resultados primários

Medida de resultado	Descrição da medida	Prazo
Overall plan Win Ratio Prazo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.

Medidas de resultados secundários

Medida de resultado	Descrição da medida	Prazo
Inter-rater agreement Prazo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	For the ternary preference judgment results of 10 expert judges across 192 paired comparisons and 6 evaluation domains, Fleiss' kappa was used to assess inter-rater agreement. The kappa value and its 95% confidence interval are reported for each evaluation domain.	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Redundancy Win Ratio Prazo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Evidence-based medicine adherence Win Ratio Prazo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Actionability Win Ratio Prazo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Completeness Win Ratio Prazo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
Safety Win Ratio Prazo: Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.	A total of 10 blinded expert judges made Win/Tie/Loss ternary preference judgments on 192 paired scheme comparisons in terms of overall scheme quality. The win ratio was calculated as Wins ÷ Losses, and the 95% confidence interval was estimated using a two-level (physician × case) cluster bootstrap resampling method (B = 10,000, quantile method on the log scale).	Measured at the time when experts completed their preference judgements. Calculated up to 3 weeks after the preference judgements.
GAPS automated rubric score Prazo: Generated up to 3 weeks after residents finished their plan generation.	A third-party large language model, independent of the two study arms' base models, served as the judge model and automatically scored all 96 plans according to the GAPS rubric.	Generated up to 3 weeks after residents finished their plan generation.
Subject physician's self-confidence score Prazo: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians self-rated their confidence in their own plan using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Tool satisfaction score Prazo: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians rated their satisfaction with the tool using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Tool trustworthiness score Prazo: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	After submitting each case plan, the participating physicians rated the tool's credibility using a 1-5 point Likert scale.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.
Decision-making time Prazo: Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.	The time taken (in minutes) by each participating physician to complete the production of each case plan was automatically recorded by the evaluation platform. Differences between groups were analyzed using a linear mixed-effects model.	Completed at the time when residents submitted their plans. Calculated up to 3 weeks after the submission.

Colaboradores e Investigadores

É aqui que você encontrará pessoas e organizações envolvidas com este estudo.

Patrocinador

Peking University People's Hospital

Datas de registro do estudo

Essas datas acompanham o progresso do registro do estudo e os envios de resumo dos resultados para ClinicalTrials.gov. Os registros do estudo e os resultados relatados são revisados pela National Library of Medicine (NLM) para garantir que atendam aos padrões específicos de controle de qualidade antes de serem publicados no site público.

Datas Principais do Estudo

Início do estudo (Real)

10 de junho de 2026

Conclusão Primária (Estimado)

21 de junho de 2026

Conclusão do estudo (Estimado)

21 de junho de 2026

Datas de inscrição no estudo

Enviado pela primeira vez

10 de junho de 2026

Enviado pela primeira vez que atendeu aos critérios de CQ

13 de junho de 2026

Primeira postagem (Real)

17 de junho de 2026

Atualizações de registro de estudo

Última Atualização Postada (Real)

17 de junho de 2026

Última atualização enviada que atendeu aos critérios de controle de qualidade

13 de junho de 2026

Última verificação

1 de junho de 2026

Mais Informações

Termos relacionados a este estudo

Palavras-chave

Termos MeSH relevantes adicionais

Outros números de identificação do estudo

2026PHB458-001

Plano para dados de participantes individuais (IPD)

Planeja compartilhar dados de participantes individuais (IPD)?

NÃO

Informações sobre medicamentos e dispositivos, documentos de estudo

Estuda um medicamento regulamentado pela FDA dos EUA

Não

Estuda um produto de dispositivo regulamentado pela FDA dos EUA

Não

Essas informações foram obtidas diretamente do site clinicaltrials.gov sem nenhuma alteração. Se você tiver alguma solicitação para alterar, remover ou atualizar os detalhes do seu estudo, entre em contato com register@clinicaltrials.gov. Assim que uma alteração for implementada em clinicaltrials.gov, ela também será atualizada automaticamente em nosso site .

Ensaios clínicos em Câncer de Pulmão (NSCLC)

Assiut University

Ainda não está recrutando

Polipropileno vs Poliglactina na Sutura do Pulmão

Câncer de pulmão | Lesão pulmonar | Bleb Lung
University Hospital, Clermont-Ferrand
Central Hospital, Nancy, France

Concluído

Toracoscopia para pneumotórax idiopático em crianças (THOPED)

Criança, Somente | Pneumotórax espontâneo | Pneumotórax idiopático | Bleb Lung

França
University of Lorraine

Concluído

Toracoscopia para Pneumotórax Idiopático em Crianças (PNOPED)

Criança, Somente | Pneumotórax espontâneo | Pneumotórax idiopático | Bleb Lung

França
Jianxing He
Innovent Biologics (Suzhou) Co. Ltd.

Recrutamento

Terapia Neoadjuvante com Fulzerasib Sequencial Sintilimab mais Duplo-Platina para NSCLC Ressecável com Mutação KRAS G12C (K-NADIR)

Terapia Neoadjuvante | Mutação KRAS G12C | NSCLC ressecável | NSCLC em estádio IB-IIIA

China
Hunan Province Tumor Hospital

Ainda não está recrutando

The Efficacy and Safety of Trastuzumab Deruxtecan in Advanced or Metastatic NSCLC With HER2 Over Expression

NSCLC
Wen-zhao ZHONG

Recrutamento

Sub-lobectomy vs Lobectomy in IIA-IIIB NSCLC After Neoadjuvant IO+Chemo

NSCLC

China
CSPC Megalith Biopharmaceutical Co.,Ltd.

Ainda não está recrutando

Um Estudo Clínico de Fase Ⅰb/Ⅲ do SYS6010 em Combinação com Osimertinib em Doentes com Cancro do Pulmão de Não-Pequenas Células Localmente Avançado ou Metastático (SYNSTAR-02)

NSCLC
Tianjin Medical University Cancer Institute and...

Recrutamento

Estudo TALENT: Estudo de Fase II de Adjuvante L-TIL mais Tislelizumab em NSCLC Ressecável Sem pCR Após Quimioimunoterapia Neoadjuvante

NSCLC

China
Shanghai Chest Hospital

Ainda não está recrutando

Um Estudo de SHR-A1811 Combinado com Adebelimumab como Terapia Neoadjuvante para Cancro do Pulmão de Células Não Pequenas HER2-Alterado Ressecável

NSCLC
Jiangsu Province Nanjing Brain Hospital

Recrutamento

O monitoramento dinâmico do ctDNA do líquido cefalorraquidiano

NSCLC

China

Ensaios clínicos em GAPS-Agent

Virginia Commonwealth University
Eunice Kennedy Shriver National Institute of Child Health and Human Development...

Recrutamento

Prevenindo a Violência por Armas de Fogo na Juventude: Uma Estratégia de Prevenção Hospitalar

Violência na Adolescência

Estados Unidos
Wyeth is now a wholly owned subsidiary of Pfizer

Concluído

Estudo de Dose Ascendente Única da Segurança, Tolerabilidade e Farmacocinética do GAP-134 Administrado por Via Intravenosa

Arritmia

Estados Unidos
Postgraduate Institute of Dental Sciences Rohtak

Recrutamento

"Avaliação comparativa de artroplastia de gap agressivo com artroplastia de gap mínimo no tratamento da anquilose da ATM"

Artroplastia

Índia
ImmunityBio, Inc.

Retirado

Estudo de Nogapendekin Alfa Inbakicept e Células iNKT em Adultos Gravemente Doentes com Pneumonia Comunitária Grave

Sepse | Linfopenia | Síndrome do Desconforto Respiratório Agudo (SDRA) | Pneumonia Adquirida na Comunidade (PAC) | Imunoparalisia
Saglik Bilimleri Universitesi

Concluído

Comparação de pontuação prognóstica na IPF e HP

Pneumonite de Hipersensibilidade | Doença Pulmonar Intersticial (DPI) | FPI | Doença Pulmonar Fibrótica

Turquia (Türkiye)
Wyeth is now a wholly owned subsidiary of Pfizer

Concluído

Dose Ascendente Única de GAP-134 como uma Infusão IV de 24 horas em Homens Japoneses Saudáveis

Arritmia

Japão
Lawson Health Research Institute

Retirado

O impacto da técnica cirúrgica da artroplastia total do joelho no movimento do plano coronal e na articulação patelofemoral

Osteoartrite

Canadá
Seattle Children's Research Institute (SCRI)

Rescindido

Ensaio Fase 1/2a da Vacina contra a Malária Pf GAP p52-/p36- Esporozoíta

Malária

Estados Unidos
Wyeth is now a wholly owned subsidiary of Pfizer

Concluído

Estudo da segurança, tolerabilidade e farmacocinética do GAP-134 administrado por via intravenosa

Sujeitos Saudáveis

Estados Unidos
Universidad Autonoma de Madrid
Ilustre Colegio Profesional de Fisioterapeutas de la Comunidad de Madrid

Ainda não está recrutando

Modelo de Gestão de Cuidados Personalizados (GAP-421) para Dor Crónica em Fisioterapia de Cuidados de Saúde Primários (GAP-421)

Dor musculoesquelética | Dor crônica | Atenção Primária à Saúde | Coordenação de Cuidados | Dor crônica não oncológica

Espanha

Preliminary Evaluation of a Large Language Model-Based Tool for Complex Surgical Decision Support in Lung Cancer

Visão geral do estudo

Status

Condições

Intervenção / Tratamento

Tipo de estudo

Inscrição (Estimado)

Estágio

Contactos e Locais

Locais de estudo

Critérios de participação

Critérios de elegibilidade

Idades elegíveis para estudo

Aceita Voluntários Saudáveis

Descrição

Plano de estudo

Como o estudo é projetado?

Detalhes do projeto

Número de braços

Armas e Intervenções

Grupo de Participantes / Braço

Intervenção / Tratamento

O que o estudo está medindo?

Medidas de resultados primários

Medida de resultado

Descrição da medida

Prazo

Medidas de resultados secundários

Medida de resultado

Descrição da medida

Prazo

Colaboradores e Investigadores

Patrocinador

Datas de registro do estudo

Datas Principais do Estudo

Início do estudo (Real)

Conclusão Primária (Estimado)

Conclusão do estudo (Estimado)

Datas de inscrição no estudo

Enviado pela primeira vez

Enviado pela primeira vez que atendeu aos critérios de CQ

Primeira postagem (Real)

Atualizações de registro de estudo

Última Atualização Postada (Real)

Última atualização enviada que atendeu aos critérios de controle de qualidade

Última verificação

Mais Informações

Termos relacionados a este estudo

Palavras-chave

Termos MeSH relevantes adicionais

Outros números de identificação do estudo

Plano para dados de participantes individuais (IPD)

Planeja compartilhar dados de participantes individuais (IPD)?

Informações sobre medicamentos e dispositivos, documentos de estudo

Estuda um medicamento regulamentado pela FDA dos EUA

Estuda um produto de dispositivo regulamentado pela FDA dos EUA

Ensaios clínicos em Câncer de Pulmão (NSCLC)

Ensaios clínicos em GAPS-Agent

Pesquisar ensaios semelhantes

Patrocinadores e Colaboradores

Condições médicas

Intervenções de drogas

CROs by country

CROs in Burundi

Condições

Doenças Raras

Intervenções de drogas

Suplementos Alimentares

Patrocinador / Colaboradores

Localizações