Esta página se tradujo automáticamente y no se garantiza la precisión de la traducción. por favor refiérase a versión inglesa para un texto fuente.

Improving the Reliability of LLMs as Medical Assistants for the General Public (LAMP-1)

11 de junio de 2026 actualizado por: Ji Xunming,MD,PhD, Capital Medical University

Improving the Reliability of LLMs as Medical Assistants for the General Public: a Proof of Concept Simulation Trial

This study will evaluate whether three-minute six-dimensions education(3M-6D education) can improve the reliability of large language models as medical assistants for the general public. Participants will be randomly assigned to receive or not receive 3M-6D education and then use ChatGPT, Gemini, or non-AI information resources. The study will assess relevant condition identification, disposition concordance, red-flag identification, and NASA-TLX score.

Descripción general del estudio

Estado

Aún no reclutando

Condiciones

Relevant Conditions Identification

Intervención / Tratamiento

Descripción detallada

This randomized, controlled, proof-of-concept simulation trial will evaluate whether three-minute six-dimensions education (3M-6D education) can improve the reliability of large language models as medical assistants for the general public.

Eligible participants will be randomly assigned in a 1:1:1:1:1 ratio to one of five study groups: the 3M-6D education GPT group, the GPT group, the 3M-6D education Gemini group, the Gemini group, or the control group. Participants in the 3M-6D education GPT and 3M-6D education Gemini groups will receive approximately three minutes of education before using ChatGPT or Gemini.Each participant will be randomly assigned one of 10 standardized clinical scenarios and complete a simulated counseling task in unrestricted natural language within approximately 10 minutes. The study will assess relevant condition identification, disposition concordance, red-flag identification, and NASA-TLX score.

Tipo de estudio

Intervencionista

Inscripción (Estimado)

525

Fase

No aplica

Contactos y Ubicaciones

Esta sección proporciona los datos de contacto de quienes realizan el estudio e información sobre dónde se lleva a cabo este estudio.

Estudio Contacto

Nombre: Xunming Ji
Número de teléfono: 01083198962
Correo electrónico: jixm@ccmu.edu.cn

Copia de seguridad de contactos de estudio

Nombre: Chuanjie Wu
Número de teléfono: 01083199439
Correo electrónico: wuchuanjie@ccmu.edu.cn

Ubicaciones de estudio

Porcelana
- Beijing Municipality
  - Beijing, Beijing Municipality, Porcelana, 100053
    - Xuanwu Hospital, Capital Medical University
    - Contacto:
      
      Chuanjie Wu
      
      Número de teléfono: 010-83199439
      
      Correo electrónico: wuchuanjie@ccmu.edu.cn

Criterios de participación

Los investigadores buscan personas que se ajusten a una determinada descripción, denominada criterio de elegibilidad. Algunos ejemplos de estos criterios son el estado de salud general de una persona o tratamientos previos.

Criterio de elegibilidad

Edades elegibles para estudiar

Adulto
Adulto Mayor

Acepta Voluntarios Saludables

Sí

Descripción

Inclusion Criteria:

Age 18 years or greater, male or female;
Completed primary school or higher education;
Able to use a smartphone or computer to complete online interaction;
No history of acute ischemic stroke, systemic lupus erythematosus, gastric ulcer, pneumonia, acute cardiac infarction, urinary tract infection, uterine fibroids, diabetes, osteoarthritis, or migraine.
Able to understand and comply with study procedures and to provide written informed consent.

Exclusion Criteria:

Currently or previously employed as a healthcare worker;
Previously received systematic medical training;
Currently involved in concurrent research that may interfere with the results of the present trial;
The investigator considered that the participant had other conditions that might affect compliance or preclude participation.

Plan de estudios

Esta sección proporciona detalles del plan de estudio, incluido cómo está diseñado el estudio y qué mide el estudio.

¿Cómo está diseñado el estudio?

Detalles de diseño

Propósito principal: Investigación de servicios de salud
Asignación: Aleatorizado
Modelo Intervencionista: Asignación paralela
Enmascaramiento: Único

Número de brazos

Armas e Intervenciones

Grupo de participantes/brazo	Intervención / Tratamiento
Experimental: 3M-6D education GPT Group Participants will first be trained in 3M-6D education, then use ChatGPT to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Conductual: three minutes six dimensions education 3M-6D education is designed based on Cognitive Load Theory to reduce the cognitive burden on patients during medical interactions with AI and to improve the clarity and completeness of symptom reporting. Guided by cognitive load theory and the natural process physicians use to take medical histories, we identified candidate information dimensions and developed a structured expression framework with six dimensions for public health queries through a Delphi expert consensus process. Participants were instructed to use the framework to describe their symptoms across these six dimensions; this process can typically be completed within three minutes, so we call this approach three minutes six dimensions education (3M-6D education). Otro: ChatGPT Participants use ChatGPT to complete a standardized simulated clinical scenarios in unrestricted natural language.
Experimental: 3M-6D education Gemini Group Participants will first be trained in 3M-6D education, then use Gemini to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Conductual: three minutes six dimensions education 3M-6D education is designed based on Cognitive Load Theory to reduce the cognitive burden on patients during medical interactions with AI and to improve the clarity and completeness of symptom reporting. Guided by cognitive load theory and the natural process physicians use to take medical histories, we identified candidate information dimensions and developed a structured expression framework with six dimensions for public health queries through a Delphi expert consensus process. Participants were instructed to use the framework to describe their symptoms across these six dimensions; this process can typically be completed within three minutes, so we call this approach three minutes six dimensions education (3M-6D education). Otro: Gemini Participants use Gemini to complete a standardized simulated clinical scenarios in unrestricted natural language.
Comparador activo: GPT Group Participants will use ChatGPT to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Otro: ChatGPT Participants use ChatGPT to complete a standardized simulated clinical scenarios in unrestricted natural language.
Comparador activo: Gemini Group Participants will use Gemini to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Otro: Gemini Participants use Gemini to complete a standardized simulated clinical scenarios in unrestricted natural language.
Sin intervención: Control group Participants will use non-AI tools such as internet searches and medical websites to complete a consultation task in unrestricted natural language in approximately 10 minutes.

¿Qué mide el estudio?

Medidas de resultado primarias

Medida de resultado	Medida Descripción	Periodo de tiempo
Relevant conditions identification of the 3M-6D education GPT group compared with the GPT group Periodo de tiempo: Usually within 1 hour.	Relevant conditions identification is defined as the proportion of participants whose final response includes the expert-defined final diagnosis or a relevant differential diagnosis.	Usually within 1 hour.
Disposition concordance of the 3M-6D education GPT group compared with the GPT group Periodo de tiempo: Usually within 1 hour.	Disposition concordance is defined as the proportion of participants whose final care recommendation matches the expert-defined level. The five levels are self-care, routine outpatient care, urgent outpatient care, emergency department visit, and emergency medical services.	Usually within 1 hour.
Relevant conditions identification of the 3M-6D education Gemini group compared with the Gemini group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Disposition concordance of the 3M-6D education Gemini group compared with the Gemini group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.

Medidas de resultado secundarias

Medida de resultado	Medida Descripción	Periodo de tiempo
Relevant conditions identification of the 3M-6D education GPT group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Relevant conditions identification of the 3M-6D education Gemini group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Disposition concordance of the 3M-6D education GPT group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Disposition concordance of the 3M-6D education Gemini group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Red-flag identification in the 3M-6D education GPT group compared with the GPT group Periodo de tiempo: Usually within 1 hour.	Red-flag identification is defined as the proportion of participants whose final response includes the key warning signs that experts defined for the assigned scenario.	Usually within 1 hour.
Red-flag identification in the 3M-6D education GPT group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Red-flag identification in the 3M-6D education Gemini group compared with the Gemini group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Red-flag identification in the 3M-6D education Gemini group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
NASA Task Load Index score of the 3M-6D education GPT group compared with the GPT group Periodo de tiempo: Usually within 1 hour.	NASA-TLX score is a self-reported task-load score measured after the simulated consultation with a physician. It includes six domains: mental demand, physical demand, temporal demand, effort, frustration, and performance. Each domain is scored from 0 to 100. The total score is the mean of the six domains. Higher scores indicate greater perceived task load.	Usually within 1 hour.
NASA Task Load Index score of the 3M-6D education GPT group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
NASA Task Load Index score of the 3M-6D education Gemini group compared with the Gemini group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
NASA Task Load Index score of the 3M-6D education Gemini group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Relevant conditions identification of the 3M-6D education GPT group compared with the 3M-6D education Gemini group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Disposition concordance of the 3M-6D education GPT group compared with the 3M-6D education Gemini group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Red-flag identification in the 3M-6D education GPT group compared with the 3M-6D education Gemini group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
NASA Task Load Index score of the 3M-6D education GPT group compared with the 3M-6D education Gemini group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.

Otras medidas de resultado

Medida de resultado	Medida Descripción	Periodo de tiempo
Failure to identify red flags in the 3M-6D education GPT group compared with the GPT group Periodo de tiempo: Usually within 1 hour.	Failure to identify red flags is defined as the proportion of participants whose final response does not include the expert-defined red-flag symptoms or warning signs for the assigned standardized simulated clinical scenario.	Usually within 1 hour.
Failure to identify red flags in the 3M-6D education GPT group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Failure to identify red flags in the 3M-6D education Gemini group compared with the Gemini group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Failure to identify red flags in the 3M-6D education Gemini group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Underestimation of disposition in the 3M-6D education GPT group compared with the GPT group Periodo de tiempo: Usually within 1 hour.	Underestimation of disposition is defined as the proportion of participants whose final care recommendation is lower than the expert-defined disposition level for the assigned standardized simulated clinical scenario.	Usually within 1 hour.
Underestimation of disposition in the 3M-6D education GPT group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Underestimation of disposition in the 3M-6D education Gemini group compared with the Gemini group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.
Underestimation of disposition in the 3M-6D education Gemini group compared with the control group Periodo de tiempo: Usually within 1 hour.		Usually within 1 hour.

Colaboradores e Investigadores

Aquí es donde encontrará personas y organizaciones involucradas en este estudio.

Patrocinador

Capital Medical University

Colaboradores

Xuanwu Hospital, Beijing

Fechas de registro del estudio

Estas fechas rastrean el progreso del registro del estudio y los envíos de resultados resumidos a ClinicalTrials.gov. Los registros del estudio y los resultados informados son revisados por la Biblioteca Nacional de Medicina (NLM) para asegurarse de que cumplan con los estándares de control de calidad específicos antes de publicarlos en el sitio web público.

Fechas importantes del estudio

Inicio del estudio (Estimado)

20 de junio de 2026

Finalización primaria (Estimado)

20 de julio de 2026

Finalización del estudio (Estimado)

20 de julio de 2026

Fechas de registro del estudio

Enviado por primera vez

11 de junio de 2026

Primero enviado que cumplió con los criterios de control de calidad

11 de junio de 2026

Publicado por primera vez (Actual)

16 de junio de 2026

Actualizaciones de registros de estudio

Última actualización publicada (Actual)

16 de junio de 2026

Última actualización enviada que cumplió con los criterios de control de calidad

11 de junio de 2026

Última verificación

1 de junio de 2026

Más información

Términos relacionados con este estudio

Palabras clave

Otros números de identificación del estudio

LAMP-1

Plan de datos de participantes individuales (IPD)

¿Planea compartir datos de participantes individuales (IPD)?

INDECISO

Información sobre medicamentos y dispositivos, documentos del estudio

Estudia un producto farmacéutico regulado por la FDA de EE. UU.

Estudia un producto de dispositivo regulado por la FDA de EE. UU.

Esta información se obtuvo directamente del sitio web clinicaltrials.gov sin cambios. Si tiene alguna solicitud para cambiar, eliminar o actualizar los detalles de su estudio, comuníquese con register@clinicaltrials.gov. Tan pronto como se implemente un cambio en clinicaltrials.gov, también se actualizará automáticamente en nuestro sitio web. .

Improving the Reliability of LLMs as Medical Assistants for the General Public (LAMP-1)

Improving the Reliability of LLMs as Medical Assistants for the General Public: a Proof of Concept Simulation Trial

Descripción general del estudio

Estado

Condiciones

Intervención / Tratamiento

Descripción detallada

Tipo de estudio

Inscripción (Estimado)

Fase

Contactos y Ubicaciones

Estudio Contacto

Copia de seguridad de contactos de estudio

Ubicaciones de estudio

Criterios de participación

Criterio de elegibilidad

Edades elegibles para estudiar

Acepta Voluntarios Saludables

Descripción

Plan de estudios

¿Cómo está diseñado el estudio?

Detalles de diseño

Número de brazos

Armas e Intervenciones

Grupo de participantes/brazo

Intervención / Tratamiento

¿Qué mide el estudio?

Medidas de resultado primarias

Medida de resultado

Medida Descripción

Periodo de tiempo

Medidas de resultado secundarias

Medida de resultado

Medida Descripción

Periodo de tiempo

Otras medidas de resultado

Medida de resultado

Medida Descripción

Periodo de tiempo

Colaboradores e Investigadores

Patrocinador

Colaboradores

Fechas de registro del estudio

Fechas importantes del estudio

Inicio del estudio (Estimado)

Finalización primaria (Estimado)

Finalización del estudio (Estimado)

Fechas de registro del estudio

Enviado por primera vez

Primero enviado que cumplió con los criterios de control de calidad

Publicado por primera vez (Actual)

Actualizaciones de registros de estudio

Última actualización publicada (Actual)

Última actualización enviada que cumplió con los criterios de control de calidad

Última verificación

Más información

Términos relacionados con este estudio

Palabras clave

Otros números de identificación del estudio

Plan de datos de participantes individuales (IPD)

¿Planea compartir datos de participantes individuales (IPD)?

Información sobre medicamentos y dispositivos, documentos del estudio

Estudia un producto farmacéutico regulado por la FDA de EE. UU.

Estudia un producto de dispositivo regulado por la FDA de EE. UU.

Buscar ensayos similares

Patrocinadores y Colaboradores

Condiciones médicas

Intervenciones de drogas

CROs by country

CROs in Denmark

Condiciones

Enfermedades Raras

Intervenciones de drogas

Suplementos dietéticos

Patrocinador / Colaboradores

Localizaciones