Diese Seite wurde automatisch übersetzt und die Genauigkeit der Übersetzung wird nicht garantiert. Bitte wende dich an die englische Version für einen Quelltext.

Improving the Reliability of LLMs as Medical Assistants for the General Public (LAMP-1)

3. Juli 2026 aktualisiert von: Ji Xunming,MD,PhD, Capital Medical University

Improving the Reliability of LLMs as Medical Assistants for the General Public: a Proof of Concept Simulation Trial

This study will evaluate whether three-minute six-dimensions education(3M-6D education) can improve the reliability of large language models as medical assistants for the general public. Participants will be randomly assigned to receive or not receive 3M-6D education and then use ChatGPT, Gemini, or non-AI information resources. The study will assess relevant condition identification, disposition concordance, red-flag identification, and NASA-TLX score.

Studienübersicht

Status

Rekrutierung

Bedingungen

Relevant Conditions Identification

Intervention / Behandlung

Detaillierte Beschreibung

This randomized, controlled, proof-of-concept simulation trial will evaluate whether three-minute six-dimensions education (3M-6D education) can improve the reliability of large language models as medical assistants for the general public.

Eligible participants will be randomly assigned in a 1:1:1:1:1 ratio to one of five study groups: the 3M-6D education GPT group, the GPT group, the 3M-6D education Gemini group, the Gemini group, or the control group. Participants in the 3M-6D education GPT and 3M-6D education Gemini groups will receive approximately three minutes of education before using ChatGPT or Gemini.Each participant will be randomly assigned one of 10 standardized clinical scenarios and complete a simulated counseling task in unrestricted natural language within approximately 10 minutes. The study will assess relevant condition identification, disposition concordance, red-flag identification, and NASA-TLX score.

Studientyp

Interventionell

Einschreibung (Geschätzt)

525

Phase

Unzutreffend

Kontakte und Standorte

Dieser Abschnitt enthält die Kontaktdaten derjenigen, die die Studie durchführen, und Informationen darüber, wo diese Studie durchgeführt wird.

Studienkontakt

Name: Xunming Ji
Telefonnummer: 01083198962
E-Mail: jixm@ccmu.edu.cn

Studieren Sie die Kontaktsicherung

Name: Chuanjie Wu
Telefonnummer: 01083199439
E-Mail: wuchuanjie@ccmu.edu.cn

Studienorte

China
- Beijing Municipality
  - Beijing, Beijing Municipality, China
    - Rekrutierung
    - Beijing Ctiy
    - Kontakt:
      
      Chuanjie Wu
      
      Telefonnummer: 010-83199439
      
      E-Mail: wuchuanjie@ccmu.edu.cn

Teilnahmekriterien

Forscher suchen nach Personen, die einer bestimmten Beschreibung entsprechen, die als Auswahlkriterien bezeichnet werden. Einige Beispiele für diese Kriterien sind der allgemeine Gesundheitszustand einer Person oder frühere Behandlungen.

Zulassungskriterien

Studienberechtigtes Alter

Erwachsene
Älterer Erwachsener

Akzeptiert gesunde Freiwillige

Beschreibung

Inclusion Criteria:

Age 18 years or greater, male or female;
Completed primary school or higher education;
Able to use a smartphone or computer to complete online interaction;
No history of acute ischemic stroke, systemic lupus erythematosus, gastric ulcer, pneumonia, acute cardiac infarction, urinary tract infection, uterine fibroids, diabetes, osteoarthritis, or migraine.
Able to understand and comply with study procedures and to provide written informed consent.

Exclusion Criteria:

Currently or previously employed as a healthcare worker;
Previously received systematic medical training;
Currently involved in concurrent research that may interfere with the results of the present trial;
The investigator considered that the participant had other conditions that might affect compliance or preclude participation.

Studienplan

Dieser Abschnitt enthält Einzelheiten zum Studienplan, einschließlich des Studiendesigns und der Messung der Studieninhalte.

Wie ist die Studie aufgebaut?

Designdetails

Hauptzweck: Versorgungsforschung
Zuteilung: Zufällig
Interventionsmodell: Parallele Zuordnung
Maskierung: Single

Anzahl der Arme

Waffen und Interventionen

Teilnehmergruppe / Arm	Intervention / Behandlung
Experimental: 3M-6D education GPT Group Participants will first be trained in 3M-6D education, then use ChatGPT to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Sonstiges: ChatGPT Participants use ChatGPT to complete a standardized simulated clinical scenarios in unrestricted natural language. Verhalten: three minutes six dimensions education 3M-6D education is designed based on Cognitive Load Theory to reduce the cognitive burden on patients during medical interactions with AI and to improve the clarity and completeness of symptom reporting. Guided by cognitive load theory and the natural process physicians use to take medical histories, the investigators identified candidate information dimensions and developed a structured expression framework with six dimensions for public health queries through a Delphi expert consensus process. Participants were instructed to use the framework to describe their symptoms across these six dimensions; this process can typically be completed within three minutes, so the investigators call this approach three minutes six dimensions education (3M-6D education).
Experimental: 3M-6D education Gemini Group Participants will first be trained in 3M-6D education, then use Gemini to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Sonstiges: Gemini Participants use Gemini to complete a standardized simulated clinical scenarios in unrestricted natural language. Verhalten: three minutes six dimensions education 3M-6D education is designed based on Cognitive Load Theory to reduce the cognitive burden on patients during medical interactions with AI and to improve the clarity and completeness of symptom reporting. Guided by cognitive load theory and the natural process physicians use to take medical histories, the investigators identified candidate information dimensions and developed a structured expression framework with six dimensions for public health queries through a Delphi expert consensus process. Participants were instructed to use the framework to describe their symptoms across these six dimensions; this process can typically be completed within three minutes, so the investigators call this approach three minutes six dimensions education (3M-6D education).
Aktiver Komparator: GPT Group Participants will use ChatGPT to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Sonstiges: ChatGPT Participants use ChatGPT to complete a standardized simulated clinical scenarios in unrestricted natural language.
Aktiver Komparator: Gemini Group Participants will use Gemini to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Sonstiges: Gemini Participants use Gemini to complete a standardized simulated clinical scenarios in unrestricted natural language.
Kein Eingriff: Control group Participants will use non-AI tools such as internet searches and medical websites to complete a consultation task in unrestricted natural language in approximately 10 minutes.

Was misst die Studie?

Primäre Ergebnismessungen

Ergebnis Maßnahme	Maßnahmenbeschreibung	Zeitfenster
Relevant conditions identification of the 3M-6D education GPT group compared with the GPT group Zeitfenster: 1 hour.	Relevant conditions identification is defined as the proportion of participants whose final response includes the expert-defined final diagnosis or a relevant differential diagnosis.	1 hour.
Disposition concordance of the 3M-6D education GPT group compared with the GPT group Zeitfenster: 1 hour.	Disposition concordance is defined as the proportion of participants whose final care recommendation matches the expert-defined level. The five levels are self-care, routine outpatient care, urgent outpatient care, emergency department visit, and emergency medical services.	1 hour.
Relevant conditions identification of the 3M-6D education Gemini group compared with the Gemini group Zeitfenster: 1 hour.		1 hour.
Disposition concordance of the 3M-6D education Gemini group compared with the Gemini group Zeitfenster: 1 hour.		1 hour.

Sekundäre Ergebnismessungen

Ergebnis Maßnahme	Maßnahmenbeschreibung	Zeitfenster
Relevant conditions identification of the 3M-6D education GPT group compared with the control group Zeitfenster: 1 hour.		1 hour.
Relevant conditions identification of the 3M-6D education Gemini group compared with the control group Zeitfenster: 1 hour.		1 hour.
Disposition concordance of the 3M-6D education GPT group compared with the control group Zeitfenster: 1 hour.		1 hour.
Disposition concordance of the 3M-6D education Gemini group compared with the control group Zeitfenster: 1 hour.		1 hour.
Red-flag identification in the 3M-6D education GPT group compared with the GPT group Zeitfenster: 1 hour.	Red-flag identification is defined as the proportion of participants whose final response includes the key warning signs that experts defined for the assigned scenario.	1 hour.
Red-flag identification in the 3M-6D education GPT group compared with the control group Zeitfenster: 1 hour.		1 hour.
Red-flag identification in the 3M-6D education Gemini group compared with the Gemini group Zeitfenster: 1 hour.		1 hour.
Red-flag identification in the 3M-6D education Gemini group compared with the control group Zeitfenster: 1 hour.		1 hour.
NASA Task Load Index score of the 3M-6D education GPT group compared with the GPT group Zeitfenster: 1 hour.	NASA-TLX score is a self-reported task-load score measured after the simulated consultation with a physician. It includes six domains: mental demand, physical demand, temporal demand, effort, frustration, and performance. Each domain is scored from 0 to 100. The total score is the mean of the six domains. Higher scores indicate greater perceived task load.	1 hour.
NASA Task Load Index score of the 3M-6D education GPT group compared with the control group Zeitfenster: 1 hour.	NASA-TLX score is a self-reported task-load score measured after the simulated consultation with a physician. It includes six domains: mental demand, physical demand, temporal demand, effort, frustration, and performance. Each domain is scored from 0 to 100. The total score is the mean of the six domains. Higher scores indicate greater perceived task load.	1 hour.
NASA Task Load Index score of the 3M-6D education Gemini group compared with the Gemini group Zeitfenster: 1 hour.	NASA-TLX score is a self-reported task-load score measured after the simulated consultation with a physician. It includes six domains: mental demand, physical demand, temporal demand, effort, frustration, and performance. Each domain is scored from 0 to 100. The total score is the mean of the six domains. Higher scores indicate greater perceived task load.	1 hour.
NASA Task Load Index score of the 3M-6D education Gemini group compared with the control group Zeitfenster: 1 hour.	NASA-TLX score is a self-reported task-load score measured after the simulated consultation with a physician. It includes six domains: mental demand, physical demand, temporal demand, effort, frustration, and performance. Each domain is scored from 0 to 100. The total score is the mean of the six domains. Higher scores indicate greater perceived task load.	1 hour.
Relevant conditions identification of the 3M-6D education GPT group compared with the 3M-6D education Gemini group Zeitfenster: 1 hour.		1 hour.
Disposition concordance of the 3M-6D education GPT group compared with the 3M-6D education Gemini group Zeitfenster: 1 hour.		1 hour.
Red-flag identification in the 3M-6D education GPT group compared with the 3M-6D education Gemini group Zeitfenster: 1 hour.		1 hour.
NASA Task Load Index score of the 3M-6D education GPT group compared with the 3M-6D education Gemini group Zeitfenster: 1 hour.	NASA-TLX score is a self-reported task-load score measured after the simulated consultation with a physician. It includes six domains: mental demand, physical demand, temporal demand, effort, frustration, and performance. Each domain is scored from 0 to 100. The total score is the mean of the six domains. Higher scores indicate greater perceived task load.	1 hour.

Andere Ergebnismessungen

Ergebnis Maßnahme	Maßnahmenbeschreibung	Zeitfenster
Failure to identify red flags in the 3M-6D education GPT group compared with the GPT group Zeitfenster: 1 hour.	Failure to identify red flags is defined as the proportion of participants whose final response does not include the expert-defined red-flag symptoms or warning signs for the assigned standardized simulated clinical scenario.	1 hour.
Failure to identify red flags in the 3M-6D education GPT group compared with the control group Zeitfenster: 1 hour.		1 hour.
Failure to identify red flags in the 3M-6D education Gemini group compared with the Gemini group Zeitfenster: 1 hour.		1 hour.
Failure to identify red flags in the 3M-6D education Gemini group compared with the control group Zeitfenster: 1 hour.		1 hour.
Underestimation of disposition in the 3M-6D education GPT group compared with the GPT group Zeitfenster: 1 hour.	Underestimation of disposition is defined as the proportion of participants whose final care recommendation is lower than the expert-defined disposition level for the assigned standardized simulated clinical scenario.	1 hour.
Underestimation of disposition in the 3M-6D education GPT group compared with the control group Zeitfenster: 1 hour.		1 hour.
Underestimation of disposition in the 3M-6D education Gemini group compared with the Gemini group Zeitfenster: 1 hour.		1 hour.
Underestimation of disposition in the 3M-6D education Gemini group compared with the control group Zeitfenster: 1 hour.		1 hour.

Mitarbeiter und Ermittler

Hier finden Sie Personen und Organisationen, die an dieser Studie beteiligt sind.

Sponsor

Capital Medical University

Mitarbeiter

Xuanwu Hospital, Beijing

Studienaufzeichnungsdaten

Diese Daten verfolgen den Fortschritt der Übermittlung von Studienaufzeichnungen und zusammenfassenden Ergebnissen an ClinicalTrials.gov. Studienaufzeichnungen und gemeldete Ergebnisse werden von der National Library of Medicine (NLM) überprüft, um sicherzustellen, dass sie bestimmten Qualitätskontrollstandards entsprechen, bevor sie auf der öffentlichen Website veröffentlicht werden.

Haupttermine studieren

Studienbeginn (Tatsächlich)

3. Juli 2026

Primärer Abschluss (Geschätzt)

20. Juli 2026

Studienabschluss (Geschätzt)

20. Juli 2026

Studienanmeldedaten

Zuerst eingereicht

11. Juni 2026

Zuerst eingereicht, das die QC-Kriterien erfüllt hat

11. Juni 2026

Zuerst gepostet (Tatsächlich)

16. Juni 2026

Studienaufzeichnungsaktualisierungen

Letztes Update gepostet (Tatsächlich)

7. Juli 2026

Letztes eingereichtes Update, das die QC-Kriterien erfüllt

3. Juli 2026

Zuletzt verifiziert

1. Juli 2026

Mehr Informationen

Begriffe im Zusammenhang mit dieser Studie

Schlüsselwörter

Andere Studien-ID-Nummern

LAMP-1

Plan für individuelle Teilnehmerdaten (IPD)

Planen Sie, individuelle Teilnehmerdaten (IPD) zu teilen?

UNENTSCHIEDEN

Arzneimittel- und Geräteinformationen, Studienunterlagen

Studiert ein von der US-amerikanischen FDA reguliertes Arzneimittelprodukt

Nein

Studiert ein von der US-amerikanischen FDA reguliertes Geräteprodukt

Nein

Diese Informationen wurden ohne Änderungen direkt von der Website clinicaltrials.gov abgerufen. Wenn Sie Ihre Studiendaten ändern, entfernen oder aktualisieren möchten, wenden Sie sich bitte an register@clinicaltrials.gov. Sobald eine Änderung auf clinicaltrials.gov implementiert wird, wird diese automatisch auch auf unserer Website aktualisiert .

Klinische Studien zur Relevant Conditions Identification

McMaster University
National Health and Medical Research Council, Australia; Canadian Institutes... und andere Mitarbeiter

Abgeschlossen

Re-Evaluating the Inhibition of Stress Erosions (REVISE) Trial (REVISE)

Gastrointestinale Blutung (klinisch relevant, oben)

Kanada, Australien, Vereinigte Staaten, Brasilien, Kuwait, Pakistan, Saudi-Arabien, Vereinigtes Königreich
Ewha Womans University Mokdong Hospital

Noch keine Rekrutierung

FENOX-Studie (Vergleich der Wirksamkeit von Fexuprazan als Begleittherapie bei Patienten, die Nicht-Vitamin-K-Antagonist orale Antikoagulanzien erhalten) (FENOX)

Arzneimittelbedingte Nebenwirkungen und Nebenwirkungen | Vorhofflimmern (AF) | Gastrointestinale Blutung (klinisch relevant, oben) | Obere gastrointestinale Blutung (UGIB)

Südkorea

Klinische Studien zur ChatGPT

Istituto Clinico Humanitas
Fondazione I.R.C.C.S. Istituto Neurologico Carlo Besta

Abgeschlossen

ChatGPT in der Diagnose und Behandlung komplexer Polyneuropathien: Vergleichende Analyse mit Neurologen anhand von realen Fallbeispielen (REASON)

Polyneuropathien

Italien
Charite University, Berlin, Germany
German Research Foundation; Max Planck Institute for Human Development

Noch keine Rekrutierung

Ovarialkarzinom-Screening und KI (AI-OCS-Gyn)

Empfehlungen von Gynäkologen zum Screening auf Eierstockkrebs

Deutschland
Philipps University Marburg

Abgeschlossen

Al zur Verbesserung der Diagnose seltener rheumatischer Erkrankungen (AIDRARER)

Rheumatische Erkrankungen

Deutschland
Chang Gung University of Science and Technology
National Science and Technology Council, Taiwan

Noch keine Rekrutierung

CHATGPT -Basis -Intervention für soziale Gebrechlichkeit bei älteren Frauen mit CHF: Geschlechtsunterschiede

Soziale Kommunikation | CHF - Kongestive Herzinsuffizienz | 65 Jahre älter
Boston Intelligent Medical Research Center, Shenzhen...
Tsinghua University

Noch keine Rekrutierung

ChatGPT vs. Mensch beim Verfassen eines präoperativen Besuchsbogens

Präoperative Betreuung
Saglik Bilimleri Universitesi

Abgeschlossen

Auswirkungen des ChatGPT-basierten Pflegeprozesstrainings

Pfleger-Patienten-Beziehungen

Truthahn
Chang Gung University of Science and Technology

Noch keine Rekrutierung

Vergleich von Sarkopenie, physischer, psychischer und sozialer Gebrechlichkeit bei älteren Frauen im Krankenhaus ältere Frauen CHF in Metropolen und ländlichen Umgebungen

Soziale Kommunikation | CHF - Kongestive Herzinsuffizienz | 65 Jahre älter | Sarkopenie bei älteren Menschen
National Taiwan University Hospital

Anmeldung auf Einladung

Verwenden von ChatGPT als Job -Coach für Erwachsene mit Autismus -Spektrum -Störung

Autismus | Vorstellungsgesprächsleistung

Taiwan
King Faisal Specialist Hospital & Research Center

Rekrutierung

KI-gestützte Hautbewertung zur Prävention von Druckverletzungen bei Intensivpflegekräften (IT-PIP)

Die Studie konzentriert sich auf Hautbeurteilung und Dekubitus-Stadieneinteilung bei Intensivpatienten

Saudi-Arabien
Ankara University

Noch keine Rekrutierung

Chatgpt-unterstütztes Bluthochdruckwissen in Krankenpflegestudenten

Hypertonie | Künstliche Intelligenz | Studenten der Krankenpflege

Improving the Reliability of LLMs as Medical Assistants for the General Public (LAMP-1)

Improving the Reliability of LLMs as Medical Assistants for the General Public: a Proof of Concept Simulation Trial

Studienübersicht

Status

Bedingungen

Intervention / Behandlung

Detaillierte Beschreibung

Studientyp

Einschreibung (Geschätzt)

Phase

Kontakte und Standorte

Studienkontakt

Studieren Sie die Kontaktsicherung

Studienorte

Teilnahmekriterien

Zulassungskriterien

Studienberechtigtes Alter

Akzeptiert gesunde Freiwillige

Beschreibung

Studienplan

Wie ist die Studie aufgebaut?

Designdetails

Anzahl der Arme

Waffen und Interventionen

Teilnehmergruppe / Arm

Intervention / Behandlung

Was misst die Studie?

Primäre Ergebnismessungen

Ergebnis Maßnahme

Maßnahmenbeschreibung

Zeitfenster

Sekundäre Ergebnismessungen

Ergebnis Maßnahme

Maßnahmenbeschreibung

Zeitfenster

Andere Ergebnismessungen

Ergebnis Maßnahme

Maßnahmenbeschreibung

Zeitfenster

Mitarbeiter und Ermittler

Sponsor

Mitarbeiter

Studienaufzeichnungsdaten

Haupttermine studieren

Studienbeginn (Tatsächlich)

Primärer Abschluss (Geschätzt)

Studienabschluss (Geschätzt)

Studienanmeldedaten

Zuerst eingereicht

Zuerst eingereicht, das die QC-Kriterien erfüllt hat

Zuerst gepostet (Tatsächlich)

Studienaufzeichnungsaktualisierungen

Letztes Update gepostet (Tatsächlich)

Letztes eingereichtes Update, das die QC-Kriterien erfüllt

Zuletzt verifiziert

Mehr Informationen

Begriffe im Zusammenhang mit dieser Studie

Schlüsselwörter

Andere Studien-ID-Nummern

Plan für individuelle Teilnehmerdaten (IPD)

Planen Sie, individuelle Teilnehmerdaten (IPD) zu teilen?

Arzneimittel- und Geräteinformationen, Studienunterlagen

Studiert ein von der US-amerikanischen FDA reguliertes Arzneimittelprodukt

Studiert ein von der US-amerikanischen FDA reguliertes Geräteprodukt

Klinische Studien zur Relevant Conditions Identification

Klinische Studien zur ChatGPT

Suchen Sie nach ähnlichen Studien

Sponsoren und Mitarbeiter

Krankheiten

Drogeninterventionen

CROs by country

CROs in Qatar

Bedingungen

Seltene Krankheiten

Drogeninterventionen

Nahrungsergänzungsmittel

Sponsor / Mitarbeiter

Standorte