Denne side blev automatisk oversat, og nøjagtigheden af oversættelsen er ikke garanteret. Der henvises til engelsk version for en kildetekst.

Improving the Reliability of LLMs as Medical Assistants for the General Public (LAMP-1)

3. juli 2026 opdateret af: Ji Xunming,MD,PhD, Capital Medical University

Improving the Reliability of LLMs as Medical Assistants for the General Public: a Proof of Concept Simulation Trial

This study will evaluate whether three-minute six-dimensions education(3M-6D education) can improve the reliability of large language models as medical assistants for the general public. Participants will be randomly assigned to receive or not receive 3M-6D education and then use ChatGPT, Gemini, or non-AI information resources. The study will assess relevant condition identification, disposition concordance, red-flag identification, and NASA-TLX score.

Studieoversigt

Status

Rekruttering

Betingelser

Relevant Conditions Identification

Intervention / Behandling

Detaljeret beskrivelse

This randomized, controlled, proof-of-concept simulation trial will evaluate whether three-minute six-dimensions education (3M-6D education) can improve the reliability of large language models as medical assistants for the general public.

Eligible participants will be randomly assigned in a 1:1:1:1:1 ratio to one of five study groups: the 3M-6D education GPT group, the GPT group, the 3M-6D education Gemini group, the Gemini group, or the control group. Participants in the 3M-6D education GPT and 3M-6D education Gemini groups will receive approximately three minutes of education before using ChatGPT or Gemini.Each participant will be randomly assigned one of 10 standardized clinical scenarios and complete a simulated counseling task in unrestricted natural language within approximately 10 minutes. The study will assess relevant condition identification, disposition concordance, red-flag identification, and NASA-TLX score.

Undersøgelsestype

Interventionel

Tilmelding (Anslået)

525

Fase

Ikke anvendelig

Kontakter og lokationer

Dette afsnit indeholder kontaktoplysninger for dem, der udfører undersøgelsen, og oplysninger om, hvor denne undersøgelse udføres.

Studiekontakt

Navn: Xunming Ji
Telefonnummer: 01083198962
E-mail: jixm@ccmu.edu.cn

Undersøgelse Kontakt Backup

Navn: Chuanjie Wu
Telefonnummer: 01083199439
E-mail: wuchuanjie@ccmu.edu.cn

Studiesteder

Kina
- Beijing Municipality
  - Beijing, Beijing Municipality, Kina
    - Rekruttering
    - Beijing Ctiy
    - Kontakt:
      
      Chuanjie Wu
      
      Telefonnummer: 010-83199439
      
      E-mail: wuchuanjie@ccmu.edu.cn

Deltagelseskriterier

Forskere leder efter personer, der passer til en bestemt beskrivelse, kaldet berettigelseskriterier. Nogle eksempler på disse kriterier er en persons generelle helbredstilstand eller tidligere behandlinger.

Berettigelseskriterier

Aldre berettiget til at studere

Voksen
Ældre voksen

Tager imod sunde frivillige

Beskrivelse

Inclusion Criteria:

Age 18 years or greater, male or female;
Completed primary school or higher education;
Able to use a smartphone or computer to complete online interaction;
No history of acute ischemic stroke, systemic lupus erythematosus, gastric ulcer, pneumonia, acute cardiac infarction, urinary tract infection, uterine fibroids, diabetes, osteoarthritis, or migraine.
Able to understand and comply with study procedures and to provide written informed consent.

Exclusion Criteria:

Currently or previously employed as a healthcare worker;
Previously received systematic medical training;
Currently involved in concurrent research that may interfere with the results of the present trial;
The investigator considered that the participant had other conditions that might affect compliance or preclude participation.

Studieplan

Dette afsnit indeholder detaljer om studieplanen, herunder hvordan undersøgelsen er designet, og hvad undersøgelsen måler.

Hvordan er undersøgelsen tilrettelagt?

Design detaljer

Primært formål: Sundhedstjenesteforskning
Tildeling: Randomiseret
Interventionel model: Parallel tildeling
Maskning: Enkelt

Antal våben

Våben og indgreb

Deltagergruppe / Arm	Intervention / Behandling
Eksperimentel: 3M-6D education GPT Group Participants will first be trained in 3M-6D education, then use ChatGPT to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Andet: ChatGPT Participants use ChatGPT to complete a standardized simulated clinical scenarios in unrestricted natural language. Adfærdsmæssigt: three minutes six dimensions education 3M-6D education is designed based on Cognitive Load Theory to reduce the cognitive burden on patients during medical interactions with AI and to improve the clarity and completeness of symptom reporting. Guided by cognitive load theory and the natural process physicians use to take medical histories, the investigators identified candidate information dimensions and developed a structured expression framework with six dimensions for public health queries through a Delphi expert consensus process. Participants were instructed to use the framework to describe their symptoms across these six dimensions; this process can typically be completed within three minutes, so the investigators call this approach three minutes six dimensions education (3M-6D education).
Eksperimentel: 3M-6D education Gemini Group Participants will first be trained in 3M-6D education, then use Gemini to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Andet: Gemini Participants use Gemini to complete a standardized simulated clinical scenarios in unrestricted natural language. Adfærdsmæssigt: three minutes six dimensions education 3M-6D education is designed based on Cognitive Load Theory to reduce the cognitive burden on patients during medical interactions with AI and to improve the clarity and completeness of symptom reporting. Guided by cognitive load theory and the natural process physicians use to take medical histories, the investigators identified candidate information dimensions and developed a structured expression framework with six dimensions for public health queries through a Delphi expert consensus process. Participants were instructed to use the framework to describe their symptoms across these six dimensions; this process can typically be completed within three minutes, so the investigators call this approach three minutes six dimensions education (3M-6D education).
Aktiv komparator: GPT Group Participants will use ChatGPT to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Andet: ChatGPT Participants use ChatGPT to complete a standardized simulated clinical scenarios in unrestricted natural language.
Aktiv komparator: Gemini Group Participants will use Gemini to complete a consultation task in unrestricted natural language in approximately 10 minutes.	Andet: Gemini Participants use Gemini to complete a standardized simulated clinical scenarios in unrestricted natural language.
Ingen indgriben: Control group Participants will use non-AI tools such as internet searches and medical websites to complete a consultation task in unrestricted natural language in approximately 10 minutes.

Hvad måler undersøgelsen?

Primære resultatmål

Resultatmål	Foranstaltningsbeskrivelse	Tidsramme
Relevant conditions identification of the 3M-6D education GPT group compared with the GPT group Tidsramme: 1 hour.	Relevant conditions identification is defined as the proportion of participants whose final response includes the expert-defined final diagnosis or a relevant differential diagnosis.	1 hour.
Disposition concordance of the 3M-6D education GPT group compared with the GPT group Tidsramme: 1 hour.	Disposition concordance is defined as the proportion of participants whose final care recommendation matches the expert-defined level. The five levels are self-care, routine outpatient care, urgent outpatient care, emergency department visit, and emergency medical services.	1 hour.
Relevant conditions identification of the 3M-6D education Gemini group compared with the Gemini group Tidsramme: 1 hour.		1 hour.
Disposition concordance of the 3M-6D education Gemini group compared with the Gemini group Tidsramme: 1 hour.		1 hour.

Sekundære resultatmål

Resultatmål	Foranstaltningsbeskrivelse	Tidsramme
Relevant conditions identification of the 3M-6D education GPT group compared with the control group Tidsramme: 1 hour.		1 hour.
Relevant conditions identification of the 3M-6D education Gemini group compared with the control group Tidsramme: 1 hour.		1 hour.
Disposition concordance of the 3M-6D education GPT group compared with the control group Tidsramme: 1 hour.		1 hour.
Disposition concordance of the 3M-6D education Gemini group compared with the control group Tidsramme: 1 hour.		1 hour.
Red-flag identification in the 3M-6D education GPT group compared with the GPT group Tidsramme: 1 hour.	Red-flag identification is defined as the proportion of participants whose final response includes the key warning signs that experts defined for the assigned scenario.	1 hour.
Red-flag identification in the 3M-6D education GPT group compared with the control group Tidsramme: 1 hour.		1 hour.
Red-flag identification in the 3M-6D education Gemini group compared with the Gemini group Tidsramme: 1 hour.		1 hour.
Red-flag identification in the 3M-6D education Gemini group compared with the control group Tidsramme: 1 hour.		1 hour.
NASA Task Load Index score of the 3M-6D education GPT group compared with the GPT group Tidsramme: 1 hour.	NASA-TLX score is a self-reported task-load score measured after the simulated consultation with a physician. It includes six domains: mental demand, physical demand, temporal demand, effort, frustration, and performance. Each domain is scored from 0 to 100. The total score is the mean of the six domains. Higher scores indicate greater perceived task load.	1 hour.
NASA Task Load Index score of the 3M-6D education GPT group compared with the control group Tidsramme: 1 hour.	NASA-TLX score is a self-reported task-load score measured after the simulated consultation with a physician. It includes six domains: mental demand, physical demand, temporal demand, effort, frustration, and performance. Each domain is scored from 0 to 100. The total score is the mean of the six domains. Higher scores indicate greater perceived task load.	1 hour.
NASA Task Load Index score of the 3M-6D education Gemini group compared with the Gemini group Tidsramme: 1 hour.	NASA-TLX score is a self-reported task-load score measured after the simulated consultation with a physician. It includes six domains: mental demand, physical demand, temporal demand, effort, frustration, and performance. Each domain is scored from 0 to 100. The total score is the mean of the six domains. Higher scores indicate greater perceived task load.	1 hour.
NASA Task Load Index score of the 3M-6D education Gemini group compared with the control group Tidsramme: 1 hour.	NASA-TLX score is a self-reported task-load score measured after the simulated consultation with a physician. It includes six domains: mental demand, physical demand, temporal demand, effort, frustration, and performance. Each domain is scored from 0 to 100. The total score is the mean of the six domains. Higher scores indicate greater perceived task load.	1 hour.
Relevant conditions identification of the 3M-6D education GPT group compared with the 3M-6D education Gemini group Tidsramme: 1 hour.		1 hour.
Disposition concordance of the 3M-6D education GPT group compared with the 3M-6D education Gemini group Tidsramme: 1 hour.		1 hour.
Red-flag identification in the 3M-6D education GPT group compared with the 3M-6D education Gemini group Tidsramme: 1 hour.		1 hour.
NASA Task Load Index score of the 3M-6D education GPT group compared with the 3M-6D education Gemini group Tidsramme: 1 hour.	NASA-TLX score is a self-reported task-load score measured after the simulated consultation with a physician. It includes six domains: mental demand, physical demand, temporal demand, effort, frustration, and performance. Each domain is scored from 0 to 100. The total score is the mean of the six domains. Higher scores indicate greater perceived task load.	1 hour.

Andre resultatmål

Resultatmål	Foranstaltningsbeskrivelse	Tidsramme
Failure to identify red flags in the 3M-6D education GPT group compared with the GPT group Tidsramme: 1 hour.	Failure to identify red flags is defined as the proportion of participants whose final response does not include the expert-defined red-flag symptoms or warning signs for the assigned standardized simulated clinical scenario.	1 hour.
Failure to identify red flags in the 3M-6D education GPT group compared with the control group Tidsramme: 1 hour.		1 hour.
Failure to identify red flags in the 3M-6D education Gemini group compared with the Gemini group Tidsramme: 1 hour.		1 hour.
Failure to identify red flags in the 3M-6D education Gemini group compared with the control group Tidsramme: 1 hour.		1 hour.
Underestimation of disposition in the 3M-6D education GPT group compared with the GPT group Tidsramme: 1 hour.	Underestimation of disposition is defined as the proportion of participants whose final care recommendation is lower than the expert-defined disposition level for the assigned standardized simulated clinical scenario.	1 hour.
Underestimation of disposition in the 3M-6D education GPT group compared with the control group Tidsramme: 1 hour.		1 hour.
Underestimation of disposition in the 3M-6D education Gemini group compared with the Gemini group Tidsramme: 1 hour.		1 hour.
Underestimation of disposition in the 3M-6D education Gemini group compared with the control group Tidsramme: 1 hour.		1 hour.

Samarbejdspartnere og efterforskere

Det er her, du vil finde personer og organisationer, der er involveret i denne undersøgelse.

Sponsor

Capital Medical University

Samarbejdspartnere

Xuanwu Hospital, Beijing

Datoer for undersøgelser

Disse datoer sporer fremskridtene for indsendelser af undersøgelsesrekord og resumeresultater til ClinicalTrials.gov. Studieregistreringer og rapporterede resultater gennemgås af National Library of Medicine (NLM) for at sikre, at de opfylder specifikke kvalitetskontrolstandarder, før de offentliggøres på den offentlige hjemmeside.

Studer store datoer

Studiestart (Faktiske)

3. juli 2026

Primær færdiggørelse (Anslået)

20. juli 2026

Studieafslutning (Anslået)

20. juli 2026

Datoer for studieregistrering

Først indsendt

11. juni 2026

Først indsendt, der opfyldte QC-kriterier

11. juni 2026

Først opslået (Faktiske)

16. juni 2026

Opdateringer af undersøgelsesjournaler

Sidste opdatering sendt (Faktiske)

7. juli 2026

Sidste opdatering indsendt, der opfyldte kvalitetskontrolkriterier

3. juli 2026

Sidst verificeret

1. juli 2026

Mere information

Begreber relateret til denne undersøgelse

Nøgleord

Andre undersøgelses-id-numre

LAMP-1

Plan for individuelle deltagerdata (IPD)

Planlægger du at dele individuelle deltagerdata (IPD)?

UBESLUTET

Lægemiddel- og udstyrsoplysninger, undersøgelsesdokumenter

Studerer et amerikansk FDA-reguleret lægemiddelprodukt

Ingen

Studerer et amerikansk FDA-reguleret enhedsprodukt

Ingen

Disse oplysninger blev hentet direkte fra webstedet clinicaltrials.gov uden ændringer. Hvis du har nogen anmodninger om at ændre, fjerne eller opdatere dine undersøgelsesoplysninger, bedes du kontakte register@clinicaltrials.gov. Så snart en ændring er implementeret på clinicaltrials.gov, vil denne også blive opdateret automatisk på vores hjemmeside .

Kliniske forsøg med Relevant Conditions Identification

Massachusetts General Hospital

Afsluttet

Effektiviteten af lys til forbedring af kognition ved transkraniel gentaget anvendelse (ELECTRA) (ELECTRA)

Sygdomstype og/eller -kategori Ikke relevant

Forenede Stater
Spectrum Ergonomics and Occupational Health Services
University of Utah; United States Air Force Research Laboratory

Rekruttering

Nye selvopladende, medicinsk-godkendte smarte indlægssåler med AI/ML-edgecomputing til at overvåge biometriske data.

Ingen relevant tilstand; Undersøgelse af fysiologisk monitor alarm

Forenede Stater
Ospedale San Raffaele
amg International

Afsluttet

Intern biologisk nedbrydelig stent versus ikke-stent hos patienter med høj risiko for at udvikle fistel efter pancreatoduodenektomi (BioSteP)

Pancreatoduodenektomi | Klinisk relevant postoperativ pancreasfistel | Intern biologisk nedbrydelig bugspytkirtelstent

Italien
Integrative Research Laboratories AB

Afsluttet

Fase I-forsøg, der evaluerer farmakokinetikken af enkeltstående orale doser af IRL757 hos raske ældre frivillige

Apati | Ikke relevant, da dette er et massebalance-/farmakokinetisk studie udført på raske forsøgspersoner

Sverige

Kliniske forsøg med ChatGPT

Charite University, Berlin, Germany
German Research Foundation; Max Planck Institute for Human Development

Ikke rekrutterer endnu

Ovariecancerscreening og AI (AI-OCS-Gyn)

Anbefalinger for screening af æggestokkræft fra gynækologer

Tyskland
Istituto Clinico Humanitas
Fondazione I.R.C.C.S. Istituto Neurologico Carlo Besta

Afsluttet

ChatGPT i diagnostik og behandling af komplekse polyneuropatier: Sammenlignende analyse med neurologer ved brug af virkelige tilfælde (REASON)

Polyneuropatier

Italien
Philipps University Marburg

Afsluttet

Al for at forbedre diagnosen af sjældne gigtsygdomme (AIDRARER)

Reumatiske sygdomme

Tyskland
Chang Gung University of Science and Technology
National Science and Technology Council, Taiwan

Ikke rekrutterer endnu

Chatgpt -baseret intervention for social skrøbelighed hos ældre kvinder med CHF: kønsforskelle

Social kommunikation | CHF - Kongestiv hjertesvigt | 65 år ældre
Boston Intelligent Medical Research Center, Shenzhen...
Tsinghua University

Ikke rekrutterer endnu

ChatGPT vs. Menneske ved at skrive et præoperativt besøgsark

Præoperativ pleje
National Taiwan University Hospital

Tilmelding efter invitation

Brug af ChatGpt som jobtræner til voksne med autismespektrumforstyrrelse

Autisme | Jobinterviewpræstation

Taiwan
Saglik Bilimleri Universitesi

Afsluttet

Effekten af ChatGPT-baseret sygeplejeprocestræning

Sygeplejerske-patient relationer

Kalkun
Chang Gung University of Science and Technology

Ikke rekrutterer endnu

Sammenligning af sarkopeni, fysisk, psykologisk og social skrøbelighed hos hospitaliserede ældre kvinder CHF i Metropolitan og landlige omgivelser

Social kommunikation | CHF - Kongestiv hjertesvigt | 65 år ældre | Sarkopeni hos ældre
King Faisal Specialist Hospital & Research Center

Rekruttering

AI-assisteret hudvurdering til forebyggelse af trykskader for intensivsygeplejersker (IT-PIP)

Studiet fokuserer på hudvurdering og tryksårstadiering hos intensivpatienter

Saudi Arabien
Ankara University

Ikke rekrutterer endnu

Chatgpt-assisteret hypertension viden hos sygeplejestuderende

Forhøjet blodtryk | Kunstig intelligens | Sygeplejestuderende

Improving the Reliability of LLMs as Medical Assistants for the General Public (LAMP-1)

Improving the Reliability of LLMs as Medical Assistants for the General Public: a Proof of Concept Simulation Trial

Studieoversigt

Status

Betingelser

Intervention / Behandling

Detaljeret beskrivelse

Undersøgelsestype

Tilmelding (Anslået)

Fase

Kontakter og lokationer

Studiekontakt

Undersøgelse Kontakt Backup

Studiesteder

Deltagelseskriterier

Berettigelseskriterier

Aldre berettiget til at studere

Tager imod sunde frivillige

Beskrivelse

Studieplan

Hvordan er undersøgelsen tilrettelagt?

Design detaljer

Antal våben

Våben og indgreb

Deltagergruppe / Arm

Intervention / Behandling

Hvad måler undersøgelsen?

Primære resultatmål

Resultatmål

Foranstaltningsbeskrivelse

Tidsramme

Sekundære resultatmål

Resultatmål

Foranstaltningsbeskrivelse

Tidsramme

Andre resultatmål

Resultatmål

Foranstaltningsbeskrivelse

Tidsramme

Samarbejdspartnere og efterforskere

Sponsor

Samarbejdspartnere

Datoer for undersøgelser

Studer store datoer

Studiestart (Faktiske)

Primær færdiggørelse (Anslået)

Studieafslutning (Anslået)

Datoer for studieregistrering

Først indsendt

Først indsendt, der opfyldte QC-kriterier

Først opslået (Faktiske)

Opdateringer af undersøgelsesjournaler

Sidste opdatering sendt (Faktiske)

Sidste opdatering indsendt, der opfyldte kvalitetskontrolkriterier

Sidst verificeret

Mere information

Begreber relateret til denne undersøgelse

Nøgleord

Andre undersøgelses-id-numre

Plan for individuelle deltagerdata (IPD)

Planlægger du at dele individuelle deltagerdata (IPD)?

Lægemiddel- og udstyrsoplysninger, undersøgelsesdokumenter

Studerer et amerikansk FDA-reguleret lægemiddelprodukt

Studerer et amerikansk FDA-reguleret enhedsprodukt

Kliniske forsøg med Relevant Conditions Identification

Kliniske forsøg med ChatGPT

Søg i lignende forsøg

Sponsorer og samarbejdspartnere

Medicinske tilstande

Narkotikainterventioner

CROs by country

CROs in Turkmenistan

Betingelser

Sjældne sygdomme

Narkotikainterventioner

Kosttilskud

Sponsor / samarbejdspartnere

Placeringer