- ICH GCP
- Registro degli studi clinici negli Stati Uniti
- Sperimentazione clinica NCT07647159
Voice Assistant for Outpatient Neurology Visits
Voice Assistant for Outpatient Neurology Visits Based on Artificial Intelligence Technologies: A Multicenter Prospective Pilot Study
Documentation duties account for a substantial portion of an outpatient physician's working time and reduce time available for direct patient interaction. Voice assistants based on automatic speech recognition and large language models are being developed to automate medical documentation across clinical specialties. However, such ambient AI-based services have not been systematically validated in Russian-language outpatient neurology practice integrated with a regional electronic health record platform.
This pilot, multicenter, prospective, before-after study evaluated the feasibility and preliminary effectiveness of a voice assistant Service designed to automatically pre-fill the structured outpatient neurology visit protocol in the Moscow regional medical information system (EMIAS). The Service implements a pipeline of streaming speech-to-text transcription, two-speaker diarization, and large language model-based mapping of the dialogue between the physician and the patient onto the fields of the standardized neurology examination protocol.
Five neurologists at five outpatient clinics in Moscow participated. The study comprised three stages: (1) baseline timing of consultations without the Service; (2) timing of consultations with the Service after a two-week adaptation period, with parallel evaluation of transcription and pre-fill quality and of physician and patient satisfaction; and (3) statistical analysis. Three hundred twenty consultations were timed (160 per stage). A stratified random sample of 30 audio-recording / generated-protocol pairs was used to evaluate Service quality; free-text fields were rated on a 5-domain Likert questionnaire and on a 10-point visual analogue scale, and binary fields were rated dichotomously to derive sensitivity, specificity, accuracy, Jaccard index, and false-positive rate. Patient satisfaction was assessed by the modified Patient Satisfaction Questionnaire 8 (PSQ-8); physician feedback was assessed by a custom questionnaire (including the Net Promoter Score) and by semi-structured in-depth interviews with thematic analysis using grounded theory.
The primary outcomes were the change between stages in (a) the time of focused physician attention to the patient and (b) the time spent filling and editing the protocol. Secondary outcomes addressed total consultation time, transcription quality (Word Error Rate), expert-rated quality of pre-filled fields, patient satisfaction, and physician satisfaction.
The study was conducted under the framework of the Moscow Healthcare Department experiment on the use of digital innovation technologies in health care (Order No. 153 of 21 February 2025), and was approved by the local independent ethics committee.
Panoramica dello studio
Stato
Condizioni
Descrizione dettagliata
Rationale and evidence gap. Documentation burden is widely recognized as a leading driver of professional burnout among physicians and reduces the time available for direct patient interaction. International evidence on ambient AI scribes - systems that combine automatic speech recognition (ASR) and large language models (LLM) - shows consistent reductions in documentation time and stable or improved patient satisfaction. However, evidence specific to Russian-language outpatient neurology practice integrated with a regional electronic health record is lacking. Neurologists are among the medical specialties with the highest reported burnout rates internationally; the specifics of Russian-language medical terminology, the structured neurological examination, and integration with a regional medical information system preclude direct extrapolation of international findings.
Technical implementation of the Service. The Service implements a distributed pipeline: streaming audio capture by an EMIAS client module; REST-API transmission of audio chunks to the speech recognition service with two-speaker diarization (physician/patient) and contextual re-clarification of streaming transcripts; asynchronous routing of transcription results through Apache Kafka topics to a pre-fill subsystem; large-language-model-based mapping of the full transcript onto a JSON schema of the structured outpatient neurology examination protocol (document class code 23951) using clinical reference dictionaries (including ICD-10 via the MKB10_ACTIVE_SHORT_INFO_V2 reference table); and return of the structured pre-filled protocol to the physician's user interface for review, editing, and signing. Audio is recorded via an active HD-capsule microphone placed at the physician's workstation (OGG Vorbis, 16 kHz, ≥ 64 kbit/s, mono). A priori quality targets fixed in the technical documentation before the pilot were: Word Error Rate of streaming transcription ≤ 10 %; sensitivity, specificity, accuracy, and Jaccard index for binary protocol fields ≥ 0.9; mean expert rating on a Visual Analogue Scale ≥ 8 out of 10; share of consultations with protocol generation time ≤ 35 seconds ≥ 90 %; physician Net Promoter Score ≥ 0 %.
Methodological design choices. A non-randomized single-group before-after design was selected as appropriate for a first-line feasibility evaluation in real clinical practice: the same physicians and clinics provide their own baseline reference in Stage 1, eliminating between-physician confounding while preserving ecological validity. A two-week post-deployment adaptation period was prespecified to mitigate purely learning-curve effects on Stage 2 metrics. Timing of consultation stages was performed by trained experts of the Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Healthcare Department on the basis of three synchronized data sources for each consultation: (i) automatic EMIAS event logs from protocol creation to signing; (ii) screen recording of the physician's workstation (VocoScreen NG, version 4.01.1); and (iii) webcam recording of the physician's face and gaze direction. The tri-source synchronization reduces systematic bias common to single-source or self-reported timing studies.
Quality evaluation framework. A stratified random sample of 30 paired audio recordings and Service-generated protocols (six per physician) was used to evaluate transcription and pre-fill quality. Reference (gold-standard) transcripts were prepared manually by an expert with more than five years of experience. Free-text fields of the protocol were rated by four independent neurology experts on a five-domain Likert questionnaire (Relevance, Accuracy, Completeness, Conciseness, Linguistic correctness) and on a 10-point Visual Analogue Scale; binary fields were rated dichotomously to construct a confusion matrix and derive sensitivity, specificity, accuracy, the Jaccard index, and the false-positive (hallucination) rate. Inter-rater agreement among the four experts was quantified by Gwet's AC2, preferred over Cohen's κ for categorical classifications with asymmetric prevalence. Word Error Rate was computed in Python with the jiwer library against the reference transcripts.
Analytical approach. Quantitative between-stage comparisons used Student's t-test or the Mann-Whitney U test depending on distribution, with Pearson's χ² for categorical variables and Cohen's d as the effect-size measure (α = 0.05, two-sided). Qualitative analysis of semi-structured in-depth interviews with each participating neurologist followed the grounded theory approach of Strauss and Corbin, with open, axial, and selective coding.
Tipo di studio
Iscrizione (Effettivo)
Fase
- Non applicabile
Contatti e Sedi
Luoghi di studio
-
-
-
Moscow, Russia, 125051
- Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department
-
-
Criteri di partecipazione
Criteri di ammissibilità
Età idonea allo studio
- Adulto
- Adulto più anziano
Accetta volontari sani
Descrizione
Inclusion Criteria:
- First-time outpatient neurology visit conducted by a participating physician according to the approved consultation script.
- Age 18 years or older.
- Signed informed consent to participate in the study.
- Signed consent to processing of personal data.
- Clinical condition that allowed the patient to complete the modified Patient Satisfaction Questionnaire 8 (PSQ-8).
Exclusion Criteria:
- Consultation longer than 45 minutes.
- Consultation shorter than 5 minutes, not requiring a full examination (e.g., prescription renewal, clarification of follow-up questions only).
- Technical failures preventing valid interpretation of the consultation or its timing (microphone failure, malfunction of the recording software, malfunction of the Service).
- Deviation from the approved consultation script.
Piano di studio
Come è strutturato lo studio?
Dettagli di progettazione
- Scopo principale: Ricerca sui servizi sanitari
- Assegnazione: N / A
- Modello interventistico: Assegnazione di gruppo singolo
- Mascheramento: Nessuno (etichetta aperta)
Armi e interventi
Gruppo di partecipanti / Arm |
Intervento / Trattamento |
|---|---|
|
Sperimentale: Outpatient Neurology Visits with AI Voice Assistant
All participating neurologists conducted outpatient consultations using the Service after a two-week adaptation period.
Each consultation followed the approved script; the Service captured the audio, transcribed it in real time, generated a pre-filled visit protocol via a large language model, and returned it for physician review and signing.
|
The Service is software using artificial intelligence technologies that automatically pre-fills the structured outpatient neurology visit protocol in the Moscow regional medical information system (EMIAS) based on the audio-recorded dialogue between physician and patient.
The pipeline comprises: (1) streaming audio capture by the EMIAS audio-recording client module, (2) chunk-wise transmission to the speech recognition service via REST API, (3) routing of transcription results through Apache Kafka topics to the pre-fill subsystem, (4) large language model-based mapping of the full transcript onto the JSON schema of the target document using clinical reference dictionaries, and (5) return of the structured pre-filled protocol to the physician's user interface.
Audio is recorded via an active HD-capsule microphone placed at the physician's workstation.
The physician reviews each pre-filled field, edits when necessary and signs the protocol.
|
Cosa sta misurando lo studio?
Misure di risultato primarie
Misura del risultato |
Misura Descrizione |
Lasso di tempo |
|---|---|---|
|
Change in physician focused-attention time on the patient
Lasso di tempo: Up to 14 weeks: pre-intervention measurements collected per consultation during study Weeks 1-2 (Stage 1); post-intervention measurements collected per consultation during study Weeks 5-14 (Stage 2, after a 2-week adaptation period).
|
Time (in minutes) of focused physician attention on the patient, including visual contact and active interaction during examination, measured from three synchronized sources (EMIAS logs, screen video, webcam video)
|
Up to 14 weeks: pre-intervention measurements collected per consultation during study Weeks 1-2 (Stage 1); post-intervention measurements collected per consultation during study Weeks 5-14 (Stage 2, after a 2-week adaptation period).
|
|
Change in protocol-filling time
Lasso di tempo: Up to 14 weeks: pre-intervention measurements collected per consultation during study Weeks 1-2 (Stage 1); post-intervention measurements collected per consultation during study Weeks 5-14 (Stage 2, after a 2-week adaptation period).
|
Time (in minutes) spent filling and editing the visit protocol, including keyboard input, mouse clicks, copying and pasting, measured from the three synchronized sources.
|
Up to 14 weeks: pre-intervention measurements collected per consultation during study Weeks 1-2 (Stage 1); post-intervention measurements collected per consultation during study Weeks 5-14 (Stage 2, after a 2-week adaptation period).
|
Misure di risultato secondarie
Misura del risultato |
Misura Descrizione |
Lasso di tempo |
|---|---|---|
|
Change in total consultation time
Lasso di tempo: Up to 14 weeks: pre-intervention measurements collected per consultation during study Weeks 1-2 (Stage 1); post-intervention measurements collected per consultation during study Weeks 5-14 (Stage 2, after a 2-week adaptation period).
|
Total time (in minutes) from creation to signing of the visit protocol, derived from EMIAS event logs
|
Up to 14 weeks: pre-intervention measurements collected per consultation during study Weeks 1-2 (Stage 1); post-intervention measurements collected per consultation during study Weeks 5-14 (Stage 2, after a 2-week adaptation period).
|
|
Transcription quality (Word Error Rate, WER)
Lasso di tempo: Up to 14 weeks: assessed once per consultation on a stratified random sample of 30 audio recordings collected during study Weeks 5-14 (post-intervention period).
|
WER (%) of automatically generated transcripts versus expert reference transcripts on a stratified random sample of 30 consultations (6 per physician), computed with the jiwer Python library
|
Up to 14 weeks: assessed once per consultation on a stratified random sample of 30 audio recordings collected during study Weeks 5-14 (post-intervention period).
|
|
Expert-rated quality of free-text protocol fields
Lasso di tempo: Up to 14 weeks: assessed once per protocol on a stratified random sample of 30 Service-generated protocols from consultations conducted during study Weeks 5-14 (post-intervention period).
|
Mean rating of free-text protocol fields by four independent neurology experts using a modified expert evaluation questionnaire ("E-5"), scored on a 5-point Likert scale ranging from 1 ("completely disagree") to 5 ("completely agree").
Each of the "Complaints", "History of present illness", and "Recommendations" sections of 30 Service-generated protocols is rated across five domains: Relevance, Accuracy, Completeness, Conciseness (scored on an inverted scale, where higher values denote less redundant content), and Linguistic correctness.
Each domain score ranges from 1 to 5; for all five domains, higher scores indicate better field quality (higher scores reflect a better outcome).
|
Up to 14 weeks: assessed once per protocol on a stratified random sample of 30 Service-generated protocols from consultations conducted during study Weeks 5-14 (post-intervention period).
|
|
Quality of pre-filling for binary protocol fields
Lasso di tempo: Up to 14 weeks: assessed once per protocol on a stratified random sample of 30 Service-generated protocols from consultations conducted during study Weeks 5-14 (post-intervention period).
|
Sensitivity, specificity, accuracy, Jaccard index, and false-positive (hallucination) rate for binary fields of 30 auto-generated protocols rated by four independent experts
|
Up to 14 weeks: assessed once per protocol on a stratified random sample of 30 Service-generated protocols from consultations conducted during study Weeks 5-14 (post-intervention period).
|
|
Overall protocol quality (Visual Analogue Scale)
Lasso di tempo: Up to 14 weeks: assessed once per protocol on a stratified random sample of 30 Service-generated protocols from consultations conducted during study Weeks 5-14 (post-intervention period).
|
Mean expert rating of the overall quality of each Service-generated visit protocol on a Visual Analogue Scale (VAS) ranging from 1 to 10.
The scale anchors are 1 = "lowest possible quality" and 10 = "highest possible quality"; higher scores indicate better protocol quality (higher scores reflect a better outcome).
Each of the 30 sampled Service-generated protocols is rated independently by four neurology experts; the mean of all expert ratings across protocols is reported.
|
Up to 14 weeks: assessed once per protocol on a stratified random sample of 30 Service-generated protocols from consultations conducted during study Weeks 5-14 (post-intervention period).
|
|
Patient satisfaction (modified PSQ-8)
Lasso di tempo: Up to 14 weeks: pre-intervention questionnaires collected per consultation during study Weeks 1-2 (Stage 1); post-intervention questionnaires collected per consultation during study Weeks 5-14 (Stage 2).
|
Mean scores on the modified Patient Satisfaction Questionnaire 8 (PSQ-8) - an 8-item self-administered patient satisfaction instrument adapted to outpatient neurology - across seven reported domains: Comfort during the consultation; Perceived physician attention (focus on the patient versus on documentation); Understanding of the physician's explanations; Patient involvement in clinical decision-making; Trust in the prescribed treatment; Understanding of subsequent management tactics; and Knowledge of actions to take in emergency situations.
Each item is rated on a 5-point Likert scale ranging from 1 ("absolutely not") to 5 ("yes, completely").
Each domain score ranges from 1 to 5; higher scores indicate greater patient satisfaction (higher scores reflect a better outcome).
|
Up to 14 weeks: pre-intervention questionnaires collected per consultation during study Weeks 1-2 (Stage 1); post-intervention questionnaires collected per consultation during study Weeks 5-14 (Stage 2).
|
|
Physician satisfaction (Net Promoter Score)
Lasso di tempo: At the end of study (study Week 14): single measurement collected from each participating physician upon completion of Stage 2.
|
Net Promoter Score (NPS) for physician satisfaction with the Service, calculated from a single 0-to-10 likelihood-to-recommend question administered to each participating neurologist at the end of Stage 2. The item anchors are 0 = "would never recommend" and 10 = "would definitely recommend".
Respondents are classified as Detractors (score 0-6), Passives (score 7-8), or Promoters (score 9-10).
The Net Promoter Score is calculated as (% Promoters) - (% Detractors).
The NPS ranges from -100 (all respondents are Detractors) to +100 (all respondents are Promoters); higher scores indicate greater physician willingness to recommend the Service (higher scores reflect a better outcome).
|
At the end of study (study Week 14): single measurement collected from each participating physician upon completion of Stage 2.
|
Collaboratori e investigatori
Investigatori
- Direttore dello studio: Yuriy Vasilev, MD, PhD, Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department
Studiare le date dei record
Studia le date principali
Inizio studio (Effettivo)
Completamento primario (Effettivo)
Completamento dello studio (Effettivo)
Date di iscrizione allo studio
Primo inviato
Primo inviato che soddisfa i criteri di controllo qualità
Primo Inserito (Effettivo)
Aggiornamenti dei record di studio
Ultimo aggiornamento pubblicato (Effettivo)
Ultimo aggiornamento inviato che soddisfa i criteri QC
Ultimo verificato
Maggiori informazioni
Termini relativi a questo studio
Parole chiave
Termini MeSH pertinenti aggiuntivi
Altri numeri di identificazione dello studio
- 2025-9
Piano per i dati dei singoli partecipanti (IPD)
Hai intenzione di condividere i dati dei singoli partecipanti (IPD)?
Descrizione del piano IPD
Informazioni su farmaci e dispositivi, documenti di studio
Studia un prodotto farmaceutico regolamentato dalla FDA degli Stati Uniti
Studia un dispositivo regolamentato dalla FDA degli Stati Uniti
Queste informazioni sono state recuperate direttamente dal sito web clinicaltrials.gov senza alcuna modifica. In caso di richieste di modifica, rimozione o aggiornamento dei dettagli dello studio, contattare register@clinicaltrials.gov. Non appena verrà implementata una modifica su clinicaltrials.gov, questa verrà aggiornata automaticamente anche sul nostro sito web .
Prove cliniche su Burnout, Professionista
-
Fundacion Arturo Lopez PerezNon ancora reclutamentoSindrome da burnout | Burnout professionale
-
The Fourth Hospital of ShijiazhuangCompletatoBurnout professionaleCina
-
University Hospital, Basel, SwitzerlandCompletato
-
Ankara Yildirim Beyazıt UniversityAnkara Etlik City HospitalReclutamentoBurnout occupazionale | Abilità comunicative | Consapevolezza emotivaTurchia (Türkiye)
-
Second Affiliated Hospital, School of Medicine,...Non ancora reclutamentoBurnout occupazionale
-
Region SkaneLund University; Swedish Council for Working Life and Social Research; County Councils...Completato
-
University of PadovaUniversity of ChileCompletato
-
Hasanuddin UniversityCompletatoAuto-efficacia e Burnout Accademico tra gli Studenti di Scienze Motorie in Indonesia (BURNOUT-SE-26)Sindrome da burnout | AutoefficaciaIndonesia
-
Firstbeat Technologies OyTampere University; University of Jyvaskyla; Aisti Health OyAttivo, non reclutanteStress occupazionale | Burnout occupazionaleFinlandia
-
University of Colorado, DenverMayo Clinic; Physicians FoundationReclutamento