Effect of Artificial Intelligence Tutoring vs Expert Instruction on Learning Simulated Surgical Skills Among Medical Students: A Randomized Clinical Trial

Ali M Fazlollahi, Mohamad Bakhaidar, Ahmad Alsayegh, Recai Yilmaz, Alexander Winkler-Schwartz, Nykan Mirchi, Ian Langleben, Nicole Ledwos, Abdulrahman J Sabbagh, Khalid Bajunaid, Jason M Harley, Rolando F Del Maestro, Ali M Fazlollahi, Mohamad Bakhaidar, Ahmad Alsayegh, Recai Yilmaz, Alexander Winkler-Schwartz, Nykan Mirchi, Ian Langleben, Nicole Ledwos, Abdulrahman J Sabbagh, Khalid Bajunaid, Jason M Harley, Rolando F Del Maestro

Abstract

Importance: To better understand the emerging role of artificial intelligence (AI) in surgical training, efficacy of AI tutoring systems, such as the Virtual Operative Assistant (VOA), must be tested and compared with conventional approaches.

Objective: To determine how VOA and remote expert instruction compare in learners' skill acquisition, affective, and cognitive outcomes during surgical simulation training.

Design, setting, and participants: This instructor-blinded randomized clinical trial included medical students (undergraduate years 0-2) from 4 institutions in Canada during a single simulation training at McGill Neurosurgical Simulation and Artificial Intelligence Learning Centre, Montreal, Canada. Cross-sectional data were collected from January to April 2021. Analysis was conducted based on intention-to-treat. Data were analyzed from April to June 2021.

Interventions: The interventions included 5 feedback sessions, 5 minutes each, during a single 75-minute training, including 5 practice sessions followed by 1 realistic virtual reality brain tumor resection. The 3 intervention arms included 2 treatment groups, AI audiovisual metric-based feedback (VOA group) and synchronous verbal scripted debriefing and instruction from a remote expert (instructor group), and a control group that received no feedback.

Main outcomes and measures: The coprimary outcomes were change in procedural performance, quantified as Expertise Score by a validated assessment algorithm (Intelligent Continuous Expertise Monitoring System [ICEMS]; range, -1.00 to 1.00) for each practice resection, and learning and retention, measured from performance in realistic resections by ICEMS and blinded Objective Structured Assessment of Technical Skills (OSATS; range 1-7). Secondary outcomes included strength of emotions before, during, and after the intervention and cognitive load after intervention, measured in self-reports.

Results: A total of 70 medical students (41 [59%] women and 29 [41%] men; mean [SD] age, 21.8 [2.3] years) from 4 institutions were randomized, including 23 students in the VOA group, 24 students in the instructor group, and 23 students in the control group. All participants were included in the final analysis. ICEMS assessed 350 practice resections, and ICEMS and OSATS evaluated 70 realistic resections. VOA significantly improved practice Expertise Scores by 0.66 (95% CI, 0.55 to 0.77) points compared with the instructor group and by 0.65 (95% CI, 0.54 to 0.77) points compared with the control group (P < .001). Realistic Expertise Scores were significantly higher for the VOA group compared with instructor (mean difference, 0.53 [95% CI, 0.40 to 0.67] points; P < .001) and control (mean difference. 0.49 [95% CI, 0.34 to 0.61] points; P < .001) groups. Mean global OSATS ratings were not statistically significant among the VOA (4.63 [95% CI, 4.06 to 5.20] points), instructor (4.40 [95% CI, 3.88-4.91] points), and control (3.86 [95% CI, 3.44 to 4.27] points) groups. However, on the OSATS subscores, VOA significantly enhanced the mean OSATS overall subscore compared with the control group (mean difference, 1.04 [95% CI, 0.13 to 1.96] points; P = .02), whereas expert instruction significantly improved OSATS subscores for instrument handling vs control (mean difference, 1.18 [95% CI, 0.22 to 2.14]; P = .01). No significant differences in cognitive load, positive activating, and negative emotions were found.

Conclusions and relevance: In this randomized clinical trial, VOA feedback demonstrated superior performance outcome and skill transfer, with equivalent OSATS ratings and cognitive and emotional responses compared with remote expert instruction, indicating advantages for its use in simulation training.

Trial registration: ClinicalTrials.gov Identifier: NCT04700384.

Conflict of interest statement

Conflict of Interest Disclosures: Mr Mirchi, Dr Yilmaz, Dr Winkler-Schwartz, Ms Ledwos, and Dr Del Maestro have a US patent for “A Framework For Transparent Artificial Intelligence In Simulation: The Virtual Operative Assistant” application No. PCT/CA2020/050353, international patent No. WO 2020/186348. Dr Mirchi reported receiving grants from Di Giovanni Foundation outside the submitted work. No other disclosures were reported.

Figures

Figure 1.. Participant Recruitment Flowchart
Figure 1.. Participant Recruitment Flowchart
Figure 2.. Performance Assessment in the Practice…
Figure 2.. Performance Assessment in the Practice Tumor Resections
A, Negative scores indicate a novice; and a positive score, a more expert performance. Scores in each trial are the mean of all estimations made for every 200 milliseconds of the simulated procedure (approximately 1500 predictions for a 5-minute practice scenario). B, Maximum bipolar force application is a recording of the highest amount of force applied with the bipolar during the entire operation. C, Mean instrument tip separation distance measured as the mean distance between the aspirator and the bipolar tips. D, Mean bipolar acceleration measured as the rate of change in the bipolar instrument’s velocity. Error bars indicate 95% CIs; and VOA, Virtual Operative Assistant.
Figure 3.. Performance Assessment in the Realistic…
Figure 3.. Performance Assessment in the Realistic Tumor Resection
Error bars indicate 95% CIs; OSATS, Objective Structured Assessment of Technical Skills; and VOA, Virtual Operative Assistant.
Figure 4.. Emotions and Cognitive Load Throughout…
Figure 4.. Emotions and Cognitive Load Throughout the Simulation Training
Positive activating emotions include happy, hopeful, grateful (A), and negative activating emotions include confusion and anxiety (B). Error bars indicate 95% CIs; and VOA, Virtual Operative Assistant.

References

    1. Schlich T. ‘The days of brilliancy are past’: skill, styles and the changing rules of surgical performance, ca. 1820-1920. Med Hist. 2015;59(3):379-403. doi:10.1017/mdh.2015.26
    1. Lawrence C. Medical Minds, Surgical Bodies. In: Lawrence C, Shapin S, eds. Science Incarnate: Historical Embodiments of Natural Knowledge. University of Chicago Press; 1998:156-201.
    1. Birkmeyer JD, Finks JF, O’Reilly A, et al. ; Michigan Bariatric Surgery Collaborative . Surgical skill and complication rates after bariatric surgery. N Engl J Med. 2013;369(15):1434-1442. doi:10.1056/NEJMsa1300625
    1. Stulberg JJ, Huang R, Kreutzer L, et al. . Association between surgeon technical skills and patient outcomes. JAMA Surg. 2020;155(10):960-968. doi:10.1001/jamasurg.2020.3007
    1. Rogers MP, DeSantis AJ, Janjua H, Barry TM, Kuo PC. The future surgical training paradigm: virtual reality and machine learning in surgical education. Surgery. 2021;169(5):1250-1252. doi:10.1016/j.surg.2020.09.040
    1. Davids J, Manivannan S, Darzi A, Giannarou S, Ashrafian H, Marcus HJ. Simulation for skills training in neurosurgery: a systematic review, meta-analysis, and analysis of progressive scholarly acceptance. Neurosurg Rev. 2021;44(4):1853-1867. doi:10.1007/s10143-020-01378-0
    1. Reznick R, Harris K, Horsely T, Sheikh Hassani M. Task Force Report on Artificial Intelligence and Emerging Digital Technologies. The Royal College of Physicians and Surgeons of Canada; 2020.
    1. Winkler-Schwartz A, Yilmaz R, Mirchi N, et al. . Machine learning identification of surgical and operative factors associated with surgical expertise in virtual reality simulation. JAMA Netw Open. 2019;2(8):e198363-e198363. doi:10.1001/jamanetworkopen.2019.8363
    1. Bissonnette V, Mirchi N, Ledwos N, Alsidieri G, Winkler-Schwartz A, Del Maestro RF; Neurosurgical Simulation & Artificial Intelligence Learning Centre . Artificial intelligence distinguishes surgical training levels in a virtual reality spinal task. J Bone Joint Surg Am. 2019;101(23):e127. doi:10.2106/JBJS.18.01197
    1. Siyar S, Azarnoush H, Rashidi S, et al. . Machine learning distinguishes neurosurgical skill levels in a virtual reality tumor resection task. Med Biol Eng Comput. 2020;58(6):1357-1367. doi:10.1007/s11517-020-02155-3
    1. Munro C, Burke J, Allum W, Mortensen N. COVID-19 leaves surgical training in crisis. BMJ. 2021;372(n659):n659. doi:10.1136/bmj.n659
    1. Tomlinson SB, Hendricks BK, Cohen-Gadol AA. Editorial. Innovations in neurosurgical education during the COVID-19 pandemic: is it time to reexamine our neurosurgical training models? J Neurosurg. 2020;133(1):1-2. doi:10.3171/2020.4.JNS201012
    1. Ma W, Adesope OO, Nesbit JC, Liu Q. Intelligent tutoring systems and learning outcomes: a meta-analysis. J Educ Psychol. 2014;106(4):901-918. doi:10.1037/a0037123
    1. Mirchi N, Bissonnette V, Yilmaz R, Ledwos N, Winkler-Schwartz A, Del Maestro RF. The virtual operative assistant: an explainable artificial intelligence tool for simulation-based training in surgery and medicine. PLoS One. 2020;15(2):e0229596. doi:10.1371/journal.pone.0229596
    1. Harris KA, Nousiainen MT, Reznick R. Competency-based resident education—the Canadian perspective. Surgery. 2020;167(4):681-684. doi:10.1016/j.surg.2019.06.033
    1. van Merriënboer JJG, Kester L. The Four-Component Instructional Design Model: Multimedia Principles in Environments for Complex Learning. In: Mayer RE, ed. The Cambridge Handbook of Multimedia Learning. 2nd ed. Cambridge University Press; 2014:104-148. doi:10.1017/CBO9781139547369.007
    1. Chao TN, Frost AS, Brody RM, et al. . Creation of an interactive virtual surgical rotation for undergraduate medical education during the COVID-19 pandemic. J Surg Educ. 2021;78(1):346-350. doi:10.1016/j.jsurg.2020.06.039
    1. Rojas-Muñoz E, Cabrera ME, Lin C, et al. . The System for Telementoring with Augmented Reality (STAR): a head-mounted display to improve surgical coaching and confidence in remote areas. Surgery. 2020;167(4):724-731. doi:10.1016/j.surg.2019.11.008
    1. Butt KA, Augestad KM. Educational value of surgical telementoring. J Surg Oncol. 2021;124(2):231-240. doi:10.1002/jso.26524
    1. Martin JA, Regehr G, Reznick R, et al. . Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg. 1997;84(2):273-278.
    1. Eppich W, Cheng A. Promoting Excellence and Reflective Learning in Simulation (PEARLS): development and rationale for a blended approach to health care simulation debriefing. Simul Healthc. 2015;10(2):106-115. doi:10.1097/SIH.0000000000000072
    1. Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK; SPIRIT-AI and CONSORT-AI Working Group . Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Lancet Digit Health. 2020;2(10):e537-e548. doi:10.1016/S2589-7500(20)30218-1
    1. Winkler-Schwartz A, Bissonnette V, Mirchi N, et al. . Artificial intelligence in medical education: best practices using machine learning to assess surgical expertise in virtual reality simulation. J Surg Educ. 2019;76(6):1681-1690. doi:10.1016/j.jsurg.2019.05.015
    1. . Accessed January 24, 2022.
    1. Duffy MC, Lajoie SP, Pekrun R, Lachapelle K. Emotions in medical education: Examining the validity of the Medical Emotion Scale (MES) across authentic medical learning environments. Learn Instr. 2020;70:101150. doi:10.1016/j.learninstruc.2018.07.001
    1. Lynch J, Aughwane P, Hammond TM. Video games and surgical ability: a literature review. J Surg Educ. 2010;67(3):184-189. doi:10.1016/j.jsurg.2010.02.010
    1. Rui M, Lee JE, Vauthey J-N, Conrad C. Enhancing surgical performance by adopting expert musicians’ practice and performance strategies. Surgery. 2018;163(4):894-900. doi:10.1016/j.surg.2017.09.011
    1. Macnamara BN, Moreau D, Hambrick DZ. The relationship between deliberate practice and performance in sports: a meta-analysis. Perspect Psychol Sci. 2016;11(3):333-350. doi:10.1177/1745691616635591
    1. Bugdadi A, Sawaya R, Bajunaid K, et al. . Is virtual reality surgical performance influenced by force feedback device utilized? J Surg Educ. 2019;76(1):262-273. doi:10.1016/j.jsurg.2018.06.012
    1. Sabbagh AJ, Bajunaid KM, Alarifi N, et al. . Roadmap for developing complex virtual reality simulation scenarios: subpial neurosurgical tumor resection model. World Neurosurg. 2020;139:e220-e229. doi:10.1016/j.wneu.2020.03.187
    1. Leppink J, Paas F, Van der Vleuten CPM, Van Gog T, Van Merriënboer JJG. Development of an instrument for measuring different types of cognitive load. Behav Res Methods. 2013;45(4):1058-1072. doi:10.3758/s13428-013-0334-1
    1. Delorme S, Laroche D, DiRaddo R, Del Maestro RF. NeuroTouch: a physics-based virtual simulator for cranial microneurosurgery training. Neurosurgery. 2012;71(1)(Suppl Operative):32-42. doi:10.1227/NEU.0b013e318249c744
    1. Gélinas-Phaneuf N, Choudhury N, Al-Habib AR, et al. . Assessing performance in brain tumor resection using a novel virtual reality simulator. Int J Comput Assist Radiol Surg. 2014;9(1):1-9. doi:10.1007/s11548-013-0905-8
    1. Yilmaz R, Winkler-Schwartz A, Mirchi N, Reich A, Del Maestro R. O51: artificial intelligence utilizing recurrent neural networks to continuously monitor composites of surgical expertise. Br J Surg. 2021;108(suppl 1):znab117. doi:10.1093/bjs/znab117.051
    1. Hebb AO, Yang T, Silbergeld DL. The sub-pial resection technique for intrinsic tumor surgery. Surg Neurol Int. 2011;2:180. doi:10.4103/2152-7806.90714
    1. Zimmerman BJ. Investigating Self-Regulation and Motivation: Historical Background, Methodological Developments, and Future Prospects. Am Educ Res J. 2008;45(1):166-183. doi:10.3102/0002831207312909
    1. McGaghie WC. Mastery learning: it is time for medical education to join the 21st century. Acad Med. 2015;90(11):1438-1441. doi:10.1097/ACM.0000000000000911
    1. Ericsson KA, Hoffman RR, Kozbelt A, Williams AM, eds. The Cambridge Handbook of Expertise and Expert Performance. Cambridge University Press; 2018. doi:10.1017/9781316480748
    1. Winkler-Schwartz A, Marwa I, Bajunaid K, et al. . A comparison of visual rating scales and simulated virtual reality metrics in neurosurgical training: a generalizability theory study. World Neurosurg. 2019;127:e230-e235. doi:10.1016/j.wneu.2019.03.059
    1. Kolb DA. Experiential Learning: Experience as the Source of Learning and Development. FT Press; 2014.
    1. Fecso AB, Szasz P, Kerezov G, Grantcharov TP. The effect of technical performance on patient outcomes in surgery: a systematic review. Ann Surg. 2017;265(3):492-501. doi:10.1097/SLA.0000000000001959
    1. Dean WH, Gichuhi S, Buchan JC, et al. . Intense simulation-based surgical education for manual small-incision cataract surgery: the Ophthalmic Learning and Improvement Initiative in Cataract Surgery Randomized Clinical Trial in Kenya, Tanzania, Uganda, and Zimbabwe. JAMA Ophthalmol. 2021;139(1):9-15. doi:10.1001/jamaophthalmol.2020.4718
    1. Meling TR, Meling TR. The impact of surgical simulation on patient outcomes: a systematic review and meta-analysis. Neurosurg Rev. 2021;44(2):843-854. doi:10.1007/s10143-020-01314-2
    1. Lohre R, Bois AJ, Athwal GS, Goel DP; Canadian Shoulder and Elbow Society (CSES) . Improved complex skill acquisition by immersive virtual reality training: a randomized controlled trial. J Bone Joint Surg Am. 2020;102(6):e26. doi:10.2106/JBJS.19.00982
    1. Issenberg SB, McGaghie WC, Petrusa ER, Lee Gordon D, Scalese RJ. Features and uses of high-fidelity medical simulations that lead to effective learning: a BEME systematic review. Med Teach. 2005;27(1):10-28. doi:10.1080/01421590500046924
    1. Kaufman DM, Mann KV. Teaching and Learning in Medical Education: How Theory can Inform Practice. Postgrad Med J. 2001;77:551. doi:10.1136/pmj.77.910.551c
    1. Bouchet F, Harley JM, Azevedo R. Evaluating adaptive pedagogical agents’ prompting strategies effect on students’ emotions. Paper presented at: 14th International Conference on Intelligent Tutoring Systems; June 11, 2018; Montreal, Canada.
    1. Harley JM, Bouchet F, Azevedo R. Aligning and comparing data on emotions experienced during learning with MetaTutor. In: Lane HC, Yacef K, Mostow J, Pavlik P, eds. Artificial Intelligence in Education. AIED 2013. Lecture Notes in Computer Science. Springer; 2013. doi:10.1007/978-3-642-39112-5_7
    1. Schaffir J, Strafford K, Worly B, Traugott A. Challenges to medical education on surgical services during the COVID-19 pandemic. Med Sci Educ. 2020;30(4):1-5. doi:10.1007/s40670-020-01072-2
    1. Mirchi N, Ledwos N, Del Maestro RF. Intelligent tutoring systems: re-envisioning surgical education in response to COVID-19. Can J Neurol Sci. 2021;48(2):198-200, doi:10.1017/cjn.2020.202
    1. Anderson LW. Curricular alignment: a re-examination. Theory Pract. 2002;41(4):255-260. doi:10.1207/s15430421tip4104_9
    1. Biggs J. Enhancing teaching through constructive alignment. Higher Ed. 1996;32(3):347-364. doi:10.1007/BF00138871
    1. Eppich WJ, Hunt EA, Duval-Arnould JM, Siddall VJ, Cheng A. Structuring feedback and debriefing to achieve mastery learning goals. Acad Med. 2015;90(11):1501-1508. doi:10.1097/ACM.0000000000000934
    1. Janzen KJ, Jeske S, MacLean H, et al. . Handling strong emotions before, during, and after simulated clinical experiences. Clin Simul Nurs. 2016;12(2):37-43. doi:10.1016/j.ecns.2015.12.004
    1. Bilgic E, Turkdogan S, Watanabe Y, et al. . Effectiveness of telementoring in surgery compared with on-site mentoring: a systematic review. Surg Innov. 2017;24(4):379-385. doi:10.1177/1553350617708725
    1. Erridge S, Yeung DKT, Patel HRH, Purkayastha S. Telementoring of surgeons: a systematic review. Surg Innov. 2019;26(1):95-111. doi:10.1177/1553350618813250
    1. Levin M, McKechnie T, Kruse CC, Aldrich K, Grantcharov TP, Langerman A. Surgical data recording in the operating room: a systematic review of modalities and metrics. Br J Surg. 2021;108(6):613-621. doi:10.1093/bjs/znab016

Source: PubMed

3
订阅