Reliability of the American Academy of Sleep Medicine Rules for Assessing Sleep Depth in Clinical Practice

Magdy Younes, Samuel T Kuna, Allan I Pack, James K Walsh, Clete A Kushida, Bethany Staley, Grace W Pien, Magdy Younes, Samuel T Kuna, Allan I Pack, James K Walsh, Clete A Kushida, Bethany Staley, Grace W Pien

Abstract

Study objectives: The American Academy of Sleep Medicine has published manuals for scoring polysomnograms that recommend time spent in non-rapid eye movement sleep stages (stage N1, N2, and N3 sleep) be reported. Given the well-established large interrater variability in scoring stage N1 and N3 sleep, we determined the range of time in stage N1 and N3 sleep scored by a large number of technologists when compared to reasonably estimated true values.

Methods: Polysomnograms of 70 females were scored by 10 highly trained sleep technologists, two each from five different academic sleep laboratories. Range and confidence interval (CI = difference between the 5th and 95th percentiles) of the 10 times spent in stage N1 and N3 sleep assigned in each polysomnogram were determined. Average values of times spent in stage N1 and N3 sleep generated by the 10 technologists in each polysomnogram were considered representative of the true values for the individual polysomnogram. Accuracy of different technologists in estimating delta wave duration was determined by comparing their scores to digitally determined durations.

Results: The CI range of the ten N1 scores was 4 to 39 percent of total sleep time (% TST) in different polysomnograms (mean CI ± standard deviation = 11.1 ± 7.1 % TST). Corresponding range for N3 was 1 to 28 % TST (14.4 ± 6.1 % TST). For stage N1 and N3 sleep, very low or very high values were reported for virtually all polysomnograms by different technologists. Technologists varied widely in their assignment of stage N3 sleep, scoring that stage when the digitally determined time of delta waves ranged from 3 to 17 seconds.

Conclusions: Manual scoring of non-rapid eye movement sleep stages is highly unreliable among highly trained, experienced technologists. Measures of sleep continuity and depth that are reliable and clinically relevant should be a focus of clinical research.

Keywords: digital sleep analysis; interrater variability; sleep depth.

© 2018 American Academy of Sleep Medicine

Figures

Figure 1. Frequency of agreement among the…
Figure 1. Frequency of agreement among the six technologists when a given stage is scored by at least one of the technologists.
Bars are standard deviations. Continuous line is the cumulative percentage. Note that only 1.5 ± 1.2 % of epochs scored by any technologist as stage N1 sleep received a unanimous (6 of 6) N1 score and a majority N1 (4 of 6) score was reached in only 18% of such cases. A similar pattern was observed for stage N3 sleep.
Figure 2. Duration of different sleep stages…
Figure 2. Duration of different sleep stages scored by individual technologists as a function of the average (of 10 scores) duration of the stage in individual PSGs.
Each PSG is represented by 10 points aligned at the average of the 10 durations. Technologists are represented by different symbols. Upper and lower irregular lines join the 5th and 95th percentiles of the individual PSGs. Solid diagonal line is the line of identity. Note that the confidence interval (difference between the upper and lower lines) is quite variable and generally quite wide for stage N1 and stage N3 sleep. Three PSGs are not represented in the N1 panel because their results would nearly double both axes, compressing most of the data in one corner. The average (confidence interval) of N1 durations in these three PSGs were 20.7 (13.3, 25.4), 28.0 (15.2, 49.3), and 36.5 (22.9, 45.1). PSG = polysomnogram, TRT = total recording time, TST = total sleep time.
Figure 3. Frequency of scoring stage N3…
Figure 3. Frequency of scoring stage N3 sleep by six technologists in epochs with different total delta wave duration.
Delta wave duration is the sum of durations of all delta waves identified digitally in each 30-second epoch. Arrows represent ideal scoring; no N3 until delta duration exceeds 6 seconds and N3 scored in all epochs with delta duration > 6 seconds. Numbers in the upper section represent the number of epochs examined within each delta duration range.

References

    1. Berry RB, Brooks R, Gamaldo CE, et al. for the American Academy of Sleep Medicine. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications. Darien, IL: American Academy of Sleep Medicine; 2012. Version 2.0.
    1. Iber C, Ancoli-Israel S, Chesson AL, Quan SF for the American Academy of Sleep Medicine. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications. 1st ed. Westchester, IL: American Academy of Sleep Medicine; 2007.
    1. Ferri R, Ferri P, Colognola RM, Petrella MA, Musumeci SA, Bergonzi P. Comparison between the results of an automatic and a visual scoring of sleep EEG recordings. Sleep. 1989;12(4):354–362.
    1. Whitney CW, Gottlieb DJ, Redline S, et al. Reliability of scoring respiratory disturbance indices and sleep staging. Sleep. 1998;21(7):749–757.
    1. Norman RG, Pal I, Stewart C, Walsleben JA, Rapoport DM. Interobserver agreement among sleep scorers from different centers in a large dataset. Sleep. 2000;23(7):901–908.
    1. Collop NA. Scoring variability between polysomnography technologists in different sleep laboratories. Sleep Med. 2002;3(1):43–47.
    1. Danker-Hopfe H, Kunz D, Gruber G, et al. Interrater reliability between scorers from eight European sleep laboratories in subjects with different sleep disorders. J Sleep Res. 2004;13(1):63–69.
    1. Pittman SD, MacDonald MM, Fogel RB, et al. Assessment of automated scoring of polysomnographic recordings in a population with suspected sleep-disordered breathing. Sleep. 2004;27(7):1394–1403.
    1. Anderer P, Gruber G, Parapatics S, et al. An E-health solution for automatic sleep classification according to Rechtschaffen and Kales: validation study of the Somnolyzer 24 × 7 utilizing the Siesta database. Neuropsychobiology. 2005;51(3):115–133.
    1. Magalang UJ, Chen NH, Cistulli PA, et al. Agreement in the scoring of respiratory events and sleep among international sleep centers. Sleep. 2013;36(4):591–596.
    1. Kuna ST, Benca R, Kushida CA, et al. Agreement in computer-assisted manual scoring of polysomnograms across sleep centers. Sleep. 2013;36(4):583–589.
    1. Malhotra A, Younes M, Kuna ST, et al. Performance of an automated polysomnography scoring system versus computer-assisted manual scoring. Sleep. 2013;36(4):573–582.
    1. Zhang X, Dong X, Kantelhardt JW, et al. Process and outcome for international reliability in sleep scoring. Sleep Breath. 2015;19(1):191–195.
    1. Younes M, Raneri J, Hanly P. Staging sleep in polysomnograms: analysis of inter-scorer variability. J Clin Sleep Med. 2016;12(6):1347–1356.
    1. Williams HL, Hammack JT, Daly RL, Dement WC, Lubin A. Responses to auditory stimulation, sleep loss, and the EEG stages of sleep. Electroencephalogr Clin Neurophysiol. 1964;16:269–279.
    1. Gugger M, Molloy J, Gould GA, et al. Ventilatory and arousal responses to added inspiratory resistance during sleep. Am Rev Respir Dis. 1989;140(5):1301–1307.
    1. Berry RB, Bonnet MH, Light RW. Effect of ethanol on the arousal response to airway occlusion during sleep in normal subjects. Am Rev Respir Dis. 1992;145(2 Pt 1):445–452.
    1. Berry RB, Asyali MA, McNellis MI, Khoo MC. Within-night variation in respiratory effort preceding apnea termination and EEG delta power in sleep apnea. J Appl Physiol (1985) 1998;85(4):1434–1441.
    1. Warby SC, Wendt SL, Welinder P, et al. Sleep-spindle detection: crowdsourcing and evaluating performance of experts, non-experts and automated methods. Nat Methods. 2014;11(4):385–392.
    1. Wendt SL, Welinder P, Sorensen HB, et al. Inter-expert and intra-expert reliability in sleep spindle scoring. Clin Neurophysiol. 2015;126(8):1548–1556.
    1. Punjabi NM, Shifa N, Dorffner G, Patil S, Pien G, Aurora RN. Computer-assisted automated scoring of polysomnograms using the Somnolyzer System. Sleep. 2015;38(10):1555–1566.
    1. Younes M, Thompson W, Leslie C, Egan T, Giannouli E. Utility of technologist editing of polysomnography scoring performed by a validated automatic system. Ann Am Thorac Soc. 2015;12(8):1206–1218.
    1. Younes M, Younes M, Giannouli E. Accuracy of automatic scoring using frontal electrodes. J Clin Sleep Med. 2016;12(5):735–746.
    1. Younes M, Soiferman M, Thompson W, Giannouli E. Performance of a new portable wireless monitor. J Clin Sleep Med. 2017;13(2):245–258.
    1. Younes M, Hanly PJ. Minimizing interrater variability in staging sleep by use of computer-derived features. J Clin Sleep Med. 2016;12(10):1347–1356.
    1. Rechtschaffen A, Kales A. A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects. Los Angeles, CA: University of California Los Angeles Brain Information Service; 1968.
    1. Krystal AD, Edinger JD. Measuring sleep quality. Sleep Med. 2008;9(Suppl 1):S10–S17.
    1. Asyali MH, Berry RB, Khoo MC, Altinok A. Determining a continuous marker for sleep depth. Comput Biol Med. 2007;37(11):1600–1609.
    1. Fell J, Röschke J, Mann K, Schäffner C. Discrimination of sleep stages: a comparison between spectral and nonlinear EEG measures. Electroencephalogr Clin Neurophysiol. 1996;98(5):401–410.
    1. Pezard L, Martinerie J, Varela FJ, et al. Entropy maps characterize drug effects on brain dynamics in Alzheimer's disease. Neurosci Lett. 1998;253(1):5–8.
    1. Burioka N, Cornélissen G, Halberg F, et al. Approximate entropy of human respiratory movement during eye-closed waking and different sleep stages. Chest. 2003;123(1):80–86.
    1. Shen Y, Olbrich E, Achermann P, Meier PF. Dimensional complexity and spectral properties of the human sleep EEG. Electroencephalograms. Clin Neurophysiol. 2003;114(2):199–209.
    1. Abásolo D, Hornero R, Espino P, Poza J, Sánchez CI, de la Rosa R. Analysis of regularity in the EEG background activity of Alzheimer's disease patients with approximate entropy. Clin Neurophysiol. 2005;116(8):1826–1834.
    1. Bruce EN, Bruce MC, Vennelaganti S. Sample entropy tracks changes in electroencephalogram power spectrum with sleep state and aging. J Clin Neurophysiol. 2009;26(4):257–266.
    1. Younes M, Ostrowski M, Soiferman M, et al. Odds Ratio Product of sleep EEG as a continuous measure of sleep state. Sleep. 2015;38(4):641–654.
    1. Younes M, Hanly PJ. Immediate postarousal sleep dynamics: an important determinant of sleep stability in obstructive sleep apnea. J Appl Physiol (1985) 2016;120(7):801–808.
    1. Qanash S, Giannouli E, Younes M. Assessment of intervention-related changes in non-rapid-eye-movement sleep depth: importance of sleep depth changes within stage 2. Sleep Med. 2017;40:84–93.
    1. Meza-Vargas S, Giannouli E, Younes M. Enhancements to the multiple sleep latency test. Nat Sci Sleep. 2016;8:145–158.
    1. Younes M. The case for using digital EEG analysis in clinical sleep medicine. Sleep Science and Practice. 2017;1:2.

Source: PubMed

3
구독하다