Statistical Decision-Making Accuracies for Some Overlap- and Distance-based Measures for Single-Case Experimental Designs

Michael T Carlin, Mack S Costello, Michael T Carlin, Mack S Costello

Abstract

Selecting a quantitative measure to guide decision making in single-case experimental designs (SCEDs) is complicated. Many measures exist and all have been rightly criticized. The two general classes of measure are overlap-based (e.g., percentage nonoverlapping data) and distance-based (e.g., Cohen's d). We compare several measures from each category for Type I error rate and power across a range of designs using equal numbers of observations (i.e., 3-10) in each phase. Results showed that Tau and the distance-based measures (i.e., RD and g) provided the highest decision accuracies. Other overlap-based measures (e.g., PND, dual-criterion method) did not perform as well. It is recommended that Tau be used to guide decision making about the presence/absence of a treatment effect, and RD or g be used to quantify the magnitude of the treatment effect.

Supplementary information: The online version contains supplementary material available at 10.1007/s40614-021-00317-8.

Keywords: Decision making; Ratio of distances; Statistical analysis; Tau.

Conflict of interest statement

Conflicts of InterestNone

© Association for Behavior Analysis International 2021.

Figures

Fig. 1
Fig. 1
Cumulative percentages of data sets with particular numbers of non-overlapping data points for each of the AB design structures

References

    1. Allison DB, Gorman BS. Calculating effect sizes for meta-analysis: The case of the single case. Behaviour Research & Therapy. 1993;31:621–631. doi: 10.1016/0005-7967(93)90115-B.
    1. Bar-Hillel M, Wagenaar WA. The perception of randomness. Advances in Applied Mathematics. 1991;12:428–454. doi: 10.1016/0196-8858(91)90029-I.
    1. Branch MN. Statistical inference in behavior analysis: Some things significance testing does and does not do. The Behavior Analyst. 1999;22:87–92. doi: 10.1007/BF03391984.
    1. Branch M. Malignant side effects of null-hypothesis significance testing. Theory & Psychology. 2014;24:256–277. doi: 10.1177/0959354314525282.
    1. Carlin MT, Costello MS. Development of a distance-based effect size metric for single-case research: Ratio of distances. Behavior Therapy. 2018;49:981–994. doi: 10.1016/j.beth.2018.02.005.
    1. Carter M. Reconsidering overlap-based measures for quantitative synthesis of single-subject data: What they tell us and what they don’t. Behavior Modification. 2013;37:378–390. doi: 10.1177/0145445513476609.
    1. Cohen J. Statistical power analysis for the behavioral sciences. (2nd ed.). 1988.
    1. DeProspero A, Cohen S. Inconsistent visual analysis of intrasubject data. Journal of Applied Behavior Analysis. 1979;12:573–579. doi: 10.1901/jaba.1979.12-573.
    1. Fisher WW, Kelley ME, Lomas JE. Visual aids and structured criteria for improving visual inspection and interpretation of single-case designs. Journal of Applied Behavior Analysis. 2003;36:387–406. doi: 10.1901/jaba.2003.36-387.
    1. Hahn U, Warren PA. Perceptions of randomness: Why three heads is better than four. Psychological Review. 2009;116:454–461. doi: 10.1037/a0015241.
    1. Lakens D. Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests ad ANOVAs. Frontiers in Psychology. 2013;4:Article 863. doi: 10.3389/fpsyg.2013.00863.
    1. Lanovaz MJ, Giannakakos AR, Destras O. Machine learning to analyze single-case data: A proof of concept. Perspectives on Behavior Science. 2020;43:21–38. doi: 10.1007/s40614-020-00244-0.
    1. Ma H. An alternative method for quantitative synthesis of single-subject researches: Percentage of data points exceeding the median. Behavior Modification. 2006;30:598–617. doi: 10.1177/0145445504272974.
    1. Manolov R, Solanas A. Comparing n = 1 effect size indices in presence of autocorrelation. Behavior Modification. 2008;32:860–875. doi: 10.1177/0145445508318866.
    1. McKnight SD, McKean JW, Huitema BE. A double bootstrap method to analyze linear models with autoregressive error terms. Psychological Methods. 2000;5:87–101. doi: 10.1037/1082-989X.5.1.87.
    1. Nickerson R. The production and perception of randomness. Psychological Review. 2002;109:330–357. doi: 10.1037/0033-295X.109.2.330.
    1. Ninci J, Vannest KJ, Wilson V, Zhang N. Interrater agreement between visual analysts of single-case data: A meta-analysis. Behavior Modification. 2015;39:510–541. doi: 10.1177/0145445515581327.
    1. Parker RI, Vannest KJ, Davis JL, Sauber SB. Combining non-overlap and trend for single-case research: Tau-U. Behavior Therapy. 2011;42:284–299. doi: 10.1016/j.beth.2010.08.006.
    1. Pustejovsky JE. Procedural sensitivities of effect sizes for single-case designs with directly observed behavioral outcome measures. Psychological Methods. 2019;24:217–235. doi: 10.1037/met0000179.
    1. Scruggs TE, Mastropieri MA. Summarizing single-subject research: Issues and applications. Behavior Modification. 1998;22:221–242. doi: 10.1177/01454455980223001.
    1. Scruggs TE, Mastropieri MA. PND at 25: Past, present, and future trends in summarizing single-subject research. Remedial & Special Education. 2013;34:9–19. doi: 10.1177/0741932512440730.
    1. Scruggs TE, Mastropieri MA, Casto G. The quantitative synthesis of single-case research: Methodology and validation. Remedial & Special Education. 1987;8:24–33. doi: 10.1177/074193258700800206.
    1. Skinner BF. The behavior of organisms: An experimental analysis. Appleton-Century; 1938.
    1. Voss JL, Federmeier KD, Paller KA. The potato chip really does look like Elvis! Neural hallmarks of conceptual processing associated with finding novel shapes subjectively meaningful. Cerebral Cortex. 2012;22:2354–2364. doi: 10.1093/cercor/bhr315.
    1. Wolery M, Busick M, Reichow B, Barton EE. Comparison of overlap methods for quantitatively synthesizing single-subject data. Journal of Special Education. 2010;44:18–28. doi: 10.1177/0022466908328009.
    1. Wolfe K, Seaman MA, Drasgow E. Interrater agreement on the visual analysis of individual tiers and functional relations in multiple baseline designs. Behavior Modification. 2016;40:852–873. doi: 10.1177/0145445516644699.
    1. Wolfe K, Seaman MA, Drasgow E, Sherlock P. An evaluation of the agreement between the conservative dual-criterion method and expert visual analysis. Journal of Applied Behavior Analysis. 2018;51:345–351. doi: 10.1002/jaba.453.

Source: PubMed

3
Abonnere