Small is beautiful: In defense of the small-N design

Philip L Smith, Daniel R Little, Philip L Smith, Daniel R Little

Abstract

The dominant paradigm for inference in psychology is a null-hypothesis significance testing one. Recently, the foundations of this paradigm have been shaken by several notable replication failures. One recommendation to remedy the replication crisis is to collect larger samples of participants. We argue that this recommendation misses a critical point, which is that increasing sample size will not remedy psychology's lack of strong measurement, lack of strong theories and models, and lack of effective experimental control over error variance. In contrast, there is a long history of research in psychology employing small-N designs that treats the individual participant as the replication unit, which addresses each of these failings, and which produces results that are robust and readily replicated. We illustrate the properties of small-N and large-N designs using a simulated paradigm investigating the stage structure of response times. Our simulations highlight the high power and inferential validity of the small-N design, in contrast to the lower power and inferential indeterminacy of the large-N design. We argue that, if psychology is to be a mature quantitative science, then its primary theoretical aim should be to investigate systematic, functional relationships as they are manifested at the individual participant level and that, wherever possible, it should use methods that are optimized to identify relationships of this kind.

Keywords: Inference; Mathematical psychology; Methodology; Replication.

Figures

Fig. 1
Fig. 1
Simulated results for N = 4 participants. The dotted line shows the average power estimate, and the shaded region is the bootstrapped 95% confidence intervals for each analysis. The x-axis is the magnitude of the interaction parameter in the generating model (Equation A4); the y-axis is the proportion of times in which a significant interaction was identified in the simulations
Fig. 2
Fig. 2
Individual- and group-level analysis as a function of increasing the sample size (from top to bottom). The x-axis is the magnitude of the interaction parameter and the y-axis is the proportion of significant interactions found in the simulation
Fig. 3
Fig. 3
Sampling distributions for each level of δ. The vertical dotted line indicates the average value for the group. The panels show heterogeneous nonnormal populations in which some proportion of the participants exhibit an interaction and some proportion do not. The proportion progressively increases from top to bottom

References

    1. Aarts A, Anderson J, Anderson C, Attridge P, Attwood A. Estimating the reproducibility of psychological science. Science. 2015;349(6251):1–8.
    1. Ashby FG. Deriving exact predictions from the cascade model. Psychological Review. 1982;89:599–607.
    1. Ashby FG, Gott RE. Decision rules in the perception and categorization of multidimensional stimuli. Journal of Experimental Psychology: Learning, Memory and Cognition. 1988;14:33–53.
    1. Ashby FG, Lee WW. Predicting similarity and categorization from identification. Journal of Experimental Psychology: General. 1991;120:150–172.
    1. Ashby FG, Alfonso-Reese LA. Categorization as probability density estimation. Journal of Mathematical Psychology. 1995;39:216–233.
    1. Baddeley AD. Human memory: Theory and practice. East Sussex: Psychology Press; 1997.
    1. Baker, M. (2015). First results from psychology’s largest reproducibility test. Retrieved from. 10.1038/nature.2015.17433.
    1. Baribault, B., Donkin, C., Little, D.R., Trueblood, J.S., Orzvecz, Z., van Ravenzwaaij, D., & vanderkerckhove, J. (2018). Robust tests of theory with randomly sampled experiments. Proceedings of the National Academy of Sciences [in press].
    1. Bartlema A, Lee M, Wetzels R, Vanpaemel W. A Bayesian hierarchical mixture approach to individual differences: Case studies in selective attention and representation in category learning. Journal of Mathematical Psychology. 2014;59:132–150.
    1. Batchelder WH. Individual differences and the all-or-none vs incremental learning controversy. Journal of Mathematical Psychology. 1975;12:53–74.
    1. Benjamin Daniel J., Berger James O., Johannesson Magnus, Nosek Brian A., Wagenmakers E.-J., Berk Richard, Bollen Kenneth A., Brembs Björn, Brown Lawrence, Camerer Colin, Cesarini David, Chambers Christopher D., Clyde Merlise, Cook Thomas D., De Boeck Paul, Dienes Zoltan, Dreber Anna, Easwaran Kenny, Efferson Charles, Fehr Ernst, Fidler Fiona, Field Andy P., Forster Malcolm, George Edward I., Gonzalez Richard, Goodman Steven, Green Edwin, Green Donald P., Greenwald Anthony G., Hadfield Jarrod D., Hedges Larry V., Held Leonhard, Hua Ho Teck, Hoijtink Herbert, Hruschka Daniel J., Imai Kosuke, Imbens Guido, Ioannidis John P. A., Jeon Minjeong, Jones James Holland, Kirchler Michael, Laibson David, List John, Little Roderick, Lupia Arthur, Machery Edouard, Maxwell Scott E., McCarthy Michael, Moore Don A., Morgan Stephen L., Munafó Marcus, Nakagawa Shinichi, Nyhan Brendan, Parker Timothy H., Pericchi Luis, Perugini Marco, Rouder Jeff, Rousseau Judith, Savalei Victoria, Schönbrodt Felix D., Sellke Thomas, Sinclair Betsy, Tingley Dustin, Van Zandt Trisha, Vazire Simine, Watts Duncan J., Winship Christopher, Wolpert Robert L., Xie Yu, Young Cristobal, Zinman Jonathan, Johnson Valen E. Redefine statistical significance. Nature Human Behaviour. 2017;2(1):6–10.
    1. Boring EG. The nature and history of experimental control. The American Journal of Psychology. 1954;67:573–589.
    1. Brown SD, Heathcote A. The simplest complete model of choice response time: Linear ballistic accumulation. Cognitive Psychology. 2008;57:153–178.
    1. Bruner JS, Goodman CC. Value and need as organizing factors in perception. The Journal of Abnormal and Social Psychology. 1947;42:33–44.
    1. Bundesen C. A theory of visual attention. Psychological Review. 1990;97:523–547.
    1. Busemeyer JR. Decision making under uncertainty: A comparison of simple scalability, fixedsample, and sequential-sampling models. Journal of Experimental Psychology: Learning Memory and Cognition. 1985;11:538–564.
    1. Busemeyer JR, Townsend JT. Decision field theory: A dynamic-cognitive approach to decision making in an uncertain environment. Psychological Review. 1993;100:432–459.
    1. Carter LF, Schooler K. Value, need, and other factors in perception. Psychological Review. 1949;56:200–207.
    1. Cohen J. Things I have learned (so far) American Psychologist. 1990;45:1304–1312.
    1. Coppola DM, White LE, Fitzpatrick D, Purves D. Unequal representation of cardinal and oblique contours in ferret visual cortex. Proceedings of the National Academy of Sciences. 1998;95:2621–2623.
    1. Cowan N. An embedded-processes model of working memory. Models of working memory: Mechanisms of active maintenance and executive control. 1999;20:62–101.
    1. Cumming G, Finch S. Inference by eye: Confidence intervals and how to read pictures of data. American Psychologist. 2005;60:170–180.
    1. de Lange HD. Experiments on flicker and some calculations on an electrical analogue of the foveal systems. Physica. 1952;18:935–950.
    1. de Lange DH. Relationship between critical flicker-frequency and a set of low-frequency characteristics of the eye. JOSA. 1954;44:380–389.
    1. de Lange, D.H. (1958). Research into the dynamic nature of the human fovea-cortex systems with intermittent and modulated light. I. attenuation characteristics with white and colored light. JOSA, 777–784.
    1. Dennis S, Humphreys MS. A context noise model of episodic word recognition. Psychological Review. 2001;108:452–478.
    1. Dorfman DD, Biderman M. A learning model for a continuum of sensory states. Journal of Mathematical Psychology. 1971;8:264–284.
    1. Dosher BA, Lu Z-L. Mechanisms of perceptual learning. Vision Research. 1999;39:3197–3221.
    1. Doyen S, Klein O, Pichon C-L, Cleeremans A. Behavioral priming: It’s all in the mind, but whose mind? PloS one. 2012;7:e29081.
    1. Estes WK. The problem of inference from curves based on group data. Psychological Bulletin. 1956;53:134–140.
    1. Estes WK. On the communication of information by displays of standard errors and confidence intervals. Psychonomic Bulletin & Review. 1997;4:330–341.
    1. Estes WK, Maddox WT. Risks of drawing inferences about cognitive processes from model fits to individual versus average performance. Psychonomic Bulletin & Review. 2005;12:403–408.
    1. Fifić M, Little DR, Nosofsky R. Logical-rule models of classification response times: A synthesis of mental-architecture, random-walk, and decision-bound approaches. Psychological Review. 2010;117:309–348.
    1. Firestone C, Scholl BJ. Cognition does not affect perception: Evaluating the evidence for top-down effects. Behavioral and Brain Sciences. 2016;39:1–77.
    1. Fisher RA. Statistical methods for research workers. Guildford: Genesis Publishing Pvt Ltd; 1925.
    1. Furmanski CS, Engel SA. An oblique effect in human primary visual cortex. Nature Neuroscience. 2000;3:535–536.
    1. Gallistel CR, Fairhurst S, Balsam P. The learning curve: Implications of a quantitative analysis. Proceedings of the National Academy of Sciences of the United States of America. 2004;101:13124–13131.
    1. Gelman A. The connection between varying treatment effects and the crisis of unreplicable research: A Bayesian perspective. Journal of Management. 2015;41:632–643.
    1. Gelman A, Loken E. The statistical crisis in science data-dependent analysis—a garden of forking paths—explains why many statistically significant comparisons don’t hold up. American Scientist. 2014;102:460–465.
    1. Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian data analysis. London: Chapman and Hall CRC; 2003.
    1. Gilbert DT, King G, Pettigrew S, Wilson TD. Comment on Estimating the reproducibility of psychological science. Science. 2016;351:1037–1037.
    1. Graham NVS. Visual pattern analyzers. Oxford: Oxford University Press; 1989.
    1. Grice J, Barrett P, Cota L, Felix C, Taylor Z, Garner S, et al. Four bad habits of modern psychologists. Behavioral Sciences. 2017;7:53–74.
    1. Heath RA. A general nonstationary diffusion model for two-choice decision-making. Mathematical Social Sciences. 1992;23:283–309.
    1. Henrich J, Heine SJ, Noranyazan A. The weirdest people in the world? Behavioral and Brain Sciences. 2010;33:61–135.
    1. Houpt JW, Townsend JT. The statistical properties of the survivor interaction contrast. Journal of Mathematical Psychology. 2010;54:446–453.
    1. Houpt JW, Townsend JT. Statistical measures for workload capacity analysis. Journal of Mathematical Psychology. 2012;56:341–355.
    1. Jones M, Curran T, Mozer MC, Wilder MH. Sequential effects in response time reveal learning mechanisms and event representations. Psychological Review. 2013;120:628–666.
    1. Kac M. A note on learning signal detection. IRE Transactions on Information Theory. 1962;8:126–128.
    1. Kahneman D. Thinking, fast and slow. Basingstoke: Macmillan; 2011.
    1. Keppel G. Design and analysis: A researcher’s handbook. Upper Saddle River: Prentice-Hall; 1982.
    1. Kerlinger FN, Lee HB. Foundations of behavioral research. Belmont: Wadsworth Publishing; 1999.
    1. Kruschke J. Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Cambridge: Academic Press; 2014.
    1. Lakens, D., Adolfi, F.G., Albers, C.J., Anvari, F., Apps, M.A.J., Argamon, S. E., & et al. (2017). Justify your alpha: A response to redefine statistical significance. PsyArXiv, 1-18. Retrieved from. 10.17605/
    1. Laming D. Signal-detection with d′≡ 0: A dynamic model for binary prediction. Journal of Mathematical Psychology. 2014;60(35):46.
    1. Lee MD, Wagenmakers E-J. Bayesian statistical inference in psychology: Comment on Trafimow. Psychological Review. 2005;112:662–668.
    1. Lee MD, Webb MR. Modeling individual differences in cognition. Psychonomic Bulletin & Review. 2005;12:605–621.
    1. Liang J, Bentler PM. An EM algorithm for fitting two-level structural equation models. Psychometrika. 2004;69:101–122.
    1. Liew SX, Howe PDL, Little DR. The appropriacy of averaging in the study of context effects. Psychonomic Bulletin & Review. 2016;23:1639–1646.
    1. Link SW. The wave theory of difference and similarity. Hillsdale: Erlbaum; 1992.
    1. Little, D. R., & Smith, P. L. (2018). Commentary on Zwaan et al. – replication is already mainstream: Lessons from small-N designs. Behavioral and Brain Sciences [in press].
    1. Little DR, Wang T, Nosofsky RM. Sequence-sensitive exemplar and decision-bound accounts of speeded-classification performance in a modified Garner-tasks paradigm. Cognitive Psychology. 2016;89:1–38.
    1. Little DR, Altieri N, Fifić M, Yang C-T. Systems factorial technology: A theory driven methodology for the identification of perceptual and cognitive mechanisms. New York: Academic Press; 2017.
    1. Loftus GR, Masson MEJ. Using confidence intervals in within-subject designs. Psychonomic Bulletin & Review. 1994;1:476–490.
    1. Lu Z-L, Dosher BA. Characterizing human perceptual inefficiencies with equivalent internal noise. JOSA A. 1999;16:764–778.
    1. Luce RD. Response times: Their role in inferring elementary mental organization. New York: Oxford University Press; 1986.
    1. Maxwell SE, Delaney HD. Designing experiments and analyzing data: A model comparison approach. Wadsworth: Belmont; 1990.
    1. McClelland JL. On the time relations of mental processes: An examination of systems of processes in cascade. Psychological Review. 1979;86:287–330.
    1. McCullagh P, Nelder JA. Generalized linear models. Boca Raton: Chapman & Hall; 1989.
    1. Meehl PE. Theory-testing in psychology and physics: A methodological paradox. Philosophy of Science. 1967;34:103–115.
    1. Meehl PE. Why summaries of research on psychological theories are often uninterpretable. Psychological Reports. 1990;66:195–244.
    1. Miller J. Discrete and continuous models of human information processing: Theoretical distinctions and empirical results. Acta Psychologica. 1988;67:191–257.
    1. Myung JI, Pitt MA. Model comparison methods. Methods in Enzymology. 2004;383:351–366.
    1. Navarro DJ, Griffiths TL, Steyvers M, Lee MD. Modeling individual differences using Dirichlet processes. Journal of Mathematical Psychology. 2006;50:101–122.
    1. Nelder JA, Mead R. A simplex method for function minimization. Computer Journal. 1965;7:308–313.
    1. Normand, M.P. (2016). Less is more: Psychologists can learn more by studying fewer people. Frontiers in Psychology, 7, 934. Retrieved from. 10.3389/fpsyg.2016.00934.
    1. Nosofsky RM. Choice, similarity, and the context theory of classification. Journal of Experimental Psychology: Learning Memory, and Cognition. 1984;10:104–114.
    1. Nosofsky RM. Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General. 1986;115:39–61.
    1. Nosofsky RM, Little DR, Donkin C, Fifić M. Short-term memory scanning viewed as exemplar-based categorization. Psychological Review. 2011;118:280–315.
    1. Nosofsky RM, Palmeri T. An exemplar-based random walk model of speeded classification. Psychological Review. 1997;104:266–300.
    1. OSC et al. Estimating the reproducibility of psychological science. Science. 2015;349:1–8.
    1. Osth AF, Dennis S. Sources of interference in item and associative recognition memory. Psychological Review. 2015;122:260–311.
    1. Pashler H, Coburn N, Harris CR. Priming of social distance? Failure to replicate effects on social and food judgments. PloS one. 2012;7:e42510.
    1. Raaijmakers JGW, Shiffrin RM. Search of associative memory. Psychological Review. 1981;88:93–144.
    1. Racine J, Su L, Ullah A. The Oxford handbook of applied nonparametric and semiparametric econometrics and statistics. Oxford: Oxford University Press; 2014.
    1. Ratcliff R. A theory of memory retrieval. Psychological Review. 1978;85:59–108.
    1. Ratcliff R, Childers R. Individual differences and fitting methods for the two-choice diffusion model of decision making. Decision. 2015;15:237–279.
    1. Ratcliff R, McKoon G. The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation. 2008;20:873–922.
    1. Ratcliff R, Smith PL. A comparison of sequential sampling models for two-choice reaction time. Psychological Review. 2004;111:333–367.
    1. Ratcliff R, Thapar A, Mckoon G. A diffusion model analysis of the effects of aging on brightness discrimination. Attention, Perception, & Psychophysics. 2003;65:523–535.
    1. Ratcliff R, Thapar A, Gomez P, McKoon G. A diffusion model analysis of the effects of aging in the lexical-decision task. Psychology and Aging. 2004;19:278.
    1. Robson JG. Spatial and temporal contrast-sensitivity functions of the visual system. JOSA. 1966;56:1141–1142.
    1. Roe RM, Busemeyer JR, Townsend JT. Multialternative decision field theory: A dynamic connectionst model of decision making. Psychological Review. 2001;108:370–392.
    1. Rohrer D, Pashler H, Harris CR. Do subtle reminders of money change people’s political views? Journal of Experimental Psychology: General. 2015;144:e73–e85.
    1. Rosenthal BG. Attitude toward money, need, and methods of presentation as determinants of perception of coins from 6 to 10 years of age. The Journal of General Psychology. 1968;78:85–103.
    1. Ross, H.E. (1990). Environmental influences on geometrical illusions. In Fechner Day 90: Proceeding of the 6th annual meeting of the International Society of Psychophysicists (p. 216).
    1. Ross, J. (2009). Visual perception 1950–2000. Inside Psychology: A Science Over 50 Years, 243–252.
    1. Rouder, J. N., & Haaf, J. M. (in press). Power, dominance, and constraint: A note on the appeal of different design traditions. Advances in Methods and Practice in Psychological Science.
    1. Rouder JN, Yue Y, Speckman PL, Pratte MS, Province JM. Gradual growth versus shape invariance in perceptual decision making. Psychological Review. 2010;117:1267.
    1. Rouder JN, Morey RD, Speckman PL, Province JM. Default Bayes factors for ANOVA designs. Journal of Mathematical Psychology. 2012;56:356–374.
    1. Sanders AF. Issues and trends in the debate on discrete vs. continuous processing of information. Acta Psychologica. 1990;74:123–167.
    1. Saville, B. K., & Buskist, W. (2003). Traditional idiographic approaches: Small-N research designs. In S. F. Davis, B. K. Saville, & W. Buskist (Eds.) Handbook of research methods in experimental psychology (pp. 66–82). Hoboken: Blackwell Publishing Ltd.
    1. Schweickert R. A critical path generalization of the additive factor methods analysis of a Stroop task. Journal of Mathematical Psychology. 1978;18:105–139.
    1. Schweickert R. Critical-path scheduling of mental processes in a dual task. Science. 1980;209:704–706.
    1. Schweickert, R., & Mounts, J. (1998). Additive effects of factors on reaction time and evoked potentials in continuous-flow models. In C.E. Dowling, F. S. Roberts, & P. Theuns (Eds.) Recent progress in mathematical psychology: Psychophysics, knowledge, representation, cognition, and measurement (pp. 311–327). HIllsdale, NJ: Erlbaum.
    1. Segall MH, Campbell DT, Herskovits MJ. Cultural differences in the perception of geometric illusions. Science. 1963;139:769–771.
    1. Seidenberg MS, McClelland JL. A distributed, developmental model of word recognition and naming. Psychological Review. 1989;96:523.
    1. Sheynin OB. C. F. Gauss and the theory of errors. Archive for History of Exact Sciences. 1979;20:21–72.
    1. Shiffrin RM, Steyvers M. A model for recognition memory: REM-retrieving effectively from memory. Psychonomic Bulletin & Review. 1997;4:145–166.
    1. Sidman, M. (1960). Tactics of scientific research, New York: Basic Books.
    1. Simmons JP, Nelson LD, Simonsohn U. False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science. 2011;22:1359–1366.
    1. Skinner, B.F. (1938). The behaviour of organisms: An experimental analysis. New York: Appleton-Century.
    1. Smith PL. Psychophysically principled models of visual simple reaction time. Psychological Review. 1995;102:567–593.
    1. Smith PL. Attention and luminance detection: A quantitative analysis. Journal of Experimental Psychology: Human Perception and Performance. 1998;24:105–133.
    1. Smith PL. Diffusion theory of decision making in continuous report. Psychological Review. 2016;123:425–451.
    1. Smith PL, Ratcliff R. An integrated theory of attention and decision making in visual signal detection. Psychological Review. 2009;116:282–317.
    1. Sperling G, Sondhi MM. Model for visual luminance discrimination and flicker detection. JOSA. 1968;58:1133–1145.
    1. Sternberg S. The discovery of processing stages: Extensions of Donders’ method. Acta Psychologica. 1969;30:276–315.
    1. Stevens SS. Mathematics, measurement, and psychophysics. New York: Wiley; 1951.
    1. Switkes E, Mayer MJ, Sloan JA. Spatial frequency analysis of the visual environment: Anisotropy and the carpentered environment hypothesis. Vision Research. 1978;18:1393–1399.
    1. Thiele JE, Haaf JM, Rouder JN. Is there variation across individuals in processing? Bayesian analysis for systems factorial technology. Journal of Mathematical Psychology. 2017;81:40–54.
    1. Thomas EAC. On a class of additive learning models: Error-correcting and probability matching. Journal of Mathematical Psychology. 1973;10:241–264.
    1. Thomas EAC, Ross BH. On appropriate procedures for combining probability distributions within the same family. Journal of Mathematical Psychology. 1980;21:136–152.
    1. Townsend JT. Truth and consequences of ordinal differences in statistical distributions: Toward a theory of hierarchical inference. Psychological Bulletin. 1990;108:551.
    1. Townsend JT, Nozawa G. Spatio-temporal properties of elementary perception: An investigation of parallel, serial and coactive theories. Journal of Mathematical Psychology. 1995;39:321–340.
    1. Treisman M, Williams TC. A theory of criterion setting with an application to sequential dependencies. Psychological Review. 1984;91:68–111.
    1. Usher M, McClelland JL. The time course of perceptual choice: The leaky, competing accumulator model. Psychological Review. 2001;108:550–592.
    1. Van der Heijden PGM, Dessens J, Bockenholt U. Estimating the concomitant-variable latent-class model with the EM algorithm. Journal of Educational and Behavioral Statistics. 1996;21:215–229.
    1. Watson, A. B. (1986). Temporal sensitivity. In K. Boff, L. Kaufman, & J. Thomas (Eds.) Handbook of perception and human performance, vol. 1 (pp. 6.1–6.43), New York: Wiley.
    1. Woodworth RS, Schlosberg H. Experimental psychology. New York: Holt; 1954.

Source: PubMed

3
Prenumerera