Repeated Measures Correlation

Jonathan Z Bakdash, Laura R Marusich, Jonathan Z Bakdash, Laura R Marusich

Abstract

Repeated measures correlation (rmcorr) is a statistical technique for determining the common within-individual association for paired measures assessed on two or more occasions for multiple individuals. Simple regression/correlation is often applied to non-independent observations or aggregated data; this may produce biased, specious results due to violation of independence and/or differing patterns between-participants versus within-participants. Unlike simple regression/correlation, rmcorr does not violate the assumption of independence of observations. Also, rmcorr tends to have much greater statistical power because neither averaging nor aggregation is necessary for an intra-individual research question. Rmcorr estimates the common regression slope, the association shared among individuals. To make rmcorr accessible, we provide background information for its assumptions and equations, visualization, power, and tradeoffs with rmcorr compared to multilevel modeling. We introduce the R package (rmcorr) and demonstrate its use for inferential statistics and visualization with two example datasets. The examples are used to illustrate research questions at different levels of analysis, intra-individual, and inter-individual. Rmcorr is well-suited for research questions regarding the common linear association in paired repeated measures data. All results are fully reproducible.

Keywords: correlation; individual differences; intra-individual; multilevel modeling; repeated measures; statistical power.

Figures

**Figure 1**
**(A)** Rmcorr plot: rmcorr plot for a set of hypothetical data and **(B)** simple regression plot: the corresponding regression plot for the same data averaged by participant.

**Figure 2**
These notional plots illustrate the range of potential similarities and differences in the intra-individual association assessed by rmcorr and the inter-individual association assessed by ordinary least squares (OLS) regression. Rmcorr-values depend only on the intra-individual association between variables and will be the same across different patterns of inter-individual variability. **(A)**rrm = −1: depicts notional data with a perfect negative intra-individual association between variables, **(B)**rrm = 0: depicts data with no intra-individual association, and **(C)**rrm = 1: depicts data with a perfect positive intra-individual association. In each column, the relationship *between* subjects (inter-individual variability) is different, which does not change the rmcorr-values within a column. However, this does change the association that would be predicted by OLS regression (black lines) if the data were treated as IID or averaged by participant.

**Figure 3**
**Rmcorr-values (and corresponding p-values) do not change with linear transformations of the data, illustrated here with three examples: (A)** original, **(B)**x/2 + 1, and **(C)**y − 1.

**Figure 4**
**Power curves for (A)** small, rrm, and r = 0.10, **(B)** medium, rrm, and r = 0.3, and **(C)** large effect sizes, rrm, and r = 0.50. X-axis is sample size. Note the sample size range differs among the panels. Y-axis is power. k denotes the number of repeated paired measures. Eighty percent power is indicated by the dotted black line. For rmcorr, the power of k = 2 is asymptotically equivalent to k = 1. A comparison to the power for a Pearson correlation with one data point per participant (k = 1) is also shown.

**Figure 5**
**Comparison of rmcorr and simple regression/correlation results for age and brain structure volume data**. Each dot represents one of two separate observations of age and CBH for a participant. **(A)** Separate simple regressions/correlations by time: each observation is treated as independent, represented by shading all the data points black. The red line is the fit of the simple regression/correlation. **(B)** Rmcorr: observations from the same participant are given the same color, with corresponding lines to show the rmcorr fit for each participant. **(C)** Simple regression/correlation: averaged by participant. Note that the effect size is greater (stronger negative relationship) using rmcorr **(B)** than with either use of simple regression models **(A)** and **(C)**. This figure was created using data from Raz et al. (2005).

**Figure 6**
**The x-axis is reaction time (seconds) and the y-axis is accuracy in visual search. (A)** Rmcorr: each dot represents the average reaction time and accuracy for a block, color identifies participant, and colored lines show rmcorr fits for each participant. **(B)** Simple regression/correlation (averaged data): each dot represents a block, (improperly) treated as an independent observation. The red line is the fit to the simple regression/correlation. **(C)** Simple regression/correlation (aggregated data): improperly treating each dot as independent. This figure was created using data from Gilden et al. (2010).

References

1. Aarts E., Verhage M., Veenvliet J. V., Dolan C. V., van der Sluis S. (2014). A solution to dependency: using multilevel analysis to accommodate nested data. Nat. Neurosci. 17, 491–496. 10.1038/nn.3648
1. Babyak M. A. (2004). What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models. Psychosom. Med. 66, 411–421.
1. Bates D., Kliegl R., Vasishth S., Baayen H. (2015). Parsimonious Mixed Models. ArXiv150604967 Stat. Available online at: (Accessed February 7, 2016).
1. Bland J. M., Altman D. G. (1995a). Calculating correlation coefficients with repeated observations: part 1 Correlation within subjects. BMJ 310:446. 10.1136/bmj.310.6977.446
1. Bland J. M., Altman D. G. (1995b). Calculating correlation coefficients with repeated observations: part 2Correlation between subjects. BMJ 310:633 10.1136/bmj.310.6980.633
1. Bolker B. M. (2015). Linear and generalized linear mixed models, in Ecological Statistics: Contemporary Theory and Application, eds Fox G. A., Negrete-Yankelevich S., Sosa V. J. (Oxford: Oxford University Press; ), 309–334. 10.1093/acprof:oso/9780199672547.003.0014
1. Button K. S., Ioannidis J. P. A., Mokrysz C., Nosek B. A., Flint J., Robinson E. S. J., et al. . (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nat. Rev. Neurosci. 14, 365–376. 10.1038/nrn3475
1. Canty A., Support B. R. (2015). boot: Bootstrap Functions (Originally by Angelo Canty for S). Available online at: (Accessed October 28, 2015).
1. Cohen J., Cohen P., West S. G., Aiken L. S. (2003). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences, 3rd Edn. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
1. Cumming G. (2014). The new statistics why and how. Psychol. Sci. 25, 7–29. 10.1177/0956797613504966
1. DiCiccio T. J., Efron B. (1996). Bootstrap confidence intervals. Stat. Sci. 189–212.
1. Efron B., Tibshirani R. J. (1994). An Introduction to the Bootstrap. New York, NY: CRC Press.
1. Estes W. K. (1956). The problem of inference from curves based on group data. Psychol. Bull. 53, 134. 10.1037/h0045156
1. Faul F., Erdfelder E., Buchner A., Lang A.-G. (2009). Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses. Behav. Res. Methods 41, 1149–1160. 10.3758/BRM.41.4.1149
1. Gelman A. (2005). Analysis of variance? Why it is more important than ever. Ann. Stat. 33, 1–53. 10.1214/009053604000001048
1. Gelman A., Hill J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. New York, NY: Cambridge University Press.
1. Gilden D. L., Thornton T. L., Marusich L. R. (2010). The serial process in visual search. J. Exp. Psychol. Hum. Percept. Perform. 36, 533. 10.1037/a0016464
1. Gueorguieva R., Krystal J. H. (2004). Move over anova: progress in analyzing repeated-measures data andits reflection in papers published in the archives of general psychiatry. Arch. Gen. Psychiatry 61, 310–317. 10.1001/archpsyc.61.3.310
1. Howell D. (1997). Statistical Methods for Psychology, 4th Edn. Belmont, CA: Wadsworth Publishing Company.
1. John O. P., Benet-Martinez V. (2014). Measurement: reliability, construction validation, and scale construction, in Handbook of Research Methods in Social and Personality Psychology, eds Reis H. T., Judd C. M. (New York, NY: Cambridge University Press; ), 473–503.
1. Johnston J., DiNardo J. E. (1997). Econometric Methods, 3rd Edn. New York, NY: McGraw-Hill Compaines, Inc.
1. Kenny D. A., Judd C. M. (1986). Consequences of violating the independence assumption in analysis of variance. Psychol. Bull. 99:422 10.1037/0033-2909.99.3.422
1. Kievit R. A., Frankenhuis W. E., Waldorp L. J., Borsboom D. (2013). Simpson's paradox in psychological science: a practical guide. Front. Psychol. 4:513. 10.3389/fpsyg.2013.00513
1. Kreft I., de Leeuw J. (1998). Introducing Multilevel Modeling. Thousand Oaks, CA: SAGE Publications.
1. Matuschek H., Kliegl R., Vasishth S., Baayen H., Bates D. (2015). Balancing Type I Error and Power in Linear Mixed Models. ArXiv Prepr. ArXiv151101864.
1. Miller G. A., Chapman J. P. (2001). Misunderstanding analysis of covariance. J. Abnorm. Psychol. 110:40. 10.1037/0021-843X.110.1.40
1. Molenaar P. C. (2004). A manifesto on psychology as idiographic science: bringing the person back into scientific psychology, this time forever. Measurement 2, 201–218. 10.1207/s15366359mea0204_1
1. Molenaar P. C., Campbell C. G. (2009). The new person-specific paradigm in psychology. Curr. Dir. Psychol. Sci. 18, 112–117. 10.1111/j.1467-8721.2009.01619.x
1. Myung I. J., Kim C., Pitt M. A. (2000). Toward an explanation of the power law artifact: insights from response surface analysis. Mem. Cognit. 28, 832–840. 10.3758/BF03198418
1. Quené H., van den Bergh H. (2004). On multilevel modeling of data from repeated measures designs: a tutorial. Speech Commun. 43, 103–121. 10.1016/j.specom.2004.02.004
1. Raz N., Lindenberger U., Rodrigue K. M., Kennedy K. M., Head D., Williamson A., et al. . (2005). Regional brain changes in aging healthy adults: general trends, individual differences and modifiers. Cereb. Cortex 15, 1676–1689. 10.1093/cercor/bhi044
1. R Core Team (2017). R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing; Available online at:
1. Robinson W. (2009). Ecological Correlations and the behavior of Individuals. Int. J. Epidemiol. 38, 337–341. 10.1093/ije/dyn357
1. Rogosa D. (1980). Comparing nonparallel regression lines. Psychol. Bull. 88, 307–321. 10.1037/0033-2909.88.2.307
1. Salthouse T. A. (2011). Cognitive correlates of cross-sectional differences and longitudinal changes in trail making performance. J. Clin. Exp. Neuropsychol. 33, 242–248. 10.1080/13803395.2010.509922
1. Schönbrodt F. D., Perugini M. (2013). At what sample size do correlations stabilize? J. Res. Personal. 47, 609–612. 10.1016/j.jrp.2013.05.009
1. Singer J. D., Willett J. B. (2003). Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. Oxford: Oxford University Press
1. Tabachnick B. G., Fidell L. S. (2007). Using Multivariate Statistics, 4th Edn. New York, NY: Pearson Education.
1. Tu Y.-K., Gunnell D., Gilthorpe M. S. (2008). Simpson's Paradox, Lord's Paradox, and Suppression Effects are the same phenomenon the reversal paradox. Emerg. Themes Epidemiol. 5:2. 10.1186/1742-7622-5-2
1. Tukey J. W. (1977). Exploratory Data Analysis. New York, NY: Addison-Wesley Publishing Company.
1. Underwood B. J. (1975). Individual differences as a crucible in theory construction. Am. Psychol. 30, 128 10.1037/h0076759
1. Vogel E. K., Awh E. (2008). How to exploit diversity for scientific gain: using individual differences to constrain cognitive theory. Curr. Dir. Psychol. Sci. 17, 171–176. 10.1111/j.1467-8721.2008.00569.x
1. Wickelgren W. A. (1977). Speed-accuracy tradeoff and information processing dynamics. Acta Psychol. 41, 67–85. 10.1016/0001-6918(77)90012-9
1. Wilkinson L. (1999). Statistical methods in psychology journals: guidelines and explanations. Am. Psychol. 54, 594–604. 10.1037/0003-066X.54.8.594

Source: PubMed

Repeated Measures Correlation

Abstract

Figures

References

スポンサーと協力者

医学的状態

薬物療法