Interrater and Test-Retest Reliability of the Beery Visual-Motor Integration in Schoolchildren

Erin M Harvey, Tina K Leonard-Green, Kathleen M Mohan, Marjean Taylor Kulp, Amy L Davis, Joseph M Miller, J Daniel Twelker, Irene Campus, Leslie K Dennis, Erin M Harvey, Tina K Leonard-Green, Kathleen M Mohan, Marjean Taylor Kulp, Amy L Davis, Joseph M Miller, J Daniel Twelker, Irene Campus, Leslie K Dennis

Abstract

Purpose: To assess interrater and test-retest reliability of the 6th Edition Beery-Buktenica Developmental Test of Visual-Motor Integration (VMI) and test-retest reliability of the VMI Visual Perception Supplemental Test (VMIp) in school-age children.

Methods: Subjects were 163 Native American third- to eighth-grade students with no significant refractive error (astigmatism <1.00 D, myopia <0.75 D, hyperopia <2.50 D, anisometropia <1.50 D) or ocular abnormalities. The VMI and VMIp were administered twice, on separate days. All VMI tests were scored by two trained scorers, and a subset of 50 tests was also scored by an experienced scorer. Scorers strictly applied objective scoring criteria. Analyses included interrater and test-retest assessments of bias, 95% limits of agreement, and intraclass correlation analysis.

Results: Trained scorers had no significant scoring bias compared with the experienced scorer. One of the two trained scorers tended to provide higher scores than the other (mean difference in standardized scores = 1.54). Interrater correlations were strong (0.75 to 0.88). VMI and VMIp test-retest comparisons indicated no significant bias (subjects did not tend to score better on retest). Test-retest correlations were moderate (0.54 to 0.58). The 95% limits of agreement for the VMI were -24.14 to 24.67 (scorer 1) and -26.06 to 26.58 (scorer 2), and the 95% limits of agreement for the VMIp were -27.11 to 27.34.

Conclusions: The 95% limit of agreement for test-retest differences will be useful for determining if the VMI and VMIp have sufficient sensitivity for detecting change with treatment in both clinical and research settings. Further research on test-retest reliability reporting 95% limits of agreement for children across different age ranges is recommended, particularly if the test is to be used to detect changes due to intervention or treatment.

Figures

Figure 1
Figure 1
Bland-Altman (difference vs. mean) plot of VMI inter-rater agreement for the first tests administered to subjects (top) and the second test administered to subjects (bottom). Data are plotted both in terms of raw scores (left) and standardized scores (right). Reference lines are mean and 95% limits of agreement (mean +/− 1.96(SD)).
Figure 2
Figure 2
Bland-Altman (difference vs. mean) plot of test-retest agreement for VMI tests scored by scorer 1 (top), VMI tests scored by scorer 2 (middle), and VMIp tests (bottom). Data plotted are plotted both in terms of raw scores (left) and standardized scores (right). Reference lines are mean and 95% limits of agreement (mean +/− 1.96(SD)).

Source: PubMed

3
Sottoscrivi