Multi-reader multi-case analysis of variance software for diagnostic performance comparison of imaging modalities

Brian J Smith, Stephen L Hillis, Brian J Smith, Stephen L Hillis

Abstract

A common study design for comparing the diagnostic performance of imaging modalities is to obtain modality-specific ratings from multiple readers of multiple cases whose true statuses are known. Typically, there is overlap between the modalities, readers, and/or cases for which special analytical methods are needed to perform statistical comparisons. We describe our new R software package MRMCaov, which is designed for multi-reader multi-case comparisons of two or more imaging modalities. The software allows for the comparison of reader performance metrics, such as area under the receiver operating characteristic curve (ROC AUC), with analysis of variance methods originally proposed by Obuchowski and Rockette (1995) and later unified and improved by Hillis and colleagues (2005, 2007, 2008, 2018). MRMCaov is an open-source package with an integrated command-line interface for performing multi-reader multi-case statistical analysis, plotting, and presenting results. Features of the package include (1) ROC curves estimated parametrically or non-parametrically; (2) reader-specific ROC curves and performance metrics; (3) user-definable performance metrics; (4) modality-specific estimates of mean performance along with confidence intervals and p-values for statistical comparisons; (5) support for factorial, nested, or partially paired study designs; (6) inference for random readers and cases, random readers and fixed cases, or fixed readers and random cases; (7) DeLong, jackknife, or unbiased covariance estimation; and (8) compatibility with Microsoft Windows, Mac OS, and Linux.

Keywords: ANOVA; ROC analysis; diagnostic radiology; multi-reader multi-case; software.

Figures

Figure 1.
Figure 1.
R Example: Plots of ROC estimates from mrmc function call.

References

    1. Dorfman DD, Berbaum KS, and Metz CE, “Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method.,” Investigative Radiology 27, 723–731 (1992).
    1. Obuchowski NA and Rockette HE, “Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: an ANOVA approach with dependent observations,” Communications in Statistics–Simulation and Computation 24, 285–308 (1995).
    1. Hillis SL, Obuchowski NA, Schartz KM, and Berbaum KS, “A comparison of the Dorfman-Berbaum-Metz and Obuchowski-Rockette methods for receiver operating characteristic (ROC) data,” Statistics in Medicine 24, 1579–1607 (2005).
    1. Hillis SL, “A comparison of denominator degrees of freedom methods for multiple observer ROC analysis,” Statistics in Medicine 26, 596–619 (2007).
    1. Hillis SL, Berbaum KS, and Metz CE, “Recent developments in the Dorfman-Berbaum-Metz procedure for multireader ROC study analysis,” Academic Radiology 15, 647–661 (2008).
    1. Hillis SL, “Relationship between Roe and Metz simulation model for multireader diagnostic data and Obuchowski-Rockette model parameters,” Statistics in Medicine 37, 2067–2093 (2018).
    1. Smith BJ and Hillis SL, MRMCaov: Multi-Reader Multi-Case Analysis of Variance (2020). R package version 0.1.3.
    1. Schartz KM, Hillis SL, Pesce LL, Berbaum KS, and Metz CE, OR-DBM MRMC (2019). version 2.52.
    1. R Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria: (2019).
    1. VanDyke CW, White RD, Obuchowski NA, Geisinger MA, Lorig RJ, and Meziane MA, “Cine MRI in the diagnosis of thoracic aortic dissection,” 79th Radiological Society of North America Meetings (1993).
    1. Pepe MS, [The Statistical Evaluation of Medical Tests for Classification and Prediction], Oxford University Press, New York: (2003).
    1. DeLong ER, DeLong DM, and Clarke-Pearson DL, “Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach,” Biometrics 44, 837–845 (1988).
    1. Efron B, [The Jackknife, the bootstrap and other resampling plans], SIAM, Philadelphia: (1982).

Source: PubMed

3
購読する