Fisher's exact approach for post hoc analysis of a chi-squared test

Guogen Shan, Shawn Gerstenberger, Guogen Shan, Shawn Gerstenberger

Abstract

This research is motivated by one of our survey studies to assess the potential influence of introducing zebra mussels to the Lake Mead National Recreation Area, Nevada. One research question in this study is to investigate the association between the boating activity type and the awareness of zebra mussels. A chi-squared test is often used for testing independence between two factors with nominal levels. When the null hypothesis of independence between two factors is rejected, we are often left wondering where does the significance come from. Cell residuals, including standardized residuals and adjusted residuals, are traditionally used in testing for cell significance, which is often known as a post hoc test after a statistically significant chi-squared test. In practice, the limiting distributions of these residuals are utilized for statistical inference. However, they may lead to different conclusions based on the calculated p-values, and their p-values could be over- o6r under-estimated due to the unsatisfactory performance of asymptotic approaches with regards to type I error control. In this article, we propose new exact p-values by using Fisher's approach based on three commonly used test statistics to order the sample space. We theoretically prove that the proposed new exact p-values based on these test statistics are the same. Based on our extensive simulation studies, we show that the existing asymptotic approach based on adjusted residual is often more likely to reject the null hypothesis as compared to the exact approach due to the inflated family-wise error rates as observed. We would recommend the proposed exact p-value for use in practice as a valuable post hoc analysis technique for chi-squared analysis.

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1. Actual family-wise error rates of…
Fig 1. Actual family-wise error rates of the proposed exact approach and the existing asymptotic approach based on the adjusted residual at the nominal level of 0.05.

References

    1. Gerstenberger S, Powell S, McCoy M. The 100th Meridian Initiative in Nevada: Assessing the Potential Movement of the Zebra Mussel to the Lake Mead National Recreation Area, Nevada, USA: University of Nevada Las Vegas; 2003;.
    1. Hebert PDN, Muncaster BW, Mackie GL. Ecological and Genetic Studies on Dreissena polymorpha (Pallas): a New Mollusc in the Great Lakes. Can J Fish Aquat Sci. 1989;46(9):1587–1591. doi:
    1. Cox MK, Key CH. Post Hoc Pair-Wise Comparisons for the Chi-Square Test of Homogeneity of Proportions. Educational and Psychological Measurement. 1993;53(4):951–962. doi:
    1. Freeman GH, Halton JH. Note on an Exact Treatment of Contingency, Goodness of Fit and Other Problems of Significance. Biometrika. 1951;38(1–2):141–149. doi:
    1. Sharpe D. Your Chi-Square Test Is Statistically Significant: Now What? Practical Assessment, Research & Evaluation. 2015;20(8).
    1. Mehta CR, Patel NR. A Network Algorithm for Performing Fisher’s Exact Test in r by c Contingency Tables. Journal of the American Statistical Association. 1983;78(382):427–434. doi:
    1. Haberman SJ. The Analysis of Residuals in Cross-Classified Tables. Biometrics. 1973;29(1):205–220. doi:
    1. Everitt BS. The analysis of contingency tables. New York; 1992.
    1. Agresti A. Categorical Data Analysis. 3rd ed Hoboken, New Jersey: Wiley; 2012. Available from: .
    1. MacDonald PL, Gardner RC. Type I Error Rate Comparisons of Post Hoc Procedures for I j Chi-Square Tables. Educational and Psychological Measurement. 2000;60(5):735–754. doi:
    1. Shan G, Ma C. Unconditional tests for comparing two ordered multinomials. Statistical methods in medical research. 2016;25(1):241–254. doi:
    1. Shan G, Ma C, Hutson AD, Wilding GE. An efficient and exact approach for detecting trends with binary endpoints. Statistics in Medicine. 2012;31(2):155–164. doi:
    1. Shan G. Exact Statistical Inference for Categorical Data. 1st ed San Diego, CA: Academic Press; 2015. Available from: .
    1. Shan G. A Note on Exact Conditional and Unconditional Tests for Hardy-Weinberg Equilibrium. Human Heredity. 2013;76(1):10–17. doi:
    1. Shan G. Exact sample size determination for the ratio of two incidence rates under the Poisson distribution. Computational Statistics. 2016;31(4):1633–1644. doi:
    1. Shan G, Wilding GE, Hutson AD, Gerstenberger S. Optimal adaptive two-stage designs for early phase II clinical trials. Statistics in Medicine. 2016;35(8):1257–1266. doi:
    1. Wang W, Shan G. Exact confidence intervals for the relative risk and the odds ratio. Biometrics. 2015;71(4):985–995. doi:
    1. Simes RJ. An improved Bonferroni procedure for multiple tests of significance. Biometrika. 1986;73(3):751–754. doi:
    1. Roberts G, Martyn AL, Dobson AJ, McCarthy WH. Tumour thickness and histological type in malignant melanoma in New South Wales, Australia, 1970–76. Pathology. 1981;13(4):763–770. doi:
    1. Dobson AJ, Barnett A. An Introduction to Generalized Linear Models, Third Edition (Chapman & Hall/CRC Texts in Statistical Science). 3rd ed Chapman and Hall/CRC; 2008. Available from: .
    1. Patefield M. Algorithm AS159. An efficient method of generating r x c tables with given row and column totals. Applied Statistics. 1981;30:91–97. doi:
    1. Demirhan H. rTableICC: An R Package for Random Generation of 22K and RC Contingency Tables. The R Journal. 2016;8(1):48–63.
    1. Shan G, Wang W. ExactCIdiff: An R Package for Computing Exact Confidence Intervals for the Difference of Two Proportions. The R Journal. 2013;5(2):62–71.
    1. Shan G, Wilding GE. Powerful Exact Unconditional Tests for Agreement between Two Raters with Binary Endpoints. PLoS ONE. 2014;9(5):e97386+ doi:
    1. Shan G, Zhang H. Exact unconditional sample size determination for paired binary data (letter commenting: J Clin Epidemiol. 2015;68:733–739). Journal of clinical epidemiology. 2017;84:188–190. doi:
    1. Shan G, Wang W. Exact one-sided confidence limits for Cohen’s kappa as a measurement of agreement. Statistical methods in medical research. 2017;26(2):615–632. doi:
    1. Shan G. Comments on ‘Two-sample binary phase 2 trials with low type I error and low sample size’. Statistics in Medicine. 2017;36(21):3437–3438. doi:
    1. Fisher RA. The Design of Experiments. 9th ed Edinburgh, UK: Macmillan Pub Co; 1935. Available from: .
    1. Lancaster HO. The derivation and partition of chi2 in certain discrete distributions. Biometrika. 1949;36(Pt. 1–2):117–129. doi:
    1. Jin M, Wang B. Implementing Multiple Comparisons on Pearson Chi-square Test for an RÃ?C Contingency Table in SAS. SAS. 2014;1544.
    1. Shan G. More efficient unconditional tests for exchangeable binary data with equal cluster sizes. Statistics & Probability Letters. 2013;83(2):644–649. doi:
    1. Shan G. Exact confidence intervals for randomized response strategies. Journal of Applied Statistics. 2016;43(7):1279–1290. doi:
    1. Shan G, Zhang H, Jiang T, Peterson H, Young D, Ma C. Exact p-Values for Simon’s Two-Stage Designs in Clinical Trials. 2016;8(2):351–357.
    1. Shan G, Zhang H, Jiang T. Minimax and admissible adaptive two-stage designs in phase II clinical trials. BMC Medical Research Methodology. 2016;16(1):90+ doi:
    1. Shan G, Bernick C, Banks S. Sample size determination for a matched-pairs study with incomplete data using exact approach. The British journal of mathematical and statistical psychology. 2017;. doi:

Source: PubMed

3
Suscribir