Exact correction factor for estimating the OR in the presence of sparse data with a zero cell in 2 × 2 tables.
RMSE
correction factor
coverage probability
odds ratio
sparsity
Journal
The international journal of biostatistics
ISSN: 1557-4679
Titre abrégé: Int J Biostat
Pays: Germany
ID NLM: 101313850
Informations de publication
Date de publication:
10 May 2023
10 May 2023
Historique:
received:
05
01
2022
accepted:
27
03
2023
medline:
9
5
2023
pubmed:
9
5
2023
entrez:
9
5
2023
Statut:
aheadofprint
Résumé
In case-control studies, odds ratios (OR) are calculated from 2 × 2 tables and in some instances, we observe small cell counts or zero counts in one of the cells. The corrections to calculate the ORs in the presence of empty cells are available in literature. Some of these include Yates continuity correction and Agresti and Coull correction. However, the available methods provided different corrections and the situations where each could be applied are not very apparent. Therefore, the current research proposes an iterative algorithm of estimating an exact (optimum) correction factor for the respective sample size. This was evaluated by simulating data with varying proportions and sample sizes. The estimated correction factor was considered after obtaining the bias, standard error of odds ratio, root mean square error and the coverage probability. Also, we have presented a linear function to identify the exact correction factor using sample size and proportion.
Identifiants
pubmed: 37159838
pii: ijb-2022-0040
doi: 10.1515/ijb-2022-0040
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© 2023 Walter de Gruyter GmbH, Berlin/Boston.
Références
George, J, Thomas, K, Jeyaseelan, L, Peter, JV, Cherian, AM. Hyponatraemia and hiccups. Natl Med J India 1996;9:107–9.
Sangeetha, U, Subbiah, M, Srinivasan, MR. Estimation of confidence intervals for Multinomial proportions of sparse contingency tables using Bayesian methods. Int J Sci Eng Res Pub 2013;3:7.
Agresti, A. Introduction to categorical data analysis , 2nd ed. Hoboken: John Wiley & Sons, Inc; 2007:394 p.
Sweeting, MJ, Sutton, AJ, Lambert, PC. What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data. Stat Med 2004;23:1351–75. https://doi.org/10.1002/sim.1761 .
doi: 10.1002/sim.1761
Yates, F. Contingency tables involving small numbers and the χ 2 test. Supplement to the. J Roy Stat Soc 1934;1:217. https://doi.org/10.2307/2983604 .
doi: 10.2307/2983604
Agresti, A, Coull, BA. Approximate is better than “exact” for interval estimation of binomial proportions. Am Stat 1998;52:119–26.
Haviland, MG. Yates’s correction for continuity and the analysis of 2 × 2 contingency tables. Stat Med 1990;9:363–7. https://doi.org/10.1002/sim.4780090403 .
doi: 10.1002/sim.4780090403
Subbiah, M, Srinivasan, MR. Classification of 2×2 sparse data sets with zero cells. Stat Probabil Lett 2008;78:3212–5. https://doi.org/10.1016/j.spl.2008.06.023 .
doi: 10.1016/j.spl.2008.06.023
Lyles, RH, Guo, Y, Greenland, S. Reducing bias and mean squared error associated with regression-based odds ratio estimators. J Stat Plan Inference 2012;142:3235–41. https://doi.org/10.1016/j.jspi.2012.05.005 .
doi: 10.1016/j.jspi.2012.05.005
Agresti, A, Hitchcock, DB. Bayesian inference for categorical data analysis. JISS 2005;14:297–330. https://doi.org/10.1007/s10260-005-0121-y .
doi: 10.1007/s10260-005-0121-y
Greenland, S. Bayesian perspectives for epidemiological research. II. Regression analysis. Int J Epidemiol 2007;36:195–202. https://doi.org/10.1093/ije/dyl289 .
doi: 10.1093/ije/dyl289
Galindo-Garre, F, Vermunt, JK, Ato-García, M. Bayesian approaches to the problem of sparse tables in log- linear modeling . In: Proceedings of the fifth International conference on logic and methodology ; 2011.
Greenland, S, Schwartzbaum, JA, Finkle, WD. Problems due to small samples and sparse data in conditional logistic regression analysis. Am J Epidemiol 2000;151:531–9. https://doi.org/10.1093/oxfordjournals.aje.a010240 .
doi: 10.1093/oxfordjournals.aje.a010240
Efron, B. Empirical Bayes methods for combining likelihoods. J Am Stat Assoc 1996;91:538–50. https://doi.org/10.1080/01621459.1996.10476919 .
doi: 10.1080/01621459.1996.10476919
Xie, M, Singh, K, Strawderman, WE. Confidence distributions and a unifying framework for meta-analysis. J Am Stat Assoc 2011;106:320–33. https://doi.org/10.1198/jasa.2011.tm09803 .
doi: 10.1198/jasa.2011.tm09803
Walter, SD, Cook, RJ. A Comparison of several point estimators of the odds ratio in a single 2 X 2 contingency table. Biometrics 1991;47:795. https://doi.org/10.2307/2532640 .
doi: 10.2307/2532640
Walter, SD. The distribution of Levin’s measure of attributable risk. Biometrika 1975;62:371–2. https://doi.org/10.1093/biomet/62.2.371 .
doi: 10.1093/biomet/62.2.371
Efron, B, Tibshirani, RJ. An introduction to the bootstrap [Internet]. Boston, MA: Springer US; 1993. Available from: http://link.springer.com/10.1007/978-1-4899-4541-9 [Accessed 19 Apr 2021].
Nair, BR, Rajshekhar, V. Factors predicting the need for prolonged (>24 Months) antituberculous treatment in patients with Brain tuberculomas. World Neurosurg 2019;125:e236–47. https://doi.org/10.1016/j.wneu.2019.01.053 .
doi: 10.1016/j.wneu.2019.01.053
Puhr, R, Heinze, G, Nold, M, Lusa, L, Geroldinger, A. Firth’s logistic regression with rare events: accurate effect estimates and predictions? Stat Med 2017;36:2302–17.