Classical tests, linear models, and their extensions for the analysis of 2x2 contingency tables

Nagel, Rebecca; Ruxton, Graeme Douglas; Morrissey, Michael Blair

Show simple item record

Files in this item

Name:: Nagel-2024-Classical-tests-linear-models-MEE-CCBYNC.pdf
Size:: 1.238Mb
Format:: PDF

View/Open

Item metadata

dc.contributor.author	Nagel, Rebecca
dc.contributor.author	Ruxton, Graeme Douglas
dc.contributor.author	Morrissey, Michael Blair
dc.date.accessioned	2024-04-02T09:30:05Z
dc.date.available	2024-04-02T09:30:05Z
dc.date.issued	2024-05
dc.identifier	300302523
dc.identifier	1a5887c7-3dbe-4ee6-aeb3-4d95146209dc
dc.identifier	85189557656
dc.identifier.citation	Nagel , R , Ruxton , G D & Morrissey , M B 2024 , ' Classical tests, linear models, and their extensions for the analysis of 2x2 contingency tables ' , Methods in Ecology and Evolution , vol. 15 , no. 5 , pp. 843-855 . https://doi.org/10.1111/2041-210x.14318	en
dc.identifier.issn	2041-210X
dc.identifier.other	ORCID: /0000-0001-8943-6609/work/157140660
dc.identifier.uri	https://hdl.handle.net/10023/29580
dc.description	Funding: Deutsche Forschungsgemeinschaft - 515410943; Royal Society London - University Research Fellowship.	en
dc.description.abstract	1. Ecologists and evolutionary biologists are regularly tasked with the comparison of binary data across groups. There is, however, some discussion in the biostatistics literature about the best methodology for the analysis of data comprising binary explanatory and response variables forming a 2 × 2 contingency table. 2. We assess several methodologies for the analysis of 2 × 2 contingency tables using a simulation scheme of different sample sizes with outcomes evenly or unevenly distributed between groups. Specifically, we assess the commonly recommended logistic (generalised linear model [GLM]) regression analysis, the classical Pearson chi-squared test and four conventional alternatives (Yates' correction, Fisher's exact, exact unconditional and mid-p), as well as the widely discouraged linear model (LM) regression. 3. We found that both LM and GLM analyses provided unbiased estimates of the difference in proportions between groups. LM and GLM analyses also provided accurate standard errors and confidence intervals when the experimental design was balanced. When the experimental design was unbalanced, sample size was small, and one of the two groups had a probability close to 1 or 0, LM analysis could substantially over- or under-represent statistical uncertainty. For null hypothesis significance testing, the performance of the chi-squared test and LM analysis were almost identical. Across all scenarios, both had high power to detect non-null effects and reject false positives. By contrast, the GLM analysis was underpowered when using z-based p-values, in particular when one of the two groups had a probability near 1 or 0. The GLM using the LRT had better power to detect non-null results. 4. Our simulation results suggest that, wherever a chi-squared test would be recommended, a linear regression is a suitable alternative for the analysis of 2 × 2 contingency table data. When researchers opt for more sophisticated procedures, we provide R functions to calculate the standard error of a difference between two probabilities from a Bernoulli GLM output using the delta method. We also explore approaches to compliment GLM analysis of 2 × 2 contingency tables with credible intervals on the probability scale. These additional operations should support researchers to make valid assessments of both statistical and practical significances.
dc.format.extent	13
dc.format.extent	1298897
dc.language.iso	eng
dc.relation.ispartof	Methods in Ecology and Evolution	en
dc.subject	2 x 2 contingency table	en
dc.subject	Chi-squared test	en
dc.subject	Linear models	en
dc.subject	Logistic GLMs	en
dc.subject	Uncertainty estimates	en
dc.subject	QH301 Biology	en
dc.subject	DAS	en
dc.subject.lcc	QH301	en
dc.title	Classical tests, linear models, and their extensions for the analysis of 2x2 contingency tables	en
dc.type	Journal article	en
dc.contributor.sponsor	The Royal Society	en
dc.contributor.institution	University of St Andrews. School of Biology	en
dc.contributor.institution	University of St Andrews. Centre for Biological Diversity	en
dc.contributor.institution	University of St Andrews. Institute of Behavioural and Neural Sciences	en
dc.contributor.institution	University of St Andrews. St Andrews Bioinformatics Unit	en
dc.identifier.doi	https://doi.org/10.1111/2041-210x.14318
dc.description.status	Peer reviewed	en
dc.identifier.grantnumber	UF130398	en

This item appears in the following Collection(s)

University of St Andrews Research

Show simple item record