Multiple regressions: the meaning of multiple regression and the non-problem of collinearity
MetadataShow full item record
Simple regression (regression analysis with a single explanatory variable), and multiple regression (regression models with multiple explanatory variables), typically correspond to very different biological questions. The former use regression lines to describe univariate associations. The latter describe the partial, or direct, effects of multiple variables, conditioned on one another. We suspect that the superficial similarity of simple and multiple regression leads to confusion in their interpretation. A clear understanding of these methods is essential, as they underlie a large range of procedures in common use in biology. Beyond simple and multiple regression in their most basic forms, understanding the key principles of these procedures is critical to understanding, and properly applying, many methods, such as mixed models, generalised models, and causal inference using graphs (including path analysis and its extensions). A simple, but careful, look at the distinction between these two analyses is valuable in its own right, and can also be used to clarify widely-held misconceptions about collinearity (correlations among explanatory variables). There is no general sense in which collinearity is a problem. We suspect that the perception of collinearity as a hindrance to analysis stems from misconceptions about interpretation of multiple regression models, and so we pursue discussions about these misconceptions in this light. In particular, collinearity causes multiple regression coefficients to be less precisely estimated than corresponding simple regression coefficients. This should not be interpreted as a problem, as it is perfectly natural that direct effects should be harder to characterise than univariate associations. Purported solutions to the perceived problems of collinearity are detrimental to most biological analyses.
Morrissey , M B & Ruxton , G D 2018 , ' Multiple regressions: the meaning of multiple regression and the non-problem of collinearity ' , Philosophy, Theory and Practice in Biology , vol. 10 , 3 . https://doi.org/10.3998/ptpbio.16039257.0010.003
Philosophy, Theory and Practice in Biology
© 2018 Author(s) This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license, which permits anyone to download, copy, distribute, or display the full text without asking for permission, provided that the creator(s) are given full credit, no derivative works are created, and the work is not used for commercial purposes
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.