Substantially inflated type I error rates if propensity score method is not fixed in advance

Neuhäuser, Markus; Kraechter, Julia M.; Thielmann, Matthias; Ruxton, Graeme D.

View/Open

Neuh_user_2020_Substantially_inflated_CiS_AAM.pdf (362.5Kb)

Date

19/05/2020

Abstract

Propensity scores are often used to adjust for between-group variation in covariates, when individuals cannot be randomized to groups. There is great flexibility in how these scores can be appropriately used. This flexibility might encourage p-value hacking – where several alternative uses of propensity scores are explored and the one yielding the lowest p-value is selectively reported. Such unreported multiple testing must inevitably inflate type I error rates – our focus is on exploring how strong this inflation effect might be. Across three different scenarios, we compared the performance of four different methods. Each taken individually gave type I error rates near the nominal (5%) value, but taking the minimum value of four tests led to actual error rates between 150% and 200% of the nominal value. Hence, we strongly recommend pre-selection of the details of the statistical treatment of propensity scores to avoid risk of very serious over-inflation of type I error rates.

Citation

Neuhäuser , M , Kraechter , J M , Thielmann , M & Ruxton , G D 2020 , ' Substantially inflated type I error rates if propensity score method is not fixed in advance ' , Communications in Statistics: Case Studies, Data Analysis and Applications . https://doi.org/10.1080/23737484.2020.1763219

Publication

Communications in Statistics: Case Studies, Data Analysis and Applications

Status

Peer reviewed

DOI

10.1080/23737484.2020.1763219

ISSN

2373-7484

Type

Journal article

Collections

University of St Andrews Research

URI

https://hdl.handle.net/10023/23220