Missing Data in Causal Analyses - Implications and Solutions
Causal inference can be attempted using different statistical methods, each of which require some (untestable) assumptions. Common methods include multivariable regression, propensity scores, g-methods (no unmeasured confounding) and instrumental variables (no association between instrument and outcome, other than via the exposure). Less attention has been given to the impact of selection (e.g. selection into a study, analysis of cases only) or missing data (e.g. dropout from a study, missing baseline data) on different methods for causal inference.
Using directed acyclic graphs (DAGs) Kate Tilling will discuss some of the ways in which bias can occur due to missing data, and methods that might be used to detect or mitigate against this bias.
Multiple imputation (MI) is one method used to account for missing data in variables. The missing at random (MAR) assumption is sufficient for valid (unbiased) MI. Paul Madley-Dowd will discuss a new algorithm, designed to be accessible to applied researchers, that 1) explores whether a version of the MAR assumption has been met, on the basis of a missingness-DAG, and 2) whether the MAR assumption may be met in a subsample of the data in which we restrict to participants with observed values for some incomplete (i.e. missing) variables. The algorithm therefore identifies whether we can unbiasedly estimate an exposure-outcome association in 1) the whole sample or 2) a subsample based on observed values for some incomplete variables.
Finally, Elinor Curnow will discuss common pitfalls in MI, including correct specification of the imputation model, and how to decide which variables to include. She will demonstrate an R package, midoc (https://cran.r-project.org/web/packages/midoc/index.html), which aims to support researchers in careful application of MI.
Prerequisites: This seminar is for students with a background in statistical analyses. This presentation uses directed acyclic graphs (DAGs) and associated terminology throughout. Please familiarize yourself with an introduction to DAGs (e.g. https://pmc.ncbi.nlm.nih.gov/articles/PMC8821727/) before the session.
*This seminar is available for RECR Credit, 1.5 Hours, Attendance will be verified and a survey must be completed afterwards with well thought out responses to receive RECR Credit.