Background: Three-level data structures arising from repeated measures on individuals who are clustered within larger units are common in health research studies. Missing data are prominent in such studies and are often handled via multiple imputation (MI).In analyses with interactions or non-linear terms, it has been shown that conventional MI approaches can produce biased parameter estimates. Substantive-model-compatible (SMC) MI has shown great promise for tackling these issues in the context of single-level data. While there have been SMC extensions of MI in the context of multilevel data, to date only one approach that explicitly handles incomplete three-level data is available. Therefore, imputation of incomplete three-level data structures compatibly with complex analysis models can be a challenge in applications and it is unclear how these specialised routines compare to extensions of single and two-level imputation models.
Aims : In this study we aimed to extend and evaluate pragmatic adaptations to the single-level and two-level MI methods using dummy indicators and/or a ‘just another variable’ approach and compare their performance to that from the only available implementation of three-level SMC MI.
Methods : We considered two analysis models that are of interest to many researchers in the longitudinal data setting - a multilevel model with an interaction between a time-varying exposure and time, and a multilevel model with a non-linear effect of the time-varying exposure. The various MI methods were evaluated via simulations and illustrated using empirical data based on a case study from a longitudinal cohort estimating the effect of depressive symptoms on the academic performance of students over time, clustered by school.
Results : Results showed that the three-level SMC MI approach and adaptations of single-and two-level SMC MI approaches performed well in terms of bias and precision when the target analysis involved an interaction with time, while the three-level SMC MI approach resulted in better performance in the presence of non-linear terms.
Conclusions : Researchers may use extensions to standard single- and two-level MI approaches to adequately handle incomplete interactions with time in three-level data, while for non-linear terms, MI approaches that explicitly model three-level variation may be more appropriate.