Multilevel data with three levels of hierarchy are common in health research studies, for example when there are repeated measures (longitudinal) data from individuals who are further clustered within larger units. A common problem in such studies is the presence of missing data and multiple Imputation (MI) is a popular approach to handle this. While many MI approaches for imputing multilevel data have been developed recently, to our knowledge there are only two implementations that are specialized for imputing missing data in a three-level setting, one within R and the other in the stand-alone software Blimp. Alternatively, it is also possible to extend more general MI approaches in a pragmatic manner to allow for three levels. However, there is a lack of sufficient guidance for practitioners regarding the settings for which each of these approaches is appropriate. This study evaluates the performance of available MI approaches for handling three-level incomplete data under a number of different scenarios via simulations and an empirical application based on a case study from the Childhood to Adolescence Transition Study (CATS) which consisted of repeated measures on students that are clustered within schools.