Back to browse results
Methods for analysis of complex survey data: an application using the Tanzanian 2015 Demographic and Health Survey and Service Provision Assessment
Authors: Sheffel A, Wilson E, Munos M, and Zeger S
Source: Journal of Global Health, 9(2): 020902; DOI: 10.7189/jogh.09.020902
Topic(s): Data models
Data use
Country: Africa
Published: DEC 2019
Abstract: Background: Low-income and middle-income countries (LMICs) seek to better utilize household and health facility survey data for monitoring and evaluation, as well as for health program planning. However, analysis of this complex survey data are complicated. In Tanzania, the National Evaluation Platform project sought to analyze Demographic and Health Survey (DHS) data and Service Provision Assessment (SPA) data as part of an evaluation of the national One Plan for Maternal and Child Health. To support this evaluation, we used this survey data to answer two key methodological questions: 1) what are the benefits and costs of using sampling weights in rate estimation; and 2) what is the best method for calculating standard errors in these two surveys? Methods: We conducted a simulation study for each methodologic question. The first simulation study assessed the benefits and costs of using sampling weights in rate estimation. This simulation used weighted and unweighted estimates and examined bias, variance, and the mean squared error (MSE). The second simulation study assessed the best method for calculating standard errors comparing cluster bootstrapped variance estimation, design based asymptotic variance with one level (svy1), and design based asymptotic variance with three levels (svy3). We compared coverage probability and confidence interval length. Results: Our results showed that although weighted estimates were less biased, unweighted estimates were less variable. The weighted estimates had a lower MSE, indicating that the effect of the bias trade-off was greater than the effect of the variance trade-off for most indicators assessed. The best performer for variance estimation was the cluster bootstrap method, followed by the svy3 method. The svy1 method was the worst performer for most indicators assessed. Conclusions: As complex survey data become more widely used for policymaking in LMICs, there is a need for guidance on the best methods for analyzing this data. The standard of practice has been a design-based analysis using survey weights and the single-level svy method for calculating standard errors. This study puts forth an alternative approach to analysis. In addition, this study offers practical guidance on determining the best method for analysis of complex survey data.