Alfredo Aliaga and Ruilin Ren and ORC Macro, Calverton, Maryland, USA
July 2006
Optimal Sample Sizes for Two-stage Cluster Sampling in Demographic and Health Surveys
This paper examines the optimal sample sizes in a two-stage cluster sampling, a sampling procedure used in most Demographic and Health Surveys (DHS), which are interview surveys of household members in a certain age group. Determining optimal sample size is a critical step in a DHS survey because it requires a trade-off between the budget available and the desired survey precision The households in a survey area are stratified according to type of residence (urban-rural) crossed by administrative/geographical regions. In the first stage, a number of primary sampling units (PSUs), or clusters, are selected from a sampling frame independently in each stratum. The sampling frame is usually a complete list of enumeration areas (EAs) created in a recent population census. After the selection of EAs and before the second-stage selection, a household listing and mapping operation is conducted in each of the selected EAs. This operation updates the outdated population information in the sampling frame and provides a list of all of the households residing in each EA with a location map. In the second stage, a fixed number of households are selected from the newly constructed household list in each of the selected EAs, and all household members in a certain age group (e.g., all women age 15-49 and all men age 15-59) in the selected household are selected for the survey. This two-stage sampling procedure has several advantages: it provides good coverage, is simple to implement, and allows for control of field-work quality. In order to achieve both economy and good precision, sample sizes at both stages of the survey must be determined in such way that they minimize the sampling error under a given sampling cost. This paper investigates the optimal sample sizes in different situations in DHS surveys, based on experiences of actual surveys. The results show that for an average cluster size of 100-300 households, for moderate intracluster correlation and cost ratio, the optimal second-stage sample size is about 20 women per cluster. The results also show that for most of the DHS surveys the sample sizes met the optimal standard or were within tolerable limits of relative precision loss.