Self-Weighting Sample Definition Essays

On By In 1

1. Centers for Disease Control and Prevention. Behavioral Risk Factor Surveillance System. Accessed November 2015.

2. Pierannunzi C, Town M, Garvin W, Shaw FS, Balluz L. Methodologic changes in the behavioral risk factor surveillance system in 2011 and potential effects on prevalence estimates. MMWR Morb Mortal Wkly Rep. 2012;61(22):410–3.[PubMed]

3. Centers for Disease Control and Prevention. 2014 National Health Interview Survey (NHIS) Public Use Data Release Survey Description. Online Accessed at

4. Ezzati M, Martin H, Skjold S, Vanderhoorn S, Murray CJ. Trends in national and state-level obesity in the USA after correction for self-report bias: analysis of health surveys. J R Soc Med. 2006;99(5):250–7. doi: 10.1258/jrsm.99.5.250.[PMC free article][PubMed][Cross Ref]

5. Ford ES, Mokdad AH, Giles WH, Galuska DA, Serdula MK. Geographic variation in the prevalence of obesity, diabetes, and obesity-related behaviors. Obes Res. 2005;13(1):118–22. doi: 10.1038/oby.2005.15.[PubMed][Cross Ref]

6. Mokdad AH, Serdula MK, Dietz WH, Bowman BA, Marks JS, Koplan JP. The spread of the obesity epidemic in the United States, 1991–1998. JAMA. 1999;282(16):1519–22. doi: 10.1001/jama.282.16.1519.[PubMed][Cross Ref]

7. Yun S, Zhu BP, Black W, Brownson RC. A comparison of national estimates of obesity prevalence from the Behavioral Risk Factor Surveillance System and the National Health and Nutrition Examination Survey. Int J Obes. 2006;30:164–70. doi: 10.1038/sj.ijo.0803125.[PubMed][Cross Ref]

8. Mazurek JM, White GE, Moorman JE, Storey E. Patient-physician communication about work-related asthma: what we do and do not know. Ann Allergy Asthma Immunol. 2015;114(2):97–102. doi: 10.1016/j.anai.2014.10.022.[PMC free article][PubMed][Cross Ref]

9. Link MW, Mokdad AH, Elam-Evans L, Balluz LS, Garvin WS, et al. Estimated influenza vaccination coverage among adults and children--United States, September 1, 2004-January 31, 2005. MMWR Morb Mortal Wkly Rep. 2005;54(12):304–7.[PubMed]

10. Ezzati M, Oza S, Danaei G, Murray CJ. Trends and cardiovascular mortality effects of state-level blood pressure and uncontrolled hypertension in the United States. Circulation. 2008;117(7):905–14. doi: 10.1161/CIRCULATIONAHA.107.732131.[PubMed][Cross Ref]

11. Mokdad AH, Ford ES, Bowman BA, et al. Diabetes trends in the U.S.: 1990–1998. Diabetes Care. 2000;23(9):1278–83. doi: 10.2337/diacare.23.9.1278.[PubMed][Cross Ref]

12. Okoro CA, Nelson DE, Mercy JA, Balluz LS, Crosby AE, Mokdad AH. Prevalence of household firearms and firearm-storage practices in the 50 states and the District of Columbia: findings from the Behavioral Risk Factor Surveillance System, 2002. Pediatrics. 2005;116(3):e370–6. doi: 10.1542/peds.2005-0300.[PubMed][Cross Ref]

13. Zhao G, Ford ES, Li C, Mokdad AH. Are United States adults with coronary heart disease meeting physical activity recommendations? Am J Cardiol. 2008;101(5):557–61. doi: 10.1016/j.amjcard.2007.10.015.[PubMed][Cross Ref]

14. Steakley L. Twenty-four percent of middle-aged and older Americans meet muscle-strengthening guidelines. Scope [Internet]. 2014 Sept 19 [cited 2015 Jun 19]. Retrieved from

15. Khalil G, Crawford C. A bibliometric analysis of U.S.-Based research on the Behavioral Risk Factor Surveillance System. Am J of Prev Med. 2015;48(1):50–7. doi: 10.1016/j.amepre.2014.08.021.[PMC free article][PubMed][Cross Ref]

16. Iachan R, Schulman J, Powell-Griner E, Nelson DE, Mariolis P, Stanwyck C. Pooling state telephone survey health data for national estimates: the CDC Behavioral Risk Factor Surveillance System. In: Cynamon ML, Kulka RA, editors. Proceedings of the 7th Conference on Health Survey Research Methods; 1999; Dallas, TX. Hyattsville: Department of Health and Human Services (US); 2001. pp. 221–6.

17. Cochran WG. Sampling techniques. 3rd ed. New York: Wiley; 1977.

18. Iachan R. A new iterative method for weight trimming and raking. Vancouver: Paper presented at: Joint Statistical Meeting of the American Statistical Association; 2010.

19. Battaglia MP, Frankel MR, Link MW. Improving standard poststratification techniques for random digit-dialed telephone surveys. Survey Res Methods. 2008;2(1):11–9.

20. Blackwell DL, Villarroel MA, Clarke TC. Tables of Summary Health Statistics for U.S. Adults: 2013 National Health Interview Survey. 2015. Available from: SOURCE: CDC/NCHS, National Health Interview Survey, 2013.

21. Nelson DE, Powell-Griner E, Town M, Kovar MG. A comparison of national estimates from the National Health Interview Survey and the Behavioral Risk Factor Surveillance System. Am J Public Health. 2003;93(8):1335–41. doi: 10.2105/AJPH.93.8.1335.[PMC free article][PubMed][Cross Ref]

22. Fahimi M, Link M, Schwartz DA, Levy P, Mokdad A. Tracking chronic disease and risk behavior prevalence as survey participation declines: statistics from the behavioral risk factor surveillance system and other national surveys. Prev Chronic Dis. 2008;5(3):A80.[PMC free article][PubMed]

23. Li C, Balluz L, Ford ES, Okoro CA, Zhao G, Pierannunzi C. A comparison of prevalence estimates for selected health indicators and chronic diseases or conditions from the Behavioral Risk Factor Surveillance System, the National Health Interview Survey and the National Health and Nutrition Examination Survey, 2007–2008. Prev Med. 2012;54:381–7. doi: 10.1016/j.ypmed.2012.04.003.[PubMed][Cross Ref]

24. Pierannunzi C, Hu S, Balluz L. A systematic review of publications assessing reliability and validity of the Behavioral Risk Factor Surveillance System (BRFSS), 2004–2011. BMC Med Res Methodol. 2013;13:49. doi: 10.1186/1471-2288-13-49.[PMC free article][PubMed][Cross Ref]

25. Iachan R, Healey K. Combining state and site-level data: examples from the Behavioral Risk Factor Surveillance System (BRFSS) state surveys. Presented at the American Statistical Association meetings, Boston, MA; 2014.

Design, data weighing and designeffects in Dutch regional health surveys

Previously published in Dutch as: Uitenbroek DG. Design, wegen en het designeffect in GGD gezondheidsenqu�tes. Tijdschrift voor Gezondheidswetenschappen (TSG). 2009(2): 64-8.

Health surveys carried out by regional public health authorities in the Netherlands frequently use a stratified design (Ten Brinke and Verhagen, 2003; GGD Hollands Midden, 2006; Heemskerk and Poort, 2007; Acker, 2005). The population in the health authorities working area is divided into groups, and in each group a pre-determined number of individuals is surveyed (Cochran, 1977). One of the reasons for this design is that health authorities often want to be able to compare local authority areas within their working area with about equal numbers of cases collected in each local authority area. Stratified sampling designs which are developed with the combined aim of providing reliable statistics at both the local level and the overall regional level are common internationally. Other examples of stratified designs in health surveys can also be found, for example the Amsterdam health survey (Uitenbroek,, 2006) was stratified by age and ethnicity, to enable the study of health service needs among older minority groups compared with the needs in the majority ethnic Dutch population, while simultaneously providing information on health in the city of Amsterdam in general.

To provide statistics for the full health authority area the data has to be weighted, to consider differences in population size and sampling fraction between the different strata used in the design. If the data is weighted the reliability of statistics and estimates produced on the health authority level will be less compared with unweighted data, given a similar number of cases, which translates in wider confidence intervals and differences between groups being less easily significant in weighted data (Kish, 1995). Although weighing reduces the reliability of statistics, non-weighing is often not an option, as statistics in data collected with a stratified design can be seriously biased compared with a simple random sample (Kish, 1957).

In analyzing weighted data therefore the decrease in reliability due to the weighing must be considered. In basic statistical computer packages this is mostly not done. However, more complex methods are nowadays available in a number of dedicated computer applications. In this paper the cell weighing procedure is discussed with specific attention for the designeffect caused by this procedure, whereby the designeffect is the statistic which measures the change in reliability which is caused (among other things) by data weighing. A number of simple formulas are introduced and the extent of the designeffect in Dutch regional health surveys is studied.


For this article a secondary analysis was done of health authority reports about local health surveys. The health authorities were asked for additional information when required. The weighing in this paper is done according to the cell weigh procedure, as described by Kalton and Flores-Cervantes (2003), the formulas by Kish (1992) are used to estimate the designeffect for the mean or average.


Purpose of most weighing is to restore in the sample on a number of (social-demographic) variables the same distribution as observed in the population. There are several methods (Kalton & Flores-Cervantes, 2003) to do this, in this paper two of these methods are used which are both based on the cell weigh procedure. Both procedures discussed here produce the same result in the weighted sample, however, the weights produced by the two methods can be differently interpreted. In the first method used in this paper are the weights (Wi) for each of k stratums the reciprocal of the sample fraction. The sample fraction is the number of sampled individuals ni in each k strata divided by the number of people Ni in the same strata in the population. Thus:

wi =1/(sample fraction)=1/( ni / Ni)= Ni / ni(1)

After weighing according to this method the sum of all individual weights in the sample is equal to the total population size: (Wi*ni) = N+. The weights Wi can be interpreted as the number of individuals in the population each individual respondent in the i-th strata represents. A practical advantage of using these weights is that many statistical programmes for complex designs use these weights as the basis of calculations.

A second method is to divide the proportion of each strata in the population by the same proportions as observed in the sample, thus:

wi = Pi / pi(2)

The sum of all individual weights for these weights is equal to the sample size: (wi*ni) = n+. These weights give the multiplication factor with which groups in the sample become more or less important because of the weighting. The weights also give an impression of the effect of weighing on the designeffect, weights which are (much) larger than one will particularly increase the designeffect. For this reason weights are often trimmed when they are above a certain value (Potter, 1999), introducing some bias in the process.


Because of the weighing the reliability of the sample will decrease, the real reliability will be lower as the number of cases collected suggests. The designeffect �DEFF- measures this and is defined as the factor by which the variance calculated under the assumption of a simple random sample (SRS) changes:

Variance after weighing (v^) = variance SRS (v) * DEFF(3)

The designeffect is sometimes also defined as the factor by which the observed number of cases changes because of the designeffect:

Effective n^ = observed n / DEFF (4)

The DEFF in formulae 3 is equivalent to the DEFF in formulae 4. As the designeffect is almost always larger than one the formulas result in the sample variance will increase because of the designeffects, and the effective number of cases decreasing.

In the case of a data file in which the weights are included as a separate variable with one weight for each respondent, formula 5 can be used to calculate the designeffect for a sample mean:

DEFF = n

If totals for the strata in the sample and population are available the following formula can be used for calculating the designeffect for the sample mean. This way of calculating the designeffect is particularly practical in the design phase of a study before data collection, for example to consider designeffects in calculating sample sizes.


Besides the designeffect there is the design factor (DEFT). The design factor is defined as the square root out of the designeffect:

Designfactor (DEFT)=√ DEFF. (7)

The design factor is the factor by which the standard error of estimates changes due to the sample design. It also gives the multiplication factor by which the confidence interval around an estimate changes due to the sample design.

The designeffect can be used to correct the t-test and the f-test for weighted data (Hahs-Vaughn, 2005).

Example and results


Table 1 gives an overview of the health survey done by the Amsterdam the Meerlanden regional public health authority (Ten Brinke and Verhagen, 2003 & 2004). The design of this survey was to take a sample of 750 individuals from the smaller local authorities, and a sample of 1500 from the larger authorities. In total this resulted in a sample of 5250 persons.

Table 1. Design for the health survey of the Amstelland de Meerlanden public health authority, 2002. Calculation of sample weights, designeffect, and the effect of weighing and design on the estimation of the percentage of citizens reporting noise disturbance due to overflying airplanes.

The designeffect of the Amstelland de Meerlanden public health authority survey is estimated to be 1.21. The effective N for calculating the variance and confidence interval around a mean for this design equals 5250/1.21=4338. In 2002 the health authority collected data from 3264 respondents, which results in an effective n for analysis of 3264/1.21=2698 respondents. The design factor is the square root from the designeffect, thus √1.21=1.1. The confidence interval around a mean is therefore after weighing about 10% wider compared with a simple random sample confidence interval.

The health authority area is near Schiphol-Amsterdam international airport and airplane noise is an important problem in the area. Data on airplane noise is used here to demonstrate the calculation to determine the designeffect of the study. The table shows in the fifth column the number of respondents reporting disturbance due to airplane noise. On the basis of the unweighted data about 16.2% (528/3264*100) of citizens in the area will be disturbed due to airplane noise, with a 95% confidence interval calculated according to a basic method (Blalock, 1960) ranging from 14.9 to 17.4%. Using the sample weights in the 6th column the numbers of respondents are recalculated into the estimated number of people disturbed by noise in the population. The result can be used to estimate the weighted percentage in the overall health authority area which is disturbed by airplane noise, this is calculated to be 15.4% (28982/187367*100), with a confidence interval ranging from 14.0 tot 16.8%. The weighted percentage is lower compared with the unweigthed percentage because in the larger local authority areas, which have larger weights, there is generally less airplane noise compare with the smaller areas.

Table 2 Examples of health survey designs as done by Dutch regional health authorities.

Table 2 gives the designeffect in the health surveys as they are published by a number of health authorities in the Netherlands. In two health surveys (Terpstra, 2006; Broer and Spijkers, 2006) there is no designeffect, as it concerns self weighing designs, fixed proportions from each of the strata. Weighing is not required and therefore there will be no designeffect due to the weighing. In the health survey from Groningen done in 2002 (Broer & Spijkers, 2002) and Amstelland de Meerlanden, also done in 2002 (Ten Brinke & Verhagen, 2003), higher numbers of cases were selected from the larger local authorities. The designeffect for these studies is 1.14 and 1.21 respectively. The designeffect is larger in those surveys were a fixed number of cases is taken from strata were the population is different in size. The designeffect ranges from 1.71 in the health survey from Noord Kennemerland from 2006 (Heemskerk & Poort, 2007) to 1.85 in the Amsterdams health survey from 2004 (Uitenbroek, 2006).


In this article attention is given to the design, data weighing and designeffects in health surveys as done by regional public health authorities in the Netherlands. Mostly stratified designs are used whereby the population in the region is divided into groups and from each group in the region pre-determined numbers of cases are sampled. A number of the designs are self weighing, in these cases there is no designeffect because of weighing as the data does not have to be weighted. Largest designeffects could be observed were fixed numbers of cases were taken from strata were the population was different in size. In many health surveys the designeffect will not cause all too serious problems, as these survey tend to be very large in size. However, when surveys are smaller there can be a problem, and also in the case of subgroup analysis were there is weighing to make the sampled sub-group representative for the same subgroup in the population, the design effect needs to be considered.

Careful planning of the design is therefore important. By making changes to the number of individuals sampled in different groups all too strong designeffects might be prevented. Attention must be given to the possible intention to study particular groups, if these groups have to be made representative of the same groups in the population. Combining designeffect calculations as suggested in this paper with sample size calculations seems required.

In this paper a relatively basic method is used to calculate the designeffect, and the method is valid to estimate the designeffect due to weighting for the variance of a simple mean or average. Designeffects for the variance of other estimators and for measures of correlation between two variables need to be calculated in other ways. In this case it seems best to use one of the dedicated packages to do the calculations such as SPSS complex samples (, epi info complex samples (, the module �survey� in �R� (, Wesvar ( or AM ( Some of these packages are freely available. These packages are particularly useful when estimating the critical p-values for statistical tests and can be used to correctly do a weighted multivariate analysis. Given that these packages are available there is no excuse to simply use the �weight� commando in a general statistical package without considering the designeffect. The confidence interval around a mean or average calculated by the method suggested in this paper is mostly very similar compared with the confidence interval calculated using one of the statistical packages mentioned above.

TOP of page


Leave a Reply

Your email address will not be published. Required fields are marked *