Survey design effect - R programming

Manuela Alcañiz, Montserrat Guillén & Zaida Vicente

We present an exemple for calculating standard errors and design effects in complex survey designs. The computations are shown in SAS, R and STATA.

Information on a typical survey can be found in:


DATA DESCRIPTION


Name Content description
ilustra.xls                                              
RS_7=id for stratum level 1, SEXEDAT= id for stratum level 2, PES_1=sample weight, variable[a 0/1 indicator], municrec=id for cluster.
ilustra.RData Same data set in workspace R format.
ilustra.sas7bdat Same data set in SAS format.
ilustra.dta Same data set in STATA format.





SIMPLE RANDOM SAMPLE vs MORE COMPLEX DESIGNS


In this example we use an exemple data set to show the calculation of Standard errors for a proportion in the simple random sample aproach.


Additionally, we also show how to condeir sample, weights, stratification, clustering and all these features together.


The 95% confidence interval for the proportion p of elements that have some condition identified by a binary variable, like the one in our example is: \begin{equation} \hat{p} \pm 1,96 \cdot \sqrt{\left( \displaystyle\frac{N-n}{N}\right) \cdot \left( \displaystyle\frac{\hat{p}(1 - \hat{p}}{n-1}\right)} \end{equation}
where p̂ is the estimated proportion.




REFERENCES

[1] Lohr, S. (2010) Sampling: Design and Analysis. Brooks/Cole.

[2] Lohr, S. (2010) Solutions Manual for Sampling: Design and Analysis. Brooks/Cole.

[3] Lohr, S. (2010) Computer Programs for Sampling: Design and Analysis. Brooks/Cole.

[4] Tillé, Y. (2006) Sampling Algorithms. Springer.

[5] Survey Analysis in R.



excel HUBc BKC
  • Universitat de Barcelona - Last Updated: 06-20-2014