José F. Zea (josezea@usantotomas.edu.co)
Santo Tomás University, Bogota, Colombia
Felipe Ortiz (andresortiz@usantotomas.edu.co)
Santo Tomás University, Bogota, Colombia
ABSTRACT
Small Area Estimation Methodology (SAE) is a widely used by statistical offices in several countries to reduce sampling errors with the help of auxiliary information. Different countries such as USA, Canada, England, Israel and European Community have within their statistical institutes offices dedicated to the application of SAE in several investigations. So far, the National Administrative Department of Statistics of Colombia (DANE), has not published official statistics that involve this methodology. The present work illustrates the advantages in the use and estimation of living conditions using SAE. Formally, the unemployment rate and the average income levels of municipalities of Cundinamarca are estimated. For this purpose, information of the Multipurpose Survey 2014 is used and is complemented with socio-demographic and economic related auxiliary information. A mixed Fay & Herriot (1979) model it is used in order to get the estimates.
We use R ecosystem to develop SAE methodology. R is used for data wrangling, model adjustment, parameter estimation and finally visualization with the aid of renowned packages such as tidyr, forcats, sae, ggplot2 among others.
We will show R implementation and some remarkable results. First, a good adjustment of the model to the data; second, a reduction in the sampling errors reported by the estimation in small areas compared to the direct estimates generated by the Bogota Multipurpose Survey (EMB); and finally acceptable estimates for municipalities that were not covered by the survey.
Keywords: Small Area Estimation, Survey Sampling, Tidyvese, Household Survey, Colombia, Cundinamarca Municipalities.
JEL classification: O54, C63, C81, C83, C88