Bogdan OANCEA (bogdan.oancea@insse.ro, bogdan.oancea@faa.unibuc.ro)
National Institute of Statistics Romania / University of Bucharest, Romania
David SALGADO (david.salgado.fernandez@ine.es)
National Institute of Statistics, Spain, Madrid
Luis SANGUIAO SANDE (luis.sanguiao.sande@ine.es)
National Institute of Statistics, Spain, Madrid
Sandra BARRAGAN (sandra.barragan.andres@ine.es)
National Institute of Statistics, Spain, Madrid
Abstract
In this paper, we describe the software implementation of the methodological framework designed to incorporate mobile phone data into the current production chain of official statistics during the ESSnet Big Data II project. We present an overview of the architecture of the software stack, its components, the interfaces between them, and show how they can be used. Our software implementation consists in four R packages: destim for estimation of the spatial distribution of the mobile devices, deduplication for classification of the devices as being in 1:1 or 2:1 correspondence with its owner, aggregation for estimation of the number of individuals detected by the network starting from the geolocation probabilities and the duplicity probabilities and inference which combines the number of individuals provided by the previous package with other information like the population counts from an official register and the mobile operator penetration rates to provide an estimation of the target population counts.
Keywords: R, mobile phone data, population count, geolocation, deduplication, aggregation, inference
JEL Classification: C88, C89