Online Job Advertisements for Labour Market Statistics using R

Andrea ASCHERI
Eurostat, European Commission, Luxembourg
Gabriele MARCONI
Sogeti, Luxembourg
Matyas MESZAROS
Eurostat, European Commission, Luxembourg
Fernando REIS
Eurostat, European Commission, Luxembourg

Abstract

This paper introduces the implementation through R of the methodology used to calculate a labour market concentration (Herfindahl-Hirschman) index for European urban areas, based on a database of over 100 million online job advertisements. After introducing the broader context and the motivation for the analysis, the authors describe the overall processing workflow. In addition, the paper presents in more detail the solutions provided to two main challenges encountered: addressing computational efficiency by using parallel computing and cloud data querying; and a custom-built machine learning model to classify an important variable for the study (company name). Finally, the paper discusses the main rationales for using R and for sharing the code in a public repository.
Keywords: R, Big data, Online Job Advertisements, Labour market

[Full Text]

Romanian Statistical Review 1/2022