Marius Jula
“Nicolae Titulescu” University of Bucharest
Abstract
When using datasets for various analyses one should test the data for particular situations like existence of outliers or possible data errors. Outliers may indicate bad data and the results may be affected if these points are not identified and/or explained. Also, there are sensitive data, like electoral datasets, which are subject of fraud suspicion. Methods for identifying outliers and data errors are described in this paper, using R support and electoral data.
Keywords: outlier, Z-score, Benford’s law, R