Oyebayo Ridwan Olaniran (rid4stat@yahoo.com)
Universiti Tun Hussein Onn Malaysia
Mohd Asrul Affendi Bin Abdullah (afendi@uthm.edu.my)
Universiti Tun Hussein Onn Malaysia
Abstract
Random Forest (RF) is a popular method for regression analysis of low or high-dimensional data. RF is often used with the later because it relaxes dimensionality assumption. RF major weakness lies in the fact that it is not governed by a statistical model, hence probabilistic interpretation of its prediction is not possible. RF major strengths are distribution free property and wide applicability to most real life problems. Bayesian Additive Regression Trees (BART) implemented in R via package BayesTree or bartMachine offers a bayesian interpretation to random forest but it suffers from high computational time as well as low efficiency when compared to RF in some specific situation. In this paper, we propose a new probabilistic interpretation to random forest called Bayesian Random Forest (BRF) for regression analysis of high-dimensional data. In addition, we present BRF implementation in R called BayesRandomForest. We also demonstrate the applicability of BRF using simulated dataset of varying dimensions. Results from the simulation experiment shows that BRF has improved efficiency over its competitors.
Keywords: Random Forest, Bayesian Additive Regression Trees, High-dimensional, R
JEL Classification: C11, C39