A comparison of random forest and logistic regression model in credit scoring of rural households
Résumé
Many banks currently use the logistic regression model to do credit scoring to give loans to customers. This paper compares the random forest and logistic regression methods to support the financial analysis functions of the predictive tool for credit scoring. We use the data provided by the Vietnam Access Resource to Household Survey (VARHS), which contains 3,530 households in the year 2014 in 12 provinces of Vietnam. Results show that random forest proved to be a better accurate predictive tool than the logistic regression method. This suggests banks use the random forest to predict potential lenders based on the existing client dataset resulting in saving time and cost to find potential clients.