L2 Boosting technique for nonlinear spatial models
Résumé
This article proposes an L2 Boosting technique for nonlinear spatial models based on a procedure that considers kernel smoothers as base-learners joined via a multiple decreasing bandwidths scheme. This new boosting algorithm makes it possible to take into account spatial heterogeneity/autocorrelation, as well as non-linearity. Our proposed technique is a first in the field of spatial econometrics, boosting techniques being very largely ignored by this research community. The notable exception is Kostov (2010, 2013), which relies on this type of method to build spatial interaction matrices by combining a large set of candidate matrices. In addition to taking into account the presence of spatial autocorrelation or spatial heterogeneity, which substantially improves the estimates under such conditions, the proposed algorithm is very useful with high-dimensional data. The algorithm performs just as well when the dimension of potential explanatory variables increases.
Moreover, our technique allows non-continuity and threshold effects to be better taken into account than with the usual "base-learners" like spline functions or nearest-neighbor kernels.
First, we detail the boosting algorithm and assess its properties in terms of prediction accuracy and selection of explanatory variables. We illustrate its remarkable performance both on synthetic data and on 2 items of real geo-localized data (land prices, the number of new dwellings per plot, local genetic diversity of the Salsifi virus) by comparing its predictive accuracy with existing gradient descent algorithms (xgboost, glmboost, gamboost) and with the usual spatial econometric models (SAR with BLUP).