Combining random forests and class-balancing to discriminate between three classes of avalanche activity in the French Alps
Résumé
Determining avalanche activity corresponding to given snow and meteorological conditions is an old problem of high practical relevance. To address it, numerous statistical forecasting models have been developed, but intercomparisons of their efficiency on very large datasets are seldom. In this work, an approach combining random forests with class-balancing is presented and systematically compared with competing methods currently described in the avalanche literature. On more than 50 years of daily avalanche observations, in the 23 massifs of the French Alps, the competing classifiers are evaluated on their ability to distinguish three classes of avalanche activity: non-avalanche days, days with moderate activity, and days with high activity. Moreover, the variables of higher importance in the random forest classifiers are shown to be coherent with current avalanche literature and a clustering based on these variable importance separates massifs which are known to have different avalanche activities. Our approach opens perspectives to support operational avalanche forecasting.