European soil bulk density and organic carbon stock database using machine learning based pedotransfer function
Résumé
Abstract. Soil bulk density (BD) serves as a fundamental indicator of soil health and quality, exerting a significant influence on critical factors such as plant growth, nutrient availability, and water retention. Due to its limited availability in soil databases, the application of pedotransfer functions (PTFs) has emerged as a potent tool for predicting BD using other easily measurable soil properties, while the impact of these PTFs’ accuracy on soil organic carbon (SOC) stock calculation has been rarely explored. In this study, we proposed an innovative local modelling approach for predicting BD across Europe using the recently released BD data from the LUCAS Soil 2018 (0–20 cm). Our approach involved a combination of neighbour sample search, Forward Recursive Feature Selection (FRFS) and Random Forest (RF) model (local-RFFRFS). The results showed that local-RFFRFS had a good performance in predicting BD (R2 of 0.58, RMSE of 0.19 g cm-3), surpassing the traditional PTFs (R2 of 0.40–0.45, RMSE of 0.22 g cm-3) and global PTFs using RF with and without FRFS (R2 of 0.56–0.57, RMSE of 0.19 g cm-3). Interestingly, we found the best traditional PTF (R2=0.84, RMSE=1.39 kg m-2) performed close to the local-RFFRFS (R2=0.85, RMSE=1.32 kg m-2) in SOC stock calculation using BD predictions. However, the local-RFFRFS still performed better (ΔR2>0.2 and ΔRMSE>0.1 g cm-3) for soil samples with low SOC stock (<3 kg m-2). Therefore, we suggest that the local-RFFRFS is a promising method for BD prediction while traditional PTFs would be more efficient when BD is subsequently utilized for calculating SOC stock. Finally, we produced two BD and SOC stocks datasets (18,945 and 15,389 soil samples) for LUCAS Soil 2018 using the best traditional PTF and local-RFFRFS, respectively. This dataset is archived from the Zenodo platform at https://zenodo.org/records/10211884 (Chen et al., 2023). The outcomes of this study present a meaningful advancement in enhancing the predictive accuracy of BD, and the resultant BD and SOC stock datasets across the Europe enable more precise soil hydrological and biological modelling.