BNPdensity: Bayesian nonparametric mixture modelling in R - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Australian and New Zealand Journal of Statistics Année : 2021

BNPdensity: Bayesian nonparametric mixture modelling in R

Résumé

Robust statistical data modelling under potential model mis-specification often requires leaving the parametric world for the nonparametric. In the latter, parameters are infinite dimensional objects such as functions, probability distributions or infinite vectors. In the Bayesian nonparametric approach, prior distributions are designed for these parameters, which provide a handle to manage the complexity of nonparametric models in practice. However, most modern Bayesian nonparametric models seem often out of reach to practitioners, as inference algorithms need careful design to deal with the infinite number of parameters. The aim of this work is to facilitate the journey by providing computational tools for Bayesian nonparametric inference. The article describes a set of functions available in the R package BNPdensity in order to carry out density estimation with an infinite mixture model, including all types of censored data. The package provides access to a large class of such models based on normalized random measures, which represent a generalization of the popular Dirichlet process mixture. One striking advantage of this generalization is that it offers much more robust priors on the number of clusters than the Dirichlet. Another crucial advantage is the complete flexibility in specifying the prior for the scale and location parameters of the clusters, because conjugacy is not required. Inference is performed using a theoretically grounded approximate sampling methodology known as the Ferguson & Klass algorithm. The package also offers several goodness of fit diagnostics such as QQ-plots, including a cross-validation criterion, the conditional predictive ordinate. The proposed methodology is illustrated on a classical ecological risk assessment method called the Species Sensitivity Distribution (SSD) problem, showcasing the benefits of the Bayesian nonparametric framework.
Fichier principal
Vignette du fichier
BNPdensity_self_contained.pdf (568.96 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03433254 , version 1 (17-11-2021)

Licence

Paternité

Identifiants

Citer

Julyan Arbel, Guillaume Kon Kam King, Antonio Lijoi, Luis E. Nieto‐Barajas, Igor Prünster. BNPdensity: Bayesian nonparametric mixture modelling in R. Australian and New Zealand Journal of Statistics, 2021, 63 (3), pp.542-564. ⟨10.1111/anzs.12342⟩. ⟨hal-03433254⟩
36 Consultations
95 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More