BNPdensity: Bayesian nonparametric mixture modelling in R - Archive ouverte HAL Access content directly
Journal Articles Australian and New Zealand Journal of Statistics Year : 2021

BNPdensity: Bayesian nonparametric mixture modelling in R

(1) , (2) , (3) , (4) , (3)
1
2
3
4

Abstract

Robust statistical data modelling under potential model mis-specification often requires leaving the parametric world for the nonparametric. In the latter, parameters are infinite dimensional objects such as functions, probability distributions or infinite vectors. In the Bayesian nonparametric approach, prior distributions are designed for these parameters, which provide a handle to manage the complexity of nonparametric models in practice. However, most modern Bayesian nonparametric models seem often out of reach to practitioners, as inference algorithms need careful design to deal with the infinite number of parameters. The aim of this work is to facilitate the journey by providing computational tools for Bayesian nonparametric inference. The article describes a set of functions available in the R package BNPdensity in order to carry out density estimation with an infinite mixture model, including all types of censored data. The package provides access to a large class of such models based on normalized random measures, which represent a generalization of the popular Dirichlet process mixture. One striking advantage of this generalization is that it offers much more robust priors on the number of clusters than the Dirichlet. Another crucial advantage is the complete flexibility in specifying the prior for the scale and location parameters of the clusters, because conjugacy is not required. Inference is performed using a theoretically grounded approximate sampling methodology known as the Ferguson & Klass algorithm. The package also offers several goodness of fit diagnostics such as QQ-plots, including a cross-validation criterion, the conditional predictive ordinate. The proposed methodology is illustrated on a classical ecological risk assessment method called the Species Sensitivity Distribution (SSD) problem, showcasing the benefits of the Bayesian nonparametric framework.
Fichier principal
Vignette du fichier
BNPdensity_self_contained.pdf (568.96 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03433254 , version 1 (17-11-2021)

Licence

Attribution - CC BY 4.0

Identifiers

Cite

Julyan Arbel, Guillaume Kon Kam King, Antonio Lijoi, Luis E. Nieto‐Barajas, Igor Prünster. BNPdensity: Bayesian nonparametric mixture modelling in R. Australian and New Zealand Journal of Statistics, 2021, 63 (3), pp.542-564. ⟨10.1111/anzs.12342⟩. ⟨hal-03433254⟩
22 View
19 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More