Variational Inference for sparse network reconstruction from count data
Résumé
The problem of network reconstruction from continuous data has been extensively studied and most state of the art methods rely on variants of Gaussian Graphical Models (GGM). GGM are unfortunately badly suited to sparse count data spanning several orders of magnitude. Most inference methods for count data (SparCC, REBACCA, SPIEC-EASI, gCoda, etc) first transform counts to pseudo-Gaussian observations before using GGM. We rely instead on a PoissonLogNormal (PLN) model where counts follow Poisson distributions with parameters sampled from a latent multivariate Gaussian variable, and infer the network in the latent space using a variational inference procedure. This model allows us to (i) control for confounding covariates and differences in sampling efforts and (ii) integrate data sets from different origins. It is also competitive in terms of speed and accuracy with state of the art methods.