Statistics of biological network motifs : A compound Poisson approximation for their count in random graphs? - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Communication Dans Un Congrès Année : 2009

Statistics of biological network motifs : A compound Poisson approximation for their count in random graphs?

Résumé

Getting and analyzing biological interaction networks is at the core of systems biology. To help understanding these complex networks, many recent works have suggested focusing on motifs which occur more frequently than expected in random (Milo et al., 2002; Shen-Orr et al., 2002; Prill et al., 2005). Such motifs seem indeed to reflect functional or computational units which combine to regulate the cellular behavior as a whole. The common method that has been used for now to detect significantly over-represented motifs is based on heavy simulations: random graphs are first generated, then the p-value is derived either from the empirical distribution of the count or via a Gaussian approximation of the z-score calculated thanks to the empirical mean and variance of the count. To identify exceptional motifs in a given network, we propose a statistical and analytical method which does not require any simulation (Picard et al., 2008). For this, we first provide an analytical expression of the mean and variance of the count under any stationary random graph model. Then we approximate the motif count distribution by a compound Poisson distribution whose parameters are derived from the mean and variance of the count. Thanks to simulations, we show that the quality of such compound Poisson approximation is very good and highly better than a Gaussian or a Poisson one. The compound Poisson distribution can then be used to get an approximate p-value and to decide if an observed count is significantly high or not. Beyond the p-value calculation, the assessment of the motif exceptionality in a given network relies on the choice of a suitable random graph model. This model should indeed fit some relevant characteristics of the observed network. The sequence degree is usually an important feature to take into account. Unfortunately the well known and well studied Erdös-Rényi model does not fit correctly biological networks, in particular it does not consider heterogeneities. We then emphasize the recent and promising mixture model for random graphs proposed by Daudin et al. (2008). This model assumes that nodes are spread into several classes of connectivity and that the probability for two nodes to be connected depends on their classes. The goodness-of-fit of this model on real biological networks is very satisfactory.
Fichier principal
Vignette du fichier
singapour_1.html (18.23 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-02816548 , version 1 (06-06-2020)

Identifiants

  • HAL Id : hal-02816548 , version 1
  • PRODINRA : 183419

Citer

Sophie S. Schbath. Statistics of biological network motifs : A compound Poisson approximation for their count in random graphs?. Progress in Stein's method, Jan 2009, Singapour, Singapore. 1p. ⟨hal-02816548⟩

Collections

INRA INRAE MATHNUM
15 Consultations
9 Téléchargements

Partager

Gmail Facebook X LinkedIn More