Statistics of biological network motifs : A compound Poisson approximation for their count in random graphs?

Sophie S. Schbath

Communication Dans Un Congrès Année : 2009

Statistics of biological network motifs : A compound Poisson approximation for their count in random graphs?

(1)

Sophie S. Schbath

Fonction : Auteur
PersonId : 183444
IdHAL : sophie-schbath
ORCID : 0000-0003-3574-8222
IdRef : 07553424X

Unité Mathématique Informatique et Génome

Résumé

Getting and analyzing biological interaction networks is at the core of systems biology. To help understanding these complex networks, many recent works have suggested focusing on motifs which occur more frequently than expected in random (Milo et al., 2002; Shen-Orr et al., 2002; Prill et al., 2005). Such motifs seem indeed to reflect functional or computational units which combine to regulate the cellular behavior as a whole. The common method that has been used for now to detect significantly over-represented motifs is based on heavy simulations: random graphs are first generated, then the p-value is derived either from the empirical distribution of the count or via a Gaussian approximation of the z-score calculated thanks to the empirical mean and variance of the count. To identify exceptional motifs in a given network, we propose a statistical and analytical method which does not require any simulation (Picard et al., 2008). For this, we first provide an analytical expression of the mean and variance of the count under any stationary random graph model. Then we approximate the motif count distribution by a compound Poisson distribution whose parameters are derived from the mean and variance of the count. Thanks to simulations, we show that the quality of such compound Poisson approximation is very good and highly better than a Gaussian or a Poisson one. The compound Poisson distribution can then be used to get an approximate p-value and to decide if an observed count is significantly high or not. Beyond the p-value calculation, the assessment of the motif exceptionality in a given network relies on the choice of a suitable random graph model. This model should indeed fit some relevant characteristics of the observed network. The sequence degree is usually an important feature to take into account. Unfortunately the well known and well studied Erdös-Rényi model does not fit correctly biological networks, in particular it does not consider heterogeneities. We then emphasize the recent and promising mixture model for random graphs proposed by Daudin et al. (2008). This model assumes that nodes are spread into several classes of connectivity and that the probability for two nodes to be connected depends on their classes. The goodness-of-fit of this model on real biological networks is very satisfactory.

Mots clés

biological networks, network motif, count of subgraph, compound Poisson approximation

Domaines

Mathématiques [math] Informatique [cs] Sciences du Vivant [q-bio]

Fichier principal

singapour_1.html (18.23 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Migration ProdInra : Connectez-vous pour contacter le contributeur

https://hal.inrae.fr/hal-02816548

Soumis le : samedi 6 juin 2020-14:00:35

Dernière modification le : jeudi 14 mars 2024-03:13:53

Dates et versions

hal-02816548 , version 1 (06-06-2020)

Identifiants

HAL Id : hal-02816548 , version 1
PRODINRA : 183419

Citer

Sophie S. Schbath. Statistics of biological network motifs : A compound Poisson approximation for their count in random graphs?. Progress in Stein's method, Jan 2009, Singapour, Singapore. 1p. ⟨hal-02816548⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRA INRAE MATHNUM

22 Consultations

12 Téléchargements

Statistics of biological network motifs : A compound Poisson approximation for their count in random graphs?

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager