Accéder directement au contenu Accéder directement à la navigation
Communication dans un congrès

The statistical world of motif occurrences along DNA sequences

Abstract : Statistics of motifs have been widely revisited in the last 15 years due to the increasing availability of genomic sequences. The identification of DNA motifs with biological functions is still a huge challenge of genome analysis. Many functional and essential motifs have the particularity to be very frequent all along the chromosome or to be concentrated in some particular regions (e.g. in front of genes) or to be co-oriented with the replication direction. The prediction of functional motifs is then mostly based on statistical properties of pattern occurrences in Markovian sequences. This lecture will be mostly devoted to such properties with a special focus on pattern frequency. How to compute or approximate the count distribution to assess motif exceptionality? How to test if a motif is significantly unbalanced between two (sets of) sequences? How to deal with more complex motifs? What is the distribution of the waiting time between occurrences? How to model motif occurrences to find regions significantly enriched with a given pattern? etc. Examples of functional motifs will illustrate all these questions and we will see how the Chi motif has been identified in Staphylococcus aureus thanks to its statistical properties.
Type de document :
Communication dans un congrès
Liste complète des métadonnées
Déposant : Migration Prodinra <>
Soumis le : samedi 6 juin 2020 - 12:19:23
Dernière modification le : vendredi 12 juin 2020 - 10:43:26


  • HAL Id : hal-02814645, version 1
  • PRODINRA : 183420



Sophie Schbath. The statistical world of motif occurrences along DNA sequences. Workshop Hitting, returning and matching in dynamical systems, information theory and mathematical biology, Nov 2008, Eindhoven, Netherlands. 1p. ⟨hal-02814645⟩



Consultations de la notice