The statistical world of motif occurrences along DNA sequences

Sophie S. Schbath

Communication Dans Un Congrès Année : 2008

The statistical world of motif occurrences along DNA sequences

(1)

Sophie S. Schbath

Fonction : Auteur
PersonId : 183444
IdHAL : sophie-schbath
ORCID : 0000-0003-3574-8222
IdRef : 07553424X

Unité Mathématique Informatique et Génome

Résumé

Statistics of motifs have been widely revisited in the last 15 years due to the increasing availability of genomic sequences. The identification of DNA motifs with biological functions is still a huge challenge of genome analysis. Many functional and essential motifs have the particularity to be very frequent all along the chromosome or to be concentrated in some particular regions (e.g. in front of genes) or to be co-oriented with the replication direction. The prediction of functional motifs is then mostly based on statistical properties of pattern occurrences in Markovian sequences. This lecture will be mostly devoted to such properties with a special focus on pattern frequency. How to compute or approximate the count distribution to assess motif exceptionality? How to test if a motif is significantly unbalanced between two (sets of) sequences? How to deal with more complex motifs? What is the distribution of the waiting time between occurrences? How to model motif occurrences to find regions significantly enriched with a given pattern? etc. Examples of functional motifs will illustrate all these questions and we will see how the Chi motif has been identified in Staphylococcus aureus thanks to its statistical properties.

Mots clés

statistiques de motifs, génomique, occurrences de mots

Domaines

Mathématiques [math] Informatique [cs] Sciences du Vivant [q-bio]

Migration ProdInra : Connectez-vous pour contacter le contributeur

https://hal.inrae.fr/hal-02814645

Soumis le : samedi 6 juin 2020-12:19:23

Dernière modification le : jeudi 14 mars 2024-03:13:53

Dates et versions

hal-02814645 , version 1 (06-06-2020)

Identifiants

HAL Id : hal-02814645 , version 1
PRODINRA : 183420

Citer

Sophie S. Schbath. The statistical world of motif occurrences along DNA sequences. Workshop Hitting, returning and matching in dynamical systems, information theory and mathematical biology, Nov 2008, Eindhoven, Netherlands. 1p. ⟨hal-02814645⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRA INRAE MATHNUM

5 Consultations

0 Téléchargements

The statistical world of motif occurrences along DNA sequences

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager