An overview on the distribution of word counts in Markov chains - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Journal of Computational Biology Année : 2000

An overview on the distribution of word counts in Markov chains

Résumé

In this paper, me give an overview about the different results existing on the statistical distribution of word counts in a Markovian sequence of letters. Results concerning the number of overlapping occurrences, the number of renewals and the number of clumps mill be presented, Counts of single words and also multiple words are considered. Most of the results are approximations as the length of the sequence tends to infinity. We will see that Gaussian approximations switch to (compound) Poisson approximations for rare words, Modeling DNA sequences or proteins by stationary Markov chains, these results can be used to study the statistical frequency of motifs in a given sequence
Fichier non déposé

Dates et versions

hal-02692799 , version 1 (01-06-2020)

Identifiants

  • HAL Id : hal-02692799 , version 1
  • PRODINRA : 36381
  • WOS : 000087833300011

Citer

Sophie S. Schbath. An overview on the distribution of word counts in Markov chains. Journal of Computational Biology, 2000, 7 (1-2), pp.193 - 201. ⟨hal-02692799⟩

Collections

INRA INRAE MATHNUM
7 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More