An efficient statistic to detect over- and under-represented words in DNA sequences - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Access content directly
Journal Articles Journal of Computational Biology Year : 1997

An efficient statistic to detect over- and under-represented words in DNA sequences

Abstract

In this note, we point out a very efficient statistic to detect overand under-represented words in DNA sequences, when Markov chain models are used to represent the sequences. This statistic is missing from the recent review done on this important problem, and appears to be a better measure of rarity and abundance of words in DNA sequences.
No file

Dates and versions

hal-02685031 , version 1 (01-06-2020)

Identifiers

  • HAL Id : hal-02685031 , version 1
  • PRODINRA : 135892
  • WOS : A1997XH95200008

Cite

Sophie Schbath. An efficient statistic to detect over- and under-represented words in DNA sequences. Journal of Computational Biology, 1997, 4 (2), pp.189-192. ⟨hal-02685031⟩

Collections

INRA INRAE MATHNUM
7 View
0 Download

Share

Gmail Mastodon Facebook X LinkedIn More