An efficient statistic to detect over- and under-represented words in DNA sequences

Sophie Schbath

Article Dans Une Revue Journal of Computational Biology Année : 1997

An efficient statistic to detect over- and under-represented words in DNA sequences

(1)

Sophie Schbath

Fonction : Auteur
PersonId : 183444
IdHAL : sophie-schbath
ORCID : 0000-0003-3574-8222
IdRef : 07553424X

Unité de biométrie et intelligence artificielle de Jouy

Résumé

In this note, we point out a very efficient statistic to detect overand under-represented words in DNA sequences, when Markov chain models are used to represent the sequences. This statistic is missing from the recent review done on this important problem, and appears to be a better measure of rarity and abundance of words in DNA sequences.

Domaines

Bio-Informatique, Biologie Systémique [q-bio.QM]

Migration ProdInra : Connectez-vous pour contacter le contributeur

https://hal.inrae.fr/hal-02685031

Soumis le : lundi 1 juin 2020-04:17:08

Dernière modification le : mardi 12 mars 2024-10:43:55

Dates et versions

hal-02685031 , version 1 (01-06-2020)

Identifiants

HAL Id : hal-02685031 , version 1
PRODINRA : 135892
WOS : A1997XH95200008

Citer

Sophie Schbath. An efficient statistic to detect over- and under-represented words in DNA sequences. Journal of Computational Biology, 1997, 4 (2), pp.189-192. ⟨hal-02685031⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRA INRAE MATHNUM

12 Consultations

0 Téléchargements

An efficient statistic to detect over- and under-represented words in DNA sequences

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager