Comparison of mapping softwares for next generation sequencing data - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Poster De Conférence Année : 2010

Comparison of mapping softwares for next generation sequencing data

Résumé

Recent DNA sequencers, usually called "next generation", produce reads that are shorter and in much larger amounts than previous sequencers. New alignment tool have been developed for these new type of reads. Our study evaluates the efficiency, strong points and weaknesses of these tools. We have identified about 40 software tools that are currently used to map on known genomes the reads produced by next generation sequencers (NGS). Our study focuses on reads produced by Illumina sequencers, but also consider specificity associated with SOLiD reads (color code). Methodology: We simulate two sets of reads of length 40 bp, that are drawn uniformly in a dataset. To reflect the diversity of genomic data, we use 2 kinds of datasets: the human genome and a concatenation of 1000 bacterial genomes. The sets contain 10M reads, close to the actual amount produced by NGS tools. In the first set reads are without errors, in the second, three mismatches are added at random positions. We use 11 of the most used tools (BWA, Novoalign, Bowtie, MOSAIK, MOM, Probematch, SOAP2, Bfast, SHRiMP, maq, and ZOOM) to align the simulated reads on the genome. We monitor several indicators of the performance of each tool: CPU time used, memory, whether the read matches at its "initial" position, number of match positions found for a given read, number of match positions that are in the original genome. We also take into account usability, flexibility, output format, documentation of the tools.
Fichier principal
Vignette du fichier
PosterJOBIM_FAYOLLE_1.pdf (209.54 Ko) Télécharger le fichier
actes_jobim_2010-1_2.pdf (18.2 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02751434 , version 1 (03-06-2020)

Identifiants

  • HAL Id : hal-02751434 , version 1
  • PRODINRA : 188069

Citer

Julien Fayolle, Jean-François Gibrat, Valentin Loux, Sophie S. Schbath. Comparison of mapping softwares for next generation sequencing data. JOBIM 2010, Sep 2010, Montpellier, France. MABLI : Methods Algorithmes Bio-Informatique LIRMM, pp.176, 2010, proceeding of JOBIM 2010 - Journées Ouvertes en Biologie, Informatique et Mathématiques - Montpellier. ⟨hal-02751434⟩

Collections

INRA INRAE MATHNUM
10 Consultations
7 Téléchargements

Partager

Gmail Facebook X LinkedIn More