Skip to Main content Skip to Navigation
Conference papers

GnpAnnot community annotation system: features, qualifiers, values

Abstract : In January 2009, 991 complete genomes have been already published and 3376 genome sequencing projects are ongoing, leading to an explosion of data that needs to be stored, curated and analyzed. GnpAnnot is a project on green genomics which intends to develop a system of structural and functional annotation supported by comparative genomics and dedicated to plant and bio-aggressor genomes allowing both automatic predictions and manual curations of genomic objects. The core of GnpAnnot is a community annotation system (CAS) based on GMOD components: Chado / GBrowse / Apollo / Artemis. The system should also enable to browse comparative genomics results, to build queries and to export sets of gene lists and gene reports in various formats. The system should allow the annotation reconciliation, history, integrity, consistency and update and the management of public and private projects. To facilitate the work of the curators, four steps are crucial: 1. To provide homogeneous features, qualifiers and values for genomic objects; 2. To share a strong CAS: run high quality combiners / pipelines to predict automatically genomic objects which are stored in a relational database management system and then available from graphical and textual fast browsers and powerful editors; 3. To define annotation rules, train the annotators and organize annotation jamborees; 4. To submit the results in public sequence knowledge bases in an easy way. In this work we focus on the first and third steps. A mapping between different known sources: sequence ontology, DDBJ / EMBL / GenBank feature definition, GFF3, Chado, gene nomenclatures, transposable element classification and annotation guidelines from various genome project consortia is described. Homogeneous feature keys, qualifiers and value format with a maximum of controlled vocabularies for genes and transposable elements are proposed. Rules to annotate, in a coherent way, the structure and the function of genes and the structure and the classification of transposable elements are proposed. These rules could be useful both for automatic predictions and manual curation. Examples of annotations on a BAC sequence of a monocot are presented.
Complete list of metadata

https://hal.inrae.fr/hal-02758372
Contributor : Migration Prodinra <>
Submitted on : Thursday, June 4, 2020 - 3:33:00 AM
Last modification on : Sunday, April 11, 2021 - 6:58:02 PM

Identifiers

  • HAL Id : hal-02758372, version 1
  • PRODINRA : 33460

Citation

Stéphanie Sidibe-Bocs, Fabrice Legeai, Gaëtan Droc, Mathieu Rouard, Michael Alaux, et al.. GnpAnnot community annotation system: features, qualifiers, values. 3. International Biocuration Conference, Apr 2009, Berlin, Germany. ⟨hal-02758372⟩

Share

Metrics

Record views

27