Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Pré-Publication, Document De Travail (Preprint/Prepublication) Année : 2022

Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach

Résumé

The dramatic increase in the amount of microbe descriptions in databases, reports and papers presents a two-fold challenge for accessing the information: integration of heterogeneous data in a standard ontology-based representation and normalization of the textual descriptions by semantic analysis. Recent text mining methods offer powerful ways to extract textual information and generate ontology-based representation. This paper describes the design of the Omnicrobe application that gathers comprehensive information on habitats, phenotypes and usages of microbes from scientific sources of high interest to the microbiology community. The Omnicrobe database contains around 1 million descriptions of microbe properties that are created by analyzing and combining six information sources of various kinds, i.e. biological resource catalogues, sequence database and scientific literature. The microbe properties are indexed by the Ontobiotope ontology and their taxa are indexed by an extended version of the taxonomy maintained by the National Center for Biotechnology Information. The Omnicrobe application covers all domains of microbiology. It provides an easy-to-use support in the resolution of scientific questions related to the habitats, phenotypes and uses of microbes through simple and complex ontology-based queries. We illustrate the potential of Omnicrobe with a use case from the food innovation domain.
Fichier principal
Vignette du fichier
2022.07.21.500958v1.full.pdf (1.49 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03828105 , version 1 (25-10-2022)

Licence

Paternité

Identifiants

Citer

Sandra Dérozier, Robert Bossy, Louise Deléger, Mouhamadou Ba, Estelle Chaix, et al.. Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach. 2022. ⟨hal-03828105⟩
77 Consultations
45 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More