Creating a comprehensive framework for leveraging meta-omic data in environmental biotechnology processes
Abstract
The scientific community needs tools to harness the potential of meta-omic data in the realm of environmental biotechnology, ensuring that these data adhere to the principles of Findable, Accessible, Interoperable, and Reusable (FAIR) data management. Although there are public repositories for sequencing data, sequencing data from environmental biotechnology processes are actually very difficult to leverage. The MEMOS project focuses on the interoperability of two information systems, DeepOmics and EnviBIS, to create a comprehensive framework for leveraging meta-omic data and metadata.
Firstly, DeepOmics (INRAE-PROSE and DSI-SOLAPP) is an information system designed to store meta-omic data pertaining to environmental biotechnology processes. It enables storing and, soon, querying cross-referenced amplicon sequencing data, to metadata associated with samples from lab-scale reactors, pilots, or industrial sites.
Secondly, EnviBIS (INRAE-LBE) is an information system driven by a set of ontologies for managing data related to environmental biorefinery, utilizing the open-source software suite OpenSILEX developed by UMR MISTEA.
Ensuring interoperability between these two tools will empower researchers to seamlessly access and analyze data from heterogeneous sources, enabling interdisciplinary knowledge aggregation.
A conversion of the relational database within the DeepOmics tool into a structured ontology has been undertaken. This conversion will facilitate alignment with EnviBIS ontologies and promote interoperability between the two tools.
In conclusion, the MEMOS project will provide valuable resources to the scientific community, enabling it to harness the potential of meta-omic data in environmental biotechnology. By championing the principles of FAIR data management and facilitating the integration of tools like DeepOmics and EnviBIS, MEMOS paves the way for enhanced research collaborations, knowledge sharing, and data-driven discoveries.