Feedback on a c omparative metatranscriptomic analysis

The progress of next generation sequencing favors the development of more comprehensive ecosystem studies thanks to metatranscriptomic approaches. These latter can indeed provide access to functional information at a good analysis depth. Through a study of anaerobic digesters treating anionic surfactant contaminated wastewater [1] (namely the linear alkylbenzene sulfonate, LAS), we developed a bioinformatics pipeline to perform the RNAseq data analysis for shotgun metatranscriptomics data. In this pipe-line, the raw data are cleaned and pre-processed. Reads corresponding to rRNA are detected and discarded from the datasets. After a normalization step based on k-mer counts, the mRNA reads from the datasets are de novo co-assembled using the Trinity software. Coding regions of the metatranscriptomic assembly are subsequently predicted and annotated. For functional annotation, sequences with matches to the eggNOG and KEGG GENES databases are retrieved to establish functional categories and reconstruct the metabolic pathways. For taxonomic classification, the sequences are assigned by comparing them to a NCBI-nr database. For each dataset individually, reads are mapped back to the co-assembled contigs. Eventually, a count table is constructed; it contains, for each predicted gene, the counts obtained by samples, as well as the associated taxonomic and functional annotations. After aggregation and statistical analysis, this study enabled detecting active genes likely involved in each step of LAS biodegradation and exploring the microbial active core related to LAS degradation. We developed a bioinformatics pipeline to perform the RNAseq data analysis for shotgun metatranscriptomics data, through a study of anaerobic digesters treating anionic surfactant contaminated wastewater. In this pipeline, the raw data are cleaned and pre-processed. Reads corresponding to rRNA are detected and discarded from the datasets. After a normalization step based on k-mer counts, the mRNA reads from the datasets are de novo co-assembled. Coding regions of the metatranscriptomic assembly are subsequently predicted and annotated. Taxonomic and functional annotations are obtained by comparison to public reference databases. The latter are used to define functional categories and reconstruct metabolic pathways. For each dataset individually, reads are mapped back to the co-assembled contigs. Finally, a count table is constructed; it contains, for each predicted gene, the counts obtained by samples, as well as the associated taxonomic and functional annotations. After aggregation and statistical analysis, this study enabled detecting active genes likely involved in each step of the anionic surfactant degradation and exploring the associated microbial activse core.

Domaines

Mathématiques [math] Informatique [cs] Sciences du Vivant [q-bio]

Fichier principal

2019_Midoux_JOBIM 2019_Poster_1 (1.34 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Migration ProdInra : Connectez-vous pour contacter le contributeur

https://hal.inrae.fr/hal-02735859

Soumis le : mardi 2 juin 2020-16:44:37

Dernière modification le : mercredi 13 novembre 2024-17:02:04

Archivage à long terme le : mercredi 2 décembre 2020-15:36:08

Dates et versions

hal-02735859 , version 1 (02-06-2020)

Identifiants

HAL Id : hal-02735859 , version 1
PRODINRA : 481552

Citer

Cedric Midoux, Tiago P. Delforno, Thais Z. Macedo, Gileno V. Lacerda, Olivier Rué, et al.. Feedback on a c omparative metatranscriptomic analysis. JOBIM 2019 : Journées Ouvertes Biologie, Informatique et Mathématiques, Jul 2019, Nantes, France. , 2019, JOBIM 2019. ⟨hal-02735859⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IRSTEA INRA UNIV-PARIS-SACLAY INRAE GS-COMPUTER-SCIENCE MAIAGE

59 Consultations

92 Téléchargements