ReClustOR: a re‐clustering tool using an open‐reference method that improves operational taxonomic unit definition - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Methods in Ecology and Evolution Année : 2020

ReClustOR: a re‐clustering tool using an open‐reference method that improves operational taxonomic unit definition

Christophe Djemiel
Battle Karimi
Samuel Dequiedt
Pierre‐alain Maron
Lionel Ranjard

Résumé

Environmental microbial communities are now widely studied using metabarcoding approaches, thanks to the democratization of high-throughput DNA sequencing technologies. The massive number of reads produced with these technologies requires bioinformatic solutions to be treated. A key step in the analysis is to cluster reads into Operational Taxonomic Units (or OTUs) and thus reduce the amount of data for downstream analyses. Due to the important impact of the clustering method on the quantity and quality of OTUs, finding an equilibrium between the reliability and time-consuming nature of the chosen strategy is a real challenge. The present article proposes a new post-clustering tool called ReClustOR aimed at improving the stability and reliability of OTUs whatever the initial clustering method. 2.We compared several clustering methods: a homemade de novo method, VSEARCH, Swarm, and ReClustOR associated with these three clustering methods, and the ESV definition, using two datasets (a simulated one and an environmental one). All methods were analyzed for their ability to efficiently describe microbial diversity in terms of alpha-diversity, beta-diversity and phylogeny. 3.Dataset analysis showed that post-clustering with ReClustOR improved OTU detection not only in terms of diversity, but also in terms of reliability and stability as compared to the initial clustering methods. More precisely, the post-clustering step improved the congruence of the results (alpha-diversity, beta-diversity, composition) whatever the initial clustering method. Moreover, ReClustOR, by defining a database of centroids, precludes the need to re-cluster all the reads each time when new reads are generated. 4.ReClustOR is a new post-clustering method that overcomes problems (OTU stability and reliability) associated with classical clustering methods and thereby increases the quality and the congruence of the reconstructed OTUs. Moreover, the OTU database defined with ReClustOR can be used as a reference gradually enriched by merging new studies and samples. In this way, huge datasets (e.g. the Earth Microbiome Project or the Tara Oceans project) can be used as references for other projects within their range of application, and increase the quality of comparisons among studies and datasets.

Dates et versions

hal-03129732 , version 1 (03-02-2021)

Identifiants

Citer

Sébastien Terrat, Christophe Djemiel, Corentin Journay, Battle Karimi, Samuel Dequiedt, et al.. ReClustOR: a re‐clustering tool using an open‐reference method that improves operational taxonomic unit definition. Methods in Ecology and Evolution, 2020, 11 (1), pp.168-180. ⟨10.1111/2041-210x.13316⟩. ⟨hal-03129732⟩
39 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More