Skip to Main content Skip to Navigation
Conference papers

Robust estimation of phylogenetic diversity: steer clear of rare species

Abstract : It is seldom possible to obtain an exhaustive census of an ecological community. Instead, most communities, especially microbial ones, are studied by taking a sample that is relatively small compared to the size of the whole community. Here we systematically study the effect of sampling on estimating phylogenetic diversity (PD). Typically, most rare species in the community are unseen in the sample, so that the PD observed in the sample is smaller than the PD of the whole community. We show that for some PD measures the missing diversity can be estimated quite accurately, while this estimation is impossible for other PD measures. To do so, we study the sensitivity of the PD measures with respect to assumptions about the abundance distribution and the branching structure of the rare species in the community that are unseen in the sample. We estimate the number of such rare species using a refinement of the non--‐parametric Good--‐Turing estimator. We show that Faith’s PD, the most widely used PD measure, is highly sensitive to the assumptions about the rare species; hence, it cannot be reliably estimated. We also show that Rao’s PD is lowly sensitive to the assumptions; hence, it can be reliably estimated. Therefore, we advise ecologists to use Rao’s PD instead of the Faith’s PD, especially when estimating PD from a (possibly large) sample that corresponds to a small fraction of the community.
Document type :
Conference papers
Complete list of metadata

https://hal.inrae.fr/hal-02739285
Contributor : Migration Prodinra Connect in order to contact the contributor
Submitted on : Tuesday, June 2, 2020 - 9:47:50 PM
Last modification on : Friday, October 15, 2021 - 6:52:03 PM

Identifiers

  • HAL Id : hal-02739285, version 1
  • PRODINRA : 306223

Citation

Thibault Latrille, Bart Haegeman, Jérôme Hamelin. Robust estimation of phylogenetic diversity: steer clear of rare species. Bioinformatique pour la Génomique Environnementale (BGE2014), May 2014, Lyon, France. ⟨hal-02739285⟩

Share

Metrics

Record views

31