High substitution rate estimates from temporally sampled sequences: are they biased or biologically meaningful?
Résumé
Genetic typing of subfossil remains is becoming a standard technique, allowing the comparison of ancient DNA (aDNA) with contemporaneous modern DNA (mDNA) sequences. Seemingly complementary to this has been the development of statistical tools for the analysis of serially sampled sequences. These methods are able to employ time interval information between samples to make inferences about biologically meaningful parameters. A number of recent studies have applied these analyses, implemented in the software BEAST, to data sets containing aDNA samples (e.g. Adelie penguin, bison, cave bear, auroch, Neanderthal), and substution rate estimates obtained were substantially higher than those traditionally recognized. These high rate estimates are being attributed to a longer persistence time for deleterious transient polymorphisms (destined to be eventually eliminated by purifying selection) than previously believed, and are being argued as evidence for the time dependancy of molecular rate estimates (Ho et al, 2005, 2007). In this work we explore alternative explanations for high substitution rate estimates from studies of aDNA. We argue that purely demographic processes can change the proportions of shared lineages, increasing the dissimilarity between aDNA and mDNA, thus upwardly biasing substitution rate estimates. To test this hypothesis sequence data was simulated under three scenarios: (1) constant population size, (2) bottlenecked population or (3) structured populations; and simulated data was analyzed with BEAST. The substitution rate estimates obtained for the scenario of constant population size were reasonably similar to the simulated mutation rate. However, for the other two scenarios, substitution rate estimates were upwardly biased, up to an order of magnitude in some cases. While this result suggests a bias on the estimates of the molecular rate of change, it is also true that these estimates reflect the change at the population level (higher dissimilarity between aDNA and mDNA) under these scenarios.