A first step towards the characterization of endogenous retroviruses’ evolutionary history and impact on small ruminant genomes
Résumé
Endogenous retroviruses (ERV) are LTR retrotransposons derived from ancient retroviral infections and represent up to 10% of mammalians’ genomes. In small ruminants, unlike in most species, an ERV family coexists with its exogenous counterparts. Apart from this particular family, no comprehensive study has been carried on sheep or goat ERVs, and their dynamic in genomes remains largely unknown.
To characterize the small ruminant ERV families, an automated pipeline for de novo ERV identification followed by manual curation steps were applied to generate consensus sequences, based on four reference assemblies of domestic and wild sheep and goats and one cattle assembly. Class I and class II ERV families were established whereas no class III ERV family has been confidently identified. Among these families, six are shared by all small ruminants and four are also observed in cattle suggesting that some ERVs integrated ruminant genomes before their speciation within the Bovidae family 17 million years ago. The newly described consensus sequences were used to annotate 10 additional domestic sheep and goat assemblies leading to an estimation of 0.5 to 1% of ERVs depending on the analyzed genomes. The insertional landscape highlights significant copy number differences between ERV families within but also between species suggesting lineage-specific insertion dynamics.
In this study, we elaborated the first set of small ruminant ERV consensus sequences. This is an important step to decipher ERV insertion polymorphism at a population scale, and open to transcriptomic studies to elucidate their functional impact on sheep and goat genomes.