ON THE LIMITS IN USING A SINGLE REFERENCE GENOME FOR POPULATION GENOMICS: TOWARDS PAN-GENOMES
Résumé
Since the advent of the so-called next generation sequencing (NGS) methods, whole genome sequencing has been used for population genomics, including in honey bees. In this approach, relatively cheap parallel sequencing is used to generate large quantities of short reads (usually 150 bp long) for individual samples that need to be aligned onto a reference genome. Genetic markers, usually SNPs, are then detected by observing sequence divergence between the aligned short reads and the reference genome. The limits to this approach, is that it depends highly (i) on the quality of the reference genome and (ii) on the fact that it is built from only a single individual. Quality issues have now been addressed with the great progress made in long-read sequencing technologies, but the fact of relying on a single individual sample for the reference remains a major issue. We have produced a reference genome for the black bee Apis mellifera mellifera by PacBio long-read sequencing and used it in a small population genomics study. Results show that the origin of the reference genome used will have some influence on the quality of results obtained. This is especially the case when studying mitochondrial DNA, as mis-alignment of reads can be caused by the presence of nuclear mitochondrial (NUMT) segments. In the future, such problems may be solved by pan-genome approaches.
Domaines
| Origine | Fichiers produits par l'(les) auteur(s) |
|---|---|
| Licence |
