Prioritize Regulating Variants in a Quantitative Trait Locus (QTL)
Abstract
A better knowledge of functional characterization of livestock species seems a lever to link genome to phenome. However, data describing gene regulation mechanisms and chromatin state in various experimental conditions are lacking compared to common animal models. To overcome this
bottleneck, predictive biology seems a good alternative. Human and mouse are organisms phylogenetically close to pig, so we can assume that molecular mechanisms are similar. In addition, they offer much more data which is a condition to train powerful deep learning algorithms.
We focused our analysis on a genomic region known to be associated with production traits in pigs. The combination of experimental design, genetic parameters and high-throughput sequencing led to a first reduction of candidate variants from 7146 to 1616. Variant effect predictions using VEP[1] suggested regulatory variants as likely causal mutations. To go further, Deepbind[2] and Enformer[3] that are neural networks, have been exploited to evaluate their ability to use human and mouse data for pig genome predictions. Preliminary results showed that both seemed complementary in their respective capacity to predict correctly and in the diversity of their predictions. Then, they have been used to calculate scores to estimate the impact of each allele. Those impact scores will be used to prioritize variants for functional validation.
To conclude, neural networks showed a good capacity to predict experiments from pig DNA sequences while trained with human data. We hope this approach will help us to associate variants to molecular mechanisms responsible in the variability of the phenotypes of interest. After the validation of the method on this region as a proof of concept, an extended whole pig genome analysis will be performed and those predictions will enrich a database accessible to scientific community.
Ce travail a bénéficié d’une aide de l’état gérée par l’Agence Nationale de la Recherche au titre de France 2030 portant la référence ANR-22-PEAE-0015