Development of a knowledge graph framework to ease and empower translational approaches in plant research: a use-case on grain legumes
Abstract
Legumes, and especially pulses, are an important source of protein for food and feed, and are appreciated for their positive impact on the “one health”. However, their unstable yields and their susceptibility to biotic and abiotic stresses highlight the need for varietal improvement in order to increase the cultivated areas and productivity. With the advent of sequencing technologies, a large pool of genetic and -omics resources, heterogeneous at the inter- and intra-species scale, is emerging. Thus, it is important to capitalize on these scattered heterogeneous data to develop translational research to boost breeding projects and crop diversification. To meet this need, we undertook the development of the Orthology-driven knowledge base framework for translational research (Ortho_KB). For a set of species of interest, it infers orthologous relationships between genes, proposes associated syntenic blocks between chromosomes and creates a graph database linking genetic and RNA-seq data. To explore the possibilities of this framework, we populated Ortho_KB to obtain OrthoLegKB, an instance dedicated to legumes. This database includes four cultivated crops, namely Pisum sativum, Vicia faba, Lens culinaris and Vigna radiata, and the model legume Medicago truncatula. Available information on quantitative trait loci (QTL) for multiple traits are being integrated as well as expression data. The proposed database model was evaluated by studying the conservation of a flowering-promoting gene