Predicting qualitative phenotypes from microarray data - the Eadgene pig data set - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue BMC Proceedings Année : 2009

Predicting qualitative phenotypes from microarray data - the Eadgene pig data set

Résumé

Background: The aim of this work was to study the performances of 2 predictive statistical tools on a data set that was given to all participants of the Eadgene-SABRE Post Analyses Working Group, namely the Pig data set of Hazard et al. (2008). The data consisted of 3686 gene expressions measured on 24 animals partitioned in 2 genotypes and 2 treatments. The objective was to find biomarkers that characterized the genotypes and the treatments in the whole set of genes. Methods: We first considered the Random Forest approach that enables the selection of predictive variables. We then compared the classical Partial Least Squares regression (PLS) with a novel approach called sparse PLS, a variant of PLS that adapts lasso penalization and allows for the selection of a subset of variables. Results: All methods performed well on this data set. The sparse PLS outperformed the PLS in terms of prediction performance and improved the interpretability of the results. Conclusion: We recommend the use of machine learning methods such as Random Forest and multivariate methods such as sparse PLS for prediction purposes. Both approaches are well adapted to transcriptomic data where the number of features is much greater than the number of individuals.
Fichier principal
Vignette du fichier
2009_Robert_Granie_BMC_Proceedings_1.pdf (494.64 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-02653385 , version 1 (29-05-2020)

Identifiants

Citer

Christèle Robert-Granié, Kim-Anh Lê Cao, Magali San Cristobal. Predicting qualitative phenotypes from microarray data - the Eadgene pig data set. BMC Proceedings, 2009, 3, online (Suppl. 4), Non paginé. ⟨10.1186/1753-6561-3-S4-S13⟩. ⟨hal-02653385⟩
3 Consultations
19 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More