To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Journal of Experimental Botany Année : 2019

To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?

Résumé

Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS.
Fichier principal
Vignette du fichier
Alvarez-Prado-JEB-2019-CC-BY-NC_1.pdf (1.36 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-02619451 , version 1 (25-05-2020)

Licence

Paternité - Pas d'utilisation commerciale

Identifiants

Citer

Santiago Alvarez Prado, Isabelle Sanchez, Llorenç Cabrera Bosquet, Antonin Grau, Claude Welcker, et al.. To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?. Journal of Experimental Botany, 2019, 70 (15), pp.3693-3698. ⟨10.1093/jxb/erz191⟩. ⟨hal-02619451⟩
10 Consultations
34 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More