Behaviour of method LR (linear regression) to measure bias and accuracy
Résumé
Cross validation is the most extended method to estimate the prediction ability in animal selection schemes; however it present problems on the quality of results. The Linear Regression (LR) method compares EBVs obtained with old (‘partial’) and old+new (‘whole’) data to infer biases and accuracies. In this work, we present preliminary results on the behaviour of LR method using simulated data. Based on heritabilities of 0.5 and 0.1, 20 populations were simulated with the QMSim software. In the simulation individuals were selected by BLUP evaluation and mating system were performed to reduce the average inbreeding and kinship of the population. Only fathers born in generation 5 with at least 5 daughters in generation 6 were used. BLUP pedigree evaluations were performed using a partial data set (without daughter’s information) and a whole data set (with daughter’s information). Statistics were obtained between the estimated breeding values of the partial (EBVp) and whole data sets (EBVw) and between the EBVp and true breeding values (TBV) obtained from the simulation. Five statistics were considered: (bias) difference between average EBVp and EBVw, with an expected value of 0, (slope) regression of EBVw on EBVp with an expected value of 1, (accuracies) correlation between EBVp and EBVw (ρpw) (proportional to increase in accuracy), covariance between EBVp and EBVw (proportional to accuracy on partial) and regression of EBVp in EBVw (proportional to increase in reliabilities). All metrics were also calculated by substituting EBVw for TBV to ascertain if metrics using EBVw predict results for TBV. The statistics comparing EBVp-EBVw and EBVp- TBV are almost identical for both values of h2. The most important differences were observed in ρpw (in h2 of 0.1: EBVp-EBVw=0.36 and EBVp-TBV=0.27; h2 of 0.5: EBVp-EBVw=0.32 and EBVp-TBV=0.27) and it was always overestimated on EBVp-EBVw. In short, biases and accuracies were correctly estimated using statistics from ‘whole’ and ‘partial’ genetic evaluations. The similarity of the results obtained using EBVw or TBV suggest that statistics proposed in method LR could be useful for measure of bias and accuracies in breeding schemes. Further work will include genomic information.