Massive spectral data analysis for plant breeding using parSketch-PLSDA method: Discrimination of sunflower genotypes

In precision agriculture and plant breeding, the amount of data tends to increase. This massive data is becoming more and more complex, leading to difficulties in managing and analysing it. Optical instruments such as NIR Spectroscopy or hyperspectral imaging are gradually expanding directly in the field, increasing the amount of spectral database. Using these tools allows access to non-destructive and rapid measurements to classify new varieties according to breeding objectives. Processing this massive amount of spectral data is challenging. In a context of genotype discrimination, we propose to apply a method called parSketch-PLSDA to analyse such a massive amount of spectral data. ParSketch-PLSDA is a combination of an indexing strategy (parSketch) and the reference method (PLSDA) for predicting classes from multivariate data. For this purpose, a spectral database was formed by collecting 1,300,000 spectra generated from hyperspectral images of leaves of four different sunflower genotypes. ParSketch-PLSDA is compared to a PLSDA. Both methods use the same set of calibration and test. The prediction model obtained by PLSDA has a classification error close to 23% on average across all genotypes. ParSketch-PLSDA method outperforms PLSDA by greatly improving prediction qualities by 10%. Indeed, the model built with ParSketch-PLSDA has the ability to take into account non-linearities among data sets. These results are encouraging and allow us to anticipate the future bottleneck related to the generation of a large amount of data from phenotyping.

Mots clés

Digital Agriculture Massive data Spectroscopy Chemometrics Precision Agriculture

Domaines

Sciences de l'environnement Sciences de la Terre

Fichier principal

OptipAg_Elsevier__ACCEPTED_ (1).pdf (4.94 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Isabelle NAULT : Connectez-vous pour contacter le contributeur

https://hal.inrae.fr/hal-03329674

Soumis le : mardi 7 juin 2022-13:34:59

Dernière modification le : mardi 12 mars 2024-10:45:15

Archivage à long terme le : jeudi 8 septembre 2022-18:46:42

Dates et versions

hal-03329674 , version 1 (07-06-2022)

Licence

Paternité - Pas d'utilisation commerciale - Pas de modification

Identifiants

HAL Id : hal-03329674 , version 1
DOI : 10.1016/j.biosystemseng.2021.08.005
WOS : 000697698700008

Citer

Maxime Ryckewaert, Maxime Metz, Daphné Héran, Pierre George, Bruno Grèzes-Besset, et al.. Massive spectral data analysis for plant breeding using parSketch-PLSDA method: Discrimination of sunflower genotypes. Biosystems Engineering, 2021, 210, pp.69-77. ⟨10.1016/j.biosystemseng.2021.08.005⟩. ⟨hal-03329674⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA ZENITH LIRMM INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC MIPS UNIV-MONTPELLIER UNIV-RENNES ITAP INSTITUT-AGRO-MONTPELLIER INRAE INRAEOCCITANIEMONTPELLIER ANR UR1-MATH-NUM MATHNUM RESEAU-EAU

159 Consultations

162 Téléchargements