Knowledge discovery and unsupervised detection of within-field yield defective observations - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement
Article Dans Une Revue Computers and Electronics in Agriculture Année : 2019

Knowledge discovery and unsupervised detection of within-field yield defective observations

Résumé

Suspicious observations, or the so-called outliers, are always present, to a greater or lesser extent, in agronomical and environmental datasets. Within field yield datasets are no exception. While most filtering approaches use expert thresholds and dedicated filters to remove these defective observations, more general and unsupervised methods will be required to process a growing number of yield maps. However, by using these last approaches, outliers would be solely identified and would remain unlabeled. This study proposes a methodology to provide a label to these defective observations so that users can better characterize the harvest process, e.g. functioning of the machine, driving of the operator, and provide guidelines for future improvements of equipment and operations processes. Here, it is assumed that outliers have already been detected by a non-parametric and unsupervised published approach. Clusters of outliers are first identified in the data to gather outliers with similar yield outlying characteristics. Once detected, these clusters are given a first-order label which describes the general yield outlying characteristics of the observations that belong to these clusters. Then, within each cluster, each outlier is given a second-order label to provide more information on the origin of the defective observation. Yield simulated datasets with known characteristics and labelled outliers were used to test the methodology. The proposed approach was then applied on real yield datasets with unlabeled outliers. This study shows that it might be conceivable to label outliers detected with an unsupervised approach but that some labels are more accurate than others, especially those related to an unknown cutting width of the harvester or to narrow finishes within the fields. Outlying observations behaved similarly between simulated and real datasets which made it possible to infer more precisely the label of defective observations. By labelling outlying observations, it was possible to provide an appropriate correction to one of the real yield dataset and to restore almost 15% of the outlying observations instead of removing them. This study is a first attempt to provide a label to yield outliers detected from an unsupervised manner.
Fichier principal
Vignette du fichier
pub00059563.pdf (2.15 Mo) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02608323 , version 1 (16-05-2020)

Identifiants

Citer

C. Leroux, H. Jones, A. Clenet, Bruno Tisseyre. Knowledge discovery and unsupervised detection of within-field yield defective observations. Computers and Electronics in Agriculture, 2019, 156, pp.645-659. ⟨10.1016/j.compag.2018.12.024⟩. ⟨hal-02608323⟩
14 Consultations
78 Téléchargements

Altmetric

Partager

More