Learning to Reason: Predict-then-Optimize vs Predict-and-Optimize

Marianne Defresne; Thomas Schiex; Sophie Barbe

Communication Dans Un Congrès Année : 2023

Learning to Reason: Predict-then-Optimize vs Predict-and-Optimize

(1, 2) , (2) , (1)

1
2

Marianne Defresne

Fonction : Auteur
PersonId : 1129857

Toulouse Biotechnology Institute

Unité de Mathématiques et Informatique Appliquées de Toulouse

Thomas Schiex

Fonction : Auteur
PersonId : 734846
IdHAL : thomas-schiex
ORCID : 0000-0001-6049-3415
IdRef : 057033935

Unité de Mathématiques et Informatique Appliquées de Toulouse

Sophie Barbe

Fonction : Auteur
PersonId : 180952
IdHAL : sophie-barbe
ORCID : 0000-0003-2581-5022
IdRef : 109362543

Toulouse Biotechnology Institute

Résumé

Many real-life decision making problems involve reasoning or optimizing over discrete variables on ill-defined problems, where exact constraints or parameters are unknown and only indirect correlated variables are observed together with solutions. In these situations, one may want to machine learn how to predict a solution directly from the observed variables, learning effectively "how to reason" or "how to optimize". A promising research direction involves the combination of Machine Learning (ML) and discrete reasoning (DR) [1]: the problem is formulated as the discrete optimization/reasoning problem whose parameters are predicted from data. Because of the various types of observed variables and of the complexity of their relationship to hidden constraints and parameters, Deep Learning (DL) is often considered as a suitable learning tool here. However, DL requires differentiable loss functions, which are either zero or unavailable for discrete reasoning [2]. Two approaches have emerged to tackle such problems. On the one hand, the Predict-thenoptimize approach learns how to predict the discrete problem to solve and then solves it. The basic components are well understood, learning can be efficient and exact DR can be performed during inference. But the effect of errors on parameters on the final solution are not accounted for [3] which makes the approach sub-optimal in low-data regimes. For this reason, much of the attention has been focused on the direct integration of optimization in the loss function, an approach that is often denoted as Predict-and-optimize or Decision-focused learning [4]. The challenges here is to provide efficient end-to-end differentiable training. Some approaches rely on efficient differentiable optimization layers that capture the DR problem through a continuous relaxation [5, 6, 7]. They can tackle complex NP-hard problems at the cost of approximate solving during inference: even optimally parameterized relaxations will fail to offer exact solutions. Another family of approaches extracts meaningful gradients out of exact DR solvers during training [2, 8, 9], enabling exact solving during inference. This however limits their application to very small instances of NP-hard DR problems (e.g., scheduling with 3 tasks only [3]) because of the tremendous optimization cost during training, a cost which is worsened by the essentially randomly parameterized nature of predicted problems in the first training epochs. We compare a Predict-then-optimize and a Predict-and-optimize approach on a classical Constraint Programming (CP) benchmark, the NP-complete sudoku problem. Our goal is to learn how to solve new grids of sudoku from examples of solved sudoku grids (as we already did using ML technology [10]), described here as a sequence of 81 numbers, each with an associated grid coordinate. To relax the Boolean nature of constraints (satisfied or violated), we learn the parameters of a weighted CP model (a binary Cost Function Network or CFN [11]) using the CFN solver toulbar2 [12] during inference and training (in Predict-and-optimize mode). For Predict-then-optimize, we introduce the Gangster-PLL loss, a variant of the well-established Pseudo-loglikelihood loss [13] and use the Hinge Loss [9] for Predict-and-optimize, both with regularization on the CFN parameters learned. We observe that the Predict-then-optimize approach combines efficient learning with exact reasoning during inference, providing totally accurate solutions even in relatively low regime situations. The far more expensive Predict-and-optimize approach struggles at learning how to solve a sudoku grid, requiring more tuning of the exact

Domaines

Informatique [cs] Sciences du Vivant [q-bio]

Fichier principal

12.pdf (104.04 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Sophie BARBE : Connectez-vous pour contacter le contributeur

https://hal.inrae.fr/hal-04222883

Soumis le : vendredi 29 septembre 2023-14:26:51

Dernière modification le : mardi 5 novembre 2024-13:35:28

Archivage à long terme le : samedi 30 décembre 2023-19:07:52

Dates et versions

hal-04222883 , version 1 (29-09-2023)

Identifiants

HAL Id : hal-04222883 , version 1

Citer

Marianne Defresne, Thomas Schiex, Sophie Barbe. Learning to Reason: Predict-then-Optimize vs Predict-and-Optimize. AAAI 2023 Bridge on Constraint Programming and Machine Learning, Feb 2023, Washington DC, United States. ⟨hal-04222883⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INSA-TOULOUSE INSA-GROUPE TBI INRAE ANR ANITI INRAEOCCITANIETOULOUSE MICA-UNITES MATHNUM MIAT

30 Consultations

17 Téléchargements

Learning to Reason: Predict-then-Optimize vs Predict-and-Optimize

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager