HADAD: A Lightweight Approach for Optimizing Hybrid Complex Analytics Queries - Département d'informatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

HADAD: A Lightweight Approach for Optimizing Hybrid Complex Analytics Queries

Résumé

Hybrid complex analytics workloads typically include (i) data management tasks (joins, filters, etc.), easily expressed using relational algebra (RA)-based languages, and (ii) complex analytics tasks (regressions, matrix decompositions, etc.), mostly expressed in linear algebra (LA) expressions. Such workloads are common in a number of areas, including scientific computing, web analytics, business recommendation, natural language processing, speech recognition. Existing solutions for evaluating hybrid complex analytics queriesranging from LA-oriented systems, to relational systems (extended to handle LA operations), to hybrid systems-fail to provide a unified optimization framework for such a hybrid setting. These systems either optimize data management and complex analytics tasks separately, or exploit RA properties only while leaving LA-specific optimization opportunities unexplored. Finally, they are not able to exploit precomputed (materialized) results to avoid computing again (part of) a given mixed (LA and RA) computation. We describe HADAD, an extensible lightweight approach for optimizing hybrid complex analytics queries, based on a common abstraction that facilitates unified reasoning: a relational model endowed with integrity constraints, which can be used to express the properties of the two computation formalisms. Our approach enables full exploration of LA properties and rewrites, as well as semantic query optimization. Importantly, our approach does not require modifying the internals of the existing systems. Our experimental evaluation shows significant performance gains on diverse workloads, from LA-centered ones to hybrid ones.
Fichier principal
Vignette du fichier
main.pdf (6.47 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03347677 , version 1 (17-09-2021)

Identifiants

  • HAL Id : hal-03347677 , version 1

Citer

Rana Alotaibi, Bogdan Cautis, Alin Deutsch, Ioana Manolescu. HADAD: A Lightweight Approach for Optimizing Hybrid Complex Analytics Queries. ACM SIGMOD 2021 - International Conference on Management of Data, Jun 2021, Xi'an / Online, China. ⟨hal-03347677⟩
61 Consultations
79 Téléchargements

Partager

Gmail Facebook X LinkedIn More