Machine learning and deep learning techniques are increasingly applied to produce efficient query optimizers, in particular in regards to big data systems. The optimization of spatial operations is even more challenging due to the inherent complexity of such kind of operations, like spatial join, range queries, and the peculiarities of spatial data. Even though a few ML-based spatial query optimizers have been proposed in literature, their design limits their use, since each one is tailored for a specific collection of datasets, a specific operation, or specific a hardware. Changes to any of these will require building and training a completely new model which entails collecting a new very large training data to obtain a good model This paper proposes a new approach for ML-based query optimization which exploits the use of the novel notion of spatial embedding for overcoming these limitations. In particular, a preliminary model is defined which captures the relevant features of spatial datasets, independently from the operation to be optimized and in an unsupervised manner. Given that, a specialized model for the optimization of each spatial operation can be trained by using spatial embeddings as input, so the cost of building the first model can be amortized and a smaller training set is required for the specialized ones.

Spatial embedding: a generic machine learning model for spatial query optimization

Belussi, Alberto;Migliorini, Sara;
2022-01-01

Abstract

Machine learning and deep learning techniques are increasingly applied to produce efficient query optimizers, in particular in regards to big data systems. The optimization of spatial operations is even more challenging due to the inherent complexity of such kind of operations, like spatial join, range queries, and the peculiarities of spatial data. Even though a few ML-based spatial query optimizers have been proposed in literature, their design limits their use, since each one is tailored for a specific collection of datasets, a specific operation, or specific a hardware. Changes to any of these will require building and training a completely new model which entails collecting a new very large training data to obtain a good model This paper proposes a new approach for ML-based query optimization which exploits the use of the novel notion of spatial embedding for overcoming these limitations. In particular, a preliminary model is defined which captures the relevant features of spatial datasets, independently from the operation to be optimized and in an unsupervised manner. Given that, a specialized model for the optimization of each spatial operation can be trained by using spatial embeddings as input, so the cost of building the first model can be amortized and a smaller training set is required for the specialized ones.
9781450395298
query optimizer
machine learning
big data
range query
File in questo prodotto:
File Dimensione Formato  
22_SIGSPATIAL_Spatial_Embeddings_ShortPaper_published.pdf

accesso aperto

Tipologia: Versione dell'editore
Licenza: Copyright dell'editore
Dimensione 323.39 kB
Formato Adobe PDF
323.39 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1079888
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact