Machine learning and deep learning techniques are increasingly applied to produce efficient query optimizers, in particular in regards to big data systems. The optimization of spatial operations is even more challenging due to the inherent complexity of such kind of operations, like spatial join, range queries, and the peculiarities of spatial data. Even though a few ML-based spatial query optimizers have been proposed in literature, their design limits their use, since each one is tailored for a specific collection of datasets, a specific operation, or specific a hardware. Changes to any of these will require building and training a completely new model which entails collecting a new very large training data to obtain a good model This paper proposes a new approach for ML-based query optimization which exploits the use of the novel notion of spatial embedding for overcoming these limitations. In particular, a preliminary model is defined which captures the relevant features of spatial datasets, independently from the operation to be optimized and in an unsupervised manner. Given that, a specialized model for the optimization of each spatial operation can be trained by using spatial embeddings as input, so the cost of building the first model can be amortized and a smaller training set is required for the specialized ones.
Spatial embedding: a generic machine learning model for spatial query optimization
Belussi, Alberto;Migliorini, Sara;
2022-01-01
Abstract
Machine learning and deep learning techniques are increasingly applied to produce efficient query optimizers, in particular in regards to big data systems. The optimization of spatial operations is even more challenging due to the inherent complexity of such kind of operations, like spatial join, range queries, and the peculiarities of spatial data. Even though a few ML-based spatial query optimizers have been proposed in literature, their design limits their use, since each one is tailored for a specific collection of datasets, a specific operation, or specific a hardware. Changes to any of these will require building and training a completely new model which entails collecting a new very large training data to obtain a good model This paper proposes a new approach for ML-based query optimization which exploits the use of the novel notion of spatial embedding for overcoming these limitations. In particular, a preliminary model is defined which captures the relevant features of spatial datasets, independently from the operation to be optimized and in an unsupervised manner. Given that, a specialized model for the optimization of each spatial operation can be trained by using spatial embeddings as input, so the cost of building the first model can be amortized and a smaller training set is required for the specialized ones.File | Dimensione | Formato | |
---|---|---|---|
22_SIGSPATIAL_Spatial_Embeddings_ShortPaper_published.pdf
accesso aperto
Tipologia:
Versione dell'editore
Licenza:
Copyright dell'editore
Dimensione
323.39 kB
Formato
Adobe PDF
|
323.39 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.