Artificial Intelligence (AI) requires machines capable of learning and generalizing from data without being explicitly programmed to do so, giving rise to the field of Machine Learning (ML). ML systems follow an inductive process where they learn to make predictions from data, guided by an objective function that defines correct or incorrect choices. This thesis deals with aspects of the subfield of Deep Learning (DL) through Neural Networks (NNs), encompassing the philosophy of emulating the human brain's computational processes. Most modern NNs excel in solving problems associated with data living in grids with Euclidean properties, such as images, text, and waveforms. However, non-Euclidean data is ubiquitous. A general representative of such data are graphs, i.e., data structures where pairwise relationships between entities, called nodes, are modeled through their connectivity, given by edges. Social networks, molecules, road networks, human-body poses, and 3D point clouds are common examples of data that can be represented through the structure of a graph. To maximize the effectiveness of NNs on data such as graphs, it is imperative to leverage the inherent geometric properties of the given structure. Moreover, geometry can serve as a tool not only to properly understand the input data but also to alter the latent representations space of NNs, giving rise to different desirable properties that can be useful for particular tasks. This thesis bifurcates into two main paths within the realm of Geometric Deep Learning. The first path explores the application of DL to graph-structured data to solve challenging problems in Computer Vision. The second path delves into the utilization of geometric constraints to shape latent space representations, showcasing how altering latent geometry can give rise to unique and superior solutions in various contexts like Graph Self-Supervised Learning and Multi-Task Learning. In summary, this thesis navigates the intersection of DL, graphs, and geometry, offering new solutions that enhance the capabilities of NNs in handling non-Euclidean data structures and learning representations that go beyond the commonly assumed Euclidean latent space. Insights from our research reveal both the potential and challenges that lie beyond intuitive geometry and how we can enable ML systems to effectively learn and generalize to a broader range of data types and tasks.

Graphs, Geometry, and Learning Representations: Navigating the Non-Euclidean Landscape in Computer Vision and Beyond

Geri Skenderi
2024-01-01

Abstract

Artificial Intelligence (AI) requires machines capable of learning and generalizing from data without being explicitly programmed to do so, giving rise to the field of Machine Learning (ML). ML systems follow an inductive process where they learn to make predictions from data, guided by an objective function that defines correct or incorrect choices. This thesis deals with aspects of the subfield of Deep Learning (DL) through Neural Networks (NNs), encompassing the philosophy of emulating the human brain's computational processes. Most modern NNs excel in solving problems associated with data living in grids with Euclidean properties, such as images, text, and waveforms. However, non-Euclidean data is ubiquitous. A general representative of such data are graphs, i.e., data structures where pairwise relationships between entities, called nodes, are modeled through their connectivity, given by edges. Social networks, molecules, road networks, human-body poses, and 3D point clouds are common examples of data that can be represented through the structure of a graph. To maximize the effectiveness of NNs on data such as graphs, it is imperative to leverage the inherent geometric properties of the given structure. Moreover, geometry can serve as a tool not only to properly understand the input data but also to alter the latent representations space of NNs, giving rise to different desirable properties that can be useful for particular tasks. This thesis bifurcates into two main paths within the realm of Geometric Deep Learning. The first path explores the application of DL to graph-structured data to solve challenging problems in Computer Vision. The second path delves into the utilization of geometric constraints to shape latent space representations, showcasing how altering latent geometry can give rise to unique and superior solutions in various contexts like Graph Self-Supervised Learning and Multi-Task Learning. In summary, this thesis navigates the intersection of DL, graphs, and geometry, offering new solutions that enhance the capabilities of NNs in handling non-Euclidean data structures and learning representations that go beyond the commonly assumed Euclidean latent space. Insights from our research reveal both the potential and challenges that lie beyond intuitive geometry and how we can enable ML systems to effectively learn and generalize to a broader range of data types and tasks.
2024
Geometric Deep Learning, Representation Learning, Computer Vision
File in questo prodotto:
File Dimensione Formato  
Geri_Skenderi_PhD_Thesis_Final.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Creative commons
Dimensione 15.95 MB
Formato Adobe PDF
15.95 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1136606
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact