Artificial Intelligence (AI) requires machines capable of learning and generalizing from data without being explicitly programmed to do so, giving rise to the field of Machine Learning (ML). ML systems follow an inductive process where they learn to make predictions from data, guided by an objective function that defines correct or incorrect choices. This thesis deals with aspects of the subfield of Deep Learning (DL) through Neural Networks (NNs), encompassing the philosophy of emulating the human brain's computational processes. Most modern NNs excel in solving problems associated with data living in grids with Euclidean properties, such as images, text, and waveforms. However, non-Euclidean data is ubiquitous. A general representative of such data are graphs, i.e., data structures where pairwise relationships between entities, called nodes, are modeled through their connectivity, given by edges. Social networks, molecules, road networks, human-body poses, and 3D point clouds are common examples of data that can be represented through the structure of a graph. To maximize the effectiveness of NNs on data such as graphs, it is imperative to leverage the inherent geometric properties of the given structure. Moreover, geometry can serve as a tool not only to properly understand the input data but also to alter the latent representations space of NNs, giving rise to different desirable properties that can be useful for particular tasks. This thesis bifurcates into two main paths within the realm of Geometric Deep Learning. The first path explores the application of DL to graph-structured data to solve challenging problems in Computer Vision. The second path delves into the utilization of geometric constraints to shape latent space representations, showcasing how altering latent geometry can give rise to unique and superior solutions in various contexts like Graph Self-Supervised Learning and Multi-Task Learning. In summary, this thesis navigates the intersection of DL, graphs, and geometry, offering new solutions that enhance the capabilities of NNs in handling non-Euclidean data structures and learning representations that go beyond the commonly assumed Euclidean latent space. Insights from our research reveal both the potential and challenges that lie beyond intuitive geometry and how we can enable ML systems to effectively learn and generalize to a broader range of data types and tasks.
Graphs, Geometry, and Learning Representations: Navigating the Non-Euclidean Landscape in Computer Vision and Beyond
Geri Skenderi
2024-01-01
Abstract
Artificial Intelligence (AI) requires machines capable of learning and generalizing from data without being explicitly programmed to do so, giving rise to the field of Machine Learning (ML). ML systems follow an inductive process where they learn to make predictions from data, guided by an objective function that defines correct or incorrect choices. This thesis deals with aspects of the subfield of Deep Learning (DL) through Neural Networks (NNs), encompassing the philosophy of emulating the human brain's computational processes. Most modern NNs excel in solving problems associated with data living in grids with Euclidean properties, such as images, text, and waveforms. However, non-Euclidean data is ubiquitous. A general representative of such data are graphs, i.e., data structures where pairwise relationships between entities, called nodes, are modeled through their connectivity, given by edges. Social networks, molecules, road networks, human-body poses, and 3D point clouds are common examples of data that can be represented through the structure of a graph. To maximize the effectiveness of NNs on data such as graphs, it is imperative to leverage the inherent geometric properties of the given structure. Moreover, geometry can serve as a tool not only to properly understand the input data but also to alter the latent representations space of NNs, giving rise to different desirable properties that can be useful for particular tasks. This thesis bifurcates into two main paths within the realm of Geometric Deep Learning. The first path explores the application of DL to graph-structured data to solve challenging problems in Computer Vision. The second path delves into the utilization of geometric constraints to shape latent space representations, showcasing how altering latent geometry can give rise to unique and superior solutions in various contexts like Graph Self-Supervised Learning and Multi-Task Learning. In summary, this thesis navigates the intersection of DL, graphs, and geometry, offering new solutions that enhance the capabilities of NNs in handling non-Euclidean data structures and learning representations that go beyond the commonly assumed Euclidean latent space. Insights from our research reveal both the potential and challenges that lie beyond intuitive geometry and how we can enable ML systems to effectively learn and generalize to a broader range of data types and tasks.File | Dimensione | Formato | |
---|---|---|---|
Geri_Skenderi_PhD_Thesis_Final.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
15.95 MB
Formato
Adobe PDF
|
15.95 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.