A person’s phenotype refers to the observable physical properties of the organism. The phenotype is determined by the genotype, which is the set of organism’s genes. The latter do not act alone but are regulated by other molecules in the same cell. The correlation between genotype and phenotype is the statistical relationship that binds one or multiple genes and their regulators with an observable physical property. When the phenotype is the status induced by the disease in a patient, finding a correlation helps diagnosis, prognosis, and treatment. However, accomplishing such task is not trivial because dysfunctional genes and regulators change among patients even if they share same disease and clinical conditions. One approach consists into looking for the altered cellular functions (i.e., pathways) instead of single dysfunctional actors. The standard pathway analysis requires data describing the cell’s molecular profiles of two classes of patients. One class composed by patients with the disease in study and one which includes control. A molecular profile is captured with wet-lab protocols from a patient and describes the activity of its molecules. A pathway is found significantly deregulated if the molecules performing that cellular function are co-ordinately more active or less active in the diseased profiles with respect the control ones. Enrichment methods for the pathway analysis are easy to use, understand and provide biologically meaningful results. However, they suffer of two limits. They do not consider the interconnected nature of the molecules which are involved both inside and outside their own pathway. They are not able to learn the resulting pathways found in a patient class and to use this information to recognise or test profile of new patients which class is unknown or insecure. On the contrary artificial intelligence techniques, more precisely machine learning algorithms, can overcome these limitations but are not under development or interesting because they are difficult to use, tune, understand and to design for providing different insights with respect the simple enrichment counterpart. This thesis focuses on pathway-based patient classifiers based on patient similarity networks. A recent concept that benefits of two characteristics; it learns pathwayinformation to predict the patient phenotype and it works with patient similarity networks as features to classify. These characteristics allow to build a classifier which is interpretable, able to accept different type of biological data and to provide new insights about the patients and the phenotypes in study; resulting a valid alternative or even better than enrichment methods. As last, this thesis describes a state-of-art artificial intelligence algorithm for predicting the effect of an altered molecule over the rest of the cell and how this strategy is integrated into pathway analysis methods both of enrichment and of machine learning for considering the interconnected nature of the molecules.

Artificial Intelligence Techniques Integrate Biological Omics into Graphs for the Prediction and Pathway Analysis of Patient’s Disease

Luca Giudice
In corso di stampa

Abstract

A person’s phenotype refers to the observable physical properties of the organism. The phenotype is determined by the genotype, which is the set of organism’s genes. The latter do not act alone but are regulated by other molecules in the same cell. The correlation between genotype and phenotype is the statistical relationship that binds one or multiple genes and their regulators with an observable physical property. When the phenotype is the status induced by the disease in a patient, finding a correlation helps diagnosis, prognosis, and treatment. However, accomplishing such task is not trivial because dysfunctional genes and regulators change among patients even if they share same disease and clinical conditions. One approach consists into looking for the altered cellular functions (i.e., pathways) instead of single dysfunctional actors. The standard pathway analysis requires data describing the cell’s molecular profiles of two classes of patients. One class composed by patients with the disease in study and one which includes control. A molecular profile is captured with wet-lab protocols from a patient and describes the activity of its molecules. A pathway is found significantly deregulated if the molecules performing that cellular function are co-ordinately more active or less active in the diseased profiles with respect the control ones. Enrichment methods for the pathway analysis are easy to use, understand and provide biologically meaningful results. However, they suffer of two limits. They do not consider the interconnected nature of the molecules which are involved both inside and outside their own pathway. They are not able to learn the resulting pathways found in a patient class and to use this information to recognise or test profile of new patients which class is unknown or insecure. On the contrary artificial intelligence techniques, more precisely machine learning algorithms, can overcome these limitations but are not under development or interesting because they are difficult to use, tune, understand and to design for providing different insights with respect the simple enrichment counterpart. This thesis focuses on pathway-based patient classifiers based on patient similarity networks. A recent concept that benefits of two characteristics; it learns pathwayinformation to predict the patient phenotype and it works with patient similarity networks as features to classify. These characteristics allow to build a classifier which is interpretable, able to accept different type of biological data and to provide new insights about the patients and the phenotypes in study; resulting a valid alternative or even better than enrichment methods. As last, this thesis describes a state-of-art artificial intelligence algorithm for predicting the effect of an altered molecule over the rest of the cell and how this strategy is integrated into pathway analysis methods both of enrichment and of machine learning for considering the interconnected nature of the molecules.
In corso di stampa
Bioinformatics
Artificial Intelligence
File in questo prodotto:
File Dimensione Formato  
Luca_Giudice_Thesis.pdf

Open Access dal 02/07/2022

Tipologia: Tesi di dottorato
Licenza: Creative commons
Dimensione 5.87 MB
Formato Adobe PDF
5.87 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1053301
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact