Background and Aims: The Cancer Genome Atlas (TGCA project has recently published a flagship paper reporting that Cell-of-Origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer including hepato-pancreatic and biliary (HPB) malignancies. The aim of the current project was to investigate the molecular landscape of HPB cancers to apply in the clinical practice the molecular classifications resulting from the TGCA analyses. Patients and Methods: Machine learning models (artificial neural network, ANN) were trained to predict the molecular subtypes and Cell-of-Origin (iCluster) of HPB cancers. A survival analysis was performed using Cox’s survival models and machine learning models (Random Survival Forest, RSF) to investigate impact of the molecular subtypes and iClusters classifications on prognosis of HPB patients. Whole exome sequencing (WES) data of TGCA patients with cholangiocarcinoma (CHOL), liver hepatocellular carcinoma (LIHC), and pancreatic adenocarcinoma (PAAD) were used to develop the ANNs. Two control groups including patients with gastrointestinal cancers and other type of cancers were used to train the ANNs. WES data of patients who underwent surgery at the Ohio State University (OSU) for HPB cancers and of patients participating to the International Cancer Gene Consortium (ICGC) were used to validate the ANNs. Results: The ANNs predicting the iClusters (i.e. from iCluster1 to iCluster28) demonstrated an accuracy of 99% in training set versus 74% in the test set. The ANNs predicting the molecular subtypes demonstrated an accuracy of 99% in training set versus 81% in the test set. The survival data of 362 (34 TGCA, 17 OSU, and 311 ICGC) CHOL patients were investigated using the RSF algorithm. The model identified the most important variables as AJCC stage, TP53 pathways status, molecular subtypes, lymph node status, and iCluster. In the multivariable Cox model, AJCC stage, TP53 pathways status, molecular subtypes, and iCluster were associated with patients’ survival. Compared with METH-3 patients, patients in IDH and METH-2 subgroups had almost 2.5- and 5-fold risk of death (IDH, HR 2.47, p=0.037; METH-2, HR 4.85, p<0.001). The c-index of the final model integrating clinical and molecular data resulted 0.72. A total of 598 (341 TGCA, 30 OSU, and 227 ICGC) LIHC patients were investigated using the RSF algorithm. The model identified the most important variables as AJCC stage, molecular subtypes, AJCC T stages, TP53 pathway status, and TGF-beta pathway status. In the multivariable Cox model, AJCC stage, TP53 pathways status, and molecular subtypes were associated with patients’ survival. Compared with patients with other molecular subtypes, patients in i-Cluster2 had almost 2.2-fold increased risk of death (i-Cluster2, HR 2.18, p<0.001). The c-index of the final model was 0.63. The survival data of 1,022 (155 TGCA, 66 OSU, and 999 ICGC) PAAD patients were investigated using the RSF algorithm. The model identified the most important variables as age, AJCC stage, molecular subtypes, i-Cluster, TP53 pathway, MYC pathway, and Cell-cycle pathway status. In the multivariable Cox model, AJCC stage, TP53 pathways status, and molecular subtypes were associated with patients’ survival. Compared with patients with KRAS_wt molecular subtypes, patients with a KRAS_mut PAAD subtype had almost 1.4-fold increased risk of death (KRAS_mut, HR 1.38, p=0.031). The c-index of the final model integrating clinical and molecular data was 0.61. Conclusion: TGCA project have reported a complex and interconnected landscape describing the molecular biology of HPB cancers. In this preliminary work, the WES of patients with HPB cancers was used to predict the molecular classifications proposed in the TGCA papers. Moreover, the molecular classifications of HPB malignancies when integrated with the clinical staging system demonstrated to improve our ability to predict the prognosis of HPB patients.

Integrating Clinical Data and Molecular Profiling of Hepato-Pancreato-Biliary Cancers: a Surgical-pathological Approach

Fabio Bagante
2020-01-01

Abstract

Background and Aims: The Cancer Genome Atlas (TGCA project has recently published a flagship paper reporting that Cell-of-Origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer including hepato-pancreatic and biliary (HPB) malignancies. The aim of the current project was to investigate the molecular landscape of HPB cancers to apply in the clinical practice the molecular classifications resulting from the TGCA analyses. Patients and Methods: Machine learning models (artificial neural network, ANN) were trained to predict the molecular subtypes and Cell-of-Origin (iCluster) of HPB cancers. A survival analysis was performed using Cox’s survival models and machine learning models (Random Survival Forest, RSF) to investigate impact of the molecular subtypes and iClusters classifications on prognosis of HPB patients. Whole exome sequencing (WES) data of TGCA patients with cholangiocarcinoma (CHOL), liver hepatocellular carcinoma (LIHC), and pancreatic adenocarcinoma (PAAD) were used to develop the ANNs. Two control groups including patients with gastrointestinal cancers and other type of cancers were used to train the ANNs. WES data of patients who underwent surgery at the Ohio State University (OSU) for HPB cancers and of patients participating to the International Cancer Gene Consortium (ICGC) were used to validate the ANNs. Results: The ANNs predicting the iClusters (i.e. from iCluster1 to iCluster28) demonstrated an accuracy of 99% in training set versus 74% in the test set. The ANNs predicting the molecular subtypes demonstrated an accuracy of 99% in training set versus 81% in the test set. The survival data of 362 (34 TGCA, 17 OSU, and 311 ICGC) CHOL patients were investigated using the RSF algorithm. The model identified the most important variables as AJCC stage, TP53 pathways status, molecular subtypes, lymph node status, and iCluster. In the multivariable Cox model, AJCC stage, TP53 pathways status, molecular subtypes, and iCluster were associated with patients’ survival. Compared with METH-3 patients, patients in IDH and METH-2 subgroups had almost 2.5- and 5-fold risk of death (IDH, HR 2.47, p=0.037; METH-2, HR 4.85, p<0.001). The c-index of the final model integrating clinical and molecular data resulted 0.72. A total of 598 (341 TGCA, 30 OSU, and 227 ICGC) LIHC patients were investigated using the RSF algorithm. The model identified the most important variables as AJCC stage, molecular subtypes, AJCC T stages, TP53 pathway status, and TGF-beta pathway status. In the multivariable Cox model, AJCC stage, TP53 pathways status, and molecular subtypes were associated with patients’ survival. Compared with patients with other molecular subtypes, patients in i-Cluster2 had almost 2.2-fold increased risk of death (i-Cluster2, HR 2.18, p<0.001). The c-index of the final model was 0.63. The survival data of 1,022 (155 TGCA, 66 OSU, and 999 ICGC) PAAD patients were investigated using the RSF algorithm. The model identified the most important variables as age, AJCC stage, molecular subtypes, i-Cluster, TP53 pathway, MYC pathway, and Cell-cycle pathway status. In the multivariable Cox model, AJCC stage, TP53 pathways status, and molecular subtypes were associated with patients’ survival. Compared with patients with KRAS_wt molecular subtypes, patients with a KRAS_mut PAAD subtype had almost 1.4-fold increased risk of death (KRAS_mut, HR 1.38, p=0.031). The c-index of the final model integrating clinical and molecular data was 0.61. Conclusion: TGCA project have reported a complex and interconnected landscape describing the molecular biology of HPB cancers. In this preliminary work, the WES of patients with HPB cancers was used to predict the molecular classifications proposed in the TGCA papers. Moreover, the molecular classifications of HPB malignancies when integrated with the clinical staging system demonstrated to improve our ability to predict the prognosis of HPB patients.
2020
Machine learning, Hepato Pancreato Biliary Cancers, Surgery
File in questo prodotto:
File Dimensione Formato  
tesi_dottorato_fabio_bagante_finale.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Creative commons
Dimensione 953.03 kB
Formato Adobe PDF
953.03 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1024296
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact