Background and Aim Over 10% of hepatocellular carcinoma (HCC) cases recur each year, even after surgical resection. Currently, there is a lack of knowledge about the causes of recurrence and the effective prevention. Prediction of HCC recurrence requires diagnostic markers endowed with high sensitivity and specificity. This study aims to identify new key proteins for HCC recurrence and to build machine learning algorithms for predicting HCC recurrence. Methods The proteomics data for analysis in this study were obtained from the Clinical Proteomics Tumor Analysis Consortium (CPTAC) database. We analyzed different proteins based on cases with or without recurrence of HCC. Survival analysis, Cox regression analysis, and area under the ROC curves (AUROC > 0.7) were used to screen for more significant differential proteins. Predictive models for HCC recurrence were developed using four machine learning algorithms. Results A total of 690 differentially expressed proteins between 50 relapsed and 77 non-relapsed hepatitis B-related HCC patients were identified. Seven of these proteins had an AUROC > 0.7 for 5-year survival in HCC, including BAHCC1, ESF1, RAP1GAP, RUFY1, SCAMP3, STK3, and TMEM230. Among the machine learning algorithms, the random forest algorithm showed the highest AUROC values (AUROC: 0.991, 95% CI 0.962-0.999) for identifying HCC recurrence, followed by the support vector machine (AUROC: 0.893, 95% Cl 0.824-0.956), the logistic regression (AUROC: 0.774, 95% Cl 0.672-0.868), and the multi-layer perceptron algorithm (AUROC: 0.571, 95% Cl 0.459-0.682). Conclusions Our study identifies seven novel proteins for predicting HCC recurrence and the random forest algorithm as the most suitable predictive model for HCC recurrence.

Machine learning algorithms based on proteomic data mining accurately predicting the recurrence of hepatitis B-related hepatocellular carcinoma

Targher, Giovanni
Writing – Review & Editing
;
2022-01-01

Abstract

Background and Aim Over 10% of hepatocellular carcinoma (HCC) cases recur each year, even after surgical resection. Currently, there is a lack of knowledge about the causes of recurrence and the effective prevention. Prediction of HCC recurrence requires diagnostic markers endowed with high sensitivity and specificity. This study aims to identify new key proteins for HCC recurrence and to build machine learning algorithms for predicting HCC recurrence. Methods The proteomics data for analysis in this study were obtained from the Clinical Proteomics Tumor Analysis Consortium (CPTAC) database. We analyzed different proteins based on cases with or without recurrence of HCC. Survival analysis, Cox regression analysis, and area under the ROC curves (AUROC > 0.7) were used to screen for more significant differential proteins. Predictive models for HCC recurrence were developed using four machine learning algorithms. Results A total of 690 differentially expressed proteins between 50 relapsed and 77 non-relapsed hepatitis B-related HCC patients were identified. Seven of these proteins had an AUROC > 0.7 for 5-year survival in HCC, including BAHCC1, ESF1, RAP1GAP, RUFY1, SCAMP3, STK3, and TMEM230. Among the machine learning algorithms, the random forest algorithm showed the highest AUROC values (AUROC: 0.991, 95% CI 0.962-0.999) for identifying HCC recurrence, followed by the support vector machine (AUROC: 0.893, 95% Cl 0.824-0.956), the logistic regression (AUROC: 0.774, 95% Cl 0.672-0.868), and the multi-layer perceptron algorithm (AUROC: 0.571, 95% Cl 0.459-0.682). Conclusions Our study identifies seven novel proteins for predicting HCC recurrence and the random forest algorithm as the most suitable predictive model for HCC recurrence.
2022
CPTAC database
machine learning models
proteomics
recurrence of hepatocellular carcinoma
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1078466
Citazioni
  • ???jsp.display-item.citation.pmc??? 6
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 6
social impact