Endometrial cancer, which is the most common gynaecological cancer in women after breast, colorectal and lung cancer, can be diagnosed at an early stage. The first aim of this study is to classify age, tumor grade, myometrial invasion and tumor size, which play an important role in the diagnosis and prognosis of endometrial cancer, with machine learning methods combined with explainable artificial intelligence. 20 endometrial cancer patients proteomic data obtained from tumor biopsies taken from different regions of EC tissue were used. The data obtained were then classified according to age, tumor size, tumor grade and myometrial invasion. Then, by using three different machine learning methods, explainable artificial intelligence was applied to the model that best classifies these groups and possible protein biomarkers that can be used in endometrial prognosis were evaluated. The optimal model for age classification was XGBoost with AUC (98.8%), for tumor grade classification was XGBoost with AUC (98.6%), for myometrial invasion classification was LightGBM with AUC (95.1%), and finally for tumor size classification was XGBoost with AUC (94.8%). By combining the optimal models and the SHAP approach, possible protein biomarkers and their expressions were obtained for classification. Finally, EWRS1 protein was found to be common in three groups (age, myometrial invasion, tumor size). This article's findings indicate that models have been developed that can accurately classify factors including age, tumor grade, and myometrial invasion all of which are critical for determining the prognosis of endometrial cancer as well as potential protein biomarkers associated with these factors. Furthermore, we were able to provide an analysis of how the quantities of the proteins suggested as biomarkers varied throughout the classes by combining the SHAP values with these ideal models.

Integrating proteomics and explainable artificial intelligence: a comprehensive analysis of protein biomarkers for endometrial cancer diagnosis and prognosis

Ardigò, Luca Paolo
2024-01-01

Abstract

Endometrial cancer, which is the most common gynaecological cancer in women after breast, colorectal and lung cancer, can be diagnosed at an early stage. The first aim of this study is to classify age, tumor grade, myometrial invasion and tumor size, which play an important role in the diagnosis and prognosis of endometrial cancer, with machine learning methods combined with explainable artificial intelligence. 20 endometrial cancer patients proteomic data obtained from tumor biopsies taken from different regions of EC tissue were used. The data obtained were then classified according to age, tumor size, tumor grade and myometrial invasion. Then, by using three different machine learning methods, explainable artificial intelligence was applied to the model that best classifies these groups and possible protein biomarkers that can be used in endometrial prognosis were evaluated. The optimal model for age classification was XGBoost with AUC (98.8%), for tumor grade classification was XGBoost with AUC (98.6%), for myometrial invasion classification was LightGBM with AUC (95.1%), and finally for tumor size classification was XGBoost with AUC (94.8%). By combining the optimal models and the SHAP approach, possible protein biomarkers and their expressions were obtained for classification. Finally, EWRS1 protein was found to be common in three groups (age, myometrial invasion, tumor size). This article's findings indicate that models have been developed that can accurately classify factors including age, tumor grade, and myometrial invasion all of which are critical for determining the prognosis of endometrial cancer as well as potential protein biomarkers associated with these factors. Furthermore, we were able to provide an analysis of how the quantities of the proteins suggested as biomarkers varied throughout the classes by combining the SHAP values with these ideal models.
2024
biomarker
endometrium cancer
explainable artificial intelligence
machine learning
proteomic
File in questo prodotto:
File Dimensione Formato  
fmolb-11-1389325.pdf

accesso aperto

Descrizione: CC BY 4.0 publisher version
Tipologia: Versione dell'editore
Licenza: Creative commons
Dimensione 1.52 MB
Formato Adobe PDF
1.52 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1130530
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact