Background and Objective: Dysregulated tyrosine kinase signaling is a central driver of tumorigenesis, metastasis, and therapeutic resistance. While tyrosine kinase inhibitors (TKIs) have revolutionized targeted cancer treatment, identifying compounds with optimal bioactivity remains a critical bottleneck. This study presents a robust machine learning framework-leveraging deep artificial neural networks (dANNs), convolutional neural networks (CNNs), and structural molecular fingerprints-to accurately predict TKI bioactivity, ultimately accelerating the preclinical phase of drug development. Methods: A curated dataset of 28,314 small molecules from the ChEMBL database targeting 11 tyrosine kinases was analyzed. Using Morgan fingerprints and physicochemical descriptors (e.g., molecular weight, LogP, hydrogen bonding), ten supervised models, including dANN, SVM, CatBoost, and CNN, were trained and optimized through a randomized hyperparameter search. Model performance was evaluated using F1-score, ROC-AUC, precision-recall curves, and log loss. Results: SVM achieved the highest F1-score (87.9%) and accuracy (85.1%), while dANNs yielded the lowest log loss (0.25096), indicating superior probabilistic reliability. CatBoost excelled in ROC-AUC and precision-recall metrics. The integration of Morgan fingerprints significantly improved bioactivity prediction across all models by enhancing structural feature recognition. Conclusions: This work highlights the transformative role of machine learning-particularly dANNs and SVM-in rational drug discovery. By enabling accurate bioactivity prediction, our model pipeline can effectively reduce experimental burden, optimize compound selection, and support personalized cancer treatment design. The proposed framework advances kinase inhibitor screening pipelines and provides a scalable foundation for translational applications in precision oncology. By enabling early identification of bioactive compounds with favorable pharmacological profiles, the results of this study may support more efficient candidate selection for clinical drug development, particularly in regards to cancer therapy and kinase-associated disorders.

Precision Enhanced Bioactivity Prediction of Tyrosine Kinase Inhibitors by Integrating Deep Learning and Molecular Fingerprints Towards Cost-Effective and Targeted Cancer Therapy

Ardigò, Luca Paolo
2025-01-01

Abstract

Background and Objective: Dysregulated tyrosine kinase signaling is a central driver of tumorigenesis, metastasis, and therapeutic resistance. While tyrosine kinase inhibitors (TKIs) have revolutionized targeted cancer treatment, identifying compounds with optimal bioactivity remains a critical bottleneck. This study presents a robust machine learning framework-leveraging deep artificial neural networks (dANNs), convolutional neural networks (CNNs), and structural molecular fingerprints-to accurately predict TKI bioactivity, ultimately accelerating the preclinical phase of drug development. Methods: A curated dataset of 28,314 small molecules from the ChEMBL database targeting 11 tyrosine kinases was analyzed. Using Morgan fingerprints and physicochemical descriptors (e.g., molecular weight, LogP, hydrogen bonding), ten supervised models, including dANN, SVM, CatBoost, and CNN, were trained and optimized through a randomized hyperparameter search. Model performance was evaluated using F1-score, ROC-AUC, precision-recall curves, and log loss. Results: SVM achieved the highest F1-score (87.9%) and accuracy (85.1%), while dANNs yielded the lowest log loss (0.25096), indicating superior probabilistic reliability. CatBoost excelled in ROC-AUC and precision-recall metrics. The integration of Morgan fingerprints significantly improved bioactivity prediction across all models by enhancing structural feature recognition. Conclusions: This work highlights the transformative role of machine learning-particularly dANNs and SVM-in rational drug discovery. By enabling accurate bioactivity prediction, our model pipeline can effectively reduce experimental burden, optimize compound selection, and support personalized cancer treatment design. The proposed framework advances kinase inhibitor screening pipelines and provides a scalable foundation for translational applications in precision oncology. By enabling early identification of bioactive compounds with favorable pharmacological profiles, the results of this study may support more efficient candidate selection for clinical drug development, particularly in regards to cancer therapy and kinase-associated disorders.
2025
bioactivity modeling
cheminformatics-based drug screening
deep learning
precision oncology
tyrosine kinase inhibitors
File in questo prodotto:
File Dimensione Formato  
pharmaceuticals-18-00975.pdf

accesso aperto

Descrizione: CC BY 4.0 publisher version
Tipologia: Versione dell'editore
Licenza: Creative commons
Dimensione 1.2 MB
Formato Adobe PDF
1.2 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1179750
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact