High-throughput experimental methods for biosample profiling and growing collections of clinical and health record data provide ample opportunities for biomarker discovery and medical decision support. However, many of the new data types, including single-cell omics and high-resolution cellular imaging data, also pose particular challenges for data analysis. A high dimensionality of the data in relation to small numbers of available samples (often referred to as the p >> n problem), influences of additive and multiplicative noise, large numbers of uninformative or redundant data features, outliers, confounding factors and imbalanced sample group numbers are all common characteristics of current biomedical data collections. While first successes have been achieved in developing clinical decision support tools using multifactorial omics data, e.g., resulting in FDA-approved omics-based biomarker signatures for common cancer indications [1], there is still an unmet need and great potential for earlier, more accurate and robust diagnostic and prognostic tools for many complex diseases

Ten quick tips for biomarker discovery and validation analyses using machine learning

Giugno, Rosalba;
2022-01-01

Abstract

High-throughput experimental methods for biosample profiling and growing collections of clinical and health record data provide ample opportunities for biomarker discovery and medical decision support. However, many of the new data types, including single-cell omics and high-resolution cellular imaging data, also pose particular challenges for data analysis. A high dimensionality of the data in relation to small numbers of available samples (often referred to as the p >> n problem), influences of additive and multiplicative noise, large numbers of uninformative or redundant data features, outliers, confounding factors and imbalanced sample group numbers are all common characteristics of current biomedical data collections. While first successes have been achieved in developing clinical decision support tools using multifactorial omics data, e.g., resulting in FDA-approved omics-based biomarker signatures for common cancer indications [1], there is still an unmet need and great potential for earlier, more accurate and robust diagnostic and prognostic tools for many complex diseases
2022
biomarker discovery, machine learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1080775
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact