Classification of samples in expression microarray experiments represents a crucial task in bioinformatics and bio-medicine. In this paper this scenario is addressed by employing a particular class of statistical approaches, called Topic Models. These models, firstly introduced in the text mining community, permit to extract from a set of objects (typically documents) an interpretable and rich description, based on an intermediate representation called topics (or processes). In this paper the expression microarray classification task is cast into this probabilistic context, providing a parallelism with the text mining domain and an interpretation. Two different topic models are investigated, namely the Probabilistic Latent Semantic Analysis (PLSA) and the Latent Dirichlet Allocation (LDA). An experimental evaluation of the proposed methodologies on three standard datasets confirms their effectiveness, also in comparison with other classification methodologies.

Expression microarray classification using topic models

BICEGO, Manuele;LOVATO, PIETRO;OLIBONI, Barbara;PERINA, Alessandro
2010-01-01

Abstract

Classification of samples in expression microarray experiments represents a crucial task in bioinformatics and bio-medicine. In this paper this scenario is addressed by employing a particular class of statistical approaches, called Topic Models. These models, firstly introduced in the text mining community, permit to extract from a set of objects (typically documents) an interpretable and rich description, based on an intermediate representation called topics (or processes). In this paper the expression microarray classification task is cast into this probabilistic context, providing a parallelism with the text mining domain and an interpretation. Two different topic models are investigated, namely the Probabilistic Latent Semantic Analysis (PLSA) and the Latent Dirichlet Allocation (LDA). An experimental evaluation of the proposed methodologies on three standard datasets confirms their effectiveness, also in comparison with other classification methodologies.
2010
Microarray; Classification; Topic models
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/343053
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 30
  • ???jsp.display-item.citation.isi??? ND
social impact