Biclustering refers to the problem of simultaneously clustering the rows and columns of a given data matrix, with the goal of obtaining submatrices where the selected rows present a coherent behaviour in the selected columns, and vice-versa. To face this intrinsically difficult problem, we propose a novel generative model, where biclustering is approached from a sparse low-rank matrix factorization perspective. The main idea is to design a probabilistic model describing the factorization of a given data matrix in two other matrices, from which information about rows and columns belonging to the sought for biclusters can be obtained. One crucial ingredient in the proposed model is the use of a spike and slab sparsity inducing prior, thus we term the approach spike and slab biclustering (SSBi). To estimate the parameters of the SSBi model, we propose an expectation-maximization (EM) algorithm, termed SSBiEM, which solves a low-rank factorization problem at each iteration, using a recently proposed augmented Lagrangian algorithm. Experiments with both synthetic and real data show that the SSBi approach compares favorably with the state-of-the-art. (C) 2017 Elsevier Ltd. All rights reserved.

Spike and slab biclustering

Denitto, M.;Bicego, M.;Farinelli, A.;
2017-01-01

Abstract

Biclustering refers to the problem of simultaneously clustering the rows and columns of a given data matrix, with the goal of obtaining submatrices where the selected rows present a coherent behaviour in the selected columns, and vice-versa. To face this intrinsically difficult problem, we propose a novel generative model, where biclustering is approached from a sparse low-rank matrix factorization perspective. The main idea is to design a probabilistic model describing the factorization of a given data matrix in two other matrices, from which information about rows and columns belonging to the sought for biclusters can be obtained. One crucial ingredient in the proposed model is the use of a spike and slab sparsity inducing prior, thus we term the approach spike and slab biclustering (SSBi). To estimate the parameters of the SSBi model, we propose an expectation-maximization (EM) algorithm, termed SSBiEM, which solves a low-rank factorization problem at each iteration, using a recently proposed augmented Lagrangian algorithm. Experiments with both synthetic and real data show that the SSBi approach compares favorably with the state-of-the-art. (C) 2017 Elsevier Ltd. All rights reserved.
2017
Biclustering; Spike and slab; Probabilistic graphical models; Expectation-maximization
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/974269
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 6
social impact