n this paper we present a novel clustering approach based on Random Forests, a popular classification and regression technique whose usability in the clustering scenario has been investigated to a lesser extent. In the clustering context, the most used class of approaches is based on the exploitation of a single Random Forest to derive a proximity measure between points,to be used with any distance-based clustering technique. On the contrary, our scheme exploitsa setof Random Forests, each onedevoted to model one cluster, in a spirit similar to that of themixture models approach. These Random Forests, which provideflexible cluster descriptors, are iteratively updated using a K-means-like clustering algorithm. The proposed scheme, whichwe callK-Random Forests (K-RF), has been evaluated on fivedatasets: the obtained results suggest that it represents a valid alternative to classic Random Forest clustering algorithms as well as to other established clustering approaches.

K-Random Forests: a K-means style algorithm for Random Forest clustering

Bicego, Manuele
2019-01-01

Abstract

n this paper we present a novel clustering approach based on Random Forests, a popular classification and regression technique whose usability in the clustering scenario has been investigated to a lesser extent. In the clustering context, the most used class of approaches is based on the exploitation of a single Random Forest to derive a proximity measure between points,to be used with any distance-based clustering technique. On the contrary, our scheme exploitsa setof Random Forests, each onedevoted to model one cluster, in a spirit similar to that of themixture models approach. These Random Forests, which provideflexible cluster descriptors, are iteratively updated using a K-means-like clustering algorithm. The proposed scheme, whichwe callK-Random Forests (K-RF), has been evaluated on fivedatasets: the obtained results suggest that it represents a valid alternative to classic Random Forest clustering algorithms as well as to other established clustering approaches.
2019
978-1-7281-1985-4
pattern recognition, clustering, random forests
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1017202
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 4
social impact