n this paper we present a novel clustering approach based on Random Forests, a popular classification and regression technique whose usability in the clustering scenario has been investigated to a lesser extent. In the clustering context, the most used class of approaches is based on the exploitation of a single Random Forest to derive a proximity measure between points,to be used with any distance-based clustering technique. On the contrary, our scheme exploitsa setof Random Forests, each onedevoted to model one cluster, in a spirit similar to that of themixture models approach. These Random Forests, which provideflexible cluster descriptors, are iteratively updated using a K-means-like clustering algorithm. The proposed scheme, whichwe callK-Random Forests (K-RF), has been evaluated on fivedatasets: the obtained results suggest that it represents a valid alternative to classic Random Forest clustering algorithms as well as to other established clustering approaches.
K-Random Forests: a K-means style algorithm for Random Forest clustering
Bicego, Manuele
2019-01-01
Abstract
n this paper we present a novel clustering approach based on Random Forests, a popular classification and regression technique whose usability in the clustering scenario has been investigated to a lesser extent. In the clustering context, the most used class of approaches is based on the exploitation of a single Random Forest to derive a proximity measure between points,to be used with any distance-based clustering technique. On the contrary, our scheme exploitsa setof Random Forests, each onedevoted to model one cluster, in a spirit similar to that of themixture models approach. These Random Forests, which provideflexible cluster descriptors, are iteratively updated using a K-means-like clustering algorithm. The proposed scheme, whichwe callK-Random Forests (K-RF), has been evaluated on fivedatasets: the obtained results suggest that it represents a valid alternative to classic Random Forest clustering algorithms as well as to other established clustering approaches.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.