This paper presents TSRF-Dist, a novel distance between time series based on Random Forests (RFs). We extend to the time-series domain concepts and tools of RF distances, a recent class of robust data-dependent distances defined for vectorial representations, thus proposing the firstRF distance for time series. The distance is determined by (i) creating an RF to model a set of time series, and (ii) exploiting the trained RF to quantify the similarity between time series. As for the first step, we introduce in this paper the Extremely Randomized Canonical Interval Forest (ERCIF), a novel extension of Canonical Interval Forests that can model time series and can be trained without labels. We then exploit three different schemes, following ideas already employed in the vectorial case. The proposed distance, in different variants, has been thoroughly evaluated with 128 datasets from the UCR Time Series archive, showing promising results compared with literature alternatives.
TSRF-Dist: a novel time series distance based on extremely randomized canonical interval forests
Azzari, Alberto;Bicego, Manuele;Combi, Carlo;Cracco, Andrea;Sala, Pietro
2025-01-01
Abstract
This paper presents TSRF-Dist, a novel distance between time series based on Random Forests (RFs). We extend to the time-series domain concepts and tools of RF distances, a recent class of robust data-dependent distances defined for vectorial representations, thus proposing the firstRF distance for time series. The distance is determined by (i) creating an RF to model a set of time series, and (ii) exploiting the trained RF to quantify the similarity between time series. As for the first step, we introduce in this paper the Extremely Randomized Canonical Interval Forest (ERCIF), a novel extension of Canonical Interval Forests that can model time series and can be trained without labels. We then exploit three different schemes, following ideas already employed in the vectorial case. The proposed distance, in different variants, has been thoroughly evaluated with 128 datasets from the UCR Time Series archive, showing promising results compared with literature alternatives.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.