Authors:
Victor Ionescu
;
Rodica Potolea
and
Mihaela Dinsoreanu
Affiliation:
Technical University of Cluj-Napoca, Romania
Keyword(s):
Time Series, Similarity Search, Structural Similarity, Linear Approximation, Data Adaptive.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Business Analytics
;
Clustering and Classification Methods
;
Data Analytics
;
Data Engineering
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Pre-Processing and Post-Processing for Data Mining
;
Symbolic Systems
Abstract:
Much effort has been invested in recent years in the problem of detecting similarity in time series. Most
work focuses on the identification of exact matches through point-by-point comparisons, although in many
real-world problems recurring patterns match each other only approximately. We introduce a new approach
for identifying patterns in time series, which evaluates the similarity by comparing the overall structure of
candidate sequences instead of focusing on the local shapes of the sequence and propose a new distance
measure ABC (Area Between Curves) that is used to achieve this goal. The approach is based on a data-driven
linear approximation method that is intuitive, offers a high compression ratio and adapts to the
overall shape of the sequence. The similarity of candidate sequences is quantified by means of the novel
distance measure, applied directly to the linear approximation of the time series. Our evaluations performed
on multiple data sets show that our proposed techni
que outperforms similarity search approaches based on
the commonly referenced Euclidean Distance in the majority of cases. The most significant improvements
are obtained when applying our method to domains and data sets where matching sequences are indeed
primarily determined based on the similarity of their higher-level structures.
(More)