
Figure 10: Seasonal and cyclical effects are self-similar.
Detecting these properties within time series can provide
new insights.
5 CONCLUSION
The presented time series analysis extends the state of
the art by several improvements that have been imple-
mented and tested on real data.
5.1 Contribution
Our contribution to the scientific community is the ex-
tension of the TS-Index approach to unbalanced trees
in order to prioritise similarity over performance op-
timised tree structures – especially in visual analytics
use cases with many similar elements, the similarity-
based tree structure can be beneficial to analysis per-
formance. Many identical elements can be quickly
excluded when searching the tree.
The issue of self-similarity is also the subject
of our second suggestion for improvement: self-
similarity may be inherent in the time series or an
indication of an external, application-specific factor.
The detection of self-similarity is therefore both de-
sirable and unnecessary noise in the analysis. The
difference between the two possibilities is reflected in
the time difference: the shorter the shift between two
similar time series, the less interesting the similarity;
large distances, on the other hand, indicate interesting
patterns, especially seasonal or cyclical patterns. Ap-
propriate filters, such as those we have implemented,
offer the possibility of separation.
In addition to filtering out unwanted self-
similarities, the efficient and effective handling of
data problems (gaps, outliers, etc.) is a tedious but
important issue. In theory, these cases are consid-
ered much less often than in practice. Here we have
extended the existing algorithms with Not-a-Number
and Out-Of-Range mapping filters. These allow to
use of even short subsequences of valid time series
and not to filter or discard them. Furthermore, the
subsequent analysis steps do not need to consider spe-
cial cases (NaN, PosInf, . . . ).
We have also added imprecision and uncertainty
handling to the existing algorithms. Many time se-
ries (measurements, surveys, etc.) have precision in-
formation (range of variation, measurement precision,
etc.) that usually goes unnoticed. In our implementa-
tion, this is taken into account throughout and is also
displayed in the visualisations, if wanted.
5.2 Benefit
The lessons learned from this study are particularly
valuable because the effects described are indepen-
dent of the application domain and can occur in many
different contexts.
The examples and the effects that occurred illus-
trate the problem of self-similarity and its ambiguous
nature: self-similarities due to short shifts should be
interpreted and filtered as noise; with increasing tem-
poral distance, self-similarity gains importance and
should be taken into account.
Another important benefit is the realisation that
cluster analysis can also be used for data inspection
and data transformation: similar errors often produce
similar indications, an example of the feedback loop
from knowledge back into data preparation.
ACKNOWLEDGMENT
The work was partially funded by the Austrian Re-
search Promotion Agency (FFG) within the frame-
work of the flagship project ICT of the Future
PRESENT, grant FO999899544.
REFERENCES
Aghabozorgi, S., Shirkhorshidi, A. S., and Wah, T. Y.
(2015). Time-series clustering – A decade review. In-
formation Systems, 53:16–38.
Bhatt, U., Antor
´
an, J., Zhang, Y., Liao, Q. V., Sattigeri, P.,
Fogliato, R., Melanc¸on, G., Krishnan, R., Stanley, J.,
Tickoo, O., Nachman, L., Chunara, R., Srikumar, M.,
Weller, A., and Xiang, A. (2021). Uncertainty as a
Form of Transparency: Measuring, Communicating,
and Using Uncertainty. AAAI/ACM Conference on AI,
Ethics, and Society, 4:401–413.
Chatzigeorgakidis, G., Skoutas, D., Patroumpas, K., Pal-
panas, T., Athanasiou, S., and Skiadopoulos, S.
(2021). Twin Subsequence Search in Time Series. In-
ternational Conference on Extending Database Tech-
nology (EDBT), 24:475–480.
Scale and Time Independent Clustering of Time Series Data
591