changes of the subjects body position near the start
of the “lying” activity. This results in a lower-than-
expected amount of matches near the start, causing
the transition to be guessed too early. Note that both
techniques have trouble with subject 1.
Table 3 lists the results for the transition from
“sitting” to “standing”. Again, we see similar or
improved results when applying noise elimination,
except for 3 subjects using the first sensor series.
Though results are worse compared to Table 2, the
gain by using noise elimination is more significant.
When using a single sensor, results on average im-
prove from 21.12 to 15.35, corresponding to a gain of
about 27 seconds. If segmentation uses all 3 sensors,
results change from 16.9 to 10.3 on average, a gain of
around 31 seconds.
6 CONCLUSION
In this paper we discussed techniques and applica-
tions related to the Matrix Profile. Our contribution
consists of a method to remove the effects of Gaussian
noise on the time series when calculating the Matrix
Profile, without affecting the complexity of the un-
derlying algorithm. This method is based on a sta-
tistical analysis of the effects of z-normalized noisy
sequences on the Euclidean distance. The only re-
quirement for this technique is to know the variance
of the noise, which is an intuitive measure and can
be easily estimated by manually extracting a flat but
noisy segment from the time series.
As the Matrix Profile is widely usable for a va-
riety of problems and across various domains, so
is our technique. In this paper, we showed gains
for anomaly detection and time series segmentation.
Both cases were evaluated on public datasets contain-
ing real-word data and showed an improvement of
the results. On the Numenta Anomaly Benchmark,
we were able to retrieve more anomalies with less
guesses, saving an operator valuable time. On the
PAMAP2 dataset, we were able to more accurately
predict transitions between passive activities. We fo-
cused on datasets containing flat and noisy segments,
a subject that was not yet tackled in other Matrix Pro-
file related literature.
Further work can be done on using a dynamic
value for the noise estimation, for series where noise
does not originate from measurement, but rather from
the underlying process.
ACKNOWLEDGEMENTS
This work has been carried out in the framework
of the Z-BRE4K project, which received funding
from the European Union’s Horizon 2020 research
and innovation programme under grant agreement no.
768869.
REFERENCES
Dau, H. A. and Keogh, E. (2017). Matrix Profile V: A
Generic Technique to Incorporate Domain Knowledge
into Motif Discovery. In Proceedings of the 23rd
ACM SIGKDD International Conference on Knowl-
edge Discovery and Data Mining - KDD ’17, pages
125–134, New York, New York, USA. ACM Press.
Gharghabi, S., Ding, Y., Yeh, C.-C. M., Kamgar, K.,
Ulanova, L., and Keogh, E. (2017). Matrix Pro-
file VIII: Domain Agnostic Online Semantic Seg-
mentation at Superhuman Performance Levels. In
2017 IEEE International Conference on Data Mining
(ICDM), pages 117–126. IEEE.
Keogh, E. and Kasetty, S. (2002). On the need for time
series data mining benchmarks. Proceedings of the
eighth ACM SIGKDD international conference on
Knowledge discovery and data mining - KDD ’02,
page 102.
Lavin, A. and Ahmad, S. (2015). Evaluating Real-
Time Anomaly Detection Algorithms – The Numenta
Anomaly Benchmark. In 2015 IEEE 14th Interna-
tional Conference on Machine Learning and Applica-
tions (ICMLA), pages 38–44. IEEE.
Mueen, A., Keogh, E., Zhu, Q., Cash, S., and Westover, B.
(2009). Exact Discovery of Time Series Motifs. In
Proceedings of the 2009 SIAM International Confer-
ence on Data Mining.
Mueen, A., Viswanathan, K., Gupta, C., and Keogh,
E. (2015). The fastest similarity search al-
gorithm for time series subsequences under eu-
clidean distance. url: www. cs. unm. edu/˜
mueen/FastestSimilaritySearch. html (accessed 24
May, 2016).
Papadimitriou, S. and Faloutsos, C. (2005). Streaming Pat-
tern Discovery in Multiple Time-Series. International
Conference on Very Large Data Bases (VLDB), pages
697–708.
Reiss, A. and Stricker, D. (2012). Introducing a new bench-
marked dataset for activity monitoring. In Proceed-
ings - International Symposium on Wearable Comput-
ers, ISWC, pages 108–109.
Vandewiele, G., Colpaert, P., Janssens, O., Van Herwe-
gen, J., Verborgh, R., Mannens, E., Ongenae, F., and
De Turck, F. (2017). Predicting train occupancies
based on query logs and external data sources. In Pro-
ceedings of the 7th International Workshop on Loca-
tion and the Web.
ICPRAM 2019 - 8th International Conference on Pattern Recognition Applications and Methods
92