tions. The CT-PFMI algorithm was also implemented
on real stock prices data to show the sparse PFMI ef-
fect between pairs of real-world time series. It was
also demonstrated how the CT-PFMI algorithm can
be used for in-depth analyses of interactions between
time series.
ACKNOWLEDGEMENTS
This research was funded by the Koret foundation
grant for Smart Cities and Digital Living 2030.
REFERENCES
Begleiter, R., El-Yaniv, R., and Yona, G. (2004). On predic-
tion using variable order markov models. Journal of
Artificial Intelligence Research, 22:385–421.
Begleiter, R., Elovici, Y., Hollander, Y., Mendelson, O.,
Rokach, L., and Saltzman, R. (2013). A fast and
scalable method for threat detection in large-scale dns
logs. In Big Data, 2013 IEEE International Confer-
ence on, pages 738–741. IEEE.
Ben-Gal, I., Morag, G., and Shmilovici, A. (2003). Context-
based statistical process control: A monitoring pro-
cedure for state-dependent processes. Technometrics,
45(4):293–311.
Ben-Gal, I., Shani, A., Gohr, A., Grau, J., Arviv, S.,
Shmilovici, A., Posch, S., and Grosse, I. (2005). Iden-
tification of transcription factor binding sites with
variable-order bayesian networks. Bioinformatics,
21(11):2657–2666.
Bialek, W., Nemenman, I., and Tishby, N. (2001). Pre-
dictability, complexity, and learning. Neural compu-
tation, 13(11):2409–2463.
Bossomaier, T., Barnett, L., Harré, M., and Lizier, J. T.
(2016). An introduction to transfer entropy. Springer.
Brice, P. and Jiang, W. (2009). A context tree method for
multistage fault detection and isolation with applica-
tions to commercial video broadcasting systems. IIE
Transactions, 41(9):776–789.
Chim, H. and Deng, X. (2007). A new suffix tree similarity
measure for document clustering. In Proceedings of
the 16th international conference on World Wide Web,
pages 121–130. ACM.
Cover, T. M. and Thomas, J. A. (2012). Elements of infor-
mation theory. John Wiley & Sons.
Dimpfl, T. and Peter, F. J. (2014). The impact of the fi-
nancial crisis on transatlantic information flows: An
intraday analysis. Journal of International Financial
Markets, Institutions and Money, 31:1–13.
Kaniwa, F., Kuthadi, V. M., Dinakenyane, O., and
Schroeder, H. (2017). Alphabet-dependent parallel al-
gorithm for suffix tree construction for pattern search-
ing. arXiv preprint arXiv:1704.05660.
Kraskov, A., Stögbauer, H., and Grassberger, P. (2004).
Estimating mutual information. Physical review E,
69(6):066138.
Kullback, S. and Leibler, R. A. (1951). On information
and sufficiency. The annals of mathematical statistics,
22(1):79–86.
Kusters, C. and Ignatenko, T. (2015). Dna sequence model-
ing based on context trees. In Proc. 5th Jt. WIC/IEEE
Symp. Inf. Theory Signal Process. Benelux, pages 96–
103.
Largeron-Leténo, C. (2003). Prediction suffix trees for su-
pervised classification of sequences. Pattern Recogni-
tion Letters, 24(16):3153–3164.
Montalto, A., Faes, L., and Marinazzo, D. (2014). Mute: a
matlab toolbox to compare established and novel esti-
mators of the multivariate transfer entropy. PloS one,
9(10):e109462.
Runge, J., Heitzig, J., Petoukhov, V., and Kurths, J. (2012).
Escaping the curse of dimensionality in estimating
multivariate transfer entropy. Physical review letters,
108(25):258701.
Sales, G. and Romualdi, C. (2011). parmigene—a par-
allel r package for mutual information estimation
and gene network reconstruction. Bioinformatics,
27(13):1876–1877.
Satish, U. C., Kondikoppa, P., Park, S.-J., Patil, M., and
Shah, R. (2014). Mapreduce based parallel suffix tree
construction for human genome. In Parallel and Dis-
tributed Systems (ICPADS), 2014 20th IEEE Interna-
tional Conference on, pages 664–670. IEEE.
Schreiber, T. (2000). Measuring information transfer. Phys-
ical review letters, 85(2):461.
Schürmann, T. and Grassberger, P. (1996). Entropy esti-
mation of symbol sequences. Chaos: An Interdisci-
plinary Journal of Nonlinear Science, 6(3):414–427.
Shmilovici, A. and Ben-Gal, I. (2012). Predicting stock re-
turns using a variable order markov tree model. Stud-
ies in Nonlinear Dynamics & Econometrics, 16(5).
Slonim, N., Bejerano, G., Fine, S., and Tishby, N. (2003).
Discriminative feature selection via multiclass vari-
able memory markov model. EURASIP Journal on
Applied Signal Processing, 2003:93–102.
Society, T. X., Wang, S., Jiang, Q., and Huang, J. Z. (2014).
A novel variable-order markov model for clustering
categorical sequences. IEEE Transactions on Knowl-
edge and Data Engineering, 26(10):2339–2353.
Still, S. (2014). Information bottleneck approach to predic-
tive inference. Entropy, 16(2):968–989.
Tishby, N., Pereira, F. C., and Bialek, W. (2000). The
information bottleneck method. arXiv preprint
physics/0004057.
Tiwari, V. S. and Arya, A. (2018). Distributed context tree
weighting (ctw) for route prediction. Open Geospatial
Data, Software and Standards, 3(1):10.
Vicente, R., Wibral, M., Lindner, M., and Pipa, G. (2011).
Transfer entropy—a model-free measure of effective
connectivity for the neurosciences. Journal of compu-
tational neuroscience, 30(1):45–67.
Weinberger, M. J., Rissanen, J. J., and Feder, M. (1995). A
universal finite memory source. IEEE Transactions on
Information Theory, 41(3):643–652.
Yang, J., Xu, J., Xu, M., Zheng, N., and Chen, Y. (2014).
Predicting next location using a variable order markov
model. In Proceedings of the 5th ACM SIGSPATIAL
International Workshop on GeoStreaming, pages 37–
42. ACM.
Past-future Mutual Information Estimation in Sparse Information Conditions
71