Figure 3: Effect of the support threshold on τ.
Figure 4: Effect of the number of clusters on r.
5 CONCLUSION
The proposed ISsC is an incremental algorithm that
discovers clusters in multiple data streams. By using
the monotonicity property, the ISsC reduces the
number of processed subsequences. That is by
excluding the non-frequent subsets, which do not
contribute to finding the clusters of subsequences
(non-frequent subsequences). By employing a decay
factor of subsequences, the ISsC can remove older
uninteresting subsequences. We noticed that, as the
cluster density is increased, the total purity of
clustering improved. Moreover, we noted that using
the monotonicity property improved the performance
over not using this property.
REFERENCES
Al Aghbari, Z., Kamel, I., & Awad, T. (2012). On clustering
large number of data streams. Intelligent Data Analysis,
16(1), 69-91.
Islam, M. K., Ahmed, M. M., & Zamli, K. Z. (2019). A
buffer-based online clustering for evolving data stream.
Information Sciences, 489, 113-135.
Tareq, M., Sundararajan, E. A., Mohd, M., & Sani, N. S.
(2020). Online Clustering of Evolving Data Streams
Using a Density Grid-Based Method. IEEE Access, 8,
166472-166490.
Alkouz, B., Al Aghbari, Z., & Abawajy, J. H. (2019).
Tweetluenza: Predicting flu trends from twitter data.
Big Data Mining and Analytics, 2(4), 273-287.
Al Aghbari, Z., Khedr, A. M., Osamy, W., Arif, I., &
Agrawal, D. P. (2019). Routing in wireless sensor
networks using optimization techniques: A survey.
Wireless Personal Communications, 1-28.
Ester M., Kriegel H.-P., Sander J., Xu X.: “A Density-
Based Algorithm for Discovering Clusters in Large
Spatial Databases with Noise”, KDD, 1996, vol. 96, no.
34, 226–231.
Cao F., Estert M., Qian W., and Zhou A., ‘‘Density-based
clustering over an evolving data stream with noise,’’ in
Proc. SIAM Int. Conf. Data Mining, Apr. 2006, 328–
339.
Aggarwal C. C., Han J., Wang J., and Yu P. S., ‘‘A
framework for clustering evolving data streams,’’ in
Proc. 29th Int. Conf. Very Large Data Bases, 29, 2003,
81–92
Kranen, P., Assent, I., Baldauf, C., & Seidl, T. (2011). The
ClusTree: indexing micro-clusters for anytime stream
mining. Knowledge and information systems, 29(2),
249-272.
Al-Shammari, A., Zhou, R., Naseriparsaa, M., & Liu, C.
(2019). An effective density-based clustering and
dynamic maintenance framework for evolving medical
data streams. International journal of medical
informatics, 126, 176-186.
Gong, S., Zhang, Y., & Yu, G. (2017). Clustering stream
data by exploring the evolution of density mountain.
Proceedings of the VLDB Endowment, 11(4), 393-405.
Zoumpatianos, K., Idreos, S., & Palpanas, T. (2014, June).
Indexing for interactive exploration of big data series.
In Proceedings of the 2014 ACM SIGMOD
international conference on Management of data (pp.
1555-1566).
Matsubara, Y., Sakurai, Y., Ueda, N., & Yoshikawa, M.
(2014, December). Fast and exact monitoring of co-
evolving data streams. In 2014 IEEE International
Conference on Data Mining (pp. 390-399). IEEE.
Keogh, E., & Lin, J. (2005). Clustering of time-series
subsequences is meaningless: implications for previous
and future research. Knowledge and information
systems, 8(2), 154-177.
Al Aghbari, Z., Kamel, I., & Elbaroni, W. (2013). Energy-
efficient distributed wireless sensor network scheme for
cluster detection. International Journal of Parallel,
Emergent and Distributed Systems, 28(1), 1-28.
Alkouz, B., & Al Aghbari, Z. (2020). SNSJam: Road traffic
analysis and prediction by fusing data from multiple
social networks. Information Processing &
Management, 57(1), 102139.
Dinges, L., Al-Hamadi, A., Elzobi, M., Al Aghbari, Z., &
Mustafa, H. (2011). Offline automatic segmentation
based recognition of handwritten Arabic words.
International Journal of Signal Processing, Image
Processing and Pattern Recognition, 4(4), 131-143.
0
5
10
15
20
25
23456
ArrivalRate(ms)
SupportThreshold
withMP
withoutMP
0
5
10
15
20
25
246810
ArrivalRate(ms)
NumberofClusters
withMP
withoutMP