A Wireless Data Stream Mining Model

Mohamed Medhat Gaber, Shonali Krishnaswamy, Arkady Zaslavsky

2004

Abstract

The sensor networks, web click stream and astronomical applications generate a continuous flow of data streams. Most likely data streams are generated in a wireless environment. These data streams challenge our ability to store and process them in real-time with limited computing capabilities of the wireless environment. Querying and mining data streams have attracted attention in the past two years. The main idea behind the proposed techniques in mining data streams in to develop efficient approximate algorithms with an acceptable accuracy. Recently, we have proposed algorithm output granularity as an approach in mining data streams. This approach has the advantage of being resource-aware in addition to its generality. In this paper, a model for mining data streams in a wireless environment has been proposed. The model contains two novel contributions; a ubiquitous data mining system architecture and algorithm output granularity approach in mining data streams.

References

  1. C. Aggarwal, J. Han, J. Wang, P. S. Yu, “A Framework for Clustering Evolving Data Streams”, Proc. 2003 Int. Conf. on Very Large Data Bases (VLDB'03), Berlin, Germany, Sept. (2003).
  2. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In Proceedings of PODS, (2002).
  3. B. Babcock, M. Datar, R. Motwani, L. O'Callaghan: Maintaining Variance and k-Medians over Data Stream Windows, to appear in Proceedings of the 22nd Symposium on Principles of Database Systems (PODS 2003).
  4. R. Bhargava, H. Kargupta, and M. Powers: Energy Consumption in Data Analysis for On-board and Distributed Applications. Proceedings of the ICML'03 work shop on Machine Learning Technologies for Autonomous Space Applications, (2003).
  5. M. Burl, Ch. Fowlkes, J. Roden, A. Stechert, and S. Mukhtar, "Diamond Eye: A distributed architecture for image data mining," in SPIE DMKD, Orlando, April (1999).
  6. M. Charikar, L. O'Callaghan, and R. Panigrahy. Better streaming algorithms for clustering problems In Proc. of 35th ACM Symposium on Theory of Computing (STOC), (2003).
  7. L. O'Callaghan, N. Mishra, A. Meyerson, S. Guha, and R. Motwani. Streamingdata algorithms for high-quality clustering. Proceedings of IEEE International Conference on Data Engineering, March (2002).
  8. Graham Cormode, S. Muthukrishnan What's hot and what's not: tracking most frequent items dynamically. PODS 2003: 296-306
  9. Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani: Maintaining Stream Statistics Over Sliding Windows (Extended Abstract) in Proceedings of 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2002).
  10. P. Domingos and G. Hulten, "A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering", Proceedings of the Eight eenth International Conference on Machine Learning, 2001, 106--113, Williams town, MA, Morgan Kaufmann. (2001)
  11. P. Domingos and G. Hulten. Mining High-Speed Data Streams. In Proceedings of the Association for Computing Machinery Sixth International Conference on Knowledge Discovery and Data Mining, pages 71--80, (2000).
  12. V. Ganti, Johannes Gehrke, Raghu Ramakrishnan: Mining Data Streams under Block Evolution. SIGKDD Explorations 3(2): 1-10 (2002).
  13. C. Giannella, J. Han, J. Pei, X. Yan, and P.S. Yu, "Mining Frequent Patterns in Data Streams at Multiple Time Granularities", in H. Kargupta, A. Joshi, K. tralasian Data Mining Workshop (AusDM 2003), Held in conjunction with the 2003 Congress on Evolutionary Computation (CEC 2003), December, Canberra, Australia, Springer Verlag, Lecture Notes in Computer Science (LNCS). (2003)
  14. Gaber, M.M., Krishnaswamy, S. and Zaslavsky, A. (2004). Cost-Efficient Mining Techniques for Data Streams. In Proc. Australasian Workshop on Data ing and Web Intelligence (DMWI2004), Dunedin, New Zealand. CRPIT, 32. Purvis, M., Ed. ACS. (2004)
  15. S. Guha, N. Mishra, R. Motwani, and L. O'Callaghan. Clustering data streams. In Proceedings of the Annual Symposium on Foundations of Computer Sci ence. IEEE, November (2000).
  16. L. Golab and M. Tamer Ozsu. Issues in Data Stream Management. In SIGMOD Record, Volume 32, Number 2, June 2003, pp. 5--14.
  17. M. Henzinger, P. Raghavan and S. Rajagopalan, Computing on data streams, Technical Note 1998-011, Digital Systems Research Center, Palo Alto, CA, May (1998).
  18. G. Hulten, L. Spencer, and P. Domingos. Mining Time-Changing Data Streams. ACM SIGKDD (2001).
  19. H. Kargupta. CAREER: Ubiquitous Distributed Knowledge Discovery from Heterogeneous Data. NSF Information and Data Management (IDM) Workshop (2001).
  20. H. Kargupta, R. Bhargava, K. Liu, M. Powers, P. Blair, M. Klein, K. Sarkar and D. Handy: Vehicle Data Stream Mining (VEDAS): An Experimental Sys tem for Mobile and Distributed Data Stream Mining. Information Mining for Automotive and Transportation Domain workshop. Madrid, Spain (2003).
  21. E. Keogh, J. Lin, and W. Truppel. Clustering of Time Series Subsequences is Meaningless: Implications for Past and Future Research. In proceedings of the 3rd IEEE International Conference on Data Mining. Melbourne, FL. (2003).
  22. Kargupta, H., Park, B., Pittie, S., Liu, L., Kushraj, D. and Sarkar, K. MobiMine: Monitoring the Stock Market from a PDA. ACM SIGKDD Explorations. January (2002). Volume 3, Issue 2. Pages 37--46. ACM Press.
  23. G. S. Manku and R. Motwani. Approximate frequency counts over data streams. In Proceedings of the 28th International Conference on Very Large data Bases, Hong Kong, China, August (2002).
  24. S. Muthukrishnan , Data streams: algorithms and applications. Proceedings of the fourteenth annual ACM-SIAM symposium on discrete algorithms. (2003)
  25. S. Muthukrishnan, Seminar on Processing Massive Data Sets. Available Online: http://athos.rutgers.edu/%7Emuthu/stream-seminar.html, (2003).
  26. Carlos Ordonez. Clustering Binary Data Streams with K-means .ACM DMKD (2003).
  27. B. Park and H. Kargupta. Distributed Data Mining: Algorithms, Systems, and Applications. To be published in the Data Mining Handbook. Editor: Nong Ye. (2002).
  28. S. Parthasarathy: Towards Network-Aware Data Mining. In International Workshop on Parallel and Distributed Data Mining, along with IPDPS (2001).
  29. S. Papadimitriou, C. Faloutsos, and A. Brockwell, “Adaptive, Hands-Off Stream Mining”, 29th International Conference on Very Large Data Bases VLDB, (2003).
Download


Paper Citation


in Harvard Style

Medhat Gaber M., Krishnaswamy S. and Zaslavsky A. (2004). A Wireless Data Stream Mining Model . In Proceedings of the 3rd International Workshop on Wireless Information Systems - Volume 1: WIS, (ICEIS 2004) ISBN 972-8865-02-3, pages 152-160. DOI: 10.5220/0002676301520160


in Bibtex Style

@conference{wis04,
author={Mohamed Medhat Gaber and Shonali Krishnaswamy and Arkady Zaslavsky},
title={A Wireless Data Stream Mining Model},
booktitle={Proceedings of the 3rd International Workshop on Wireless Information Systems - Volume 1: WIS, (ICEIS 2004)},
year={2004},
pages={152-160},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002676301520160},
isbn={972-8865-02-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Workshop on Wireless Information Systems - Volume 1: WIS, (ICEIS 2004)
TI - A Wireless Data Stream Mining Model
SN - 972-8865-02-3
AU - Medhat Gaber M.
AU - Krishnaswamy S.
AU - Zaslavsky A.
PY - 2004
SP - 152
EP - 160
DO - 10.5220/0002676301520160