Authors:
Zhang Longbo
;
Li Zhanhuai
;
Yu Min
;
Wang Yong
and
Jiang Yun
Affiliation:
School of Computer Science, Northwestern Polytechnical University, China
Keyword(s):
Data stream, Landmark Window, Approximate algorithm, Random Sampling.
Related
Ontology
Subjects/Areas/Topics:
Databases and Information Systems Integration
;
Deductive, Active, Temporal and Real-Time Databases
;
Enterprise Information Systems
Abstract:
In many applications including sensor networks, telecommunications data management, network monitoring and financial applications, data arrives in a stream. There are growing interests in algorithms over data streams recently. This paper introduces the problem of sampling from landmark windows of recent data items from data streams and presents a random sampling algorithm for this problem. The presented algorithm, which is called SMS Algorithm, is a stratified multistage sampling algorithm for landmark window. It takes different sampling fraction in different strata of landmark window, and works even when the number of data items in the landmark window varies dramatically over time. The theoretic analysis and experiments show that the algorithm is effective and efficient for continuous data streams processing.