A New Kernel for Outlier Detection in WSNs Minimizing MISE
Rohit Jain, C. P. Gupta, Seema Sharma
2015
Abstract
In sensor network, data generated by various sensors deployed at different locations need to be analyzed in order to identify interesting events correspond to outliers. The presence of outliers may distort contained information. To ensure that the information is correctly extracted, it is necessary to identify the outliers and isolate them during knowledge extraction phase. In this paper, we propose a novel unsupervised algorithm for detecting outliers based on density by coupling two principles: first, kernel density estimation and second assigning an outlier score to each object. A new kernel function building a smoother version of density estimate is proposed. An outlier score is assigned to each object by comparing local density estimate of each object to its neighbors. The two steps provide a framework for outlier detection that can be easily applied to discover new or unusual types of outliers. Experiments performed on synthetic and real datasets suggest that the proposed approach can detect outliers precisely and achieve high recall rates which in turn demonstrate the potency of the proposed approach.
References
- Aggarwal C. C., 2013. Outlier analysis, Springer, doi:10.1007/978-1-4614-6396-2.
- Barnett, V. and Lewis, T., 1994. Outliers in statistical data, Wiley, vol. 3, New York.
- Branch J. W., Giannella C., Szymanski B., Wolf R. and Kargupta H., 2013. “In-Network Outlier Detection in Wireless Sensor Networks,” Knowledge and Information System, vol. 34 no. 1, pp. 23-54.
- Breunig M. M.,. Kriegel H. P., Raymond T. Ng. and Sander J., 2000. “LOF: identifying density-based local outliers,” ACM Sigmod Record, vol. 29 no. 2, pp. 93- 104, doi:10.1145/335191.335388.
- Chandola V., Banerjee A. and Kumar V., 2009. “Anomaly detection: A survey”, ACM Computing Surveys (CSUR), vol. 41 no. 3: 15, pp. 1-58, doi:10.1145/1541880.1541882.
- Gupta M., Gao J., Aggarwal C.C. and Han J., 2013. “Outlier detection for temporal data: A survey”, IEEE Transaction on Knowledge and Data Engineering, vol. 25 no. 1, doi:10.1109/TKDE.2013.184.
- Hodge V. J., Austin J., 2004. “A survey of outlier detection methodologies,” Artificial Intelligence Review, vol. 22 no. 2, pp. 85-126, doi:10.1007/ s10462-004-4304-y.
- Jin W., Tung A. KH, Han J. and Wang W., 2006. “Ranking outliers using symmetric neighborhood relationship,” Advances in Knowledge Discovery and Data Mining, Springer Berlin Heidelberg, pp. 577- 593, doi:10.1007/11731139_68.
- Knorr E. M. and Raymond T. Ng., 1997. “A Unified Notion of Outliers: Properties and Computation,” Proc. KDD. Available at: http://www.aaai.org/PublishedPapers/ KDD/1997/ KDD97-044.pdf.
- Kriegel H. P., Kröger P., Schubert E., Zimek A., 2009. “LoOP: local outlier probabilities,” Proc. of the 18th ACM conference on Information and knowledge management (CIKM 09), ACM, pp. 1649-1652, doi:10.1145/1645953.1646195.
- Latecki L. J., Lazarevic A. and Pokrajac D., 2007. “Outlier detection with kernel density functions,” Machine Learning and Data Mining in Pattern Recognition, Springer Berlin Heidelberg, pp. 61-75, doi:10.1007/978-3-540-73499-4_6.
- Loftsgaarden D. O. and Quesenberry C. P., 1965. “A nonparametric estimate of a multivariate density functions,” The Annals of Mathematical Statistics, vol. 36 no. 3, pp. 1049-1051. Available at: http://projecteuclid.org/euclid.aoms/1177700079.
- Marron J. S. and Wand M. P., 1992. “Exact mean integrated squared error,” The Annals of Statistics, vol. 20 no. 2, pp. 712-736. Available at: http://projecteuclid.org/download/pdf_1/euclid.aos/11 76348653.
- Papadimitriou S., Kitagawa H., Gibbons P. B. and Faloutsos C.,2003. “Loci: Fast outlier detection using the local correlation integral,” Proc. of the 19th International Conference on Data Engineering (ICDE 03), IEEE, Bangalore, India, pp. 315-326, doi:10.1109/ICDE.2003.1260802.
- Schubert E., Zimek A. and Kriegel H.P., 2014. “Generalized Outlier Detection with Flexible Kernel Density Estimates”, Proc. of the 14th SIAM Conference on Data Mining (SDM 14), 2014. Available at: http://www.dbs.ifi.lmu.de/zimek/ publications/SDM2014 /KDEOS.pdf.
- Sheather S. J. and Jones M. C., 1991. “A reliable databased bandwidth selection method for kernel density estimation,” Journal of the Royal Statistical Society, series B vol. 53 no. 3, pp. 683-690. Available at: http://www.researchgate.net/publication/ 224817413.
- Sheng B., Li Q., Mao W. and Jin W., 2007. “Outlier Detection in Sensor Network”, MobiHoc 07, ACM, pp. 219-228.
- Silverman B.W., 1986. Density estimation for statistics and data analysis, Vol. 26, CRC press, ISBN 0-412- 24620-1.
- Terrell G. R. and Scott D. W., 1992. “Variable kernel density estimation,” The Annals of Statistics, pp. 1236-1265. Available at: http://www.jstor.org/stable/ 2242011.
- Zucchini W., Berzel A. and Nenadic O., 2005. Applied smoothing techniques.
Paper Citation
in Harvard Style
Jain R., Gupta C. and Sharma S. (2015). A New Kernel for Outlier Detection in WSNs Minimizing MISE . In Proceedings of the 4th International Conference on Sensor Networks - Volume 1: SENSORNETS, ISBN 978-989-758-086-4, pages 169-175. DOI: 10.5220/0005318401690175
in Bibtex Style
@conference{sensornets15,
author={Rohit Jain and C. P. Gupta and Seema Sharma},
title={A New Kernel for Outlier Detection in WSNs Minimizing MISE},
booktitle={Proceedings of the 4th International Conference on Sensor Networks - Volume 1: SENSORNETS,},
year={2015},
pages={169-175},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005318401690175},
isbn={978-989-758-086-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 4th International Conference on Sensor Networks - Volume 1: SENSORNETS,
TI - A New Kernel for Outlier Detection in WSNs Minimizing MISE
SN - 978-989-758-086-4
AU - Jain R.
AU - Gupta C.
AU - Sharma S.
PY - 2015
SP - 169
EP - 175
DO - 10.5220/0005318401690175