# ANALYSIS OF THE ITERATED PROBABILISTIC WEIGHTED K NEAREST NEIGHBOR METHOD, A NEW DISTANCE-BASED ALGORITHM

### J. M. Martínez-Otzeta, B. Sierra

#### Abstract

The k-Nearest Neighbor (k-NN) classification method assigns to an unclassified point the class of the nearest of a set of previously classified points. A problem that arises when aplying this technique is that each labeled sample is given equal importance in deciding the class membership of the pattern to be classified, regardless of the typicalness of each neighbor. We report on the application of a new hybrid version named Iterated Probabilistic Weighted k Nearest Neighbor algorithm (IPW-k-NN) which classifies new cases based on the probability distribution each case has to belong to each class. These probabilities are computed for each case in the training database according to the k Nearest Neighbors it has in this database; this is a new way to measure the typicalness of a given case with regard to every class. Experiments have been carried out using UCI Machine Learning Repository well-known databases and performing 10-fold cross-validation to validate the results obtained in each of them. Three different distances (Euclidean, Camberra and Chebychev) are used in the comparison done.

#### References

- Blake, B. and Mertz, C. (1998). Uci repository of machine learning databases.
- Clark, P. and Nibblet, T. (1989). The cn2 induction algorithm. Machine Learning, 3(4):261-283.
- Cover, T. and Hart, P. (1967). Nearest neighbour pattern classi cation. IEEE Transactions on Information Theory, 13(1):21-27.
- Cowell, R. G., Dawid, A. P., Lauritzen, S., and Spiegelharter, D. J. (1999). Probabilistic Networks and Expert Systems. Springer.
- Sierra, B., Lazkano, E., Inza, I., Merino, M., LarraÃaga, P., and Quiroga., J. (2001). Prototype Selection and Feature Subset Selection by Estimation of Distribution Algorithms. A case Study in the survival of cirrhotic patients treated with TIPS. In Proceedings of the Eighth Arti cial Intelligence in Medicine in Europe. Lecture Notes on Arti cial Intelligence, pages 20-29. Springer-Verlag.
- Stone, M. (1974). Cross-validation choice and assesment of statistical predictions. Journal of the Royal Statistic Society, 36:111-147.
- Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics, 1:80-83.

#### Paper Citation

#### in Harvard Style

M. Martínez-Otzeta J. and Sierra B. (2004). **ANALYSIS OF THE ITERATED PROBABILISTIC WEIGHTED K NEAREST NEIGHBOR METHOD, A NEW DISTANCE-BASED ALGORITHM** . In *Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS,* ISBN 972-8865-00-7, pages 233-240. DOI: 10.5220/0002605402330240

#### in Bibtex Style

@conference{iceis04,

author={J. M. Martínez-Otzeta and B. Sierra},

title={ANALYSIS OF THE ITERATED PROBABILISTIC WEIGHTED K NEAREST NEIGHBOR METHOD, A NEW DISTANCE-BASED ALGORITHM},

booktitle={Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS,},

year={2004},

pages={233-240},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0002605402330240},

isbn={972-8865-00-7},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS,

TI - ANALYSIS OF THE ITERATED PROBABILISTIC WEIGHTED K NEAREST NEIGHBOR METHOD, A NEW DISTANCE-BASED ALGORITHM

SN - 972-8865-00-7

AU - M. Martínez-Otzeta J.

AU - Sierra B.

PY - 2004

SP - 233

EP - 240

DO - 10.5220/0002605402330240