Authors:
Wagner Francisco Castilho
1
;
Hércules Antônio do Prado
2
and
Marcelo Ladeira
3
Affiliations:
1
Federal Savings Bank and Catholic University of Brasília, Brazil
;
2
Brazilian Enterprise for Agricultural Research and Catholic University of Brasília, Brazil
;
3
University of Brasília, Brazil
Keyword(s):
Knowledge Discovery in Databases (KDD), Clustering Analysis, Dactyloscopy.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Business Analytics
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
Enterprise Information Systems
;
Health Information Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
Abstract:
Knowledge Discovery in Databases (KDD) is the process by which unknown and useful knowledge and information are extracted, by automatic or semi-automatic methods, from large amounts of data. Along the evolution of Information Technology and the rapid growth in the number and size of databases, the development of methodologies, techniques, and tools for data mining has become a major concern for researchers, and has led, in turn, to the development of applications in a variety of areas of human activity. About 1997, the processes and techniques associated with cluster analysis had begun to be researched with increasing intensity by the KDD community. Within the context of a model intended to support decisions based on cluster analysis, prior knowledge about the data structure and the application domain can be used as important constraints that lead to better results in the clusters’ configurations. This paper presents an application of cluster analysis in the area of public safety usi
ng a schema that takes into account the burden of prior knowledge acquired from statistical analysis on the data. Such an information was used as a bias for the k-means algorithm that was applied to identify the dactyloscopic (fingerprint) profile of criminals in the Brazilian capital, also known as Federal District. These results was then compared with a similar analysis that disregarded the prior knowledge. It is possible to observe that the analysis using prior knowledge generated clusters that are more coherent with the expert knowledge.
(More)