
 
Preliminary results yield an accuracy of more than 
91%, with the entire set of attributes, without using 
feature selection. We consider that feature selection 
will further boost the classification accuracy. We 
also managed to improve the classification time by 
using a smaller number of instances per class (14).  
The results have also shown that peak 
performances are obtained on a 14 instances/class 
dataset using 8 clusters and a 20 instances/class 
dataset using 7 clusters.  
Our current work focuses on connecting the two 
steps of the training process, and addressing the 
classification stage. Also, for generalizing the scope 
of the system, during the training process several 
issues need to be considered. 
The first is that the classes are not split uniformly 
into clusters (instances from the same class are 
distributed among at most 4 clusters). At present, we 
solve this issue by adding all the instances to the 
cluster having the maximum number of instances 
from that particular class. However, on a global 
model, such situations should have a specific 
approach. A possible solution is to distribute all the 
instances of a class to all clusters which contain a 
number of instances above a threshold from that 
class. We need to investigate how this approach 
influences the complexity, the performance and the 
time of the induced sub-models, as it may produce 
the necessity of an additional clustering step.  
A second issue which needs addressing is the 
time required for the SimpleKMeans method to split 
the dataset into clusters. We experimentally 
observed that the clustering time increases with the 
number of clusters. As for 2-5 clusters it takes 
several minutes to build the clusters, for values like 
8 or 9 clusters, the time required is of up to 2-3 days.  
Moreover, as the number of classes increases, we 
might need to introduce additional clustering steps. 
We are currently evaluating a methodology for 
automatically establishing the parameters of the 
hierarchical structure: number of clustering levels, 
number of clusters per level, optimal size (in terms 
of number of classes) of the training subset 
submitted to the Naïve Bayes classifiers.  
ACKNOWLEDGEMENTS  
Research described in this paper was supported by 
the IBM Faculty Award received in 2009 by  Rodica 
Potolea from the Computer Science Department  at 
the Technical University of Cluj-Napoca, Romania. 
 
REFERENCES 
Azar, D., 1997.“Hilditch's Algorithm for Skeletonization”, 
Pattern Recognition course, Montreal. 
Bărbănţan, I., Vidrighin, C., Borca, R., 2009. “An Offline 
System for Handwritten Signature Recognition”, 
Proceedings of IEEE ICCP, pp. 3-10. 
Halkidi, M., Batistakis, Y., Vazirgiannis, M., 2001. “On 
Custering Validation Techniques”, Journal of 
Intelligent Information Systems, 107–145. 
Hall, M.A., 2000. “Correlation based Feature Selection for 
Machine Learning.” Doctoral dissertation, 
Department of Computer Science, The University of 
Waikato, Hamilton, New Zealand. 
Han, J., Kamber, M., 2006. “Data Mining – Concepts and 
Techniques”, Morgan Kaufmann, 2
nd
 edition. 
Justino, J., R., Yacoubi, A., 2000. “An Off-Line Signature 
Verification System Using Hidden Markov Model and 
Cross-Validation”,  Proceedings of the 13th Brazilian 
Symposium on Computer Graphics and Image 
Processing. 
Kohavi, H., John, R., Pfleger, G., 1994. “Irrelevant 
Features and the Subset Selection Problem”, Machine 
Learning: Proceedings of the Eleventh International 
Conference, 121-129, Morgan Kaufman Publishers, 
San Francisco. 
McCabe, A., Trevathan, J., Read, W., 2008. “Neural 
Network-based Handwritten Signature Verification”, 
Journal of Computers, Vol 3, No.8. 
Ozgunduz, E., Senturk, T., 2005. “Off-line Signature 
Verification and Recognition by Support Vector 
Machines”,  13
th
 European Signal Processing 
Conferenece, Antalya. 
Prasad, A., G., Amaresh, V., M. An Offline Signature 
Verification System. 
Saitta, S., Raphael, B. and Smith, F.C.I., 2007.  “A 
Bounded Index for Cluster Validity”, Proceedings of 
the 5th international conference on Machine Learning 
and Data Mining in Pattern Recognition - Lecture 
Notes In Artificial Intelligence; Vol. 4571, Springer-
Verlag, pp. 174-187. 
Vidrighin, Bratu, C., Muresan, T., Potolea, R., 2008.  
“Improving  Classification   Accuracy  through 
Feature Selection”, Proceedings of IEEE ICCP, pp. 
25-32. 
Witten, I., R., Frank, E., 2005. Data Mining,”Practical 
Machine Learning Tools and Techniques”,  Morgan 
Kaufmann Publishers, Elsevier Inc. 
A HIERARCHICAL HANDWRITTEN OFFLINE SIGNATURE RECOGNITION SYSTEM
147