Authors:
            
                    Elias Oliveira
                    
                        
                                1
                            
                    
                    ; 
                
                    Howard Roatti
                    
                        
                                1
                            
                    
                    ; 
                
                    Matheus de Araujo Nogueira
                    
                        
                                2
                            
                    
                    ; 
                
                    Henrique Gomes Basoni
                    
                        
                                1
                            
                    
                     and
                
                    Patrick Marques Ciarelli
                    
                        
                                1
                            
                    
                    
                
        
        
            Affiliations:
            
                    
                        
                                1
                            
                    
                    Universidade Federal do Espírito Santo, Brazil
                
                    ; 
                
                    
                        
                                2
                            
                    
                    Fundação de Assistência e Educação FAESA, Brazil
                
        
        
        
        
        
             Keyword(s):
            Text Classification, Social Network, Textmining.
        
        
            
                Related
                    Ontology
                    Subjects/Areas/Topics:
                
                        Artificial Intelligence
                    ; 
                        Clustering and Classification Methods
                    ; 
                        Computational Intelligence
                    ; 
                        Concept Mining
                    ; 
                        Evolutionary Computing
                    ; 
                        Information Extraction
                    ; 
                        Knowledge Discovery and Information Retrieval
                    ; 
                        Knowledge-Based Systems
                    ; 
                        Machine Learning
                    ; 
                        Mining Text and Semi-Structured Data
                    ; 
                        Soft Computing
                    ; 
                        Symbolic Systems
                    
            
        
        
            
                Abstract: 
                The usual practice in the classification problem is to create a set of labeled data for training and then use it to
tune a classifier for predicting the classes of the remaining items in the dataset. However, labeled data demand
great human effort, and classification by specialists is normally expensive and consumes a large amount of
time. In this paper, we discuss how we can benefit from a cluster-based tree kNN structure to quickly build
a training dataset from scratch. We evaluated the proposed method on some classification datasets, and the
results are promising because we reduced the amount of labeling work by the specialists to 4% of the number
of documents in the evaluated datasets. Furthermore, we achieved an average accuracy of 72.19% on tested
datasets, versus 77.12% when using 90% of the dataset for training.