Authors:
            
                    Simone D’Amico
                    
                        
                                1
                            
                    
                    ; 
                
                    Lorenzo Malandri
                    
                        
                                2
                            
                                ; 
                            
                                3
                            
                    
                    ; 
                
                    Fabio Mercorio
                    
                        
                                2
                            
                                ; 
                            
                                3
                            
                    
                     and
                
                    Mario Mezzanzanica
                    
                        
                                2
                            
                                ; 
                            
                                3
                            
                    
                    
                
        
        
            Affiliations:
            
                    
                        
                                1
                            
                    
                    Department of Economics, Management and Statistics, University of Milano-Bicocca, Milan, Italy
                
                    ; 
                
                    
                        
                                2
                            
                    
                    Department of Statistics and Quantitative Methods, University of Milano-Bicocca, Milan, Italy
                
                    ; 
                
                    
                        
                                3
                            
                    
                    CRISP Research Centre, University of Milan-Bicocca, Milan, Italy
                
        
        
        
        
        
             Keyword(s):
            Keyphrases Extraction, Keyphrases Evaluation, Keyphrases Benchmark Evaluation, Word Embeddings, Natural Language Processing.
        
        
            
                
                
            
        
        
            
                Abstract: 
                A research area of NLP is known as keyphrases extraction, which aims to identify words and expressions in a text that comprehensively represent the content of the text itself. In this study, we introduce a new approach called KRAKEN (Keyphrease extRAction maKing use of EmbeddiNgs). Our method takes advantage of widely used NLP techniques to extract keyphrases from a text in an unsupervised manner and we compare the results with well-known benchmark datasets in the literature. The main contribution of this work is developing a novel approach for keyphrase extraction. Both natural language text preprocessing techniques and distributional semantics techniques, such as word embeddings, are used to obtain a vector representation of the texts that maintains their semantic meaning. Through KRAKEN, we propose and design a new method that exploits word embedding for identifying keyphrases, considering the relationship among words in the text. To evaluate KRAKEN, we employ benchmark datasets a
                nd compare our approach with state-of-the-art methods. Another contribution of this work is the introduction of a metric to rank the identified keyphrases, considering the relatedness of both the words within the phrases and all the extracted phrases from the same text.
                (More)