Authors:
            
                    Jacob Søgaard Larsen
                    
                        
                    
                     and
                
                    Line Katrine Harder Clemmensen
                    
                        
                    
                    
                
        
        
            Affiliation:
            
                    
                        
                    
                    Technical University of Denmark, Denmark
                
        
        
        
        
        
             Keyword(s):
            Non-negative Matrix Factorization, Binary Data, Binary Matrix Factorization, Text Modelling.
        
        
            
                Related
                    Ontology
                    Subjects/Areas/Topics:
                
                        Artificial Intelligence
                    ; 
                        Business Analytics
                    ; 
                        Computational Intelligence
                    ; 
                        Data Analytics
                    ; 
                        Data Engineering
                    ; 
                        Evolutionary Computing
                    ; 
                        Knowledge Discovery and Information Retrieval
                    ; 
                        Knowledge-Based Systems
                    ; 
                        Machine Learning
                    ; 
                        Mining Text and Semi-Structured Data
                    ; 
                        Soft Computing
                    ; 
                        Symbolic Systems
                    
            
        
        
            
                Abstract: 
                We propose the Logistic Non-negative Matrix Factorization for decomposition of binary data. Binary data
are frequently generated in e.g. text analysis, sensory data, market basket data etc. A common method for
analysing non-negative data is the Non-negative Matrix Factorization, though this is in theory not appropriate
for binary data, and thus we propose a novel Non-negative Matrix Factorization based on the logistic link
function. Furthermore we generalize the method to handle missing data. The formulation of the method
is compared to a previously proposed logistic matrix factorization without non-negativity constraint on the
features. We compare the performance of the Logistic Non-negative Matrix Factorization to Least Squares
Non-negative Matrix Factorization and Kullback-Leibler (KL) Non-negative Matrix Factorization on sets of
binary data: a synthetic dataset, a set of student comments on their professors collected in a binary termdocument
matrix and a sensory dataset.
                 We find that choosing the number of components is an essential part
in the modelling and interpretation, that is still unresolved.
                (More)