Jacob Søgaard Larsen
Line Katrine Harder Clemmensen
Technical University of Denmark, Denmark
Non-negative Matrix Factorization, Binary Data, Binary Matrix Factorization, Text Modelling.
Artificial Intelligence
Business Analytics
Computational Intelligence
Data Analytics
Data Engineering
Evolutionary Computing
Knowledge Discovery and Information Retrieval
Knowledge-Based Systems
Machine Learning
Mining Text and Semi-Structured Data
Soft Computing
Symbolic Systems
We propose the Logistic Non-negative Matrix Factorization for decomposition of binary data. Binary data
are frequently generated in e.g. text analysis, sensory data, market basket data etc. A common method for
analysing non-negative data is the Non-negative Matrix Factorization, though this is in theory not appropriate
for binary data, and thus we propose a novel Non-negative Matrix Factorization based on the logistic link
function. Furthermore we generalize the method to handle missing data. The formulation of the method
is compared to a previously proposed logistic matrix factorization without non-negativity constraint on the
features. We compare the performance of the Logistic Non-negative Matrix Factorization to Least Squares
Non-negative Matrix Factorization and Kullback-Leibler (KL) Non-negative Matrix Factorization on sets of
binary data: a synthetic dataset, a set of student comments on their professors collected in a binary termdocument
matrix and a sensory dataset.
We find that choosing the number of components is an essential part
in the modelling and interpretation, that is still unresolved.