ular, starting from a baseline with direct links, we
tried learning models that included rules for the di-
rect links and rules for the co-citation links. The non-
MLN algorithms used the extra information to im-
prove (e.g., from 52.7% to 69.5% for ICA
), but the
best MLN algorithm (MCMC) actually lost accuracy
(from 56.6% to 45.0%). Thus, given data with com-
plex links, MLNs may sometimes outperform other
techniques, but at other times may struggle with learn-
ing based on a complex rule set.
These observations suggest that MLN learning re-
mains a challenging problem, at least for CC and
likely for other tasks as well. Future work should fur-
ther consider the impact of training set size on MLN
learning, explore the effects of MLN structure learn-
ing, and evaluate other recent weight learning algo-
rithms (e.g., (Huynh and Mooney, 2009)) for MLNs.
Thanks to David Aha, Bryan Auslander, Ryan Rossi,
and the anonymous referees for comments that helped
to improve this work. Portions of this analysis were
conducted using Proximity, an open-source software
environment developed by the Knowledge Discov-
ery Laboratory at the University of Massachusetts
Amherst (http://kdl.cs.umass.edu/proximity/). This
work was supported in part by the U.S. National Sci-
ence Foundation under award number 1116439.
