Figure 7: ROC curve of the leave-one-out experiment.
to modify the methods itself. In this work we de-
rived a version of PLSA which is able to work with
k-partite graphs as well with bipartite graphs, thus ex-
tending the potential of the method. Indeed, k-partite
graphs have shown to be valuable model for many dif-
ferent types of data, in particular in bioinformatics,
healthcare and recommender systems. In particular,
we firstly derived a new bayesian model that extends
the classical PLSA to k-partite graphs; then we de-
rived the Expectation-Maximization (EM) algorithm
to train the model from the data. Our proposed algo-
rithm is capable to deal with k-partite graphs of any
size, without limitation.
Through evaluation on two datasets, we show how
the extended PLSA is able to exploit information from
auxiliary graph’s layer to enhance the link prediction
in the main bipartite graph.
Future work comprise better evaluation of the
method and in particular of its variants (e.g., different
initializations, stop critera) and study on how to iden-
tify the best number of topics to guarantee optimal
performance. Furthermore we will apply the method
to real-world scenarios, such as the problem of drug
repurposing.
Finally, we plan to further investigate the proba-
bilistic properties of such method to propose an ex-
tension that is able to provide an explanation for each
predicted link, thus contributing in the direction of ex-
plainable artificial intelligence.
REFERENCES
Benchettara, N., Kanawati, R., and Rouveirol, C. (2010).
Supervised machine learning applied to link predic-
tion in bipartite social networks. In 2010 international
conference on advances in social networks analysis
and mining, pages 326–330. IEEE.
Boyd, S. and Vandenberghe, L. (2004). Convex Optimiza-
tion.
Ceddia, G., Pinoli, P., Ceri, S., and Masseroli, M. (2020).
Matrix factorization-based technique for drug repur-
posing predictions. IEEE journal of biomedical and
health informatics, 24(11):3162–3172.
Daud, N. N., Ab Hamid, S. H., Saadoon, M., Sahran, F., and
Anuar, N. B. (2020). Applications of link prediction
in social networks: A review. Journal of Network and
Computer Applications, 166:102716.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (2015).
Maximum likelihood from incomplete data via the em
algorithm.
Dissez, G., Ceddia, G., Pinoli, P., Ceri, S., and Masseroli,
M. (2019). Drug repositioning predictions by non-
negative matrix tri-factorization of integrated associa-
tion data.
Farahat, A. and Chen, F. (2015). Improving probabilistic la-
tent semantic analysis with principal component anal-
ysis.
Gligorijevi
´
c, V., Malod-Dognin, N., and Pr
ˇ
zulj, N. (2016).
Patient-specific data fusion for cancer stratification
and personalised treatment. In Biocomputing 2016:
Proceedings of the Pacific Symposium, pages 321–
332. World Scientific.
Haugh, M. (2015). The em algorithm.
Hofmann, T. (1999). Probabilistic latent semantic analysis.
Hofmann, T. and Puzicha, J. (1998). Unsupervised learning
from dyadic data.
Hong, L. (2012). Probabilistic latent semantic analysis.
Hwang, T., Atluri, G., Xie, M., Dey, S., Hong, C., Kumar,
V., and Kuang, R. (2012). Co-clustering phenome–
genome for phenotype classification and disease gene
discovery. Nucleic acids research, 40(19):e146–e146.
Koptelov, M., Zimmermann, A., Cr
´
emilleux, B., and
Soualmia, L. (2020). Link prediction via community
detection in bipartite multi-layer graphs. In Proceed-
ings of the 35th Annual ACM Symposium on Applied
Computing, pages 430–439.
Kumar, A., Singh, S. S., Singh, K., and Biswas, B. (2020).
Link prediction techniques, applications, and perfor-
mance: A survey. Physica A: Statistical Mechanics
and its Applications, 553:124289.
Kumar, P. and Sharma, D. (2020). A potential energy and
mutual information based link prediction approach for
bipartite networks. Scientific Reports, 10(1):1–14.
Liu, Q., Long, C., Zhang, J., Xu, M., and Lv, P. (2021). Tri-
atne: Tripartite adversarial training for network em-
beddings. IEEE Transactions on Cybernetics.
Menon, A. K. and Elkan, C. (2011). Link prediction
via matrix factorization. In Joint european confer-
ence on machine learning and knowledge discovery
in databases, pages 437–452. Springer.
Pham, C. and Dang, T. (2021). Link prediction for biomed-
ical network. In The 12th international conference on
advances in information technology, pages 1–5.
Pinoli, P., Srihari, S., Wong, L., and Ceri, S. (2021). Iden-
tifying collateral and synthetic lethal vulnerabilities
within the dna-damage response. BMC bioinformat-
ics, 22(1):1–17.
KDIR 2022 - 14th International Conference on Knowledge Discovery and Information Retrieval
136