Positive-Unlabeled Learning Using Pairwise Similarity and Parametric Minimum Cuts
Torpong Nitayanont, Dorit Hochbaum
2024
Abstract
Positive-unlabeled (PU) learning is a binary classification problem where the labeled set contains only positive class samples. Most PU learning methods involve using a prior π on the true fraction of positive samples. We propose here a method based on Hochbaum’s Normalized Cut (HNC), a network flow-based method, that partitions samples, both labeled and unlabeled, into two sets to achieve high intra-similarity and low inter-similarity, with a tradeoff parameter to balance these two goals. HNC is solved, for all tradeoff values, as a parametric minimum cut problem on an associated graph producing multiple optimal partitions, which are nested for increasing tradeoff values. Our PU learning method, called 2-HNC, runs in two stages. Stage 1 identifies optimal data partitions for all tradeoff values, using only positive labeled samples. Stage 2 first ranks unlabeled samples by their likelihood of being negative, according to the sequential order of partitions from stage 1, and then uses the likely-negative along with positive samples to run HNC. Among all generated partitions in both stages, the partition whose positive fraction is closest to the prior π is selected. An experimental study demonstrates that 2-HNC is highly competitive compared to state-of-the-art methods.
DownloadPaper Citation
in Harvard Style
Nitayanont T. and Hochbaum D. (2024). Positive-Unlabeled Learning Using Pairwise Similarity and Parametric Minimum Cuts. In Proceedings of the 16th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR; ISBN 978-989-758-716-0, SciTePress, pages 60-71. DOI: 10.5220/0012948100003838
in Bibtex Style
@conference{kdir24,
author={Torpong Nitayanont and Dorit Hochbaum},
title={Positive-Unlabeled Learning Using Pairwise Similarity and Parametric Minimum Cuts},
booktitle={Proceedings of the 16th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR},
year={2024},
pages={60-71},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012948100003838},
isbn={978-989-758-716-0},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 16th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR
TI - Positive-Unlabeled Learning Using Pairwise Similarity and Parametric Minimum Cuts
SN - 978-989-758-716-0
AU - Nitayanont T.
AU - Hochbaum D.
PY - 2024
SP - 60
EP - 71
DO - 10.5220/0012948100003838
PB - SciTePress