A Gradient Descent based Heuristic for Solving Regression Clustering Problems
Enis Kayış
2020
Abstract
Regression analysis is the method of quantifying the effects of a set of independent variables on a dependent variable. In regression clustering problems, the data points with similar regression estimates are grouped into the same cluster either due to a business need or to increase the statistical significance of the resulting regression estimates. In this paper, we consider an extension of this problem where data points belonging to the same level of another partitioning categorical variable should belong to the same partition. Due to the combinatorial nature of this problem, an exact solution is computationally prohibitive. We provide an integer programming formulation and offer gradient descent based heuristic to solve this problem. Through simulated datasets, we analyze the performance of our heuristic across a variety of different settings. In our computational study, we find that our heuristic provides remarkably better solutions than the benchmark method within a reasonable time. Albeit the slight decrease in the performance as the number of levels increase, our heuristic provides good solutions when each of the true underlying partition has a similar number of levels.
DownloadPaper Citation
in Harvard Style
Kayış E. (2020). A Gradient Descent based Heuristic for Solving Regression Clustering Problems.In Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-440-4, pages 102-108. DOI: 10.5220/0009836701020108
in Bibtex Style
@conference{data20,
author={Enis Kayış},
title={A Gradient Descent based Heuristic for Solving Regression Clustering Problems},
booktitle={Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2020},
pages={102-108},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009836701020108},
isbn={978-989-758-440-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - A Gradient Descent based Heuristic for Solving Regression Clustering Problems
SN - 978-989-758-440-4
AU - Kayış E.
PY - 2020
SP - 102
EP - 108
DO - 10.5220/0009836701020108