A Gradient Descent based Heuristic for Solving Regression Clustering Problems

Enis Kayış

2020

Abstract

Regression analysis is the method of quantifying the effects of a set of independent variables on a dependent variable. In regression clustering problems, the data points with similar regression estimates are grouped into the same cluster either due to a business need or to increase the statistical significance of the resulting regression estimates. In this paper, we consider an extension of this problem where data points belonging to the same level of another partitioning categorical variable should belong to the same partition. Due to the combinatorial nature of this problem, an exact solution is computationally prohibitive. We provide an integer programming formulation and offer gradient descent based heuristic to solve this problem. Through simulated datasets, we analyze the performance of our heuristic across a variety of different settings. In our computational study, we find that our heuristic provides remarkably better solutions than the benchmark method within a reasonable time. Albeit the slight decrease in the performance as the number of levels increase, our heuristic provides good solutions when each of the true underlying partition has a similar number of levels.

Download


Paper Citation


in Harvard Style

Kayış E. (2020). A Gradient Descent based Heuristic for Solving Regression Clustering Problems.In Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-440-4, pages 102-108. DOI: 10.5220/0009836701020108


in Bibtex Style

@conference{data20,
author={Enis Kayış},
title={A Gradient Descent based Heuristic for Solving Regression Clustering Problems},
booktitle={Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2020},
pages={102-108},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009836701020108},
isbn={978-989-758-440-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 9th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - A Gradient Descent based Heuristic for Solving Regression Clustering Problems
SN - 978-989-758-440-4
AU - Kayış E.
PY - 2020
SP - 102
EP - 108
DO - 10.5220/0009836701020108