Authors:
Arthur Yosef
1
;
Moti Schneider
2
;
Eli Shnaider
3
;
Amos Baranes
3
and
Rimona Palas
4
Affiliations:
1
Tel Aviv-Yaffo Academic College and Israel
;
2
Netanya Academic College and Israel
;
3
Peres Academic Center and Israel
;
4
College of Law and Business and Israel
Keyword(s):
Data Mining, Fuzzy Logic, Intervals, Central Tendency, Data Preparation.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Pre-Processing and Post-Processing for Data Mining
;
Process Mining
;
Symbolic Systems
Abstract:
Model-building professionals are often facing a very difficult choice of selecting relevant variable/s from a set of several similar variables. All those variables are supposedly representing the same factor but are measured differently. They are based on different methodologies, baselines, conversion/comparability methods, etc., thus leading to substantial differences in numerical values for essentially the same things. In this study we introduce a method that utilizes intervals to capture all the relevant variables that represent the same factor. First, we discuss the advantages utilizing intervals of values from the stand point of reliability, better and more efficient data utilization, as well as substantial reduction in the complexity, and thus improvement in our ability to interpret the results. In addition, we introduce an interval (range) reduction algorithm, designed to reduce excessive size of intervals, thus bringing them closer to their central tendency cluster. Following
the theoretical component, we present a case study. The case study demonstrates the process of converting the data into intervals for two broad economic variables (each consisting of several data series) and two broad financial variables. Furthermore, it demonstrates the practical application of the procedures addressed in this study and their effectiveness.
(More)