data marts must be finally merged in order to
provide the end-user/application with the extracted
knowledge. It should be noted that this is a relevant
task in our proposed framework, as very often end-
users/applications are interested in extracting useful
knowledge by means of correlated, cross-
comparative KDD tasks, rather than a singleton
KDD task, according to real-life DM scenarios.
Combining results coming from different DM
algorithms is a non-trivial research issue, as
recognized in literature. In fact, as highlighted in
Section 4.2, the output of a DM algorithm depends
on the nature of that algorithm, so that in some cases
MR coming from very different algorithms cannot
be combined directly.
In MRE-KDD
+
, we face-off this problematic
issue by making use of OLAP technology again. We
build multidimensional views over MR provided by
execution schemes of
KDF, thus giving support to a
unifying manner of exploring and analyzing final
results. It should be noted that this approach is well-
motivated under noticing that usually end-
user/applications are interested in analyzing final
results based on a certain mining metrics provided
by KDD processes (e.g., confidence interval of
association rules, density of clusters, recall of IR-
style tasks etc), and this way-to-do is perfectly
suitable to be implemented within OLAP data cubes
where (i) data source is the output of DM algorithms
(e.g., item sets), (ii) (OLAP) dimensions are user-
selected features of the output of DM algorithms,
and (iii) (OLAP) measures are the above-mentioned
mining metrics. Furthermore, this approach also
involves in the benefit of efficiently supporting the
visualization of final results by mean of attracting
user-friendly, graphical formats/tools such as
multidimensional bars, charts, plots etc, similarly to
the functionalities supported by
DBMiner and
WEKA.
Multidimensional Ensembling Function
MEF is
the component of MRE-KDD
+
which is in charge
of supporting the above-described knowledge
presentation/delivery task. It takes as input a
collection of Q output results O = {O
0
, O
1
, …, O
Q-1
}
provided by
KDF-formatted execution schemes and
the definition of a data mart Z, and returns as output
a data mart L, which we name as Knowledge
Visualization Data Mart (KVDM), built over data in
O according to Z. Formally,
MEF is defined as
follows:
MEF: 〈O,Z〉 → D
(5)
It is a matter to note that the KVDM L becomes
part of the set of data marts D of MRE-KDD
+
,
but, contrarily to the previous data marts, which are
used to knowledge processing purposes, it is used to
knowledge exploration/visualization purposes.
4 CONCLUSIONS AND FUTURE
WORK
Starting from successful OLAM technologies, in this
paper we have presented MRE-KDD
+
, a model
for supporting advanced knowledge discovery from
large databases and data warehouses, which is useful
for any data-intensive setting.
Future work is oriented along two main
directions: (i) testing the performance of MRE-
KDD
+
against real-life scenarios such as those
drawn by distributed corporate data warehousing
environments in B2B and B2C e-commerce systems,
and (ii) extending the actual capabilities of MRE-
KDD
+
as to embed novel functionalities for
supporting prediction of events in new DM activities
edited by users/applications on the basis of the
“history” given by logs of previous KDD processes
implemented in similar or correlated application
scenarios.
REFERENCES
Chaudhuri, S., and Dayal, U., 1997. An Overview of Data
Warehousing and OLAP Technology. In SIGMOD
Record, Vol. 26, No. 1, pp. 65-74.
Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P, 1996.
From Data Mining to Knowledge Discovery: An
Overview. In Fayyad, U., Piatetsky-Shapiro, G.,
Smyth, P, and Uthurusamy, R. (eds.), “Advances in
Knowledge Discovery and Data Mining”, AAAI/MIT
Press, Menlo Park, CA, USA, pp. 1-35.
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A.,
Reichart, D., Venkatrao, M., Pellow, F., and Pirahesh,
H., 1997. Data Cube: A Relational Aggregation
Operator Generalizing Group-By, Cross-Tabs, and
Sub-Totals. In Data Mining and Knowledge
Discovery, Vol. 1, No. 1, pp.29-54.
Goebel, M., and Gruenwald L., 1999. A Survey of Data
Mining and Knowledge Discovery Software Tools. In
SIGKDD Explorations, Vol. 1, No. 1, pp. 0-33.
Han, J., 1997. OLAP Mining: An Integration of OLAP
with Data Mining. In Proc. of the 7th IFIP 2.6 DS
Work. Conf., pp. 1-9.
Han, J., Fu, Y., Wang, W., Chiang, J., Gong, W.,
Koperski, K., Li, D., Lu, Y., Rajan, A., Stefanovic, N.,
MRE-KDD+: A MULTI-RESOLUTION, ENSEMBLE-BASED MODEL FOR ADVANCED KNOLWEDGE
DISCOVERY
157