
clickstream. Next, a ROLAP database is used for
storing the clickstreams. As long as the clickstream
are available on the database it is possible to perform
exploratory analysis using data cubes. Besides,
before any kind of exploration over data cubes a
multidimensional structure needs to be built. This is
supported using the data cube definition module. It
includes the following attributes: page_dimension,
time_di-mension, date_dimension, user_agent,
referrer_di-mension, request_dimension, and
session_di-mension.
Building this clickstream data cube allows the
application of OLAP operations, to view and
analyze the clickstreams from different angles,
derive ratios, or compute measures across many
dimensions. The data cube structure offers analytical
modelling capabilities, including a calculation
engine for deriving various statistics, and a highly
interactive and powerful data retrieval and analysis
environment. It is possible to use this engine to
discover implicit knowledge in the clickstream data
cube. The knowledge that can be discovered is
represented in the form of rules, tables, charts,
graphs, and other visual presentation forms for
associating or classifying data from clickstream data
cube (Figure 2).
4 CONCLUSIONS AND FUTURE
WORK
In this paper, we have discussed some
implementation issues on on-line analytical mining,
specifically on the data cube engines and its desired
functions. In fact, the observations mentioned in
(Han et al., 1998) (Chen et al., 1999) (Chen et al.,
2000) (Han et al., 1997) motivated us to study the
desired way to perform data cube mining and its
efficient implementation. As a result, we present our
data cube mining engine proposal, which contains
some guidelines and perspectives of research in
applying data cube techniques for analyzing
clickstreams.
REFERENCES
Chen, S., M., Han, J. and Yu, S., P., 1996. Data Mining:
An overview from database perspective. IEEE Trans.
Knowledge and Data Engineering, 8:866-8883.
Chen, Q., Dayal, U. and Hsu, M., 2000. An OLAP-based
Scalable Web Access Analysis Engine”. HP Labs,
Hewlett-Packard, 1501 Page Mill Road, MS 1U4, Palo
Alto, CA 94303, USA.
Chen, Q., Dayal, U. and Hsu, M., 1999. A Distributed
OLAP Infrastructure for E-Commerce. Proc. Fourth
IFCIS Conference on Cooperative Information
Systems (CoopIS’99).
Fayyad, U., M., Piatetsky-Shapiro, G., Smyth, P., and
Uthurusamy., R., 1998. Advances in Knowledge
Discovery and Data Mining. AAAI/MIT Press.
Han., J., 1998 Towards on-line analytical mining in large
databases. ACM SIGMOD Record, 27:97-107.
Han, J., Chee, S., and Chiang, J., Y., 1998. Issues for On-
line Analytical Mining of Data Warehouses,
SIGMOD’98 Workshop on Research Issues on Data
Mining and Knowledge Disvovery (DMKD’98).
Han, J., Chiang, J., Chee, S., Chen, J., Chen, Q. , Cheng,
S., Gong, W., Kamber, M., Liu, G., Koperski, K., Lu,
Y., Stefanovic, N., Winstone, L., Xia, B., Zaiane, O.,
R., Zhang, S. and Zhu H. 1997. DBMiner: A system
for data mining in relational databases and data
warehouses. In Proc. CASCON'97.
Figure 2: Discovering implicit knowledge on
clickstream data cubes.
Kimbal. R., 2000. The Data Webhouse Toolkit, Wiley.
Zaiane, O., Xin, M., and Han. J., 1998. Discovering web
access patterns and trends by applying olap and data
mining technology on web logs. In Proceedings of
Advances in Digital Libraries Conference (ADL),
pages 19—29.
ICEIS 2004 - DATABASES AND INFORMATION SYSTEMS INTEGRATION
586