consideration the semantic aspect while comparing
the elements of the schemas. The result of this
algorithm is a set of clusters containing a set of
schemas semantically close.
The new algorithm offers the possibility to deal
with the requirements of different users having
different skills and belonging to different
departments.
As future work, we will merge the schemas
within each cluster using the schema integration
technique to generate data mart schemas.
We propose, also, extending this work to deal
with other structures of schemas corresponding to
the databases schemas.
ACKNOWLEDGEMENTS
We would like to thank everyone.
REFERENCES
Alexander, J. H., Freiling, M. J., Shulman, S. J., Staley, J.
L., Rehfuss, S., and Messick, S. L., 1986. Knowledge
Level Engineering: Ontological Analysis. In
Proceedings of the 5th National Conference on
Artificial Intelligence, AAAI-86, 963-968.
Alexiev, V., Breu, M., De Bruijn, J., Fensel, D., Lara,
R., and Lausen, H., 2005. Information Integration with
Ontologies: Experiences from an Industrial Showcase,
John Wiley & Son.
Andreopoulos, B., An, A., and Wang, X., 2004. MULIC:
Multi-Layer Increasing Coherence Clustering of
Categorical data sets. Technical Report CS-2004-07,
York University.
Andritsos, P., Tsaparas, P., Miller, R. J., and Sevcik, K.
C., 2004. LIMBO: Scalable Clustering of Categorical
Data. In Proceedings of the 9th International
Conference on Extending Database Technology
(EDBT), Heraklion, Greece, 123-146.
Annoni, E., Ravat, F., Teste, O., and Zurfluh, G., 2006.
Towards Multidimensional Requirement Design. In
Proceedings of 8th International Conference Data
Warehousing and Knowledge Discovery (DaWaK),
75-84.
Arfaoui. N., Akaichi. J., 2013. New Approach for the
Collection of Users’ Requirements using DwADS. In
Proceedings of 22nd International Business
Information Management Association (IBIMA),
Rome, Italy,
Barbara, D., Couto, J., and Li, Y., 2002. COOLCAT: An
entropy-based algorithm for categorical clustering. In
Proceedings of the eleventh international conference
on Information and knowledge management, 582-589.
Batet, M., Valls, A., and Gibert, K.., 2008. Improving
classical clustering with ontologies. In Proceedings of
the 4th world conference of the international
association for statistical computing, 137-146.
Chavent, M., Kuentz, V., and Saracco, J., 2010. Clustering
of categorical variables around latent variables.
Cahiers du GREThA 2010-02, Groupe de Recherche
en Economie Theorique et Appliquee.
Chen, D., Cui, D.W., Wang, C.X., and Wang, Z. R., 2006.
A Rough Set-Based Hierarchical Clustering Algorithm
for Categorical Data. International Journal of
Information Technology.
Faber, V., 1994. Clustering and the Continuous k-means
Algorithm. Los Alamos Science, 138-144.
Guha, S., Rastogi, R., and Shim, K.., 2000. ROCK: A
Robust Clustering Algorithm for Categorical
Attributes. In: Inf. Syst., Vol. 25, Nr. 5 Oxford, UK,
UK: Elsevier Science Ltd., 345-366.
Gyssens, M. and Lakshmanan, L. V. S., 1997. A
Foundation for Multi-dimensional Databases.
In Proceedings of 23rd International Conference on
Very Large Data Bases (VLDB), 106-11.
Hand, D., Mannila, H., and Smyth, P., 2001. Principles of
Data Mining. MIT Press, Cambridge, MA.
Hotho, A., Staab, S., and Stumme, G., 2003. Wordnet
improves Text Document Clustering. In Proceedings
of the SIGIR 2003 Semantic Web Workshop.
Huang, Z., 1998. Extensions to the k-Means Algorithm for
Clustering Large Data Sets with Categorical Values.
Data Mining and Knowledge Discovery, 2:283–304.
Jain, A. K., Murty, M. N., and Flynn, P. J., 1999. Data
Clustering: A Review. ACM Comput. Surv., 264-323.
Jing, L., Zhou, L., Ng, M. K., and Huang, J. Z., 2006.
Ontology-based Distance Measure for Text Clustering.
In Proceeding of SIAM International conference on
Text Data Mining, Bethesda.
Khan, S. S., Kant, S., 2007. Computation of Initial Modes
for K-modes Clustering Algorithm using Evidence
Accumulation.
International Joint Conference on
Artificial Intelligence, 2785-2789.
Malinowski, E., and Zimanyi, E., 2008. Advanced Data
Warehouse Design, From Conventional to Spatial and
Temporal Applications, Springer Verlag Berlin
Heidelberg.
Ng, M. K., Li, M. J., Huang, J. Z., and He, Z., 2007. On
the Impact of Dissimilarity Measure in k-modes
Clustering Algorithm. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 29 (3): 503-507.
Quine, W.V.O., 1980. From a Logical Point of View.
Harvard University Press; Cambridge, MA.
Rezankova, H., 2009. Cluster Analysis and Categorical
Data. Statistika, 216-232.
San, O. M., Huynh, V. N., and Nakamori, Y., 2004. An
Alternative Extension Of The K-Means Algorithm For
Clustering Categorical Data. Journal of Applied
Mathematics and Computer Science, No. 2, 241-247.
Studer, R., Benjamins, V. R., and Fensel, D., 1998.
Knowledge Engineering: Principles and Methods.
IEEE Trans on Data and Knowledge Engineering, 25
(1-2): 161-197.
Tibshirani, R., Walther, G.., and Hastie, T., 2001.
Estimating the number of clusters in a data set via the
gap statistic, J. R. Statist. Soc. B, 411-423.
ClusteringUsers'RequirementsSchemas
21