the proposed model doesnt manage the flexibility and
the scalability of social networks data.
(N. U. Rehman and Scholl, 2012) provide a DW
solution for hosting the public data stream of Twit-
ter messaging. The authors enrich the multidimen-
sional analysis of such data via content-driven discov-
ery of dimensions and classifying hierarchies. In the
first step, data mining algorithms are applied to clus-
ter dimensional data. In the second step, the acquired
classification is added as a new aggregation path to
the respective dimension, leading to the third step of
enabling this new aggregation path in OLAP queries.
Nevertheless, this work is limited to the granularity
level addition and ignores the other MC such as facts
and dimensions. Moreover, the proposed model is in-
flexible and no scalable.
(E. Gallinucci and Rizzi, 2013) propose a method-
ology called meta-stars to model topic hierarchies
in ROLAP systems. Its basic idea is to use meta-
modeling coupled with navigation table and with tra-
ditional dimension tables. The navigation tables sup-
port hierarchy instances with different lengths and
with non-leaf facts, and allow different roll-up seman-
tics to be explicitly annotated. The meta-modeling
enables hierarchy heterogeneity and dynamics to be
accommodated. However, this work is based on a re-
lational approach which presents limitations regard-
ing to schema scalability.
(Moalla and Nabli, 2014) present a method to
multidimensional schema construction from unstruc-
tured data extracted from SN. This construction is car-
ried out from Facebook page in order to analyze the
customers opinions. A real case study has been devel-
oped to illustrate the proposed method and to confirm
that the SN analysis can predict the success prospects
of the products. Nevertheless, the dynamic discovery
of MC is not supported. The proposed model is not
flexible and not adaptable to the huge amount of so-
cial data.
Based on the previous study, most of the works
show no indication of the dynamic determination of
MC seen the velocity of SN data. Also the DW
schemas are generally fixed at design stage.
2.2 Dynamic Discovery of
Multidimensional Concepts
Nowadays, we are experiencing a rapid growth of so-
cial structures supported by communication technolo-
gies and various Web-based services. Due to scale,
complexity and dynamicity, user-generated data from
SN are very difficult to store and analyze in terms of
traditional data warehousing methods (N. U. Rehman
and Scholl, 2012). To overcome these problems,
many authors have worked on dynamic discovery of
MC and have used data mining to build a DW.
In this context, (Usman and Pears, 2011) provide
a methodology to design semi-automatically DWs
schema with hierarchical clustering. This latter is
used to perform a pre-processing on the data. After
that, the system identifies both facts and dimensions
into the clustered data.
Rehman proposes a system to dynamically build
hierarchies based on data from Twitter (N. U. Rehman
and Scholl, 2012). This paper has two Interests: a)
The cube is built on original data which are the mes-
sages of users on a SN. b) Data mining is used to
dynamically build hierarchies. Thanks to data min-
ing, the categories of network users described in hi-
erarchies are updated automatically. On the other
hand, Ceci uses a hierarchical clustering to integrate
continuous variables as dimensions in a DW schema
(M. Ceci and Malerba, 2011). It discretizes a continu-
ous dimension so that the user can perform operations
on existing querying a cube: Roll-up and DrillDown.
As for the current work, (L. Sautot and Molin,
2014) propose using hierarchical agglomerative clus-
tering with a metric that comes from ecological stud-
ies to build semi-automatically hierarchical dimen-
sions in an OLAP cube. The authors perform a hierar-
chical clustering on heterogeneous data sets that con-
tains qualitative and quantitative variables. They offer
a prototypical automatic system which builds dimen-
sion for an OLAP cube and measure the performances
of this system according to the number of clustered
individuals and according to the number of variables
used for clustering.
Table 1 highlights a summary of the literature re-
view which is based on seven criteria (Concept M.:
Conceptual Model, D. MC: multidimensional con-
cepts, Methodology, SN: Social Network, Ontology,
Flexibility, Scalability).
All the mentioned works present several interest-
ing mining. It has been recognized that mining tech-
niques such as Clustering can help in designing DW
schema. That is why we adopt this orientation for
the dynamic discovery of MC. However, no work has
ever dealt with the semantic heterogeneity. More-
over, no work has ever followed a mixed approach
(data/demand driven approach). Furthermore, it is
worth noting that just one work has provided the scal-
ability of the schema. At the same time, the hetero-
geneity and the growth of the social data need to be
considered in order to properly retrieve needed data.
The frequent arrival of new needs requires that the
system should be adaptable to changes.
Based on the above discussion, there is a strong
need of a significant methodology that allows a dy-
ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems
340