
such as Twitter (TW), Facebook (FB), and
YouTube(YT) used by the DW and ETL models.
• ETL Design. It verifies if the approach proposes
ETL process model to map UGC into sentiment.
• ETL concepts. It checks whether the approach de-
fines or formalizes the ETL concepts.
• SA method. It determines if the sentiment is gen-
erated based on a valid sentiment analysis method.
Although some approaches have proposed solu-
tions to integrate opinions from text UGCs, we note
the need to model ETL processes to transform UGC
text into DW. In this context, our contribution ad-
dresses this problem at a conceptual level. The mod-
els (cf. Sections 3.2 and 4) serve as design patterns
for opinion data integration and simplify the ETL de-
signer’s task. Morover, the ETL4Social-Sentiment-
Process and operation models apply to all social me-
dia types for sentiment analysis. It provides ETL de-
signers with a standardized approach to optimizing
social ETL processes using formalized concepts.
7 CONCLUSION
Integrating opinion data from unstructured text
sources into a decisional system can be challenging
when designing ETL processes. A social data ware-
house can help with this. However, careful han-
dling of user-generated content is required to iden-
tify sentiment. Our research aimed to develop prac-
tical approaches for sentiment analysis on social me-
dia. We proposed design models for the ETL4Social-
Sentiment process and operations. These models
handle activities and data to match UGC text with
the SentimentDim dimension of the SDW. The mod-
els are generic and customizable based on the ETL-
formulized concepts. Big data sources require pow-
erful ETL tools that are efficient in execution cost,
transformations, and parallel data processing. To im-
prove our proposal, we must use MapReduce as a dis-
tributed execution framework to process big data in
parallel, saving time and reducing the risk of errors.
REFERENCES
Alamoudi, E. S. and Alghamdi, N. S. (2021). Sentiment
classification and aspect-based sentiment analysis on
yelp reviews using deep learning and word embed-
dings. Journal of Decision Systems, 30(2-3):259–281.
Ben Kraiem, M., Alqarni, M., Feki, J., and Ravat, F. (2020).
Olap operators for social network analysis. Cluster
Computing, 23:2347–2374.
Darwich, M., Mohd, S. A., Omar, N., and Osman, N. A.
(2019). Corpus-based techniques for sentiment lex-
icon generation: A review. J. Digit. Inf. Manag.,
17(5):296.
Khan, B., Jan, S., Khan, W., and Chughtai, M. I. (2024).
An overview of etl techniques, tools, processes and
evaluations in data warehousing. Journal on Big Data,
6.
Li, J., Zhang, Y., Li, J., and Du, J. (2023). The role of
sentiment tendency in affecting review helpfulness for
durable products: nonlinearity and complementarity.
Information Systems Frontiers, 25(4):1459–1477.
Li, S., Liu, F., Zhang, Y., Zhu, B., Zhu, H., and Yu, Z.
(2022). Text mining of user-generated content (ugc)
for business applications in e-commerce: A system-
atic review. Mathematics, 10(19):3554.
Moalla, I., Nabli, A., and Hammami, M. (2018). Towards
opinions analysis method from social media for mul-
tidimensional analysis. In MoMM, pages 8–14.
Moalla, I., Nabli, A., and Hammami, M. (2022). Data
warehouse building to support opinion analysis in so-
cial media. Social Network Analysis and Mining,
12(1):123.
Ojeda-Hern
´
andez, M., L
´
opez-Rodr
´
ıguez, D., and Mora,
´
A.
(2023). Lexicon-based sentiment analysis in texts us-
ing formal concept analysis. International Journal of
Approximate Reasoning, 155:104–112.
Silva, L. M. M., Val
ˆ
encio, C. R., Zafalon, G. F. D.,
Columbini, A. C., Filipe, J., Smialek, M., Brodsky, A.,
and Hammoudi, S. (2022). Feature selection with hy-
brid bio-inspired approach for classifying multi-idiom
social media sentiment analysis. In ICEIS, pages 297–
307.
Sinha, S., Narayanan, R. S., and Rakila, R. (2024). Har-
nessing sentiment analysis methodologies for business
intelligence enhancement and governance intelligence
evaluation. Journal of Intelligent Systems and Appli-
cations in Engineering, 12(11s):166–176.
Su, Y. and Shen, Y. (2022). A deep learning-based senti-
ment classification model for real online consumption.
Frontiers in Psychology, 13:886982.
Val
ˆ
encio, C. R., Silva, L. M. M., Ten
´
orio, W., Zafalon, G.
F. D., Colombini, A. C., and Fortes, M. Z. (2020).
Data warehouse design to support social media anal-
ysis in a big data environment. Journal of Computer
Science, pages 126–136.
Walha, A., Ghozzi, F., and Gargouri, F. (2016). A lexicon
approach to multidimensional analysis of tweets opin-
ion. In AICCSA, pages 1–8. IEEE.
Walha, A., Ghozzi, F., and Gargouri, F. (2017). Etl4social-
data: Modeling approach for topic hierarchy. In
KEOD, pages 107–118.
Wankhade, M., Rao, A. C. S., and Kulkarni, C. (2022).
A survey on sentiment analysis methods, applica-
tions, and challenges. Artificial Intelligence Review,
55(7):5731–5780.
ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering
648