Warehousing Data for Brand Health and Reputation with AI-Driven Scores in NewSQL Architectures: Opportunities and Challenges
Paulo Siqueira, Paulo Siqueira, Rodrigo Dias, João Silva-Leite, Paulo Mann, Rodrigo Salvador, Daniel de Oliveira, Marcos Bedo
2025
Abstract
This study explores the use of NewSQL systems for brand health and reputation analysis, focusing on multidimensional modeling and Data Warehouses. While row-based and relational OLAP systems (ROLAP) struggle to ingest large volumes of data and NoSQL alternatives rely on physically coupled models, NewSQL solutions enable Data Warehouses to maintain their multidimensional schemas, which can be seamlessly implemented across various physical models, including columnar and key-value structures. Additionally, NewSQL provides ACID guarantees for data updates, which is instrumental when data curation involves human supervision. To address these challenges, we propose a Star schema model to analyze brand health and reputation, focusing on the ingestion of large volumes of data from social media and news sources. The ingestion process also includes rapid data labeling through a large language model (GPT-4o), which is later refined by human experts through updates. To validate this approach, we implemented the Star schema in a system called RepSystemand tested it across four NewSQL systems: Google Spanner, CockroachDB, Snowflake, and Amazon Aurora. An extensive evaluation revealed that NewSQL systems significantly outperformed the baseline ROLAP (a multi-sharded PostgreSQL instance) in terms of: (i) data ingestion time, (ii) query performance, and (iii) maintenance and storage. Results also indicated that the primary bottleneck of RepSystem lies in the classification process, which may hinder data ingestion. These findings highlight how NewSQL can overcome the drawbacks of row-based systems while maintaining the logical model, and suggest the potential for integrating AI-driven strategies into data management to optimize both data curation and ingestion.
DownloadPaper Citation
in Harvard Style
Siqueira P., Dias R., Silva-Leite J., Mann P., Salvador R., de Oliveira D. and Bedo M. (2025). Warehousing Data for Brand Health and Reputation with AI-Driven Scores in NewSQL Architectures: Opportunities and Challenges. In Proceedings of the 27th International Conference on Enterprise Information Systems - Volume 1: ICEIS; ISBN 978-989-758-749-8, SciTePress, pages 51-62. DOI: 10.5220/0013279800003929
in Bibtex Style
@conference{iceis25,
author={Paulo Siqueira and Rodrigo Dias and João Silva-Leite and Paulo Mann and Rodrigo Salvador and Daniel de Oliveira and Marcos Bedo},
title={Warehousing Data for Brand Health and Reputation with AI-Driven Scores in NewSQL Architectures: Opportunities and Challenges},
booktitle={Proceedings of the 27th International Conference on Enterprise Information Systems - Volume 1: ICEIS},
year={2025},
pages={51-62},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013279800003929},
isbn={978-989-758-749-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 27th International Conference on Enterprise Information Systems - Volume 1: ICEIS
TI - Warehousing Data for Brand Health and Reputation with AI-Driven Scores in NewSQL Architectures: Opportunities and Challenges
SN - 978-989-758-749-8
AU - Siqueira P.
AU - Dias R.
AU - Silva-Leite J.
AU - Mann P.
AU - Salvador R.
AU - de Oliveira D.
AU - Bedo M.
PY - 2025
SP - 51
EP - 62
DO - 10.5220/0013279800003929
PB - SciTePress