Authors:
Muhammad Arslan
and
Christophe Cruz
Affiliation:
Laboratoire Interdisciplinaire Carnot de Bourgogne (ICB), Université de Bourgogne, Dijon, France
Keyword(s):
Business Intelligence (BI), Decision-Making, Information Extraction (IE), Large Language Models (LLMs), Natural Language Processing (NLP), Retrieval-Augmented Generation (RAG).
Abstract:
Enterprises depend on diverse data like invoices, news articles, legal documents, and financial records to operate. Efficient Information Extraction (IE) is essential for extracting valuable insights from this data for decision-making. Natural Language Processing (NLP) has transformed IE, enabling rapid and accurate analysis of vast datasets. Tasks such as Named Entity Recognition (NER), Relation Extraction (RE), Event Extraction (EE), Term Extraction (TE), and Topic Modeling (TM) are vital across sectors. Yet, implementing these methods individually can be resource-intensive, especially for smaller organizations lacking in Research and Development (R&D) capabilities. Large Language Models (LLMs), powered by Generative Artificial Intelligence (GenAI), offer a cost-effective solution, seamlessly handling multiple IE tasks. Despite their capabilities, LLMs may struggle with domain-specific queries, leading to inaccuracies. To overcome this challenge, Retrieval-Augmented Generation (RAG
) complements LLMs by enhancing IE with external data retrieval, ensuring accuracy and relevance. While the adoption of RAG with LLMs is increasing, comprehensive business applications utilizing this integration remain limited. This paper addresses this gap by introducing a novel application named Business-RAG, showcasing its potential and encouraging further research in this domain.
(More)