Manipulating Prompts and Retrieval-Augmented Generation for LLM Service Providers

Aditya Kuppa, Jack Nicholls, Nhien-An Le-Khac

2024

Abstract

The emergence of large language models (LLMs) has revolutionized the field of AI, introducing a new era of generative models applied across diverse use cases. Within this evolving AI application ecosystem, numerous stakeholders, including LLM and AI application service providers, use these models to cater to user needs. A significant challenge arises due to the need for more visibility and understanding of the inner workings of these models to end-users. This lack of transparency can lead to concerns about how the models are being used, how outputs are generated, the nature of the data they are trained on, and the potential biases they may harbor. The user trust becomes a critical aspect of deploying and managing these advanced AI applications. This paper highlights the safety and integrity issues associated with service providers who may introduce covert, unsafe policies into their systems. Our study focuses on two attacks: the injection of biased content in generative AI search services, and the manipulation of LLM outputs during inference by altering attention heads. Through empirical experiments, we show that malicious service providers can covertly inject malicious content into the outputs generated by LLMs without the awareness of the end-user. This study reveals the subtle yet significant ways LLM outputs can be compromised, highlighting the importance of vigilance and advanced security measures in AI-driven applications. We demonstrate empirically that is it possible to increase the citation score of LLM output to include erroneous or unnecessary sources of information to redirect a reader to a desired source of information.

Download


Paper Citation


in Harvard Style

Kuppa A., Nicholls J. and Le-Khac N. (2024). Manipulating Prompts and Retrieval-Augmented Generation for LLM Service Providers. In Proceedings of the 21st International Conference on Security and Cryptography - Volume 1: SECRYPT; ISBN 978-989-758-709-2, SciTePress, pages 777-785. DOI: 10.5220/0012803100003767


in Bibtex Style

@conference{secrypt24,
author={Aditya Kuppa and Jack Nicholls and Nhien-An Le-Khac},
title={Manipulating Prompts and Retrieval-Augmented Generation for LLM Service Providers},
booktitle={Proceedings of the 21st International Conference on Security and Cryptography - Volume 1: SECRYPT},
year={2024},
pages={777-785},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012803100003767},
isbn={978-989-758-709-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 21st International Conference on Security and Cryptography - Volume 1: SECRYPT
TI - Manipulating Prompts and Retrieval-Augmented Generation for LLM Service Providers
SN - 978-989-758-709-2
AU - Kuppa A.
AU - Nicholls J.
AU - Le-Khac N.
PY - 2024
SP - 777
EP - 785
DO - 10.5220/0012803100003767
PB - SciTePress