Agent-Centric Projection of Prompting Techniques and Implications for

Synthetic Training Data for Large Language Models

Dhruv Dhamani

and Mary Lou Maher

University of North Carolina, Charlotte, U.S.A.

Keywords:

Large Language Models (LLMs), Task-Oriented LLM System, Prompt Engineering, Large Language

Model-Based Agent, LLM-Based Multi-Agent System, Synthetic Training Data, Artiﬁcial Intelligence in

Problem Solving.

Abstract:

Recent advances in prompting techniques and multi-agent systems for Large Language Models (LLMs) have

produced increasingly complex approaches. However, we lack a framework for characterizing and comparing

prompting techniques or understanding their relationship to multi-agent LLM systems. This position paper

introduces and explains the concepts of linear contexts (a single, continuous sequence of interactions) and

non-linear contexts (branching or multi-path) in LLM systems. These concepts enable the development of

an agent-centric projection of prompting techniques, a framework that can reveal deep connections between

prompting strategies and multi-agent systems. We propose three conjectures based on this framework: (1)

results from non-linear prompting techniques can predict outcomes in equivalent multi-agent systems, (2)

multi-agent system architectures can be replicated through single-LLM prompting techniques that simulate

equivalent interaction patterns, and (3) these equivalences suggest novel approaches for generating synthetic

training data. We argue that this perspective enables systematic cross-pollination of research ﬁndings between

prompting and multi-agent domains, while providing new directions for improving both the design and training

of future LLM systems.

1 INTRODUCTION

Large Language Models (LLMs) are a recent devel-

opment in Generative Artiﬁcial Intelligence that can

mimic human-like behavior (Park et al., 2023), es-

pecially in conversations (Cai et al., 2023). LLMs

have also shown a kind of general intelligence (Rad-

ford et al., 2019; Yogatama et al., 2019). Central

to harnessing the capabilities of LLMs is the con-

cept of prompting, a strategy that signiﬁcantly inﬂu-

ences task performance by instructing LLMs in spe-

ciﬁc ways (Chen et al., 2023).

HYPOTHESIS AND GOALS: In this position pa-

per, we hypothesize that viewing prompting tech-

niques through a proposed agent-centric lens can help

uncover structural equivalences between single-LLM

prompting and multi-agent approaches. Our goal is

to (1) introduce a uniﬁed framework for comparing

these techniques, (2) develop and examine conjec-

tures about their relationship, and (3) outline how

https://orcid.org/0009-0003-8226-7621

https://orcid.org/0000-0002-4150-0322

this perspective can inform the generation of synthetic

training data.

Consider a simple math problem. When we di-

rectly prompt an LLM, “What is 13 × 27?”, we might

receive a single numeric answer. However, when we

ask, “Let’s solve this step by step: what is 13 × 27?”,

we explicitly prompt for intermediate reasoning plus

the ﬁnal result (Kojima et al., 2023; Yu et al., 2023).

While both prompts seek the same ﬁnal answer, are

they both still the same problem if one has a different

“correct” answer?

Another approach to improving end-task perfor-

mance when using LLMs has been to incorporate

“reasoning” (OpenAI, 2024). The model outputs ar-

bitrarily long “reasoning traces” before responding to

the prompts. These traces are sequences of natural

language statements like “Let’s ﬁrst understand the

input and output formats”. OpenAI o1 is a single large

language model or agent.

If we simply added role identiﬁers before each

statement - “Analyst: Let’s ﬁrst understand the input

and output formats” - would it suddenly qualify as a

multi-agent system?

1254

Dhamani, D. and Maher, M. L.

Agent-Centric Projection of Prompting Techniques and Implications for Synthetic Training Data for Large Language Models.

DOI: 10.5220/0013318300003890

In Proceedings of the 17th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2025) - Volume 3, pages 1254-1261

ISBN: 978-989-758-737-5; ISSN: 2184-433X

What about approaches where an LLM analyzes

problems from multiple perspectives in separate con-

versations before merging all perspectives together in

another conversation (Saha et al., 2023)? Is each con-

versation a different “agent” performing a subtask? Is

this a multi-agent system?

We argue that these questions can be systemat-

ically addressed by viewing prompting techniques

through an agent-centric lens. By developing the

concepts of linear and non-linear contexts in LLM

systems, we shed light on possible connections be-

tween single-LLM prompting techniques and multi-

agent systems. We discuss implications for the future

of LLM systems, from enabling cross-pollination of

research ﬁndings between prompting techniques and

multi-agent systems to suggesting novel approaches

for generating synthetic training data that could en-

hance capabilities in both domains.

To develop this argument, we ﬁrst establish foun-

dational deﬁnitions and examine previous work on

prompting techniques and task-oriented LLM systems

(§ 2). Building on these foundations, we present our

framework for agent-centric projection and explore its

implications for both system design and training (§ 3).

We conclude by discussing the larger impact of this

perspective on future research in LLM systems (§ 4).

2 DEFINITIONS AND PRIOR

WORK

A task-oriented LLM system is a Large Language

Model (LLM) system conﬁgured to perform speciﬁc

tasks, rather than open-ended conversations

. Such

systems have shown promise in complex tasks such as

software development (Hong et al., 2023), where the

system must manage multiple rounds of interaction,

maintain context throughout iterations, and often col-

laborate with other systems or agents to complete the

task.

We begin by deﬁning a minimal task-oriented

LLM system (§ 2.1), with particular attention to how

such systems manage context across multiple interac-

tions. We then examine prompting techniques (§ 2.2),

focusing on how different approaches to prompting

lead to different patterns of context creation and man-

agement. These patterns form the basis for our novel

concepts of linear and non-linear contexts, which en-

able an agent-centric projection of prompting tech-

niques.

In (Xi et al., 2023), the authors describe task-oriented

deployments of LLM-based agents, which we generalize to

simply task-oriented LLM systems

2.1 Minimal Task-Oriented LLM

System

A minimal task-oriented LLM system is a minimal

LLM system that can be instructed to solve tasks.

Thus, we start by deﬁning a minimal LLM system.

Large Language Models are auto-regressive mod-

els that accept input tokens and use them as history

(often referred to as context), to compute probabili-

ties of all tokens in their vocabulary as the next token.

We can sample from this probability distribution us-

ing a sampling/decoding algorithm to generate text.

This process is then repeated until the LLM predicts

a special token, or a special sequence of tokens, that

marks the end of the text (Feuerriegel et al., 2023)

We call this a bare-bones LLM system (see Fig-

ure 1) as it contains the minimal components needed

for text generation, without additional components to

help with context management. Every time an LLM

is prompted with context C

, it generates a response

that would need to be stored in context C

n+1

for the

next prompt, assuming multiple rounds of instruction

and response generation are required.

Figure 1: A bare-bones LLM system.

For systems oriented towards solving even mod-

erately complex tasks, context management becomes

quickly cumbersome. For example, in (Saha et al.,

2023), the authors describe a system in which m

branches are created from a LLM response to a

The Generative AI system description in (Feuerriegel

et al., 2023) includes any UI components as part of the Gen-

erative AI system, and we use a modiﬁed deﬁnition that

only includes the language model and sampling/decoding

procedure here.

Agent-Centric Projection of Prompting Techniques and Implications for Synthetic Training Data for Large Language Models

1255

prompt C

, producing a set of m responses r =

, R

, . . . , R

}. Then, they take all of r, and

transform it into a prompt C

n+1

, in which they instruct

the LLM to merge all responses in r into a single re-

sponse R

n+1

, which potentially needs to be stored in

context C

n+2

for the next prompt. A similarly com-

plex system is described in (Ning et al., 2023), and

we will examine more examples in our discussion of

prompting techniques.

If we deﬁne a minimal LLM system without de-

scribing how context is managed, it would be too dif-

ﬁcult to compare different systems and apply learn-

ings from one researched system to another. Because

of this, we include a description of a minimal con-

text management subsystem within our deﬁnition of a

minimal task-oriented LLM system.

Speciﬁcally, we include a context store CS. Ini-

tially, the context store CS is empty and the ﬁrst time

an LLM is provided with context C

to generate R

both the prompt and the response are permanently ap-

pended to CS. For all future requests, the LLM is ﬁrst

provided with a sliding window of content from CS as

context, to which it appends the prompt C

to gener-

ate R

. Once the response R

is generated, both the

prompt and the response are permanently appended

to CS. The model then closely matches messaging:

the context store CS acts as chat history, users send

messages, and the LLM responds.

Figure 2: A minimal LLM system that includes a context

store.

We discuss the implications of this composition of

a minimal LLM system in § 3.2.

2.2 Prompting Techniques

Prompting refers to the act of constructing and pro-

viding input text (a prompt) to an LLM. In the con-

text of task-oriented LLM systems, prompt engineer-

ing can be deﬁned as iteratively creating and adapting

a prompt for a given LLM and task pair.

The way an LLM is prompted signiﬁcantly affects

task performance (Nori et al., 2023; Savage et al.,

2023). There are many surprising results in this area,

such as letting an LLM know that solving a task “is

very important to my career” can improve task per-

formance (Li et al., 2023).

Such results can be explained by research such

as (Hendel et al., 2023), which shows that in-context

learning creates task vectors or representations within

the LLM that increase the probability of correct task

completion. Other research has shown that it is possi-

ble to “search” for prompts that are more likely to lead

to success, analogous to ﬁnding task vectors that are

more likely to lead to success. In (Zou et al., 2023),

the authors were able to procedurally ﬁnd adversar-

ial preﬁxes, which, when added to prompts, result in

LLMs breaking their alignment and engaging in un-

safe behavior.

All of these are examples of modifying the prompt

without changing the actual task/problem deﬁnition,

to make the successful completion of the intended

task more likely. However, researchers are prone to

modifying the prompts in a manner that changes the

task, rather than modifying the prompts in a manner

that improves task performance.

For example, when using Chain-of-Thought

prompting (Wei et al., 2023)

, or when asking LLMs

to think step-by-step (Kojima et al., 2023) – the task

meaningfully changes. It goes from instructing LLMs

to give me an answer now to asking it to ﬁrst plan out

a solution, and then share an answer. This is a differ-

ent task being solved, even though the ﬁnal deliver-

able (the answer) is the same. It should be a given that

LLMs have different capabilities for different tasks.

This is not to say that we shouldn’t instead solve

equivalent tasks that LLMs are more suitable to, but

that it is problematic to have prompt modiﬁcation

(that leaves instructions/task deﬁnition intact) to in-

struction modiﬁcation in the same category. Thus, we

make the distinction between prompt engineering and

instruction engineering:

• PROMPT ENGINEERING: The act of modify-

ing the prompt without changing the actual

task/problem deﬁnition or adding relevant knowl-

edge/information, to make the successful comple-

tion of the intended task more likely. We restrict

the addition of relevant knowledge/information to

LLM augmentation to avoid an overlap.

• INSTRUCTION ENGINEERING: The act of mod-

ifying the prompt in a manner that changes the

task/problem to an equivalent task/problem that

the LLMs are more suitable for, such that the ﬁnal

deliverable (the answer) is the same.

As described in the paper, one is to also provide in-

context examples, but this is unnecessary, as shown in (Ko-

jima et al., 2023)

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

1256

In (Besta et al., 2023), the authors describe a tax-

onomy of techniques to improve reliability in task-

oriented text generation:

• INPUT-OUTPUT. The LLM is directly being in-

structed to respond with the result of a prompted

task.

• INPUT-OUTPUT WITH ADDITIONAL STEPS. The

LLM is instructed to perform additional steps be-

fore or after generating a result for a prompted

task, such as reﬂecting on its response and reﬁn-

ing it, or creating a plan (Madaan et al., 2023; Wei

et al., 2023).

• SINGLE INPUT-MANY OUTPUT.

The LLM is

passed the same input prompt multiple times, and

elaborate mechanisms are used to choose the ﬁnal

answer (Wang et al., 2022).

• INPUT WITH NON-LINEAR INTERMEDIARY

STEPS.

LLM branches into multiple paths

(through variations of an input prompt), gener-

ating multiple responses as additional steps, and

then merges them into a single response (Saha

et al., 2023; Ning et al., 2023).

• TREE OF THOUGHTS. An elaborate method de-

scribed in (Yao et al., 2023), where many interme-

diate thought branches are explored, backtracked,

and pruned until a ﬁnal answer is settled on.

• GRAPH OF THOUGHTS. An elaborate method de-

scribed in (Besta et al., 2023), where intermediate

thoughts are modeled as a connected graph, and

the LLM traverses the graph to settle on a ﬁnal

answer.

The way these techniques manage context dif-

fers signiﬁcantly, from linear interactions to branch-

ing paths of thought. In the next section, we intro-

duce the concepts of Linear and Non-Linear contexts

to formalize these differences, and show how this

formalization enables an Agent-centric projection of

prompting techniques with potential implications for

synthetic training data generation.

3 FRAMEWORK AND

CONJECTURES

In research and practice, LLM systems exhibit differ-

ent patterns in how they manage context and generate

responses. We argue that these patterns can be under-

stood through a theoretical framework that connects

Referred to as Multiple CoTs in (Besta et al., 2023)

This is not described in (Besta et al., 2023)

prompting techniques with multi-agent systems, re-

vealing opportunities for improving both system de-

sign and training. In this section, we ﬁrst introduce a

formal categorization of context management patterns

in LLM systems (§ 3.1). Building on this foundation,

we develop an agent-centric projection of prompt-

ing techniques (§ 3.2) that reveals deep connections

between seemingly disparate areas. Finally, we ex-

plore how this uniﬁed perspective suggests novel ap-

proaches to synthetic training data generation (§ 3.3),

with potentially far-reaching implications for improv-

ing LLM capabilities.

3.1 Linear and Non-Linear Context in

LLM Systems

To formally characterize how task-oriented LLM sys-

tems manage context and generate responses, we de-

velop a framework based on message ﬂow patterns.

Building on the minimal task-oriented LLM system

concept (§ 2.1), we analyze how the context store

maintains sequences of messages M = {(C

, R

)}

n=1

where each response R

is generated using all previ-

ous context-response pairs.

Using this foundation, we propose a method for

classifying prompting techniques and their resulting

task-oriented LLM systems into two categories based

on their context management patterns.

PROMPTING TECHNIQUES WITH LINEAR CON-

TEXT – where there exists exactly one continuous se-

quence of messages M = {(C

, R

)}

n=1

that contains

all generated messages and input contexts in the cor-

rect chronological order.

All Input-Output and Input-Output with additional

steps techniques (as described in § 2.2) can be classi-

ﬁed as having a linear context, as they all involve a

single continuous sequence of messages.

For example, consider Self-Reﬁne (Madaan et al.,

2023), where each response is iteratively reﬁned using

all previous context-response pairs until a stop condi-

tion is met.

PROMPTING TECHNIQUES WITH NON-LINEAR

CONTEXT – where there cannot always be one con-

tinuous sequence of messages that contains all input

context and generated messages in the correct chrono-

logical order. Instead, there can be multiple branches

of conversation possible, each with its own continu-

ous sequence of messages {M

, M

, . . . , M

All single input-many output, input with non-

linear intermediary steps, tree of thought, and graph

of thought techniques, (as described in § 2.2) can be

classiﬁed as having a non-linear context, as they all

potentially involve sequences of conversation M.

For example, consider a simpliﬁed version of

Agent-Centric Projection of Prompting Techniques and Implications for Synthetic Training Data for Large Language Models

1257

Figure 3: An example of a prompting technique with non-

linear context.

BRANCH-SOLVE-MERGE, ﬁrst described in (Saha

et al., 2023), and as visualized in Figure 3. The ﬁg-

ure depicts a task-oriented LLM system that helps the

user (the human) make decisions. First, the human

ﬁrst instructs the system to make a decision. The sys-

tem uses the instructions to create an input context

for an LLM (context C

) and uses it to generate a re-

sponse R

. R

is then used to create two new prompts

(via an algorithmic transformation depicted in the ﬁg-

ure as

7→

), one in which the LLM is tasked with re-

ﬂecting on the drawbacks of this decision (in context

) and another where the LLM is tasked with reﬂect-

ing on the beneﬁts of this decision (in context C

Finally, another prompt is created where both reﬂec-

tions (responses R

and R

) are considered (using an-

other algorithmic transformation

7→

) to create a new

prompt (context C

) which is used to generate a ﬁnal

decision within response R

. R

is then reported to

the user as the ﬁnal decision.

As long as any task-oriented system is using

LLMs, it will always have one or more continuous

streams of messages M as described. This means

that all task-oriented LLM systems and all prompting

techniques can be classiﬁed as having either linear or

non-linear contexts.

This fundamental dichotomy between linear and

non-linear contexts provides a powerful lens through

which to analyze LLM systems. As we will show in

the next section, it reveals surprising connections be-

tween prompting techniques and multi-agent systems

that can inform both system design and training ap-

proaches.

3.2 Agent-Centric Projection of

Prompting Techniques

In the previous section (§ 3.1), we classify all prompt-

ing techniques and all resulting task-oriented LLM

systems they bore into either having a linear or non-

linear context. This decision and the overall deﬁnition

have the following implications:

• Research on techniques for reliable, task-oriented

text generation that involve linear context can be

modeled as a kind of two-agent system (the hu-

man instructing the LLM being the second agent,

as we also see in (Xi et al., 2023; Wu et al., 2023)).

• Research on techniques for reliable, task-oriented

text generation that involves non-linear context

can be modeled to be a kind of multi-agent sys-

tem, where each “branch” of conversation M can

be considered to have occurred with a different

agent.

Figure 4: The prompting technique from Figure 3 is mod-

eled as a multi-agent system.

For example, In Figure 4, we show how the

prompting technique from Figure 3 can be mod-

eled as a multi-agent system. Each continuous

linear sequence of messages M

= {C

, R

}, M

, R

}, M

= {C

, R

}, and M

= {C

, R

} can

be considered to have occurred with a different

agent. Using this approach, we can model any

prompting technique with non-linear context as a

multi-agent system.

It would also help to note that each continuous

sequence of messages M

in Figure 4 is essen-

tially a minimal task-oriented LLM system, as de-

scribed in § 2.1. This means that we can substitute

each such minimal system with a more compli-

cated task-oriented system if needed.

In Figure 5, we show a more realistic example

of a multi-agent system, designed to replicate the

behavior of the prompting technique in Figure 3.

Here, the major changes are that the agents com-

municate with each other using tools, meaning all

communication is bidirectional (say, if an agent

wants to ask a clarifying question) and that the

algorithmic transformations

7→

and

7→

are now

present each as a tool available to Agents A

and

respectively. This system may behave exactly

like the system in Figure 3 most of the time, but

may prove to be more resilient to unexpected cir-

cumstances, as each component is more “intelli-

gent”.

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

1258

Figure 5: A more realistic projection of the prompting tech-

nique from Figure 3 as a multi-agent system.

As all prompting techniques can be projected to such

multi-agent systems, we can conjecture that:

Conjecture 1. Results from prompting techniques in-

volving non-linear context can predict similar results

from multi-agent systems designed to replicate the

same behavior.

This projection or view allows us to generalize all

such techniques and apply learnings from one tech-

nique to another, and even learnings from multi-agent

systems. For example, if new LLM-based multi-agent

collaboration research shows that “A Process Super-

vising Agent is all you Need”, then we can imme-

diately apply that result to the prompting technique

described in BRANCH-SOLVE-MERGE from (Saha

et al., 2023) – by viewing the “process supervising

agent” work via a non-linear context lens as illustrated

in Figure 3, and then adding the BRANCH-SOLVE-

MERGE “nodes and connections”.

But it does not end there – because of how ﬂex-

ible natural language is, all non-linear contexts can

also be projected to linear contexts. For example, say

four agents engage in adversarial interaction as de-

scribed in (Xi et al., 2023) (§ 4.2.2), where they argue

about a decision until they reach a consensus. The

beneﬁt of this interaction paradigm is that each agent

can be instructed to look at the problem from various

perspectives.

This interaction can be elicited within a lin-

ear context, where the LLM is prompted with the

same decision-making problem but with additional

instructions to share a turn-by-turn dialogue where

four individuals argue about the decision until they

reach a consensus. This has been demonstrated

in (Wang et al., 2024), where a single LLM in-

stance is prompted to produce a transcript of mul-

tiple personas (agents) interacting with each other

to solve a task. The authors call this “Solo Per-

formance Prompting”. Their results show that this

technique—essentially converting non-linear context

(multiple agents collaborating) into linear context (a

dialogue transcript)—shows performance gains com-

parable to those achieved by multi-agent systems on

other tasks ((Wang et al., 2024) does not directly com-

pare to multi-agent systems).

In (Dong et al., 2024) the authors describe a sim-

ilar approach minus the “dialogue”, where multiple

roles (analyst, coder, tester, etc.) are simulated by a

single LLM with linear context (Dong et al., 2024)

(§2.2, Eq. 1). The paper shows how this approach

outperforms baseline and advanced prompting tech-

niques (such as CoT).

Conjecture 2. Performance improvements achieved

through multi-agent system architectures can be at

least partially replicated using single-LLM prompt-

ing techniques

that simulate equivalent multi-agent

interaction patterns within a linear context.

3.3 Implications for Synthetic Training

Data

Recent work has demonstrated that synthetic data

can effectively enhance model capabilities in various

applications, from structured information extraction

(Josifoski et al., 2023) to visual question answering

(Su et al., 2024).

A key insight stems from an apparent paradox

in LLM systems: while all LLMs are trained on

“linear context” (sequential text), research and prac-

tice show that “non-linear context” approaches—such

as advanced prompting techniques and multi-agent

interactions—are of signiﬁcant interest and demon-

strate superior task performance (Saha et al., 2023;

Ning et al., 2023; Wei et al., 2023; Hong et al., 2023;

Wu et al., 2023).

The previous subsection presents an argument for

how techniques involving non-linear context can be

projected to an equivalent technique utilizing linear

context. This can have profound implications when

you consider that all LLMs are trained on “linear

context”, i.e., trained on continuous sequences of

text. If intermediate steps from advanced prompting

techniques like BRANCH-SOLVE-MERGE are pro-

jected to linear contexts similar to Solo Performance

Prompting (Wang et al., 2024) and Self-Collaboration

(Dong et al., 2024) – then they can also be used as

synthetic training data.

Interestingly, a recent approach called Stream of

prompting techniques such as Solo Performance

Prompting (Wang et al., 2024) and Self-Collaboration

(Dong et al., 2024)

Agent-Centric Projection of Prompting Techniques and Implications for Synthetic Training Data for Large Language Models

1259

Search (SoS) (Gandhi et al., 2024) further under-

scores our perspective on using non-linear or subop-

timal reasoning traces for training. SoS demonstrates

that when LLMs are trained on branching, backtrack-

ing search trajectories—serialized into a linear textual

format—they acquire stronger problem-solving capa-

bilities and can even discover new strategies. These

ﬁndings support Conjecture 3 below, illustrating how

self-generated, “messy” intermediate steps can serve

as valuable synthetic data to improve the performance

of an LLM.

Conjecture 3. Synthetically generated “self-

collaboration” transcripts of successful task-solving

attempts—whether derived from non-linear prompt-

ing techniques or multi-agent collaboration—when

used as training data, improve LLM performance in

both multi-agent systems and advanced prompting

techniques targeting similar tasks.

This idea can be extended further by using ex-

isting problems and their real-world deliverables,

both intermediate and ﬁnal, and generating simu-

lated interactions between collaborators as synthetic

data. Consider taking the requirements of a com-

pleted software project on GitHub, along with pull

requests/issue commentary, commit messages, com-

mit diffs in chronological order, and using LLMs

to fabricate communication between collaborators –

wouldn’t the resulting manuscript, perhaps made to

resemble a theater play script, be effective training

data?

Thus, our proposed framework of linear and non-

linear context (§ 3.1) along with the agent-centric pro-

jection of prompting techniques (§ 3.2) presents a lens

that could lead to signiﬁcant advancements in syn-

thetic data generation.

4 CONCLUSIONS

4.1 Core Arguments

• PROMPT ENGINEERING AND INSTRUCTION

ENGINEERING: Clearly differentiating the adjust-

ment of the prompt without altering the actual task

or the deﬁnition of the problem (prompt engineer-

ing) and modifying the task to an equivalent task

more suitable for LLM systems (instruction en-

gineering) is essential to precise communication

and understanding of research in this area.

• LINEAR AND NON-LINEAR CONTEXT: Prompt-

ing techniques and resulting task-oriented LLM

one with the same ﬁnal deliverable

systems can be classiﬁed into having either linear

or non-linear context.

• AGENT-CENTRIC PROJECTION OF PROMPTING

TECHNIQUES: We demonstrate approaches that

allow prompting techniques with non-linear con-

text to be understood as multi-agent systems and

vice versa. This projection provides a frame-

work for analyzing, comparing, and improving

both prompting techniques and multi-agent sys-

tem architectures.

4.2 Implications for Future Research

• CROSS-POLLINATION IN PROMPTING AND

LLM-BASED MULTI-AGENT SYSTEMS: The

agent-centric projection of prompting techniques

may allow us to cross-pollinate research ﬁndings

in these areas.

• SYNTHETIC TRAINING DATA GENERATION:

Our core arguments suggest two novel approaches

for generating high-quality synthetic training data

for LLMs: (1) converting successful non-linear

prompting traces into linear training data and (2)

augmenting real-world task traces with synthetic

agent collaboration artifacts. These approaches

could provide structured, high-quality data specif-

ically suited for training LLMs in multi-agent and

complex reasoning tasks.

• REAL-WORLD APPLICATIONS AND ETHICAL

CONSIDERATIONS: As these systems become

more capable, their deployment in real-world sce-

narios becomes more feasible. With this comes

the need for rigorous ethical considerations, es-

pecially concerning autonomy, decision making,

and human-AI interaction.

By establishing the fundamental distinction be-

tween linear and non-linear contexts in prompting

techniques, and using this to develop an agent-centric

projection that reveals deep connections between

prompting techniques and multi-agent systems. This

framework leads to three key conjectures about the re-

lationship between prompting techniques and multi-

agent systems, suggesting that results from one do-

main can inform the other. Furthermore, we demon-

strate how this uniﬁed perspective opens up novel ap-

proaches to synthetic training data generation, both

through the conversion of non-linear prompting traces

and through the augmentation of real-world task

traces. Our position highlights the untapped potential

in viewing prompting techniques through an agent-

centric lens, providing concrete directions for improv-

ing both the design and training of future LLM sys-

tems.

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

1260

REFERENCES

Besta, M., Blach, N., Kubicek, A., Gerstenberger, R., Gi-

aninazzi, L., Gajda, J., Lehmann, T., Podstawski, M.,

Niewiadomski, H., Nyczyk, P., and Hoeﬂer, T. (2023).

Graph of Thoughts: Solving Elaborate Problems with

Large Language Models. arXiv:2308.09687 [cs].

Cai, Z. G., Haslett, D. A., Duan, X., Wang, S., and Picker-

ing, M. J. (2023). Does ChatGPT resemble humans in

language use? arXiv:2303.08014 [cs].

Chen, B., Zhang, Z., Langren’e, N., and Zhu, S. (2023).

Unleashing the potential of prompt engineering in

Large Language Models: a comprehensive review.

arXiv.org.

Dong, Y., Jiang, X., Jin, Z., and Li, G. (2024). Self-

Collaboration Code Generation via ChatGPT. ACM

Trans. Softw. Eng. Methodol., 33(7):189:1–189:38.

Feuerriegel, S., Hartmann, J., Janiesch, C., and Zschech,

P. (2023). Generative AI. Business & Information

Systems Engineering. arXiv:2309.07930 [cs].

Gandhi, K., Lee, D., Grand, G., Liu, M., Cheng, W.,

Sharma, A., and Goodman, N. D. (2024). Stream

of Search (SoS): Learning to Search in Language.

arXiv:2404.03683.

Hendel, R., Geva, M., and Globerson, A. (2023). In-Context

Learning Creates Task Vectors. arXiv:2310.15916

[cs].

Hong, S., Zhuge, M., Chen, J., Zheng, X., Cheng, Y.,

Zhang, C., Wang, J., Wang, Z., Yau, S. K. S., Lin, Z.,

Zhou, L., Ran, C., Xiao, L., Wu, C., and Schmidhuber,

J. (2023). MetaGPT: Meta Programming for A Multi-

Agent Collaborative Framework. arXiv:2308.00352

[cs].

Josifoski, M., Sakota, M., Peyrard, M., and West, R. (2023).

Exploiting Asymmetry for Synthetic Training Data

Generation: SynthIE and the Case of Information Ex-

traction. arXiv:2303.04132.

Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., and Iwasawa, Y.

(2023). Large Language Models are Zero-Shot Rea-

soners. arXiv:2205.11916 [cs].

Li, C., Wang, J., Zhang, Y., Zhu, K., Hou, W., Lian, J.,

Luo, F., Yang, Q., and Xie, X. (2023). Large Lan-

guage Models Understand and Can be Enhanced by

Emotional Stimuli. arXiv:2307.11760 [cs].

Madaan, A., Tandon, N., Gupta, P., Hallinan, S., Gao, L.,

Wiegreffe, S., Alon, U., Dziri, N., Prabhumoye, S.,

Yang, Y., Welleck, S., Majumder, B. P., Gupta, S.,

Yazdanbakhsh, A., and Clark, P. (2023). Self-Reﬁne:

Iterative Reﬁnement with Self-Feedback. arXiv.org.

Ning, X., Lin, Z., Zhou, Z., Wang, Z., Yang, H., and Wang,

Y. (2023). Skeleton-of-Thought: Large Language

Models Can Do Parallel Decoding. arXiv:2307.15337

[cs].

Nori, H., Lee, Y. T., Zhang, S., Carignan, D., Edgar,

R., Fusi, N., King, N., Larson, J., Li, Y., Liu, W.,

Luo, R., McKinney, S. M., Ness, R. O., Poon, H.,

Qin, T., Usuyama, N., White, C., and Horvitz, E.

(2023). Can Generalist Foundation Models Out-

compete Special-Purpose Tuning? Case Study in

Medicine. arXiv:2311.16452 [cs].

OpenAI (2024). Learning to Reason with LLMs.

Park, J., O’Brien, J. C., Cai, C. J., Morris, M., Liang, P., and

Bernstein, M. S. (2023). Generative Agents: Interac-

tive Simulacra of Human Behavior. arXiv.org.

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., and

Sutskever, I. (2019). Language Models are Unsuper-

vised Multitask Learners.

Saha, S., Levy, O., Celikyilmaz, A., Bansal, M., Weston,

J., and Li, X. (2023). Branch-Solve-Merge Improves

Large Language Model Evaluation and Generation.

arXiv:2310.15123 [cs].

Savage, T., Nayak, A., Gallo, R., Rangan, E., and Chen,

J. H. (2023). Diagnostic Reasoning Prompts Reveal

the Potential for Large Language Model Interpretabil-

ity in Medicine. arXiv:2308.06834 [cs].

Su, X., Luo, M., Pan, K. W., Chou, T. P., Lal, V., and

Howard, P. (2024). SK-VQA: Synthetic Knowledge

Generation at Scale for Training Context-Augmented

Multimodal LLMs. arXiv:2406.19593.

Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., and

Zhou, D. (2022). Self-Consistency Improves Chain of

Thought Reasoning in Language Models. ArXiv.

Wang, Z., Mao, S., Wu, W., Ge, T., Wei, F., and

Ji, H. (2024). Unleashing the Emergent Cog-

nitive Synergy in Large Language Models: A

Task-Solving Agent through Multi-Persona Self-

Collaboration. arXiv:2307.05300.

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B.,

Xia, F., Chi, E., Le, Q., and Zhou, D. (2023). Chain-

of-Thought Prompting Elicits Reasoning in Large

Language Models. arXiv:2201.11903.

Wu, Q., Bansal, G., Zhang, J., Wu, Y., Li, B., Zhu, E.,

Jiang, L., Zhang, X., Zhang, S., Liu, J., Awadallah,

A. H., White, R. W., Burger, D., and Wang, C. (2023).

AutoGen: Enabling Next-Gen LLM Applications via

Multi-Agent Conversation. arXiv:2308.08155 [cs].

Xi, Z., Chen, W., Guo, X., He, W., Ding, Y., Hong, B.,

Zhang, M., Wang, J., Jin, S., Zhou, E., Zheng, R.,

Fan, X., Wang, X., Xiong, L., Zhou, Y., Wang, W.,

Jiang, C., Zou, Y., Liu, X., Yin, Z., Dou, S., Weng, R.,

Cheng, W., Zhang, Q., Qin, W., Zheng, Y., Qiu, X.,

Huang, X., and Gui, T. (2023). The Rise and Potential

of Large Language Model Based Agents: A Survey.

arXiv:2309.07864 [cs].

Yao, S., Yu, D., Zhao, J., Shafran, I., Grifﬁths, T., Cao, Y.,

and Narasimhan, K. (2023). Tree of Thoughts: Delib-

erate Problem Solving with Large Language Models.

arXiv.org.

Yogatama, D., d’Autume, C. d. M., Connor, J., Kocisky,

T., Chrzanowski, M., Kong, L., Lazaridou, A., Ling,

W., Yu, L., Dyer, C., and Blunsom, P. (2019). Learn-

ing and Evaluating General Linguistic Intelligence.

arXiv:1901.11373 [cs, stat].

Yu, Z., He, L., Wu, Z., Dai, X., and Chen, J. (2023). To-

wards Better Chain-of-Thought Prompting Strategies:

A Survey. arXiv.org.

Zou, A., Wang, Z., Kolter, J. Z., and Fredrikson, M. (2023).

Universal and Transferable Adversarial Attacks on

Aligned Language Models. arXiv:2307.15043 [cs].

Agent-Centric Projection of Prompting Techniques and Implications for Synthetic Training Data for Large Language Models

1261