• Shared mem ory layer
• Governance la yer, including expla inability, pri-
vacy, security, etc.
Given a u ser task, th e goal of an AI agen t pla tform
is to identify (compose) an agent (group of agents)
capable to executing the given task. So the first com-
ponen t that we need is a reasoning layer capable of
decomp osing a task into sub-tasks, with execution of
the respective agen ts orchestrated by an orch estration
engine.
Chain of Thought (CoT) (Wei et al., 2022) is
the most widely used decomposition framework to-
day to transform complex tasks into multiple manage-
able ta sks and shed light into an interpretation of the
model’s thinking process. CoT can be implemented
using two approaches: user prompting and automated
approa c h.
• User Prompting: Here, during prompting, user
provides the lo gic about how to appr oach a certain
problem and LLM will solve similar problem s us-
ing same logic and return the output a long with
the logic.
• Automating Chain of Thought Prompting: Manu-
ally handcrafting CoT can be time consuming and
provide sub-optimal solution, Auto matic Chain of
Thought (Auto -CoT) (Zhang et al., 2022) can be
leveraged to generate the r easoning chain s auto-
matically thus eliminating the human interven-
tion.
Tree of Thoughts (Yao et al., 2023) extends CoT
by exploring multiple deco mposition possibilities in
a structured way. Fr om each thought, it can branch
out and generate multiple next-level thou ghts, creat-
ing a tree-like structure that can be explored by BFS
(breadth-first search) o r DFS (depth-first search) with
each state evaluated by a classifier (via a prompt) or
majority vote.
Agent composition implies the existence of an
agent marketplace / registry of agents - with a well-
defined descrip tion of the agent capabilities and con-
straints. For example, let u s consider a house painting
agent C whose services can be reserved online (via
credit card). Given this, the fact that the user requires
a valid c redit card is a constrain t, and the fact that
the user’s house will be painted within a certain time-
frame are its capabilities. In addition, we also need
to consider any constraints of C during the actual ex-
ecution phase, e.g., the fact that C can only provide
the service on weekdays (and not on weekends). In
general, constraints refer to the conditions that need
to be satisfied to initiate an execution and capabilities
reflect the expected outcome after the execution ter-
minates.
In the context of MAS, specifically, previous
works (Capezzuto et al., 2021; Trabelsi et al., 2022;
Veit et al., 2001) have conside red agent limitations
during the discovery process. (Veit et al., 2001) pro-
poses a con figurable XML based framework called
GRAPPA (Generic Request Architecture for Passive
Provider Agents) for agent matchmaking. (Capez-
zuto et al., 2021) specifies a compact formulation for
multi-agent task allocation with spatial and tempo-
ral constraints. (Trabelsi et al., 2022) considers agent
constraints in the form of incompatib ility with re-
sources. The au thors then propose an optimal match-
making algorithm that allows the agents to relax their
restrictions, within a budget. Refer to (Biswas., 2024)
for a detailed discussion on the discovery aspect of AI
agents.
Given the need to orchestrate multiple agen ts, we
also need an integration layer supportin g different
agent interaction patterns, e. g., agen t-to-agen t API,
agent API providing output for human consump tion,
human triggering an AI agent, AI agent-to-agent with
human in the loop. The integration patterns need to
be supp orted by the underlying AgentOps platform.
To accommodate multiple lon g-running agents,
we a lso need a shared long-term mem ory layer en-
abling d ata transfer between agents, storing interac-
tion data such tha t it can be used to persona lize future
interactions. The standard approa ch here is to save th e
embedd ing representation of agent in formation into
a vector database that can support maximum inner
product sear ch ( MIPS). For fast retrieval, the approx-
imate nearest neighbors (ANN) algorithm is used that
returns approximately top k-nearest neighbor s with an
accuracy trade-off versus a huge speed ga in.
Finally, the governance layer. We need to ensure
that data shared by the user specific to a task, or user
profile data that cuts across tasks; is only shared with
the relevant agents (authenticatio n and access con-
trol). We further consider the different responsible
AI dimensions in terms of data quality, privacy, re-
producibility and explainability to enable a well gov-
erned AI agent platform.
3 STATEFUL AGENT
MONITORING
Stateful execution (Lu et al., 2024) is an inherent char-
acteristic of any distributed systems platform, and can
be considered as a critical requiremen t to materialize
the orchestration layer of an AI agent platform. Given
this, we envision that agent monitor ing together with
failure recovery will become more and more criti-
cal as AI agent platforms become enterprise ready,