The agentic stack, 2026: what solidified and what's still soft
A year ago the agentic stack was a pile of demos. Today it's an emerging set of standards. A technical read on what has solidified — protocols, memory, tool-use — and where the real work still is: evaluation, orchestration, and the trust layer.
Teleperson Team · June 16, 2026 · 4 min read
A year ago, "the agentic stack" mostly meant a clever prompt, a loop, and a handful of tool calls held together with hope. In 2026 it is starting to look like infrastructure — with layers that are genuinely standardizing and layers that are still improvised. This is a technical read on which is which, because the gap between them is where most agent projects still fail.
The protocol layer has solidified
The most durable change is that agents now have standard ways to reach tools and reach each other. Anthropic's Model Context Protocol (MCP), released in late 2024, gave agents a common interface for tools and data sources — a way to plug into a system without a bespoke integration each time. Google's Agent2Agent (A2A) protocol, released in 2025, did the analogous thing for agent-to-agent communication: discovery, capability advertisement, and task delegation across heterogeneous agents.
What matters is not any single protocol but that the layer now exists. Interoperable action — invoking a tool or handing a task to another agent without a one-off adapter — has moved from research to plumbing. That is the connective tissue of agent-to-agent commerce, and it is the part of the stack you can now more or less assume.
Memory and tool-use are table stakes
Two design choices separate this agent generation from the last: persistent memory across sessions, and reliable tool-use. Both are now expected rather than novel. Retrieval over a knowledge base with sentence-aware chunking and vector search; structured tool-calling with typed arguments; cross-session memory that survives a reload — these are solved enough to build on. The interesting engineering has moved up a level, to when to remember, which tool to trust, and how to recover when a tool call fails.
Orchestration and evaluation are still soft
Here is where the stack gets improvised. Orchestrating a multi-step agent — deciding when to plan, when to act, when to ask, when to stop — is still more craft than science. And evaluating an agent is genuinely unsolved. A model you can benchmark on a fixed test set. An agent that takes a different path every run, calls live tools, and produces side effects resists the same treatment. Teams that have shipped agents to production will tell you the same thing: the demo works on the first try, and the next three months are spent on the long tail of ways it fails. The evaluation problem is the single biggest thing standing between a working agent and a deployable one.
The trust layer is the new frontier
The layer getting the most attention in 2026 is the one that lets an agent act across a boundary safely. It has four parts, and they are maturing in order:
- Verifiable identity — credentials an agent can present and a counterparty can check, so a brand agent knows it is talking to this customer's agent and vice versa.
- Scoped authority — granular, time-bound permission (read an order, refund up to a limit) delegated explicitly, enforced with mechanisms like OAuth 2.1 and DPoP-bound tokens.
- Interoperable action — the protocol layer above, so scoped authority can actually be exercised across systems.
- Auditable record and gates — every consequential action logged and reproducible, with high-risk steps gated for human hand-off.
Reasoning and orchestration get the headlines, but this trust layer is what determines whether an agent can be trusted to do something consequential on someone else's behalf. It is the least glamorous and most decisive part of the stack.
The remaining gap: deployment
Add it up and a pattern emerges. The lower layers — protocols, memory, tool-use — have standardized to the point where a capable prototype is a weekend's work. The upper layers — evaluation, orchestration, and the trust machinery — are where production lives, and where most of the effort now goes. The distance between a working agent and a truly deployable one is far larger than most teams anticipate, and it is not closing because models got smarter; it is closing because teams are building the boring, essential layer around them: bounded authority, signed receipts, and reproducible traces.
For 2026, the honest summary is this: the agentic stack is real, it is standardizing from the bottom up, and the frontier has moved from "can it act?" to "can you prove what it did, and keep it inside the lines?" The teams that treat that as the core engineering problem — not an afterthought bolted on before launch — are the ones getting agents past the pilot stage.