New paperGrounding Promises in the Sandbox: an environment-grounded commitment protocol for trained autonomous agents.Read now
All posts
Industry

Agent-washing is here: a field guide to spotting it

Every CX vendor now says "agentic." Most of them shipped a chatbot with a new label. Five questions that tell the difference before you sign.

Teleperson Team · May 8, 2026 · 4 min read

In 2023 the word was "AI." In 2024 it was "copilot." In 2026 it is "agent," and the same thing is happening to it that happened to the other two: the word has detached from the product. Vendors who shipped a retrieval chatbot eighteen months ago have not rebuilt it. They have re-labeled it. The deck says "agentic AI platform." The software is the chatbot.

We call this agent-washing, and if you are evaluating CX tooling this year you will see a lot of it. Here is how to tell, in a first call, whether you are looking at an agent or a costume.

1. Does it take actions, or only produce text?

This is the dividing line, and it is binary. A chatbot's output is a message. An agent's output is a changed state in the world — a ticket created, a refund issued, a plan downgraded, a record updated.

Ask: "Show me the last thing it did that wasn't sending a message." If the answer is a longer message, a better-formatted message, or a message with a link in it, you are looking at a chatbot. The demo will be fluent. Fluency is not the question.

2. What is its authority scope — and who set it?

A real agent operates inside an explicit, bounded scope: these actions, up to these limits, on behalf of this principal. Ask the vendor to show you the scope object. Ask how it's defined, who can change it, and what happens at the boundary.

If there's no answer — if "scope" turns out to mean a system prompt that says please be careful — the product has no authority model. It either can't take real actions, or it takes them with no governing limit. Both are disqualifying for anything that touches money or account state.

3. Does it confirm before consequential actions?

Genuine agents exhibit confirmation behavior: before an action that binds the principal to something — a purchase, a cancellation, a commitment — they check back. This isn't a UX nicety. It's the mechanism that makes a too-broad scope survivable, because it catches the error before it commits.

Ask: "Walk me through what it does right before it spends money." A good answer is specific and describes a checkpoint. A bad answer is "it's been trained not to make mistakes." Training reduces error rates. It does not produce a checkpoint.

4. Is there an audit trail, and can you read it?

After the agent acts, can you reconstruct what it did, on what evidence, under whose authorization? Ask to see a real one — a transaction, end to end, as the system recorded it.

If the record is a chat transcript, that's not an audit trail. A transcript shows what was said. An audit trail shows what was done and why — the structured, attributable record that holds up when something is disputed. We've written about why this layer is load-bearing in Trust at Machine Speed; the short version is that an agent you can't audit is an agent you can't deploy.

5. What happens when it's wrong?

Every agent will be wrong sometimes. The question is whether the product was designed for that. Ask: "A customer says the agent did something they didn't authorize. What do you hand me?"

A serious vendor describes a process — the signed record, the scope it acted under, the contestation path. A non-serious vendor describes a hope: that it won't happen, that the model is very good. We laid out the framework for this in The Liability Question. You don't need the vendor to have solved liability. You need them to have thought about it — because if they haven't, you've just bought the liability.

The pattern

Four of these five questions are about what happens around the model, not the model itself. That's deliberate. The conversational quality of every serious vendor's product is, by now, roughly the same — good enough. The model is not where the differences are.

The differences are in whether anyone built the boring infrastructure: scope, confirmation, audit, contestation. That infrastructure is what separates a system you can let act on a customer's behalf from a chatbot wearing the word "agent." Ask the five questions. The costume comes off fast.