Your AI agents will get worse over time. Make sure someone is watching.

An AI customer service agent does not stay the same after launch. The model behind it changes, your product changes, and your customers ask new things. An agent that did well on day one will slowly get worse, and the decline is easy to miss. The fix is plain: give one named person the job of watching for it.

Most teams treat an AI agent like a finished build. You test it, it passes, you launch it, you move on. That works for ordinary software, which does the same thing today that it did last month. An AI agent is different. It can change how it behaves on its own.

This slow change has a name: drift. This article explains what drift is, why it is so easy to miss, and who on your team should own the job of catching it.

What drift means for an AI agent

Drift is when an AI agent's answers get less accurate or less helpful over time, without anyone changing it on purpose. Nothing in your setup looks broken. The agent still replies, still sounds confident, still closes tickets. The answers are just a little more wrong than they used to be.

Three things cause it. The model gets updated by the provider, and the new version behaves a little differently. Your own product changes, so the agent's knowledge falls out of date. And customers start asking things in new ways, or asking about new problems the agent was never good at. None of these is dramatic. Together they pull the agent off course.

Why drift is so easy to miss

The common belief is that a launch test settles the question. The agent was tested, it passed, so it works. But a launch test only tells you how the agent did on launch day, with launch-day customers and the launch-day model. It says nothing about next month.

Drift is also quiet. A broken integration throws an error you can see. A drifting agent throws no error. It just gets a few more answers wrong each week. Your headline numbers, tickets closed and average satisfaction, can stay flat while the quality underneath them slips. By the time those numbers move, the agent has been getting worse for a while.

The job nobody owns

Nick Clark, who writes the Service Matters newsletter, makes the point that once you run AI agents, you are managing them, not just installing them. Managing means watching them over time. In most contact centres, though, no single person has that job. The team that built the agent moved on. The operations team runs the shift. The vendor sells the platform. When the agent starts to drift, the question of whose job it is to notice has no answer.

This is the gap to close. Pick one person, in operations, and make agent health part of their role. Not the person who built the agent, and not the vendor. Someone who works the floor, sees the tickets, and can be held responsible for catching the slide.

What the watcher actually does

The job is concrete. Each week, the watcher reads a fresh sample of the agent's closed conversations. Not the numbers, the actual conversations. They score each one against a simple standard: was the customer's problem solved, and was the tone right. They chart that score over time. They watch for the warning signs Clark describes: questions the agent failed to understand, answers that were wrong, a change in behaviour right after a model update.

When the score drops, the watcher has the authority to act: pull a task back to closer human review, flag the agent for re-prompting or a rollback, and raise it with the vendor. The point is that someone is allowed to pull the brake.

What to set up this quarter: name one person in operations as the owner of AI agent health. Give them a weekly routine: read 20 to 30 closed conversations, score each on problem solved and tone, and chart the score. Agree in advance what size of drop triggers a response, and give that person the authority to act on it. Drift is normal. Catching it late is the avoidable part.

Where to start

Start small and start now. One agent, one watcher, one weekly sample, one chart. You do not need a monitoring platform to begin. You need a person and a habit. As you add more agents, the habit becomes a system and you can automate the sampling. But the automation supports the watcher. It does not replace the judgement of a person reading real conversations.

An AI agent is not a finished build. It needs someone watching it over time, because the decline it is meant to catch is normal and quiet. The avoidable mistake is leaving that job unassigned.

What drift means for an AI agent

Why drift is so easy to miss

The job nobody owns

What the watcher actually does

Where to start

Sources