Running a fleet of AI agents, not just one

A company rarely ends up with one AI agent. It ends up with several: one on the website, one in the help centre, one inside a billing tool, one a vendor switched on as a feature. Once you have a few, the work changes. You are running a fleet, and a fleet breaks in ways a single agent never does.

A separate field note covers how a single AI agent slowly gets worse after launch. That is real, and it still happens to every agent in a fleet. This article is about a different problem: what happens when you have many agents at the same time, and nobody is looking at them as a group.

How a fleet appears without anyone deciding to build one

Most fleets are not planned. They grow. The support team launches a chat agent. Three months later the marketing team adds an agent to the website. The billing software ships an update with its own assistant built in. Each decision made sense on its own. Nobody added them up.

This is common enough to have a name. Analysts call it agent sprawl: agents deployed across an organisation without a shared inventory, owner, or set of controls. Microsoft, describing its Agent 365 product, frames the core problem plainly: leaders cannot manage agents they cannot see, including agents other teams or vendors switched on. If you cannot list your agents, you have a fleet and you are not running it.

What breaks once you have several

The first thing that breaks is monitoring. Watching one agent is a habit one person can hold. Watching six, each with its own dashboard or no dashboard at all, is not. The agents that drift quietly are the ones nobody is assigned to check.

The second is conflicting answers. Two agents trained at different times, on different documents, will answer the same question two different ways. A customer who asks the website agent and then the help-centre agent gets told two things. To the customer, your company contradicted itself.

The third is version sprawl. Each agent has its own prompt, its own knowledge source, its own model version. When your refund policy changes, you now have to find every agent that mentions refunds and update each one. Miss one, and it keeps quoting the old policy for months.

The fourth is ownership. With one agent it is at least clear which team is embarrassed when it fails. With six, spread across support, marketing, and a vendor's product, a bad answer has no obvious owner, so it gets investigated by nobody.

The job of running the fleet

A fleet needs an owner: one person, or a small group, whose job is the agents as a set. Not the team that built each one, and not each vendor. Someone who can answer three questions at any time. How many agents do we have. What does each one do. Who is accountable when one of them fails.

That sounds basic, and it is the part most companies skip. The platform vendors are converging on the same answer from the tooling side. Microsoft's Agent 365 is built around a registry, a single inventory of every agent, plus shared observability across the whole group. You do not need to buy that product to take the lesson: the fleet needs a list and a named owner before it needs anything clever.

A simple operating routine

Start with a register. A single document or spreadsheet with one row per agent: what it does, which channel it lives on, which knowledge source it uses, which model version, and who owns it. The rule is that no agent goes live without a row. That one rule stops most sprawl.

Then run a regular fleet review. Once a month, the fleet owner goes down the register and checks each agent is still doing its job, still on a current model, and still pointed at current content. The same review is where you catch conflicting answers: ask the same handful of common questions to every customer-facing agent and read the replies side by side. If two disagree, that is the work for the month.

Finally, tie content changes to the register. When a policy or price changes, whoever makes the change checks the register for every agent that touches that topic. The register turns a vague worry into a short, finite checklist.

What to do this month: build the register. List every AI agent in the company, including ones inside vendor tools and ones other teams launched. For each, write down what it does, its channel, its knowledge source, its model version, and a named owner. Name one person as the fleet owner. Then ask three common customer questions to every customer-facing agent and compare the answers. The disagreements you find are your first month of work.

Where this leaves you

One AI agent is a project. Several agents are an operation, and an operation needs someone running it. The practical move is small: keep a register, name a fleet owner, and review the group on a schedule. Do that and a fleet stays a fleet. Skip it and you have a set of agents that drift apart, contradict each other, and quote last year's policy, with nobody assigned to notice.

How a fleet appears without anyone deciding to build one

What breaks once you have several

The job of running the fleet

A simple operating routine

Where this leaves you

Sources