Governance for agentic AI in customer service has to cover three things at minimum: scope (what actions the AI is allowed to take), thresholds (above what value or risk a human approves), and an audit trail (a log of every action with what or who made it). Regulators are moving from "what does the bot say" toward "what does the bot do," and your governance needs to do the same.
An old support chatbot told the customer to follow the refund policy. An agentic AI does the refund. The first only needs a policy document. The second needs guardrails on which contacts it can refund, up to what value, with what verification, and a record of every refund it issued.
What people in the field are saying
Nick Clark's "Get excited by guardrails and governance" in Service Matters makes the case that governance is not the boring part; it is the part that decides whether agentic AI is shippable in regulated industries. EU AI Act Article 14 requires human oversight for high-risk AI, and that frame is now mainstream in CX governance discussions.
What is the minimum set of guardrails?
Three components. First, an explicit scope: a list of contact types the AI may resolve on its own, with the rest defaulting to a human. Second, thresholds: above some refund value, some risk score, or some sensitive field, the AI gets approval before acting. Third, an audit trail: every action the AI took, when, on whose record, with which inputs, retrievable later.
How do regulated industries approach this?
Insurance, banking, and healthcare already have a "two pairs of eyes" mindset for material actions. The same mindset applies to agentic AI: the second pair of eyes is either a human reviewer or a separate validation step before the action goes through. The audit trail is the substrate that makes the rest auditable.
What do governance failures look like?
The AI processes a refund on the wrong customer's account. The AI quietly changes a sensitive field in production. The AI resolves a ticket and there is no record of how it decided. The AI follows the rules and still produces a bad outcome the customer wants redress for. Each one is a real failure mode in deployed systems, and each one is preventable with the three components above.
Where do most teams start?
With the audit trail. It is the cheapest piece, it is required for the other two, and it lets you ask "what would governance even have caught?" against real production data. Scope and thresholds come next, informed by what the audit trail shows.
Related: AI guardrails in customer service, when AI is compliant but customers are harmed, and the glossary explainer on human-in-the-loop.