Deflection rate tells you how much work the AI took off your team. It does not tell you whether customers were helped. Four other metrics do: first-contact resolution by channel, re-contact rate, escalation quality, and downstream churn.
Most AI support reporting leads with deflection rate, and most teams have learned, often the hard way, that it can look excellent while customers are quietly leaving. The problem is not that deflection rate lies. It is that it measures the wrong thing: volume absorbed, not problems solved.
This article gives you four metrics that measure whether the AI actually helped. They come from Michael Howlett, who writes Customer Experience Decoded, and his work on why contact centre numbers look great while customers leave.
1. First-contact resolution, split by channel
First-contact resolution is the share of contacts solved without the customer needing to come back. It is a better question than deflection, because it asks about the outcome, not the handoff.
The important word is split. A blended FCR number across all channels hides the AI's real performance. Measure FCR for the AI channel on its own, and compare it to the human channels. If the AI's FCR is well below your agents', the AI is closing contacts without resolving them, and the blended number was covering for it.
2. Re-contact rate
Re-contact rate is the share of customers who come back, in any channel, within a short window, about the same issue. It is the direct test of whether a closed contact stayed closed.
This metric catches the failure deflection rate cannot see: a customer the AI marked as handled who reopens the problem an hour later by email or phone. Measure re-contact within 24 to 72 hours. A high deflection rate paired with a high re-contact rate is not success. It is the same problem being counted twice and solved zero times.
3. Escalation quality
Every AI deployment escalates some contacts to humans. Escalation is not a failure; sending the right contacts to a person is the system working. The metric that matters is whether the escalation was clean.
A clean escalation hands the human agent the full context: who the customer is, what they asked, what the AI already tried. A poor one dumps the customer on an agent with nothing, so the customer repeats the whole story and the agent starts cold. Score a sample of escalations on whether the context travelled. Poor escalation quality means the AI is not reducing work at the boundary, it is doubling it.
4. Downstream churn
The last metric looks past the contact itself. Downstream churn is the rate at which customers who had an AI interaction reduce spend, cancel, or fail to renew in the weeks after.
This is the slowest signal and the most important one. A contact can be closed, never re-contacted, and still have damaged the relationship enough that the customer leaves later. Compare the later behaviour of customers who went through AI against those who reached a human. If the AI cohort churns more, the efficiency on the dashboard is buying you lost customers, and no faster handle time makes that a good trade.
Why these four, together
No single metric is enough, which is why deflection rate fails alone. Together, the four cover the whole arc of a contact: was it resolved (FCR), did it stay resolved (re-contact), was the handoff clean (escalation quality), and did the customer stay (downstream churn).
Deflection rate answers how busy the AI was. These four answer whether it actually helped the customer. That is the difference between a workload number and a performance number.