When a customer's case cannot be resolved in one contact (a regulatory complaint, a long-running dispute, a multi-party issue), an AI agent manages the case over weeks: scheduling follow-ups, chasing internal owners, updating the customer at the agreed cadence, and escalating when an SLA is at risk. The new ingredient on this rung is durability: the case persists in the AI's working state for as long as it takes.

A customer files a complaint that involves a third party, a missing document, and a regulatory window. Before AI, the case lived in a ticketing system with manual reminders. The internal owner forgot. The customer chased. A week passed without progress because nobody noticed. AI changes the part that depends on someone noticing.

How this used to be a decision tree

Long-running cases were not actually trees so much as a queue with manual reminders. A ticket was created. SLAs were set. A human watched (sometimes). Follow-ups happened when an agent had time. Customers were updated when someone remembered. The structure depended on attention, and attention is a finite resource the customer's case did not always win.

Why AI doesn't make this a decision tree anymore

The AI watches the case clock continuously. It schedules the next follow-up at the agreed time. It prompts the internal owner when their step is due. It updates the customer at the cadence the customer agreed to at intake. It escalates to a human when any party misses their step. The case is not waiting for someone to remember it; the case remembers itself.

What people in the field are saying

Gradient Flow's "Agentic AI: challenges and opportunities" argues that durable, multi-step agency is the next real test for AI: the model that can hold a case open for weeks and act when its time comes is a different system from the one that answers a single question well.

How does long-running case management look today?

At intake, the AI captures the case, sets the SLA, agrees the cadence with the customer, and assigns the internal owners. From there, the AI runs the case calendar. It nudges the internal owner when their step is due, sends the customer an update when the cadence calls for it, and escalates when an owner is late. The customer never has to chase, because the AI is doing the chasing for them.

What does it take to make this work?

A case data model that can hold weeks of state: agreed cadence, internal owners, SLA clocks, pending steps. Write access to the calendar and notification systems. A clear distinction between "AI chases" (acceptable for low-risk internal nudges) and "AI escalates" (for breaches that need human visibility). Audit logs of every action on every case. An owner of the long-running case discipline overall, because the policy decisions stack up.

Where does this go wrong?

Cadence drift: the AI under- or over-contacts the customer because the agreed cadence was never written down. Internal-owner fatigue: the AI nudges the same owner several times a week and the owner stops reading. Escalation noise: too many SLA breaches escalate, and the human team stops reading the escalations too. Silent drops: the case clock ran out and the AI did not flag it.

Which tools handle long-running cases?

  • Sierra: case-state-aware autonomous handling.
  • Kustomer: customer-history-first helpdesk built for long cases.
  • DevRev: ties customer cases to internal work items.
  • Pylon: B2B support with multi-week case patterns.
  • Decagon: action-taking with persistent state.

How would I start doing this?

Pick one case type that already takes more than one contact to resolve (regulated complaints, refund disputes with a third party, account recovery). Set the cadence and SLA explicitly with the customer at intake. Let the AI run the calendar for one cohort. Measure how often the AI chased, how often the customer chased, and how often the case broke the SLA. The reduction in customer-initiated chasing is the metric that matters.

Next: a single event affects many customers at once. Responding to a mass event.