Autonomous Email Reply Agent

Cold outreach at scale means a lot of replies. Most of them follow the same patterns: interested, not interested, wrong person, out of office, asking for more info. Handling them manually is repetitive and slows down the top of the funnel.

So I built something that handles it. The agent reads every inbound reply, figures out what the person means, writes back, and follows up if they go quiet. No human in the loop unless it actually needs one.

How it works

A webhook fires on every inbound reply and pushes it into a BullMQ queue. From there:

A Claude call classifies the reply into a fixed intent set: POSITIVE, QUESTION, SOFT_NO, REFERRAL, UNSUBSCRIBE, OOO, and a few others. It also returns a confidence score. Below 70% confidence, it routes to a human regardless of what the intent was. The model’s uncertainty is useful information.

A second Claude call drafts the reply, given the classified intent, the full thread history, and the sender’s CRM context.

A third Claude call reviews the draft and scores it 0-100 on tone, accuracy, and whether it actually addresses what was asked. 80+ sends automatically. 60-79 goes to Slack for the operator to approve or rewrite. Below 60, the human writes from scratch.

const result = await claude.messages.create({
  model: "claude-opus-4-5",
  max_tokens: 256,
  system: CLASSIFIER_SYSTEM_PROMPT,
  messages: [{ role: "user", content: `Classify this reply:\n\n${emailBody}` }],
});

Nudge system

If a lead doesn’t reply, the agent follows up. After the first AI reply goes out, a 9-message sequence fires over roughly three weeks. Each nudge is a BullMQ job with a delay baked in. When one fires, it queues the next before returning. Self-chaining.

The sequence stops when the lead replies, a booking webhook fires, the operator pauses it from Slack, or the intent comes back as UNSUBSCRIBE or a hard no.

Nudge bodies are generated at send time with the full thread as context, not from static templates. If the lead asked about pricing two messages ago, the next nudge knows that. Early nudges are short check-ins, later ones carry soft urgency, the final one closes cleanly. Static templates are kept as a fallback if the AI call fails.

Multi-turn

The agent isn’t limited to one reply per lead. When a lead follows up with a question after getting an AI reply, the classifier routes it back through the same write/review loop with the full thread injected as context, rather than handing off to a human immediately.

Turn 1: classify → write → review → send → nudge sequence starts
Turn 2: classify → thread history fetched → write → review → send → nudge resumes from position
Turn N > maxAiTurns: route to operator with thread summary

Some intents always go to the operator regardless of turn count: REFERRAL, pricing pushback above a certain signal strength, anything below the confidence threshold. The agent shouldn’t be handling complex negotiation.

After a successful re-reply, the nudge sequence resumes from its current position rather than restarting from 1.

Slack controls

Every AI action posts to Slack. The operator controls live there:

Pause AI cancels all queued nudge jobs for a thread. Resume re-queues the next nudge from the current position, using the sequence-defined delay rather than a hardcoded 1 day (later nudges have 2-4 day gaps, compressing them looks bad).

Score 60-79 drafts post to Slack with the draft body and two buttons: “Send this” or “Decline”. The draft is stored in Redis with a 30-minute TTL; the button value is just the Redis key to avoid Slack’s 2000-character button value limit.

If the operator edits a draft and sends it manually from their inbox, there’s a “I sent it manually, start nudges” button that kicks off the sequence without touching the CRM.

What’s still unfinished

Meeting no-shows leave threads paused indefinitely. If a lead books a call and ghosts it, nothing re-engages them automatically.

Cold re-engagements (lead goes quiet, comes back months later) correctly route to the operator for the first reply back. But there’s no automated path to restart the nudge sequence after that.

maxAiTurns is set conservatively right now. It needs calibration against real thread data to find where reply quality actually starts slipping.

Full build breakdown across two posts:

→ Part 1: Core Pipeline

→ Part 2: Nudges, Edge Cases & Multi-Turn