Auto-Responders That Actually Read the Email

MultiMail's inbound webhooks and gated_send oversight let AI agents draft contextual replies that a human reviews before anything reaches the customer.


Why this matters

Generic auto-responders erode trust. A customer emailing about a billing dispute doesn't want 'Thanks for contacting us — we'll respond within 2 business days.' They want an answer. But fully autonomous AI replies carry real risk: hallucinated policy details, wrong account information, or a tone-deaf response to an already-frustrated customer. Most teams are stuck between 'useless template' and 'fully autonomous AI' with no intermediate option that's actually safe to deploy.


How MultiMail solves this

MultiMail's inbound webhook delivers the full email to your agent the moment it arrives. The agent calls get_thread to load the complete conversation history, drafts a reply using your LLM of choice, then queues it via reply_email with gated_send mode. A human reviewer sees the draft alongside the original thread, approves or edits it, and MultiMail delivers. The agent handles reading and drafting; the human handles last-mile judgment. As you build confidence in specific email categories, you can shift them to monitored or autonomous mode without changing your agent's code.

1

Inbound webhook triggers the agent

Configure your MultiMail mailbox to POST to your webhook endpoint on every inbound message. The payload includes message_id, thread_id, sender, subject, and body text. Your agent receives this event and begins processing immediately — no polling required.

2

Agent loads full thread context

The agent calls get_thread with the thread_id from the webhook payload. This returns the complete conversation history — all prior messages, timestamps, and directions — so the draft accounts for everything that's already been said and avoids repeating questions already answered.

3

Agent drafts a contextual reply

The agent passes the thread history and incoming message to your LLM with a system prompt that includes your product knowledge and any verified account data you've fetched from your own database. The model generates a reply scoped to what it actually knows.

4

Draft queued for human review

The agent calls reply_email with the drafted response and oversight_mode set to gated_send. The message is held in the pending queue rather than delivered. Reviewers access the queue via list_pending and see the original email, full thread, and proposed reply side by side.

5

Human approves, edits, or rejects

The reviewer calls decide_email with decision: approve, approve with a revised_body, or reject with a rejection_note. Approved messages deliver immediately. Rejected drafts can trigger a revision loop in your agent.

6

Delivery webhook closes the loop

MultiMail fires a delivery event when the message sends. Your agent logs the outcome, calls tag_email to mark the thread resolved, and optionally updates your CRM or support system with the resolution details.


Implementation

Inbound webhook handler
python
import hmac
import hashlib
from flask import Flask, request, jsonify
from multimail import MultimailClient

app = Flask(__name__)
client = MultimailClient(api_key="$MULTIMAIL_API_KEY")
WEBHOOK_SECRET = b"your_webhook_secret"

@app.route("/webhooks/inbound", methods=["POST"])
def handle_inbound():
    sig = request.headers.get("X-MultiMail-Signature", "")
    expected = hmac.new(WEBHOOK_SECRET, request.data, hashlib.sha256).hexdigest()
    if not hmac.compare_digest(sig, expected):
        return jsonify({"error": "invalid signature"}), 401

    payload = request.json
    message_id = payload["message_id"]
    thread_id = payload["thread_id"]

    thread = client.get_thread(thread_id=thread_id)
    draft = draft_reply(thread=thread, incoming=payload)

    client.reply_email(
        message_id=message_id,
        body=draft,
        oversight_mode="gated_send"
    )

    return jsonify({"status": "queued"}), 200

Receive an inbound email event, verify the signature, fetch the thread, and queue a draft for review.

Draft contextual reply with thread history
python
import anthropic

client_ai = anthropic.Anthropic()

def draft_reply(thread: dict, incoming: dict) -> str:
    history = []
    for msg in thread["messages"]:
        role = "assistant" if msg["direction"] == "outbound" else "user"
        history.append({"role": role, "content": msg["body_text"]})

    system_prompt = (
        "You are a support agent for Acme SaaS. "
        "Draft a helpful, accurate reply to the customer&"cm">#039;s email. "
        "Be concise. Do not make promises you cannot verify. "
        "If you are unsure about account-specific details, write "
        "[REVIEWER: please fill in X] so the human reviewer knows where to complete the draft."
    )

    response = client_ai.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=512,
        system=system_prompt,
        messages=history + [
            {"role": "user", "content": incoming["body_text"]}
        ]
    )

    return response.content[0].text

Build a prompt from the full thread and use an LLM to generate a reply grounded in the actual conversation.

Review queue — list and decide
python
from multimail import MultimailClient

client = MultimailClient(api_key="$MULTIMAIL_API_KEY")

def process_review_queue(mailbox_id: str):
    pending = client.list_pending(mailbox_id=mailbox_id)

    for item in pending["messages"]:
        print(f"From: {item[&"cm">#039;from']}")
        print(f"Subject: {item[&"cm">#039;subject']}")
        print(f"Draft reply:\n{item[&"cm">#039;draft_body']}\n")

        action = input("[a]pprove / [e]dit / [r]eject: ").strip().lower()

        if action == "a":
            client.decide_email(
                message_id=item["message_id"],
                decision="approve"
            )
        elif action == "e":
            revised = input("Enter revised reply: ")
            client.decide_email(
                message_id=item["message_id"],
                decision="approve",
                revised_body=revised
            )
        elif action == "r":
            note = input("Rejection note for agent: ")
            client.decide_email(
                message_id=item["message_id"],
                decision="reject",
                rejection_note=note
            )

process_review_queue(mailbox_id="[email protected]")

Build a simple approval loop using list_pending and decide_email. Runs in a dashboard backend or a CLI tool for your support team.

Tag resolved threads after delivery
python
from flask import Flask, request, jsonify
from multimail import MultimailClient

app = Flask(__name__)
client = MultimailClient(api_key="$MULTIMAIL_API_KEY")

@app.route("/webhooks/delivery", methods=["POST"])
def handle_delivery():
    payload = request.json

    if payload.get("event") == "message.delivered":
        client.tag_email(
            message_id=payload["original_message_id"],
            tags=["auto-replied", "resolved"]
        )

    return jsonify({"ok": True}), 200

Use the delivery webhook to mark the original message resolved via tag_email, keeping your inbox organized without manual work.

MCP tool equivalent (Claude Desktop / Cursor)
text
# Check for new inbound emails
check_inbox(mailbox_id="[email protected]", filter="unread")

# Load full thread context
get_thread(thread_id="thread_01abc123")

# Read the specific inbound message
read_email(message_id="msg_01xyz789")

# Queue a draft reply for human review
reply_email(
  message_id="msg_01xyz789",
  body="Hi Alex, you can cancel your subscription from Settings > Billing > Cancel Plan. The cancellation takes effect at the end of your current billing period. Let me know if you need help finding it.",
  oversight_mode="gated_send"
)

# Check what's pending approval
list_pending(mailbox_id="[email protected]")

# Approve a draft
decide_email(message_id="msg_01xyz789", decision="approve")

# Tag original thread resolved
tag_email(message_id="msg_01xyz789", tags=["resolved"])

If your agent runs inside an MCP-compatible client, use these tool calls directly — no webhook handler required.


What you get

Contextual replies, not templates

The agent reads the full conversation thread via get_thread before drafting. Replies reference what the customer actually said, not a generic acknowledgment that ignores the question entirely.

Human review before every send

gated_send mode holds every draft in the pending queue. Your team reviews the agent's work and approves, edits, or rejects before anything reaches the customer. You get drafting speed without losing control over what goes out.

Incremental path to autonomy

Start with gated_send across all email categories. As you build confidence in specific types — shipping status, password reset instructions, plan upgrade confirmations — switch those categories to monitored or autonomous. The oversight_mode is set per-send, so you can mix modes without infrastructure changes.

Structured audit trail

Every draft, every approval decision, and every delivery event is logged. You can query the full history of any thread, see exactly what the agent drafted, and review what a human changed before approving. This matters for support quality audits.

Model-agnostic drafting

MultiMail handles inbound routing, thread stitching, and delivery. Your agent can use any model for the drafting step — Claude, GPT-4o, Gemini, or a fine-tuned model trained on your historical support data. The API does not care which LLM you use.


Recommended oversight mode

Recommended
gated_send
Auto-responders handle high-volume, customer-facing email where errors are visible and trust-damaging. gated_send is the right default: agents draft quickly, but a human reviewer catches hallucinated policy details, wrong account information, or inappropriate tone before the message reaches the customer. Once you've reviewed enough drafts in a given email category to trust the agent's output quality, switch that category to monitored or autonomous — neither requires changes to your webhook handler or agent logic.

Common questions

How does the agent access the full conversation, not just the latest message?
The inbound webhook payload includes a thread_id. Your agent passes this to get_thread, which returns all messages in the conversation in chronological order — body text, sender, timestamp, and direction (inbound or outbound). The LLM receives the full history before drafting, so replies don't repeat questions already answered or ignore prior commitments.
What happens when a reviewer rejects a draft?
Calling decide_email with decision: 'reject' removes the message from the pending queue without delivering it. You can include a rejection_note that your agent logs or uses to trigger a revision workflow — for example, re-prompting the LLM with the reviewer's feedback and queuing a new draft. The original inbound message remains readable via read_email so the thread is not lost.
Can I use different oversight modes for different email categories?
Yes. The oversight_mode parameter is set at send time, not at the mailbox level. Your agent can inspect the incoming email's subject line, sender domain, or body content, then choose gated_send for ambiguous cases and autonomous for well-defined categories where the response is deterministic — like confirming a support ticket was received with a ticket number from your system.
How do I prevent the agent from making up account details it doesn't have access to?
Two approaches work well together. First, inject verified account data into the system prompt before calling your LLM — look up the sender's email in your database and include their plan, account status, and relevant history. Second, instruct the agent to write [REVIEWER: please fill in X] for any detail it cannot verify, so reviewers know exactly where to complete the draft before approving.
What email volume can this handle?
MultiMail processes inbound webhooks synchronously with no per-mailbox rate limit on inbound processing. Outbound sending is subject to your plan's monthly limit: Builder (5,000/mo), Pro (30,000/mo), Scale (150,000/mo). High-volume support queues should use the Pro or Scale plan. Drafts that are rejected or expire do not count against your send limit.
Can I build this with the MCP server instead of the REST API?
Yes. The MCP server exposes equivalent tools: check_inbox, read_email, get_thread, reply_email, list_pending, decide_email, and tag_email. If your agent runs inside Claude Desktop, Cursor, Windsurf, or another MCP-compatible client, you can use these tools directly without writing a webhook handler. Oversight mode behavior is identical to the REST API.
How do I notify my support team when drafts are waiting for review?
MultiMail fires a message.queued webhook event when a gated_send draft enters the pending queue. Subscribe to this event in your webhook endpoint and use it to post a Slack notification, send an internal email, or update your support team's dashboard. The event payload includes the message_id, thread_id, and sender so reviewers can triage before opening the queue.

Explore more use cases

The only agent email with a verifiable sender

Email infrastructure built for AI agents. Verifiable identity, graduated oversight, and a 50-tool MCP server. Formally verified in Lean 4.