Email Threading That Keeps Agents in Context

MultiMail tracks thread identity, conversation history, and reply headers so your AI agent never loses context or forks a conversation incorrectly.


Why this matters

Email agents fail in subtle, trust-destroying ways when thread context is missing. An agent that replies to the wrong message in a chain, repeats information already addressed three emails ago, or sends a response that omits the References header — breaking thread grouping in every major email client — looks broken even if the underlying logic is correct. These failures are hard to catch in testing because they depend on real conversation state that unit tests do not model. The root cause is usually the same: the agent fetched the most recent message and replied to it in isolation, without access to the thread as a whole.


How MultiMail solves this

MultiMail exposes email conversations as first-class objects. Each thread has a stable ID, an ordered message list, and pre-computed In-Reply-To and References header values ready to attach to your reply. When your agent calls get_thread, it receives the full message history — not just the latest email — along with the correct Message-ID of the message being replied to. The reply_email endpoint sets RFC 2822 threading headers automatically, so downstream mail clients group the conversation correctly regardless of which client receives the reply.

1

Fetch the thread

Call get_thread with the thread ID to retrieve all messages in the conversation, ordered chronologically. Each message includes its Message-ID, sender, timestamp, and body. The response surfaces the thread subject, participant list, and the message_id needed to construct a well-formed reply.

2

Load conversation history

The get_thread response includes a messages array with every message in the thread. Pass this ordered history to your LLM so it can draft a reply with full awareness of what has already been said, committed to, or asked — not just the most recent message.

3

Draft a contextual reply

Your agent generates a reply grounded in the full thread history. Because the complete conversation is available, the agent avoids repeating resolved items, surfaces only what is genuinely new, and matches the tone and commitments established earlier in the thread.

4

Send with correct reply headers

Call reply_email with the thread_id and the message_id of the specific message you are replying to. MultiMail constructs and attaches the In-Reply-To and References headers automatically — no manual header manipulation required in your agent code.

5

Track delivery and handle inbound replies

Webhooks fire on delivery confirmation, bounces, and inbound replies. Each inbound webhook payload includes the thread_id already resolved, so your agent can process the next message in the conversation reactively without polling check_inbox.


Implementation

Fetch a full email thread
python
import multimail

client = multimail.Client(api_key="$MULTIMAIL_API_KEY")

"cm"># Fetch the full thread — not just the latest message
thread = client.get_thread(thread_id="thrd_01HXYZ1234ABCD")

print(f"Subject: {thread.subject}")
print(f"Messages in thread: {len(thread.messages)}")

for msg in thread.messages:
    print(f"  [{msg.sent_at}] From: {msg.from_address}")
    print(f"  Message-ID: {msg.message_id}")
    print(f"  Preview: {msg.body_text[:120]}...")

"cm"># The last message is what we will reply to
latest = thread.messages[-1]
print(f"Replying to message: {latest.message_id}")

Retrieve all messages in a conversation by thread ID. The response includes ordered messages with bodies, participant list, and the Message-ID needed to construct a reply.

Reply with thread context preserved
python
import multimail
from anthropic import Anthropic

client = multimail.Client(api_key="$MULTIMAIL_API_KEY")
anthropic = Anthropic()

"cm"># Load the thread
thread = client.get_thread(thread_id="thrd_01HXYZ1234ABCD")

"cm"># Build conversation history for the model
messages = []
for msg in thread.messages:
    role = "assistant" if msg.from_address == "[email protected]" else "user"
    messages.append({"role": role, "content": msg.body_text})

"cm"># Add the drafting instruction
messages.append({
    "role": "user",
    "content": "Draft a reply covering the deployment details and remaining action items from this thread."
})

"cm"># Draft the reply with full thread context
response = anthropic.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system="You are an assistant drafting email replies. Be concise and address only what is new.",
    messages=messages
)
reply_body = response.content[0].text

"cm"># Send — threading headers set automatically
result = client.reply_email(
    thread_id=thread.thread_id,
    reply_to_message_id=thread.messages[-1].message_id,
    from_address="[email protected]",
    body_text=reply_body,
)

print(f"Sent: {result.message_id}")
print(f"Thread position: {result.position_in_thread} of {result.thread_length}")

Send a reply anchored to the correct thread. MultiMail sets In-Reply-To and References headers automatically based on the thread_id and reply_to_message_id you provide — no raw header construction in your agent.

REST API — get_thread and reply_email
bash
"cm"># Step 1: Fetch the full thread
curl -s https://api.multimail.dev/v1/threads/thrd_01HXYZ1234ABCD \
  -H "Authorization: Bearer $MULTIMAIL_API_KEY" | jq .

"cm"># Response shape:
"cm"># {
"cm">#   "thread_id": "thrd_01HXYZ1234ABCD",
"cm">#   "subject": "Re: Next steps on your deployment",
"cm">#   "messages": [
"cm">#     { "message_id": "<[email protected]>", "from_address": "[email protected]", "body_text": "..." },
#     { "message_id": "<[email protected]>", "from_address": "[email protected]", "body_text": "..." }
#   ],
#   "participant_addresses": ["[email protected]", "[email protected]"]
# }

# Step 2: Reply — MultiMail sets In-Reply-To and References automatically
curl -s -X POST https://api.multimail.dev/v1/emails/reply \
  -H "Authorization: Bearer $MULTIMAIL_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: reply-$(uuidgen)" \
  -d &"cm">#039;{
    "thread_id": "thrd_01HXYZ1234ABCD",
    "reply_to_message_id": "<[email protected]>",
    "from_address": "[email protected]",
    "body_text": "Following up on the prior thread, here are the deployment details we discussed and the remaining action items."
  }&"cm">#039; | jq .

Direct REST calls for teams not using the Python SDK. The reply endpoint requires an Idempotency-Key to prevent duplicate sends on retry.

MCP server — get_thread and reply_email in Claude Desktop
text
// Tool: get_thread
// Available in Claude Desktop, Cursor, Windsurf, and any MCP client

Tool call: get_thread
Arguments:
  thread_id: "thrd_01HXYZ1234ABCD"

Returns:
  subject: "Re: Next steps on your deployment"
  messages: [ ... ordered list of all messages with bodies and Message-IDs ... ]
  participant_addresses: ["[email protected]", "[email protected]"]

---

// Tool: reply_email
// MultiMail sets In-Reply-To and References — do not construct headers manually

Tool call: reply_email
Arguments:
  thread_id: "thrd_01HXYZ1234ABCD"
  reply_to_message_id: "<[email protected]>"
  from_address: "[email protected]"
  body_text: "Following up on the prior thread, here are the deployment details we discussed and the remaining action items."

Returns:
  message_id: "<[email protected]>"
  position_in_thread: 3
  headers_set: ["In-Reply-To", "References"]

For agents using the MultiMail MCP server, get_thread and reply_email are available as named tools. Header management is handled server-side, identical to the REST and SDK paths.

Webhook handler — process inbound reply with thread_id resolved
python
from fastapi import FastAPI, Request
import multimail

app = FastAPI()
client = multimail.Client(api_key="$MULTIMAIL_API_KEY")

@app.post("/webhooks/multimail")
async def handle_inbound(request: Request):
    payload = await request.json()

    "cm"># MultiMail resolves thread_id on inbound — no header parsing needed
    if payload["event"] != "email.inbound":
        return {"ok": True}

    thread_id = payload["thread_id"]
    message_id = payload["message_id"]

    "cm"># Load the full thread to give the agent complete context
    thread = client.get_thread(thread_id=thread_id)

    "cm"># Your agent processes the thread and drafts a reply
    reply_body = your_agent.process_thread(thread.messages)

    "cm"># Reply anchored to the triggering message
    client.reply_email(
        thread_id=thread_id,
        reply_to_message_id=message_id,
        from_address="[email protected]",
        body_text=reply_body,
    )

    return {"ok": True}

Inbound email webhooks include the thread_id already matched. Use this to trigger your agent reactively when a customer replies, without polling check_inbox.


What you get

RFC 2822 threading headers set automatically

MultiMail computes and attaches In-Reply-To and References headers on every reply_email call. Replies group correctly in Gmail, Outlook, Apple Mail, and any RFC 2822-compliant client without any header construction in your agent code.

Full conversation history in a single API call

get_thread returns every message in the thread ordered chronologically, with bodies, timestamps, and sender addresses. Your LLM receives the complete context it needs to write a coherent reply — not just the most recent message.

Stable thread IDs across inbound and outbound messages

Thread identity is tracked server-side. Whether the next message arrives from the customer or is sent by your agent, it is appended to the same thread object under the same thread_id. No client-side state or header parsing required.

Reactive processing via webhooks

Inbound reply webhooks include the thread_id already resolved. For high-volume deployments handling thousands of conversations, your agent processes new replies reactively rather than polling check_inbox on an interval.

Works across all MCP clients and the REST API

get_thread and reply_email are available as MCP tools (Claude Desktop, Cursor, Windsurf), as Python SDK methods, and as direct REST endpoints. Threading header logic is handled server-side identically across all three paths.


Recommended oversight mode

Recommended
monitored
Thread-aware reply agents handle high volumes of conversational email where per-message approval creates unacceptable latency. Monitored mode lets agents send autonomously while routing every outbound message to a human reviewer who can intervene if the agent drifts off-context or misreads thread history. Start with gated_send during the first week of deployment to validate that thread context is being used correctly across varied thread lengths, then promote to monitored once reply quality is consistent.

Common questions

Does MultiMail thread emails sent from external clients correctly?
Yes. When an inbound email arrives with a References or In-Reply-To header matching an existing thread, MultiMail appends it to that thread automatically. The thread_id in the inbound webhook payload is already resolved — your agent does not need to parse headers to identify the conversation.
What happens if a customer reply accidentally starts a new thread?
If an inbound message has no matching thread reference, MultiMail creates a new thread for it. You can flag the new thread for human review by calling tag_email with a label like needs-review, or implement logic to detect suspiciously short threads that may be orphaned replies.
How many messages can a thread hold?
There is no hard cap on thread length. get_thread returns all messages in the thread. For long threads, consider passing only the most recent N messages to your LLM to stay within context window limits — the messages array is ordered chronologically so you can slice from the end.
Can I use get_thread with LangChain or CrewAI?
Yes. The multimail-sdk exposes get_thread and reply_email as callables that wrap cleanly as LangChain tools or CrewAI task actions. Pass the thread.messages list directly as the agent's conversation memory.
What oversight mode should I use while testing thread handling?
Use gated_send during initial development. This lets you inspect each reply before it reaches the recipient, making it easy to catch cases where the agent missed context from earlier in the thread or repeated already-resolved items. Promote to monitored once quality is consistent across threads of five or more messages.
Does the MCP server set threading headers correctly?
Yes. The reply_email MCP tool accepts thread_id and reply_to_message_id and delegates all header construction to the MultiMail API. You do not specify raw headers in MCP tool arguments — threading is managed server-side, identical to the REST and SDK paths.
How do I find the thread_id for an existing email?
Inbound email webhooks always include thread_id. For emails sent via send_email, the response includes the thread_id assigned to the new conversation. You can also call check_inbox and inspect the thread_id field on each message in the results.

Explore more use cases

The only agent email with a verifiable sender

Email infrastructure built for AI agents. Verifiable identity, graduated oversight, and a 50-tool MCP server. Formally verified in Lean 4.