Email infrastructure built for AI agents, not adapted from human email

MultiMail gives your engineering team purpose-built primitives for agent email: identity, graduated oversight, and deployment patterns that fit API and MCP-based architectures.


Why this matters

Generic email APIs were designed for humans sending transactional messages. Adapting them for AI agents means bolting on approval workflows, building your own audit trails, and hoping the agent does not send something it should not. CTOs evaluating agent email infrastructure need to understand what the system actually guarantees — not what a marketing page implies. That means knowing where identity is enforced, where approval checkpoints sit in the call graph, how the formal security model was verified, and how to roll out gradually without committing your entire email stack on day one.


How MultiMail solves this

MultiMail is a purpose-built email API for AI agents. Identity is scoped per mailbox, not per account — each agent gets its own mailbox and credentials. Oversight is a first-class primitive: every outbound action passes through a configurable approval checkpoint before delivery. The security model covering identity, oversight, and authorization is formally verified in Lean 4 and the proofs are published and checked in CI. Integration follows standard patterns: REST API, Python SDK, or MCP server depending on your agent framework. You can start with read-only access, graduate to gated sends requiring human approval, and move to monitored or autonomous operation once your team has established confidence in the agent's behavior.

1

Review the architecture

MultiMail's API surface is a set of discrete, auditable operations: send_email, reply_email, check_inbox, read_email, get_thread, tag_email, decide_email, manage_contacts, create_mailbox, list_pending, and cancel_message. Each operation is separately authorized. Agents authenticate with per-mailbox Bearer tokens (mm_live_... for production, mm_test_... for integration tests). No shared credentials across mailboxes.

2

Evaluate the oversight controls

Every outbound action can be configured to require human approval before delivery. The oversight_mode field on each mailbox controls this: gated_send (reads are autonomous, sends require approval), gated_all (all actions require approval), monitored (autonomous with human notifications), or autonomous. Pending approvals are queryable via list_pending and cancellable via cancel_message. Approval checkpoints are enforced server-side — they cannot be bypassed by the agent.

3

Test the integration

Use mm_test_... tokens to run against the live API without delivering real email. Test tokens have identical behavior to production tokens — same approval flows, same audit logs, same error responses — but outbound messages are intercepted at the delivery layer. Integration tests can cover the full send → approval → deliver path without a live inbox.

4

Deploy a pilot

Create a dedicated mailbox for the pilot agent via create_mailbox, set oversight_mode to gated_send, and assign one human approver. Route a narrow slice of agent email volume through that mailbox. Monitor the approval queue via list_pending and check the audit log via the REST API to verify the agent is operating within expected parameters.

5

Expand usage

Once the pilot establishes a behavioral baseline, create additional mailboxes per agent role, adjust oversight_mode per mailbox based on observed trust levels, and wire up webhooks for inbound email, delivery status, and approval events. The MCP server exposes all 50 tools for MCP-compatible clients (Claude Desktop, Cursor, Windsurf) without requiring direct API integration code.


Implementation

Create a scoped agent mailbox
bash
curl -X POST https://api.multimail.dev/v1/mailboxes \
  -H &"cm">#039;Authorization: Bearer $MULTIMAIL_API_KEY' \
  -H &"cm">#039;Content-Type: application/json' \
  -d &"cm">#039;{
    "address": "[email protected]",
    "display_name": "Procurement Agent",
    "oversight_mode": "gated_send",
    "approver_emails": ["[email protected]"]
  }&"cm">#039;

Provision a mailbox for a specific agent with gated_send oversight. Each agent should have its own mailbox and token — never share credentials across agents.

Send with approval checkpoint
python
import multimail

client = multimail.Client(api_key="$MULTIMAIL_API_KEY")

"cm"># Returns immediately with status='pending' when oversight_mode='gated_send'
result = client.send_email(
    from_address="[email protected]",
    to=["[email protected]"],
    subject="PO #8821 — amended delivery schedule",
    body="Please confirm the updated delivery date of 2026-05-01 for order #8821.",
    metadata={"workflow": "procurement", "po_number": "8821"}
)

print(result.status)        "cm"># 'pending'
print(result.message_id)    "cm"># 'msg_01JXK...'
print(result.approval_url)  "cm"># approval link sent to the assigned human approver

Under gated_send, send_email returns a pending message ID rather than delivering immediately. The message sits in the approval queue until a human approves it or the agent cancels it.

Query and manage the approval queue
python
import multimail

client = multimail.Client(api_key="$MULTIMAIL_API_KEY")

pending = client.list_pending(
    mailbox="[email protected]",
    limit=50
)

for msg in pending.messages:
    print(f"{msg.message_id} | to={msg.to[0]} | {msg.subject} | queued {msg.created_at}")

"cm"># Cancel a message before it is approved
client.cancel_message(message_id="msg_01JXK_specific_id")

list_pending returns all messages awaiting human approval. Useful for monitoring agent activity and building internal dashboards.

Integration tests with test tokens
python
import multimail
import pytest

@pytest.fixture
def client():
    return multimail.Client(api_key="mm_test_your_test_token")

def test_gated_send_creates_pending(client):
    result = client.send_email(
        from_address="[email protected]",
        to=["[email protected]"],
        subject="Test: contract renewal",
        body="Renewing contract #4492 for another 12 months."
    )
    assert result.status == "pending"
    assert result.message_id.startswith("msg_")

def test_cancel_pending_message(client):
    msg = client.send_email(
        from_address="[email protected]",
        to=["[email protected]"],
        subject="Cancellation test",
        body="This message will be cancelled before delivery."
    )
    cancelled = client.cancel_message(message_id=msg.message_id)
    assert cancelled.status == "cancelled"

mm_test_... tokens run the full API stack including approval flows and audit logging but intercept at delivery. Use in CI — same behavior as production, no real email sent.

Inbound webhook handler
python
from fastapi import FastAPI, Request, HTTPException
import hmac
import hashlib

app = FastAPI()
WEBHOOK_SECRET = "your_webhook_signing_secret"

@app.post("/webhooks/multimail")
async def handle_email_event(request: Request):
    body = await request.body()
    sig = request.headers.get("X-MultiMail-Signature", "")
    expected = hmac.new(
        WEBHOOK_SECRET.encode(),
        body,
        hashlib.sha256
    ).hexdigest()
    if not hmac.compare_digest(sig, expected):
        raise HTTPException(status_code=401, detail="invalid signature")

    event = await request.json()
    if event["type"] == "email.received":
        mailbox = event["data"]["mailbox"]
        message_id = event["data"]["message_id"]
        await route_to_agent(mailbox, message_id)

    return {"ok": True}

Configure a webhook endpoint to receive inbound email events. Each event includes the full message payload, mailbox address, and a signature for verification.

MCP server configuration for Claude Desktop
json
{
  "mcpServers": {
    "multimail": {
      "command": "npx",
      "args": ["-y", "@multimail/mcp-server"],
      "env": {
        "MULTIMAIL_API_KEY": "$MULTIMAIL_API_KEY",
        "MULTIMAIL_MAILBOX": "[email protected]"
      }
    }
  }
}

The MCP server exposes all 43 MultiMail tools without writing API integration code. The same oversight_mode configured on the mailbox applies regardless of whether the agent uses REST, SDK, or MCP.


What you get

Identity is scoped per agent, not per account

Each agent mailbox has its own credentials and authorization scope. An agent with access to [email protected] cannot read or send from [email protected]. Token compromise is contained to a single mailbox.

Oversight is enforced server-side

Approval checkpoints are implemented in the API, not in client code the agent controls. An agent cannot bypass gated_send by modifying its own requests — the server rejects unapproved outbound messages regardless of what the agent sends.

Formal verification of the security model

The identity, oversight, and authorization models are proven correct in Lean 4. The proofs cover core invariants: a gated agent cannot deliver without approval, identity cannot be forged across mailboxes, and oversight mode changes require explicit re-authorization. Proofs are published and checked in CI on every commit.

Graduated rollout without re-architecting

Start a pilot agent on gated_send with one human approver. Move it to monitored once you have behavioral confidence. Expand to autonomous when your team is ready. oversight_mode is a per-mailbox field — changing it does not require code changes or redeployment.

Test tokens with full API fidelity

mm_test_... tokens run the complete API stack including approval flows, audit logging, and webhook delivery, but intercept outbound messages before delivery. Integration tests cover real code paths — not mocked behavior — without touching live inboxes.

No framework lock-in on the agent side

MultiMail exposes a REST API, a Python SDK (multimail-sdk), and an MCP server. Agents built on LangChain, CrewAI, AutoGen, Semantic Kernel, or any MCP-compatible client integrate without framework-specific adapters.


Recommended oversight mode

Recommended
gated_send
For initial evaluation and pilot deployments, gated_send gives agents autonomous read access to their inbox while requiring human approval before any outbound message is delivered. This lets you observe real agent behavior in a live email environment without risk of unintended external communications. Once the pilot establishes a behavioral baseline and your team has confidence in the agent's judgment, individual mailboxes can graduate to monitored or autonomous without changing the integration code.

Common questions

Where is the Lean 4 source for the formal proofs?
The proofs live in the Proofs/ directory of the MultiMail repository and are compiled and verified in CI on every commit. The proof covers core security invariants: a mailbox in gated_send mode cannot deliver outbound messages without explicit approval, tokens cannot be used across mailbox boundaries, and oversight mode changes require account-level credentials. You can inspect the theorems directly rather than relying on a compliance attestation.
How does agent identity work when multiple agents share a domain?
Each agent gets its own mailbox address and its own API token. A token issued for [email protected] cannot send from or read the inbox of [email protected]. Authorization is enforced at the API layer on every request — there is no shared session or ambient identity that spans mailboxes.
Can an agent modify its own oversight mode?
No. oversight_mode is a mailbox configuration field controlled by account-level credentials, not by the agent's scoped token. An agent token authorizes send and receive operations on its assigned mailbox only — it cannot call mailbox configuration endpoints. Oversight mode changes require an account owner credential.
What happens to a pending message if no one approves it?
Pending messages expire after a configurable TTL (default 72 hours). You can also cancel a pending message programmatically via cancel_message using the message_id returned by the original send_email call. The agent can query its own pending queue via list_pending at any time to check current status.
How do we audit what the agent has done?
Every API call — inbound reads, outbound sends, approval events, cancellations — is recorded in the audit log with timestamp, mailbox, action type, and outcome. The log is queryable via the REST API. Webhooks deliver real-time events for inbound email, delivery status changes, and approval or rejection events, so you can route events into your existing observability stack.
Does the MCP server apply the same oversight controls as the REST API?
Yes. The MCP server is an adapter over the same REST API. When an MCP-connected agent calls the send_email tool, it goes through the same server-side oversight checkpoint as a direct API call. The oversight_mode configured on the mailbox applies regardless of whether the agent is using the REST API, Python SDK, or MCP server.
How do we handle data residency or retention requirements?
MultiMail stores email data on Cloudflare's global network with configurable region pinning. Retention policies are configurable per mailbox. Audit logs are append-only and exportable. If your compliance requirements mandate specific certifications such as SOC 2 or a HIPAA BAA, contact the MultiMail team to discuss your requirements before making an architectural commitment.

Explore more use cases

The only agent email with a verifiable sender

Email infrastructure built for AI agents. Verifiable identity, graduated oversight, and a 50-tool MCP server. Formally verified in Lean 4.