AI Email Infrastructure for CTOs

Why this matters

Generic email APIs were designed for humans sending transactional messages. Adapting them for AI agents means bolting on approval workflows, building your own audit trails, and hoping the agent does not send something it should not. CTOs evaluating agent email infrastructure need to understand what the system actually guarantees — not what a marketing page implies. That means knowing where identity is enforced, where approval checkpoints sit in the call graph, how the formal security model was verified, and how to roll out gradually without committing your entire email stack on day one.

How MultiMail solves this

MultiMail is a purpose-built email API for AI agents. Identity is scoped per mailbox, not per account — each agent gets its own mailbox and credentials. Oversight is a first-class primitive: every outbound action passes through a configurable approval checkpoint before delivery. The security model covering identity, oversight, and authorization is formally verified in Lean 4 and the proofs are published and checked in CI. Integration follows standard patterns: call the REST API directly from any HTTP client or use the MCP server depending on your agent framework. You can start with read-only access, graduate to gated sends requiring human approval, and move to monitored or autonomous operation once your team has established confidence in the agent's behavior.

Review the architecture

MultiMail's API surface is a set of discrete, auditable operations: POST /v1/mailboxes/{mailbox_id}/send, POST /v1/mailboxes/{mailbox_id}/reply/{email_id}, GET /v1/mailboxes/{mailbox_id}/emails, GET /v1/mailboxes/{mailbox_id}/emails/{email_id}, GET /v1/mailboxes/{mailbox_id}/threads/{thread_id}, PUT /v1/mailboxes/{mailbox_id}/emails/{email_id}/tags, POST /v1/oversight/decide, POST /v1/contacts, POST /v1/mailboxes, GET /v1/oversight/pending, and POST /v1/mailboxes/{mailbox_id}/emails/{email_id}/cancel. Each operation is separately authorized. Agents authenticate with per-mailbox Bearer tokens (mm_live_... for production, mm_test_... for integration tests). No shared credentials across mailboxes.

Evaluate the oversight controls

Every outbound action can be configured to require human approval before delivery. The oversight_mode field on each mailbox controls this: gated_send (reads are autonomous, sends require approval), gated_all (all actions require approval), monitored (autonomous with human notifications), or autonomous. Pending approvals are queryable via GET /v1/oversight/pending and a still-pending send is cancellable via POST /v1/mailboxes/{mailbox_id}/emails/{email_id}/cancel. Approval checkpoints are enforced server-side — they cannot be bypassed by the agent.

Test the integration

Use mm_test_... tokens to run against the live API without delivering real email. Test tokens have identical behavior to production tokens — same approval flows, same audit logs, same error responses — but outbound messages are intercepted at the delivery layer. Integration tests can cover the full send → approval → deliver path without a live inbox.

Deploy a pilot

Create a dedicated mailbox for the pilot agent via POST /v1/mailboxes, set oversight_mode to gated_send, and designate a human approver. Route a narrow slice of agent email volume through that mailbox. Monitor the approval queue via GET /v1/oversight/pending and check the audit log via GET /v1/audit-log to verify the agent is operating within expected parameters.

Expand usage

Once the pilot establishes a behavioral baseline, create additional mailboxes per agent role, adjust oversight_mode per mailbox based on observed trust levels, and wire up webhooks for inbound email, delivery status, and approval events. The MCP server exposes all 51 tools for MCP-compatible clients (Claude Desktop, Cursor, Windsurf) without requiring direct API integration code.

Try it with your agent

Pick your platform, copy the prompt, and paste it to your AI agent — it sets up MultiMail and builds the whole flow. Nothing to fill in.

1. Get MultiMail ready: read https://multimail.dev/llms.txt, connect the MCP server, create a free inbox, and set up a verified sender. 2. In GitHub, use repository webhooks for issues, pull_request, and issue_comment events, and watch for labels such as ai-email-eval, security-review, or rollout-ready on the repositories where agent email infrastructure is being evaluated. 3. When an event matches, draft a CTO-ready evaluation email that explains the proposed agent workflow, identity boundaries, required approval checkpoints, audit expectations, rollout stage, and any unresolved security or compliance questions, using the GitHub issue, pull request, comments, and linked files as source material. 4. Use a gradual cadence: send an initial evaluation brief, schedule a follow-up after engineering review, and send a rollout-readiness note only when the GitHub thread shows the required owners have approved. 5. Run MultiMail in gated_send mode for every outbound message, and ask me only for GitHub credentials plus the company name, sender identity, recipient list, and brand voice needed to go live.

What you get

Identity is scoped per agent, not per account

Each agent mailbox has its own credentials and authorization scope. An agent with access to [email protected] cannot read or send from [email protected]. Token compromise is contained to a single mailbox.

Oversight is enforced server-side

Approval checkpoints are implemented in the API, not in client code the agent controls. An agent cannot bypass gated_send by modifying its own requests — the server rejects unapproved outbound messages regardless of what the agent sends.

Formal verification of the security model

The identity, oversight, and authorization models are proven correct in Lean 4. The proofs cover core invariants: a gated agent cannot deliver without approval, identity cannot be forged across mailboxes, and oversight mode changes require explicit re-authorization. Proofs are published and checked in CI on every commit.

Graduated rollout without re-architecting

Start a pilot agent on gated_send with one human approver. Move it to monitored once you have behavioral confidence. Expand to autonomous when your team is ready. oversight_mode is a per-mailbox field — changing it does not require code changes or redeployment.

Test tokens with full API fidelity

mm_test_... tokens run the complete API stack including approval flows, audit logging, and webhook delivery, but intercept outbound messages before delivery. Integration tests cover real code paths — not mocked behavior — without touching live inboxes.

No framework lock-in on the agent side

MultiMail exposes a REST API you can call directly from any HTTP client (requests, httpx, fetch) and an MCP server (npm @multimail/mcp-server). Agents built on LangChain, CrewAI, AutoGen, Semantic Kernel, or any MCP-compatible client integrate without framework-specific adapters.

Recommended oversight mode

Recommended

gated_send

For initial evaluation and pilot deployments, gated_send gives agents autonomous read access to their inbox while requiring human approval before any outbound message is delivered. This lets you observe real agent behavior in a live email environment without risk of unintended external communications. Once the pilot establishes a behavioral baseline and your team has confidence in the agent's judgment, individual mailboxes can graduate to monitored or autonomous without changing the integration code.

Common questions

Where is the Lean 4 source for the formal proofs?

The proofs live in the Proofs/ directory of the MultiMail repository and are compiled and verified in CI on every commit. The proof covers core security invariants: a mailbox in gated_send mode cannot deliver outbound messages without explicit approval, tokens cannot be used across mailbox boundaries, and oversight mode changes require account-level credentials. You can inspect the theorems directly rather than relying on a compliance attestation.

How does agent identity work when multiple agents share a domain?

Each agent gets its own mailbox address and its own API token. A token issued for [email protected] cannot send from or read the inbox of [email protected]. Authorization is enforced at the API layer on every request — there is no shared session or ambient identity that spans mailboxes.

Can an agent modify its own oversight mode?

No. oversight_mode is a mailbox configuration field controlled by account-level credentials, not by the agent's scoped token. An agent token authorizes send and receive operations on its assigned mailbox only — it cannot call mailbox configuration endpoints. Oversight mode changes require an account owner credential.

What happens to a pending message if no one approves it?

Pending messages expire after a configurable TTL (default 72 hours). You can also cancel a pending message programmatically via POST /v1/mailboxes/{mailbox_id}/emails/{email_id}/cancel using the id returned by the original send call. The agent can query the pending queue via GET /v1/oversight/pending at any time to check current status.

How do we audit what the agent has done?

Every API call — inbound reads, outbound sends, approval events, cancellations — is recorded in the audit log with timestamp, mailbox, action type, and outcome. The log is queryable via the REST API. Webhooks deliver real-time events for inbound email, delivery status changes, and approval or rejection events, so you can route events into your existing observability stack.

Does the MCP server apply the same oversight controls as the REST API?

Yes. The MCP server is an adapter over the same REST API. When an MCP-connected agent calls the send_email tool, it goes through the same server-side oversight checkpoint as a direct API call. The oversight_mode configured on the mailbox applies regardless of whether the agent is using the REST API or the MCP server.

How do we handle data residency or retention requirements?

MultiMail stores email data on Cloudflare's global network with configurable region pinning. Retention policies are configurable per mailbox. Audit logs are append-only and exportable. If your compliance requirements mandate specific certifications such as SOC 2 or a HIPAA BAA, contact the MultiMail team to discuss your requirements before making an architectural commitment.

Email infrastructure built for AI agents, not adapted from human email

Why this matters

How MultiMail solves this

Review the architecture

Evaluate the oversight controls

Test the integration

Deploy a pilot

Expand usage

Try it with your agent

What you get

Identity is scoped per agent, not per account

Oversight is enforced server-side

Formal verification of the security model

Graduated rollout without re-architecting

Test tokens with full API fidelity

No framework lock-in on the agent side

Recommended oversight mode

Common questions

Explore more use cases

The only agent email with a verifiable sender

Email infrastructure built for AI agents, not adapted from human email

Why this matters

How MultiMail solves this

Review the architecture

Evaluate the oversight controls

Test the integration

Deploy a pilot

Expand usage

Try it with your agent

What you get

Identity is scoped per agent, not per account

Oversight is enforced server-side

Formal verification of the security model

Graduated rollout without re-architecting

Test tokens with full API fidelity

No framework lock-in on the agent side

Recommended oversight mode

Common questions

Explore more use cases

Agent Identity Standards (NIST)

Email API for AI Agents

The only agent email with a verifiable sender