Start restricted, build confidence, increase autonomy — without ever losing control of what your agent sends.
Most teams deploy AI email agents at one of two extremes: fully autonomous (where mistakes go out unreviewed) or fully gated (where human bottlenecks kill the automation value). Neither is right. The real challenge is building a path from zero trust to earned autonomy — where the agent proves its judgment on low-risk tasks before you give it control over high-stakes communication. Without a structured framework, teams either over-restrict their agents indefinitely or flip to autonomous too early and ship an embarrassing or non-compliant email to a real customer.
MultiMail exposes five oversight modes — read_only, gated_all, gated_send, monitored, and autonomous — that map directly to your agent's demonstrated reliability. You configure the mode per mailbox, not globally, so a single agent can operate autonomously on internal routing while staying gated on customer-facing sends. Every mode produces an audit log. Approvals flow through the list_pending and decide_email endpoints so your review UI integrates with existing tooling. When an agent's approval rate hits your threshold, promote it by updating the mailbox oversight_mode — no code changes, no restart.
Classify each email workflow by blast radius. Customer-facing sends, billing notifications, and compliance-triggered messages are high-risk. Internal routing, status digests, and read-only monitoring are low-risk. Assign a starting oversight mode based on that classification — not on how capable you think the model is.
Call POST /create_mailbox with oversight_mode set to your chosen level. Each mailbox gets its own mode, so you can give the same agent restrictive settings on [email protected] and looser settings on [email protected]. The mode is enforced server-side — your agent code needs no conditional logic per environment.
Let the agent operate normally. In gated_send mode, outbound messages queue for human review via list_pending while reads remain unrestricted. In gated_all mode, both reads and sends require approval. The audit log records every action — what the agent attempted, what was approved, what was rejected — giving you the behavioral data to make a promotion decision.
Fetch pending messages with GET /list_pending and inspect each drafted message. Use POST /decide_email with action: 'approve' or 'reject' to release or block delivery. Track your approval-to-rejection ratio across decisions. An approval rate above 95% over 50+ decisions is a reasonable threshold for promotion to monitored mode.
When the agent's track record supports it, update the mailbox oversight_mode. Move gated_send to monitored, then monitored to autonomous as confidence grows. Each promotion takes effect immediately on the next API call. If behavior degrades after a promotion, demote the mailbox in one API call and the gate reactivates instantly.
The audit log satisfies EU AI Act Article 14 requirements for human oversight of automated decision-making systems, and provides the CAN-SPAM delivery and opt-out record required for bulk sends. Every decide_email action is timestamped and attributed to the reviewer's API token. Export per-mailbox logs for compliance reviews.
import requests
API_BASE = "https://api.multimail.dev"
headers = {
"Authorization": "Bearer $MULTIMAIL_API_KEY",
"Content-Type": "application/json"
}
response = requests.post(
f"{API_BASE}/create_mailbox",
headers=headers,
json={
"name": "support-agent",
"domain": "multimail.dev",
"oversight_mode": "gated_send",
"display_name": "Support Agent"
}
)
mailbox = response.json()
print(f"Mailbox: {mailbox[&"cm">#039;address']}")
print(f"Oversight mode: {mailbox[&"cm">#039;oversight_mode']}")
"cm"># Output:
"cm"># Mailbox: [email protected]
"cm"># Oversight mode: gated_sendStart a new agent mailbox in gated_send mode — reads are unrestricted, outbound messages queue for human approval before delivery.
import requests
API_BASE = "https://api.multimail.dev"
headers = {
"Authorization": "Bearer $MULTIMAIL_API_KEY",
"Content-Type": "application/json"
}
pending = requests.get(
f"{API_BASE}/list_pending",
headers=headers,
params={"mailbox": "[email protected]"}
).json()
for message in pending["messages"]:
print(f"To: {message[&"cm">#039;to']}")
print(f"Subject: {message[&"cm">#039;subject']}")
print(f"Preview: {message[&"cm">#039;body_preview']}")
is_internal = message["to"].endswith("@your-company.com")
action = "approve" if is_internal else input("approve/reject? ")
requests.post(
f"{API_BASE}/decide_email",
headers=headers,
json={
"message_id": message["id"],
"action": action,
"reason": "Auto-approved: internal recipient" if is_internal else "Manual review"
}
)
print(f"Decision recorded: {action}")Fetch the gated queue, inspect each drafted message, and approve or reject using decide_email. Auto-approve internal recipients, route external ones to a human reviewer.
import requests
API_BASE = "https://api.multimail.dev"
headers = {
"Authorization": "Bearer $MULTIMAIL_API_KEY",
"Content-Type": "application/json"
}
"cm"># Fetch decided messages to calculate approval rate
history = requests.get(
f"{API_BASE}/list_pending",
headers=headers,
params={
"mailbox": "[email protected]",
"status": "decided",
"limit": 200
}
).json()
approved = [m for m in history["messages"] if m.get("decision") == "approve"]
rejected = [m for m in history["messages"] if m.get("decision") == "reject"]
total = len(approved) + len(rejected)
approval_rate = len(approved) / total if total > 0 else 0
print(f"Decisions reviewed: {total}, approval rate: {approval_rate:.1%}")
if total >= 50 and approval_rate >= 0.95:
resp = requests.patch(
f"{API_BASE}/create_mailbox",
headers=headers,
json={
"mailbox": "[email protected]",
"oversight_mode": "monitored"
}
)
print(f"Promoted to: {resp.json()[&"cm">#039;oversight_mode']}")
else:
print(f"Not ready. Need 50+ decisions at ≥95% approval rate.")Check the decision history from list_pending, calculate approval rate, and update oversight_mode when the threshold is met. The change applies to all subsequent API calls immediately.
"cm">// Create a high-risk mailbox with gated_all — reads and sends both require approval
{
"tool": "create_mailbox",
"parameters": {
"name": "billing-agent",
"domain": "multimail.dev",
"oversight_mode": "gated_all",
"display_name": "Billing Agent"
}
}
"cm">// Fetch pending queue for that mailbox
{
"tool": "list_pending",
"parameters": {
"mailbox": "[email protected]"
}
}
"cm">// Approve a specific queued message
{
"tool": "decide_email",
"parameters": {
"message_id": "msg_01J8K2M4N6P8Q0R2S4T6V8W0",
"action": "approve",
"reason": "Reviewed and approved: correct recipient, accurate amount"
}
}If your agent connects via MultiMail's MCP server, configure oversight modes and review the approval queue using MCP tool calls directly in your MCP client.
Oversight modes apply to individual mailboxes, not globally per agent. One agent can operate autonomously on [email protected] while staying fully gated on [email protected] — without any conditional logic in the agent code.
Every action — send attempt, approval, rejection, read — is logged with timestamp and API token attribution. EU AI Act Article 14 requires human oversight mechanisms for high-risk AI systems; the audit log provides the required paper trail. CAN-SPAM requires honoring opt-outs; the log records all delivery decisions.
Updating oversight_mode via the API takes effect on the next request. If an agent starts drafting inappropriate messages after a promotion, demote the mailbox in one API call and the gate reactivates immediately — no deployment, no rollback, no restart required.
list_pending returns structured JSON you can render in an existing admin dashboard, Slack app, or ticketing workflow. decide_email accepts the approval or rejection from any HTTP client. You are not required to use a MultiMail UI — the endpoints are the oversight layer.
The claim that gated_send blocks delivery until a decide_email approval is not a policy — it is a formally proven property of the system, verified in Lean 4 proofs published with each release. An agent calling send_email on a gated mailbox receives a 202 Accepted with a pending message ID; the message is queued, not delivered, and no API parameter can bypass that.
Email infrastructure built for AI agents. Verifiable identity, graduated oversight, and a 50-tool MCP server. Formally verified in Lean 4.