Turn Inbound Email Into Structured Data

MultiMail delivers inbound email to your AI agent as clean markdown, with webhook triggers and full API access. Extract orders, invoices, and support requests without running a mail server.


Why this matters

Business-critical data arrives in email every day—purchase orders, invoices, support tickets, form submissions—formatted however the sender chose. Extracting that data means either paying staff to copy it manually or building fragile parsers that break on the first formatting variation. Neither approach scales. An AI agent that reads email the way a trained human would—understanding context and intent rather than pattern-matching on text—is the only approach that handles real-world variation at volume.


How MultiMail solves this

MultiMail receives inbound email on any mailbox—your domain or @multimail.dev—and immediately delivers it to your AI agent via webhook. The email body is stored and served as clean markdown, stripping HTML scaffolding and preserving the structure that matters: tables, lists, quoted replies. Your agent calls read_email to get the full content, runs extraction logic, tags the message for tracking, and optionally replies with a confirmation. The full pipeline runs in seconds, at scale, without you managing SMTP infrastructure or parsing MIME multipart manually.

1

Receive via webhook

Configure a mailbox on your domain or @multimail.dev. When email arrives, MultiMail fires a webhook to your endpoint with the email_id, mailbox, sender, subject, and timestamp. No polling required.

2

Read content as markdown

Call read_email with the email_id to retrieve the full message body as clean markdown. Attachments are listed with signed URLs and MIME types. HTML is stripped to preserve only semantic content—tables, lists, quoted text.

3

Extract structured data

Pass the markdown body to your AI agent with a schema-aware prompt. The agent returns structured JSON—invoice line items, order details, support ticket fields—without brittle regex or format assumptions.

4

Route to downstream systems

Write extracted data to your database, trigger a workflow, or pass it to another service. Use get_thread to access prior messages for context when an email references an earlier conversation, and manage_contacts to look up sender history.

5

Tag and confirm

Call tag_email to mark the message as processed, preventing double-handling if both webhook and batch sweep are in use. Optionally call reply_email to send a confirmation receipt to the sender using the same email_id.


Implementation

Webhook handler with extraction
python
from multimail_sdk import MultiMailClient
from flask import Flask, request, jsonify

app = Flask(__name__)
client = MultiMailClient(api_key="mm_live_...")

@app.route("/webhooks/inbound", methods=["POST"])
def handle_inbound():
    payload = request.json
    email_id = payload["email_id"]

    "cm"># Fetch full content as markdown
    email = client.read_email(email_id=email_id)

    "cm"># Run your extraction logic
    extracted = extract_invoice_data(email.body_markdown)

    "cm"># Tag as parsed and confirm receipt
    client.tag_email(email_id=email_id, tags=["parsed", "invoice"])
    client.reply_email(
        email_id=email_id,
        body=f"Invoice "cm">#{extracted['invoice_number']} received. Processing time: 1–2 business days."
    )

    return jsonify({"status": "ok", "extracted": extracted})

Minimal webhook endpoint that receives an inbound email trigger and runs extraction via the Python SDK.

Batch inbox sweep for missed messages
python
from multimail_sdk import MultiMailClient

client = MultiMailClient(api_key="mm_live_...")

def sweep_inbox(mailbox: str) -> list[dict]:
    inbox = client.check_inbox(
        mailbox=mailbox,
        filter={"tags": {"exclude": ["parsed"]}},
        limit=100
    )

    results = []
    for summary in inbox.emails:
        email = client.read_email(email_id=summary.id)
        data = extract_order_data(email.body_markdown, subject=email.subject)

        client.tag_email(email_id=summary.id, tags=["parsed"])
        results.append({"email_id": summary.id, "data": data})

    return results

if __name__ == "__main__":
    records = sweep_inbox("[email protected]")
    print(f"Processed {len(records)} emails")

Use check_inbox to process any emails that missed the webhook, filtering to untagged messages only.

Thread-aware extraction
python
from multimail_sdk import MultiMailClient

client = MultiMailClient(api_key="mm_live_...")

INVOICE_SCHEMA = {
    "invoice_number": "string",
    "vendor": "string",
    "line_items": "array",
    "total_due": "number",
    "due_date": "string (ISO 8601)"
}

def extract_with_context(email_id: str) -> dict:
    email = client.read_email(email_id=email_id)

    "cm"># Retrieve full thread for context
    thread = client.get_thread(thread_id=email.thread_id)
    prior_messages = [
        {"subject": m.subject, "body": m.body_markdown}
        for m in thread.messages
        if m.id != email_id
    ]

    "cm"># Pass current email + thread history to agent
    extracted = agent_extract(
        current_body=email.body_markdown,
        thread_context=prior_messages,
        schema=INVOICE_SCHEMA
    )

    client.tag_email(email_id=email_id, tags=["parsed", "has-context"])
    return extracted

Pull the full thread before extracting—critical when emails reference prior context (amendments, revisions, follow-ups).

MCP tool sequence (Claude Desktop / Cursor / Windsurf)
text
# 1. Check for unprocessed inbound emails
check_inbox(
  mailbox="[email protected]",
  filter={"tags": {"exclude": ["parsed"]}},
  limit=20
)

# 2. Read the full content of a specific email
read_email(email_id="em_01HXYZ...")

# 3. Tag it once extraction is complete
tag_email(email_id="em_01HXYZ...", tags=["parsed", "support-ticket"])

# 4. Reply to confirm receipt
reply_email(
  email_id="em_01HXYZ...",
  body="Thanks — we've logged your request as ticket #4821. Expect a response within 4 hours."
)

Use MultiMail's MCP server tools directly from an agent session without writing server code.


What you get

No SMTP infrastructure to operate

MultiMail handles MX records, DKIM verification, MIME parsing, and storage. Your agent gets a webhook and a clean API—not a mail server to maintain.

Email body delivered as clean markdown

HTML email is noisy. MultiMail strips tracking pixels, inline styles, and markup scaffolding, leaving the semantic content your agent needs to extract data reliably.

Full thread context on demand

Extraction accuracy improves when the agent can see prior messages. get_thread returns the complete conversation for any email_id, so amendments and revisions are read in context rather than misidentified as new records.

Idempotent processing via tagging

tag_email lets you mark messages as parsed, routed, or errored. Your webhook handler and batch sweep can both run without double-processing the same email.

Scales to high volume without polling overhead

Webhook delivery means your agent activates when email arrives, not on a timer. At high inbound volume, this eliminates the latency and API cost of continuous polling.


Recommended oversight mode

Recommended
monitored
Inbound parsing is read-heavy with low-stakes outbound actions—confirmation receipts, not decisions with downstream consequences. The agent reads email, extracts data, tags messages, and sends brief acknowledgments, none of which require human pre-approval to be safe. Monitored mode lets the pipeline run at full throughput while giving operators visibility into what the agent classified and how. Extraction errors surface in the notification log before they propagate to downstream systems, without creating a bottleneck on every message.

Common questions

How does MultiMail receive inbound email?
MultiMail uses MX records to accept email for mailboxes on your domain or @multimail.dev addresses. When an email arrives, it's stored, converted to markdown, and a webhook fires to your configured endpoint with the email_id. You don't run SMTP servers or handle MIME parsing.
What happens to email attachments?
Attachments are stored in R2 and listed in the read_email response with signed URLs and MIME types. Your agent can download them, pass them to a document extraction pipeline, or ignore them depending on the use case. PDFs and images are supported.
Can I route different senders or subjects to different extraction pipelines?
Yes. Create separate mailboxes for different workflows—orders@, invoices@, support@—and configure per-mailbox webhooks pointing to different handlers. Alternatively, use a single inbox and classify by sender or subject before routing. tag_email makes it easy to label messages for downstream filtering.
How do I handle emails that arrive before my webhook is ready?
Use check_inbox with a tag exclusion filter to sweep for unprocessed messages: check_inbox(filter={"tags": {"exclude": ["parsed"]}}). This returns any messages your webhook handler missed, so you can process them on recovery without gaps in your extraction pipeline.
Does MultiMail verify that inbound email is legitimate?
MultiMail records SPF, DKIM, and DMARC authentication results for each inbound message, available in the read_email response. Your agent can check authentication status before trusting the sender identity—important when parsing instructions or approvals from known business partners.
What's the latency from email arrival to webhook delivery?
Typical webhook delivery is under 5 seconds from SMTP receipt. For latency-sensitive extraction pipelines—real-time order intake, time-bound approvals—use dedicated mailboxes to avoid queue contention with other high-volume inboxes.

Explore more use cases

The only agent email with a verifiable sender

Email infrastructure built for AI agents. Verifiable identity, graduated oversight, and a 50-tool MCP server. Formally verified in Lean 4.