Classify Every Inbound Email Before a Human Sees It

AI agents read incoming emails, apply structured tags, and route them to the right handler — before urgent requests get buried in a general queue.


Why this matters

At high inbound volume, manual classification stops working. Support queues mix billing complaints with outage reports. Sales inboxes conflate renewals with churn signals. Urgent requests sit for hours because no one triaged them. The problem isn't that your team is slow — it's that classification is a repetitive, high-frequency task that humans aren't built to do at scale. A single missed priority email can cost more than the entire classification system.


How MultiMail solves this

MultiMail's inbound processing pipeline delivers each arriving email to your agent via webhook before it enters any human queue. The agent calls check_inbox or receives the push event, reads the full content with read_email, runs its classification logic, and writes structured tags back with tag_email. Downstream systems — ticket routers, Slack integrations, on-call pagers — subscribe to those tags via webhook and act immediately. The agent never sends email in this flow, so no approval gates slow it down. Classification latency is bounded by your inference time, not by human availability.

1

Receive inbound email via webhook

MultiMail delivers a POST to your configured webhook endpoint the moment an email arrives at your mailbox. The payload includes the email ID, sender, subject, and a preview. Your agent service handles this event rather than polling.

2

Read full email content

Your agent calls read_email with the email ID from the webhook payload to retrieve the full body, headers, and any attachments. This gives the classifier access to the complete signal — not just the subject line.

3

Run classification logic

Pass the email content to your classification model or LLM. Extract structured fields: intent (support, sales, billing, abuse), urgency (critical, high, normal, low), sentiment (negative, neutral, positive), and topic tags. This step is entirely within your agent — MultiMail places no constraints on how you classify.

4

Apply tags to the email

Call tag_email with the email ID and your classification results as structured tags. Tags are queryable and filterable across the MultiMail API, making them the source of truth for downstream routing decisions.

5

Trigger downstream routing

Webhook listeners subscribed to tag events receive the classification immediately. Route critical outages to PagerDuty, billing disputes to your finance queue, and positive responses to your CRM — all without human intervention in the classification step.


Implementation

Webhook handler and classification entry point
typescript
import express from 'express';
import { classifyAndTag } from './classifier';

const app = express();
app.use(express.json());

app.post('/webhooks/multimail', async (req, res) => {
  const { event, email_id, mailbox } = req.body;

  if (event !== 'email.received') {
    return res.sendStatus(200);
  }

  "cm">// Acknowledge immediately; classify async
  res.sendStatus(200);

  await classifyAndTag(email_id);
});

app.listen(3000);

Express handler that receives the MultiMail inbound webhook and dispatches to the classifier agent.

Read, classify, and tag
typescript
import Anthropic from '@anthropic-ai/sdk';

const MM_API = 'https://api.multimail.dev';
const MM_TOKEN = process.env.MM_API_KEY!;
const anthropic = new Anthropic();

export async function classifyAndTag(emailId: string): Promise<void> {
  "cm">// 1. Read full email
  const emailRes = await fetch(`${MM_API}/v1/emails/${emailId}`, {
    headers: { Authorization: `Bearer ${MM_TOKEN}` },
  });
  const email = await emailRes.json();

  "cm">// 2. Classify
  const message = await anthropic.messages.create({
    model: 'claude-sonnet-4-6',
    max_tokens: 256,
    system: 'You are an email classifier. Return JSON with fields: intent (support|sales|billing|abuse|other), urgency (critical|high|normal|low), sentiment (negative|neutral|positive), topics (string[]).',
    messages: [{
      role: 'user',
      content: `Subject: ${email.subject}\n\n${email.body_text?.slice(0, 2000)}`,
    }],
  });

  const classification = JSON.parse(message.content[0].text);

  "cm">// 3. Write tags
  await fetch(`${MM_API}/v1/emails/${emailId}/tags`, {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${MM_TOKEN}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      tags: [
        `intent:${classification.intent}`,
        `urgency:${classification.urgency}`,
        `sentiment:${classification.sentiment}`,
        ...classification.topics.map((t: string) => `topic:${t}`),
      ],
    }),
  });

  console.log(`Classified ${emailId}: ${JSON.stringify(classification)}`);
}

Fetches full email content, runs LLM classification, and writes structured tags back via the MultiMail API.

Python SDK — classify and tag
python
import os
import json
import anthropic
from multimail_sdk import MultimailClient

mm = MultimailClient(api_key=os.environ["MM_API_KEY"])
ai = anthropic.Anthropic()

def classify_and_tag(email_id: str) -> dict:
    "cm"># Read full email
    email = mm.emails.get(email_id)

    "cm"># Classify
    response = ai.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=256,
        system=(
            "You are an email classifier. Return JSON with: "
            "intent (support|sales|billing|abuse|other), "
            "urgency (critical|high|normal|low), "
            "sentiment (negative|neutral|positive), "
            "topics (list of strings)."
        ),
        messages=[{
            "role": "user",
            "content": f"Subject: {email.subject}\n\n{(email.body_text or &"cm">#039;')[:2000]}",
        }],
    )

    classification = json.loads(response.content[0].text)

    # Apply tags
    tags = [
        f"intent:{classification[&"cm">#039;intent']}",
        f"urgency:{classification[&"cm">#039;urgency']}",
        f"sentiment:{classification[&"cm">#039;sentiment']}",
        *[f"topic:{t}" for t in classification.get("topics", [])],
    ]
    mm.emails.tag(email_id, tags=tags)

    return classification

Same flow using the multimail-sdk Python package. Suitable for agent frameworks that run in Python.

Query emails by classification tag
typescript
const res = await fetch(
  'https:"cm">//api.multimail.dev/v1/emails?tag=urgency%3Acritical&since=24h',
  {
    headers: { Authorization: `Bearer ${process.env.MM_API_KEY}` },
  }
);

const { emails } = await res.json();

for (const email of emails) {
  console.log(`[CRITICAL] ${email.subject} — from ${email.from} at ${email.received_at}`);
}

Retrieve all critical-urgency emails from the last 24 hours for a dashboard or escalation sweep.

MCP tool usage — classify via Claude Desktop
text
// In your MCP-connected client (Claude Desktop, Cursor, Windsurf):

// Step 1 — read the email
tool: read_email
params: { email_id: "em_01J8KXQZ4N2P3R5T7V9W" }

// Step 2 — after reviewing content, apply classification tags
tool: tag_email
params: {
  email_id: "em_01J8KXQZ4N2P3R5T7V9W",
  tags: ["intent:support", "urgency:critical", "sentiment:negative", "topic:database-outage"]
}

// Step 3 — check inbox for other critical items
tool: check_inbox
params: { tag: "urgency:critical", limit: 20 }

If you're operating through an MCP client, use the tag_email tool directly after reading the email. No code deployment required.


What you get

Zero-latency triage

Classification runs the moment email arrives, not when a human opens the inbox. Critical issues get tagged within seconds of delivery regardless of time zone or staffing level.

Consistent taxonomy at scale

An LLM classifier applies the same intent and urgency taxonomy to the ten-thousandth email as it does to the first. Human classifiers drift over time; agents don't.

Tags as routing primitives

MultiMail tags are queryable via API and filterable in webhook subscriptions. Every downstream system — ticketing, alerting, CRM — can subscribe to exactly the classification signals it needs without coupling to your agent's internal logic.

No approval overhead on read paths

Classification is a read-and-tag operation. The monitored oversight mode lets the agent act immediately without waiting for human approval, while still giving your team full visibility into every classification decision via the audit log.

Handles volume spikes without degradation

Email volume spikes — product launches, outages, billing cycles — hit classification agents the same as steady state. Queue depth is the only constraint, not human bandwidth.


Recommended oversight mode

Recommended
monitored
Email classification is a read-and-tag operation with no outbound sends and no irreversible side effects. The agent reads content, applies structured labels, and triggers downstream routing — none of these actions require pre-approval. Monitored mode lets the agent run at inbound velocity while surfacing every classification decision in your audit log. If a misclassification occurs (e.g., a critical outage tagged as normal), your team can retag the email and adjust the classifier prompt without having approved every decision upfront. Gated modes would introduce latency that defeats the purpose of automated triage.

Common questions

How do I receive inbound emails in real time rather than polling?
Configure a webhook endpoint in your MultiMail dashboard under Settings → Webhooks. Set the trigger to email.received for your target mailbox. MultiMail will POST the event payload — including email_id — to your endpoint within seconds of delivery. Your handler should respond with 200 immediately and process asynchronously to avoid webhook timeouts.
Can I classify emails into a custom taxonomy, not just the example fields?
Yes. The tags you write via tag_email are arbitrary strings in the format key:value or plain strings. You define the taxonomy in your classifier prompt. Common schemes include intent:billing, priority:p1, team:infrastructure, or product:checkout. Tags are filterable and queryable, so design them around how your downstream systems route.
What happens if the classifier returns malformed JSON?
Your agent code is responsible for parsing and validating the LLM response before calling tag_email. A try/catch around the JSON.parse call with a fallback to a tag_email call with intent:unknown urgency:unknown gives you a safe default that still routes the email somewhere rather than dropping it.
Does MultiMail store email body content, and for how long?
Email bodies are stored in your account's R2-backed storage. Retention policy is configurable per mailbox. For support or enterprise deployments processing sensitive content, set retention to the minimum your workflows require. MultiMail does not train on stored email content.
Can I use this with an existing support ticketing system?
Yes. The standard pattern is: classify with tag_email, then subscribe a second webhook handler to tag events that creates tickets in your system (Zendesk, Linear, Jira, etc.) via their respective APIs. Your classifier and your ticketing integration are decoupled — each subscribes to the MultiMail webhook stream independently.
How do I handle classification errors or low-confidence results?
Include a confidence field in your LLM response schema and write it as a tag (confidence:low). Configure your downstream router to send low-confidence emails to a human review queue rather than auto-routing them. Over time, review the human corrections to improve your classifier prompt.
What email volume can this handle?
MultiMail's inbound pipeline is designed for high-volume workloads. Your throughput ceiling is your agent's inference latency multiplied by the concurrency of your webhook handler. For burst workloads, run multiple handler replicas and process email IDs from a queue rather than inline in the webhook handler.

Explore more use cases

The only agent email with a verifiable sender

Email infrastructure built for AI agents. Verifiable identity, graduated oversight, and a 50-tool MCP server. Formally verified in Lean 4.