AI agents analyze every inbound email for sentiment, tag urgent messages, and route escalations — before a frustrated customer becomes a lost account.
Support teams process hundreds of emails daily. A frustrated customer who waited three weeks for a refund sits in the same queue as a routine status check. By the time a human spots the tone, the customer has already posted a negative review or initiated a chargeback. Sentiment signals are present in every email — the problem is reading them at scale, in real time.
MultiMail delivers inbound emails as structured markdown via webhook and stores them queryable via API. An AI agent subscribes to inbound webhooks, reads the email body via read_email, scores sentiment using your model of choice, and calls tag_email to mark urgency level. High-negative emails can trigger an immediate reply draft or route to a priority queue — all without human intervention on the detection step.
MultiMail fires an inbound webhook to your endpoint when a message arrives. The payload includes the email ID, sender, subject, and a markdown-rendered body. Your agent receives this event and kicks off the analysis pipeline immediately — no polling required.
Use read_email to retrieve the full message body. MultiMail stores emails as clean markdown, stripping HTML noise and normalizing quoted reply chains — so your model sees the customer's actual words, not a soup of style tags and duplicate quoted text.
Pass the subject and body to your LLM or classification model. Classify tone as positive, neutral, negative, or hostile. Also detect escalation language: mentions of refunds, cancellations, legal threats, or explicit time pressure. The normalized markdown makes token usage predictable.
Call tag_email with structured labels like sentiment:negative and urgency:critical. Tags are queryable via the API, so your support tooling, dashboards, and routing rules can filter on sentiment without any schema changes or separate data stores on your end.
For emails classified as high-urgency or hostile, trigger downstream actions: draft a holding reply via reply_email for human review, push a Slack alert, or move the thread to a priority queue. The agent handles detection autonomously; humans handle resolution language.
import express from 'express';
import Anthropic from '@anthropic-ai/sdk';
const app = express();
app.use(express.json());
const MM_API_KEY = process.env.MM_API_KEY!;
const BASE_URL = 'https://api.multimail.dev';
const anthropic = new Anthropic();
async function readEmail(emailId: string): Promise<{ subject: string; body: string; from: string }> {
const res = await fetch(`${BASE_URL}/emails/${emailId}`, {
headers: { Authorization: `Bearer ${MM_API_KEY}` },
});
if (!res.ok) throw new Error(`read_email failed: ${res.status}`);
return res.json();
}
async function tagEmail(emailId: string, tags: string[]): Promise<void> {
await fetch(`${BASE_URL}/emails/${emailId}/tags`, {
method: 'POST',
headers: { Authorization: `Bearer ${MM_API_KEY}`, 'Content-Type': 'application/json' },
body: JSON.stringify({ tags }),
});
}
async function scoreSentiment(text: string): Promise<{ label: string; score: number }> {
const msg = await anthropic.messages.create({
model: 'claude-haiku-4-5-20251001',
max_tokens: 64,
messages: [{
role: 'user',
content: `Classify the sentiment of this email. Reply with JSON only, no prose: {"label": "positive|neutral|negative|hostile", "score": 0.0-1.0}\n\n${text}`,
}],
});
const raw = msg.content[0].type === 'text' ? msg.content[0].text : '{}';
return JSON.parse(raw);
}
app.post('/webhooks/inbound', async (req, res) => {
const { email_id } = req.body;
res.sendStatus(200); "cm">// ack immediately before processing
try {
const email = await readEmail(email_id);
const sentiment = await scoreSentiment(`Subject: ${email.subject}\n\n${email.body}`);
const tags: string[] = [`sentiment:${sentiment.label}`];
if (sentiment.label === 'hostile' || sentiment.score > 0.8) {
tags.push('urgency:critical');
} else if (sentiment.label === 'negative') {
tags.push('urgency:high');
}
await tagEmail(email_id, tags);
console.log(`[${email_id}] from=${email.from} label=${sentiment.label} score=${sentiment.score} tags=${tags.join(',')}`);
} catch (err) {
console.error(`[${email_id}] sentiment pipeline failed:`, err);
}
});
app.listen(3000, () => console.log('webhook listener up on :3000'));Express handler that receives MultiMail inbound webhooks, fetches the full email, scores sentiment, and tags urgency in real time.
import os
import json
import requests
import anthropic
MM_API_KEY = os.environ[&"cm">#039;MM_API_KEY']
BASE = &"cm">#039;https://api.multimail.dev'
HEADERS = {&"cm">#039;Authorization': f'Bearer {MM_API_KEY}'}
client = anthropic.Anthropic()
def check_inbox(mailbox: str, limit: int = 50) -> list[dict]:
r = requests.get(
f&"cm">#039;{BASE}/mailboxes/{mailbox}/inbox',
headers=HEADERS,
params={&"cm">#039;limit': limit}
)
r.raise_for_status()
return r.json()[&"cm">#039;emails']
def read_email(email_id: str) -> dict:
r = requests.get(f&"cm">#039;{BASE}/emails/{email_id}', headers=HEADERS)
r.raise_for_status()
return r.json()
def tag_email(email_id: str, tags: list[str]) -> None:
requests.post(
f&"cm">#039;{BASE}/emails/{email_id}/tags',
headers=HEADERS,
json={&"cm">#039;tags': tags}
)
def score_sentiment(text: str) -> dict:
msg = client.messages.create(
model=&"cm">#039;claude-haiku-4-5-20251001',
max_tokens=64,
messages=[{
&"cm">#039;role': 'user',
&"cm">#039;content': f'Classify sentiment. Reply JSON only: {{"label": "positive|neutral|negative|hostile", "score": 0.0-1.0}}\n\n{text}'
}]
)
return json.loads(msg.content[0].text)
def process_untagged(mailbox: str) -> None:
emails = check_inbox(mailbox)
untagged = [
e for e in emails
if not any(t.startswith(&"cm">#039;sentiment:') for t in e.get('tags', []))
]
print(f&"cm">#039;{len(untagged)} untagged emails in {mailbox}')
for email in untagged:
full = read_email(email[&"cm">#039;id'])
text = f"Subject: {full[&"cm">#039;subject']}\n\n{full['body']}"
sentiment = score_sentiment(text)
tags = [f"sentiment:{sentiment[&"cm">#039;label']}"]
if sentiment[&"cm">#039;label'] == 'hostile' or sentiment['score'] > 0.85:
tags.append(&"cm">#039;urgency:critical')
elif sentiment[&"cm">#039;label'] == 'negative':
tags.append(&"cm">#039;urgency:high')
tag_email(email[&"cm">#039;id'], tags)
print(f" {email[&"cm">#039;id']} → {tags}")
if __name__ == &"cm">#039;__main__':
process_untagged(&"cm">#039;[email protected]')Poll the inbox for emails missing a sentiment tag and run scoring on them — useful for backfilling history or recovering from webhook gaps.
import os
import anthropic
client = anthropic.Anthropic()
SYSTEM = """
You are a support triage agent. For each inbound email you process:
1. Use read_email to fetch the full message body.
2. Assess sentiment: positive, neutral, negative, or hostile.
Pay attention to the subject line — it often carries strong signal.
3. Use tag_email to apply two tags: &"cm">#039;sentiment:<label>' and 'urgency:<level>'.
urgency levels: low (positive/neutral), high (negative), critical (hostile or score >0.85).
4. For any email tagged urgency:critical, use reply_email to draft a short holding reply
that acknowledges the issue without making promises. Set requires_approval=true.
5. After each email, log: email ID, sender, sentiment label, score, and tags applied.
"""
USER_PROMPT = """
Triage the latest 10 emails in [email protected].
Apply sentiment and urgency tags to all of them.
Draft holding replies for any emails classified as hostile or urgency:critical.
"""
response = client.beta.messages.create(
model=&"cm">#039;claude-sonnet-4-6',
max_tokens=4096,
system=SYSTEM,
messages=[{&"cm">#039;role': 'user', 'content': USER_PROMPT}],
tools=[
{
&"cm">#039;type': 'mcp',
&"cm">#039;server_label': 'multimail',
&"cm">#039;server_url': 'https://mcp.multimail.dev/mcp',
&"cm">#039;headers': {'Authorization': f'Bearer {os.environ["MM_API_KEY"]}'},
&"cm">#039;allowed_tools': ['check_inbox', 'read_email', 'tag_email', 'reply_email'],
}
],
betas=[&"cm">#039;mcp-client-2025-04-04'],
)
for block in response.content:
if hasattr(block, &"cm">#039;text'):
print(block.text)Using MultiMail's MCP server in an agent loop to read emails, apply sentiment tags, and draft holding replies for hostile messages — all in one agent turn.
import os
import json
import requests
from collections import defaultdict
from datetime import datetime, timedelta, timezone
MM_API_KEY = os.environ[&"cm">#039;MM_API_KEY']
BASE = &"cm">#039;https://api.multimail.dev'
HEADERS = {&"cm">#039;Authorization': f'Bearer {MM_API_KEY}'}
def get_emails_by_tag(mailbox: str, tag: str, since: str) -> list[dict]:
r = requests.get(
f&"cm">#039;{BASE}/mailboxes/{mailbox}/inbox',
headers=HEADERS,
params={&"cm">#039;tag': tag, 'since': since, 'limit': 500}
)
r.raise_for_status()
return r.json()[&"cm">#039;emails']
def sentiment_report(mailbox: str, days: int = 7) -> dict:
since = (
datetime.now(timezone.utc) - timedelta(days=days)
).strftime(&"cm">#039;%Y-%m-%dT%H:%M:%SZ')
counts: dict[str, int] = {}
sender_negatives: dict[str, int] = defaultdict(int)
for label in (&"cm">#039;positive', 'neutral', 'negative', 'hostile'):
emails = get_emails_by_tag(mailbox, f&"cm">#039;sentiment:{label}', since)
counts[label] = len(emails)
if label in (&"cm">#039;negative', 'hostile'):
for e in emails:
sender_negatives[e[&"cm">#039;from']] += 1
total = sum(counts.values()) or 1
return {
&"cm">#039;mailbox': mailbox,
&"cm">#039;period_days': days,
&"cm">#039;total_emails': total,
&"cm">#039;sentiment_breakdown': {
k: {&"cm">#039;count': v, 'pct': round(v / total * 100, 1)}
for k, v in counts.items()
},
&"cm">#039;at_risk_senders': [
{&"cm">#039;sender': s, 'negative_count': c}
for s, c in sorted(sender_negatives.items(), key=lambda x: -x[1])
if c >= 2
],
}
if __name__ == &"cm">#039;__main__':
report = sentiment_report(&"cm">#039;[email protected]', days=7)
print(json.dumps(report, indent=2))Query tagged emails over a rolling time window to compute sentiment distribution and surface at-risk senders for proactive CSM outreach.
Sentiment scoring runs on every inbound email via webhook — no batch delays. A frustrated customer gets flagged within seconds of sending, not hours later when a human eventually scans the queue.
tag_email writes labels like sentiment:negative and urgency:critical that your existing tooling can filter on. No ETL pipeline, no separate sentiment database — the signal lives with the email and is accessible via the same API.
MultiMail normalizes emails to clean markdown before your agent reads them, stripping HTML noise and deduplicating quoted reply chains. You're not burning tokens on boilerplate — just the customer's actual words.
Aggregate sentiment tags over a rolling window to surface accounts sending multiple negative emails in a week. That pattern is a churn signal your CSM team can act on before a cancellation request lands.
Monitored mode lets the agent tag and route autonomously. For holding replies to hostile emails, gated_send keeps a human in the loop on the specific language used — detection is automated, escalation tone is not.
Email infrastructure built for AI agents. Verifiable identity, graduated oversight, and a 50-tool MCP server. Formally verified in Lean 4.