Email Infrastructure for Hugging Face Agents

Connect any open model — Llama, Mistral, Qwen, or your own fine-tune — to a production email API with graduated oversight controls built in.


Hugging Face provides the foundational layer for a large share of production AI stacks: open model weights, inference endpoints, Transformers pipelines, and increasingly, agent tooling through smolagents. When those agents need to touch email — reading a support inbox, drafting a reply, triaging inbound leads — they need infrastructure that enforces safe behavior regardless of which model is generating the output.

MultiMail is a REST API designed for exactly this. It exposes email primitives (send, reply, read, classify, approve) behind a policy layer that runs independently of the model. Whether your pipeline is running a 7B model on a local GPU or hitting a Hugging Face Inference Endpoint, MultiMail applies the same oversight rules: gated sends, human-in-the-loop approval queues, and webhook-driven delivery confirmation.

The integration pattern is straightforward: your Hugging Face pipeline generates text or a structured action, and your application code calls the MultiMail API to execute it. No special SDK required — standard HTTP calls or the Python `requests` library are sufficient. This keeps your model layer decoupled from your email layer, which matters when you're swapping models or running A/B tests across checkpoints.

Built for Hugging Face developers

Model-agnostic policy enforcement

MultiMail's oversight controls apply at the API layer, not the model layer. A fine-tuned classifier and a 70B instruction model are subject to identical send policies. You don't need to re-implement safety logic when you swap models or update weights.

Formal verification for authorization logic

MultiMail's oversight and identity models are proven correct in Lean 4. For teams using open models — where output behavior is harder to guarantee — having a formally verified authorization boundary on the email side reduces the attack surface significantly.

Graduated oversight modes

Start with gated_all to require human approval for every action while you validate your pipeline's behavior. Relax to gated_send or monitored once you have confidence in the model's outputs. The mode is set per mailbox, not per request.

Webhook-driven pipeline triggers

MultiMail fires webhooks on inbound email, delivery status, and approval events. Use these to trigger Hugging Face Inference Endpoint calls or local pipeline runs — the email event becomes the entry point for your agent workflow.

Structured email data for classification pipelines

The read_email and check_inbox endpoints return clean, structured JSON — sender, subject, body, thread metadata — that maps directly to classifier inputs. No parsing raw MIME, no attachment handling boilerplate.

CAN-SPAM and GDPR compliance built in

MultiMail handles unsubscribe mechanics (CAN-SPAM) and provides audit logs with full attribution (GDPR Article 30 record-keeping). Your Hugging Face pipeline doesn't need to implement compliance logic — the API enforces it.


Get started in minutes

Email triage pipeline with zero-shot classification
python
import requests
from transformers import pipeline

MULTIMAIL_API_KEY = "mm_live_your_key_here"
BASE_URL = "https://api.multimail.dev"
MAILBOX = "[email protected]"

classifier = pipeline(
    "zero-shot-classification",
    model="facebook/bart-large-mnli"
)

CANDIDATE_LABELS = ["billing", "technical-support", "feature-request", "spam", "urgent"]

def fetch_unread():
    resp = requests.get(
        f"{BASE_URL}/check_inbox",
        headers={"Authorization": f"Bearer {MULTIMAIL_API_KEY}"},
        params={"mailbox": MAILBOX, "unread": True, "limit": 20}
    )
    resp.raise_for_status()
    return resp.json()["emails"]

def read_email(email_id):
    resp = requests.get(
        f"{BASE_URL}/read_email/{email_id}",
        headers={"Authorization": f"Bearer {MULTIMAIL_API_KEY}"}
    )
    resp.raise_for_status()
    return resp.json()

def tag_email(email_id, tag):
    resp = requests.post(
        f"{BASE_URL}/tag_email",
        headers={"Authorization": f"Bearer {MULTIMAIL_API_KEY}"},
        json={"email_id": email_id, "tag": tag}
    )
    resp.raise_for_status()

def triage_inbox():
    emails = fetch_unread()
    for summary in emails:
        email = read_email(summary["id"])
        text = f"{email[&"cm">#039;subject']}\n\n{email['body'][:512]}"

        result = classifier(text, CANDIDATE_LABELS)
        top_label = result["labels"][0]
        top_score = result["scores"][0]

        if top_score > 0.6:
            tag_email(email["id"], top_label)
            print(f"Tagged {email[&"cm">#039;id']} as '{top_label}' ({top_score:.2f})")
        else:
            tag_email(email["id"], "needs-review")
            print(f"Low confidence for {email[&"cm">#039;id']} — tagged needs-review")

if __name__ == "__main__":
    triage_inbox()

Use a Hugging Face zero-shot classifier to categorize inbound emails and tag them via the MultiMail API. This runs against a live mailbox and requires no fine-tuning.

Draft reply generation with gated send
python
import requests

MULTIMAIL_API_KEY = "mm_live_your_key_here"
HF_API_TOKEN = "hf_your_token_here"
HF_ENDPOINT = "https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.3"
BASE_URL = "https://api.multimail.dev"

def generate_reply(original_email: dict) -> str:
    prompt = f"""<s>[INST] You are a professional customer support agent. Write a helpful, concise reply to the following email.

From: {original_email[&"cm">#039;sender']}
Subject: {original_email[&"cm">#039;subject']}

{original_email[&"cm">#039;body'][:1000]}

Reply: [/INST]"""

    resp = requests.post(
        HF_ENDPOINT,
        headers={"Authorization": f"Bearer {HF_API_TOKEN}"},
        json={
            "inputs": prompt,
            "parameters": {"max_new_tokens": 300, "temperature": 0.4}
        }
    )
    resp.raise_for_status()
    generated = resp.json()[0]["generated_text"]
    # Strip the prompt from the output
    return generated[len(prompt):].strip()

def send_gated_reply(email_id: str, reply_body: str):
    # With gated_send mode on the mailbox, this enters the approval queue
    resp = requests.post(
        f"{BASE_URL}/reply_email",
        headers={"Authorization": f"Bearer {MULTIMAIL_API_KEY}"},
        json={
            "email_id": email_id,
            "body": reply_body,
            "oversight_mode": "gated_send"
        }
    )
    resp.raise_for_status()
    return resp.json()

def process_email(email_id: str):
    # Read the email
    email = requests.get(
        f"{BASE_URL}/read_email/{email_id}",
        headers={"Authorization": f"Bearer {MULTIMAIL_API_KEY}"}
    ).json()

    draft = generate_reply(email)
    result = send_gated_reply(email_id, draft)

    print(f"Draft queued for approval: {result[&"cm">#039;pending_id']}")
    print(f"Approve at: https://app.multimail.dev/pending/{result[&"cm">#039;pending_id']}")
    return result

Generate a draft reply using a Hugging Face Inference Endpoint, then submit it through MultiMail's gated_send mode. The reply enters the human approval queue before delivery.

Webhook handler for inbound email → pipeline trigger
python
from fastapi import FastAPI, Request, HTTPException
from transformers import pipeline
import requests
import hmac
import hashlib
import os

app = FastAPI()

MULTIMAIL_WEBHOOK_SECRET = os.environ["MULTIMAIL_WEBHOOK_SECRET"]
MULTIMAIL_API_KEY = os.environ["MULTIMAIL_API_KEY"]
BASE_URL = "https://api.multimail.dev"

sentiment = pipeline(
    "sentiment-analysis",
    model="cardiffnlp/twitter-roberta-base-sentiment-latest"
)

def verify_signature(payload: bytes, signature: str) -> bool:
    expected = hmac.new(
        MULTIMAIL_WEBHOOK_SECRET.encode(),
        payload,
        hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(f"sha256={expected}", signature)

@app.post("/webhook/inbound")
async def handle_inbound(request: Request):
    payload = await request.body()
    sig = request.headers.get("X-MultiMail-Signature", "")

    if not verify_signature(payload, sig):
        raise HTTPException(status_code=401, detail="Invalid signature")

    event = await request.json()
    if event.get("type") != "email.received":
        return {"status": "ignored"}

    email_id = event["data"]["email_id"]

    "cm"># Fetch full email content
    email = requests.get(
        f"{BASE_URL}/read_email/{email_id}",
        headers={"Authorization": f"Bearer {MULTIMAIL_API_KEY}"}
    ).json()

    "cm"># Run sentiment analysis
    result = sentiment(email["body"][:512])[0]
    label = result["label"].lower()  "cm"># positive, negative, neutral
    score = result["score"]

    "cm"># Tag based on sentiment
    tag = "negative-feedback" if label == "negative" and score > 0.8 else "processed"
    requests.post(
        f"{BASE_URL}/tag_email",
        headers={"Authorization": f"Bearer {MULTIMAIL_API_KEY}"},
        json={"email_id": email_id, "tag": tag}
    )

    print(f"Email {email_id}: sentiment={label} ({score:.2f}), tagged={tag}")
    return {"status": "ok", "sentiment": label, "score": score}

A FastAPI webhook endpoint that receives MultiMail inbound events and dispatches a Hugging Face sentiment analysis pipeline on the email body.

Retrieval-augmented email search with sentence transformers
python
import requests
import numpy as np
from sentence_transformers import SentenceTransformer
from typing import Optional

MULTIMAIL_API_KEY = "mm_live_your_key_here"
BASE_URL = "https://api.multimail.dev"

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

def get_thread_context(thread_id: str) -> list[dict]:
    resp = requests.get(
        f"{BASE_URL}/get_thread/{thread_id}",
        headers={"Authorization": f"Bearer {MULTIMAIL_API_KEY}"}
    )
    resp.raise_for_status()
    return resp.json()["messages"]

def find_similar_contacts(query: str, limit: int = 5) -> list[dict]:
    resp = requests.get(
        f"{BASE_URL}/search_contacts",
        headers={"Authorization": f"Bearer {MULTIMAIL_API_KEY}"},
        params={"query": query, "limit": limit}
    )
    resp.raise_for_status()
    return resp.json()["contacts"]

def rank_messages_by_similarity(
    query: str,
    messages: list[dict],
    top_k: int = 3
) -> list[dict]:
    if not messages:
        return []

    query_embedding = model.encode(query)
    message_embeddings = model.encode([m["body"][:512] for m in messages])

    scores = np.dot(message_embeddings, query_embedding) / (
        np.linalg.norm(message_embeddings, axis=1) * np.linalg.norm(query_embedding)
    )

    ranked_indices = np.argsort(scores)[::-1][:top_k]
    return [
        {**messages[i], "similarity_score": float(scores[i])}
        for i in ranked_indices
    ]

def analyze_thread(thread_id: str, user_query: str) -> dict:
    messages = get_thread_context(thread_id)
    relevant = rank_messages_by_similarity(user_query, messages)

    return {
        "thread_id": thread_id,
        "query": user_query,
        "relevant_messages": relevant,
        "message_count": len(messages)
    }

Use sentence-transformers embeddings to find semantically similar emails via MultiMail's search_contacts endpoint, then compose a contextually informed reply.


Step by step

1

Install dependencies

Install the Transformers library and the requests library for making MultiMail API calls.

bash
pip install transformers torch requests
"cm"># For sentence embeddings:
pip install sentence-transformers
"cm"># For serving a webhook endpoint:
pip install fastapi uvicorn
2

Create a MultiMail account and mailbox

Sign up at multimail.dev, copy your API key from the dashboard (it starts with mm_live_), and create a mailbox. For testing, use the mm_test_ key — it records actions without delivering email.

bash
curl -X POST https://api.multimail.dev/create_mailbox \
  -H "Authorization: Bearer $MULTIMAIL_API_KEY" \
  -H "Content-Type: application/json" \
  -d &"cm">#039;{"address": "[email protected]", "oversight_mode": "gated_send"}'
3

Verify your Hugging Face pipeline can read an email

Pull an email from the inbox and pass its body through a Transformers pipeline to confirm the data flow works end to end before building automation logic.

bash
import requests
from transformers import pipeline

API_KEY = "mm_live_your_key_here"

"cm"># Fetch the inbox
emails = requests.get(
    "https://api.multimail.dev/check_inbox",
    headers={"Authorization": f"Bearer {API_KEY}"},
    params={"mailbox": "[email protected]", "limit": 1}
).json()["emails"]

if emails:
    email = requests.get(
        f"https://api.multimail.dev/read_email/{emails[0][&"cm">#039;id']}",
        headers={"Authorization": f"Bearer {API_KEY}"}
    ).json()

    classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
    result = classifier(email["body"][:512], ["urgent", "routine", "spam"])
    print(result["labels"][0], result["scores"][0])
4

Set up a webhook for inbound email events

Register a webhook URL so MultiMail calls your endpoint when new email arrives. Use ngrok or a staging server URL during development. Register the webhook from the dashboard or via API.

bash
curl -X POST https://api.multimail.dev/webhooks \
  -H "Authorization: Bearer $MULTIMAIL_API_KEY" \
  -H "Content-Type: application/json" \
  -d &"cm">#039;{
    "url": "https://your-app.multimail.dev/webhook/inbound",
    "events": ["email.received", "email.approved", "email.rejected"],
    "mailbox": "[email protected]"
  }&"cm">#039;
5

Test the approval queue flow

Send a test email through the gated_send path and confirm it appears in the pending queue before delivery. This validates that your oversight mode is correctly configured.

bash
import requests

API_KEY = "mm_live_your_key_here"

"cm"># Send via gated_send — this queues for approval, not immediate delivery
resp = requests.post(
    "https://api.multimail.dev/send_email",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "from": "[email protected]",
        "to": "[email protected]",
        "subject": "Test from Hugging Face pipeline",
        "body": "This email was drafted by a Transformers model and is pending approval.",
        "oversight_mode": "gated_send"
    }
)
print(resp.json())  "cm"># {"status": "pending", "pending_id": "pend_..."}

"cm"># List pending approvals
pending = requests.get(
    "https://api.multimail.dev/list_pending",
    headers={"Authorization": f"Bearer {API_KEY}"}
).json()
print(f"{len(pending[&"cm">#039;items'])} items awaiting approval")

Common questions

Do I need the MultiMail Python SDK to use this with Hugging Face?
No. MultiMail exposes a standard REST API, and the Hugging Face Transformers library has no special SDK requirement. Use the requests library or httpx to call MultiMail endpoints directly. The Python SDK (multimail-sdk) is optional and provides convenience wrappers, but the raw API is sufficient and preferred if you want to keep your dependency footprint small.
Can I use Hugging Face Inference Endpoints instead of running models locally?
Yes. Hugging Face Inference Endpoints expose a standard HTTP API that you can call from anywhere. The integration pattern is the same: your application code calls the Inference Endpoint to generate text, then calls MultiMail to send or act on the result. There is no coupling between the inference provider and the email API.
How does MultiMail handle cases where the model generates a harmful or incorrect email?
MultiMail does not inspect email content for harm — that is the responsibility of your application layer. What MultiMail does provide is the gated_send and gated_all oversight modes, which route all outbound email through a human approval queue before delivery. A human reviewer sees the draft before it reaches the recipient. For automated pipelines where human review is not feasible, you should implement content validation before calling the MultiMail API.
Can I use smolagents with MultiMail?
Yes. smolagents is Hugging Face's agent framework and supports tool use. You can wrap MultiMail API calls as smolagents tools using the @tool decorator. The MultiMail MCP server (44 tools) is the most complete integration path if your smolagents setup supports MCP. Otherwise, define individual Python functions for send_email, check_inbox, and reply_email and register them as tools directly.
What oversight mode should I start with when I'm still evaluating my pipeline?
Use gated_all during evaluation. This requires human approval for every action — reads, sends, and replies — which gives you full visibility into what the model is doing without any automated delivery. Once you have confidence in the model's outputs for read operations, switch the mailbox to gated_send, which makes reads autonomous but keeps sends in the approval queue.
Does MultiMail log which model generated each email for compliance purposes?
MultiMail logs the API call chain — which API key, which endpoint, which parameters, and which approval events occurred. It does not automatically record which model generated the content, because MultiMail is not in the inference path. If you need model provenance for GDPR or internal audit purposes, pass it as metadata in your API call using the metadata field on send_email or reply_email. That metadata is stored with the email record and included in audit logs.
Can I run this integration entirely on-premises or in a private cloud?
Hugging Face models can run entirely on your own infrastructure using the Transformers library locally or on-premises Inference Endpoints. MultiMail is a cloud API (api.multimail.dev), so email actions will always go through MultiMail's hosted service. If you require fully on-premises email handling, MultiMail is not the right fit — it is a hosted API, not a self-hosted library.

Explore more

The only agent email with a verifiable sender

Email infrastructure built for AI agents. Verifiable identity, graduated oversight, and a 38-tool MCP server. Formally verified in Lean 4.