Local Email Agents with Ollama

Run open-source LLMs locally with Ollama and connect them to MultiMail for email capabilities — with human oversight as a safety net for local models.


Ollama lets you run open-source LLMs like Llama 3, Mistral, and Qwen locally with an OpenAI-compatible API. MultiMail provides the email infrastructure layer that turns local models into functional email agents capable of sending, receiving, and managing email.

Local LLMs via Ollama may have weaker instruction following than cloud models, making human oversight especially important. MultiMail's default gated_send mode ensures every email drafted by a local model requires human approval before delivery, protecting against lower-quality model outputs.

Connect Ollama to MultiMail using the OpenAI-compatible API with tool calling support. Your data stays local during inference while MultiMail handles the email transport, creating a privacy-friendly architecture for sensitive email workflows.

Built for Ollama developers

Safety Net for Local Models

Local LLMs may produce lower-quality outputs than cloud models. MultiMail's oversight modes are especially critical here — gated_send ensures every email is human-reviewed before delivery, catching issues that local models are more prone to.

Privacy-Friendly Architecture

Your prompts and reasoning stay on your machine with Ollama. Only the final email content is sent through MultiMail's API, creating a hybrid architecture that minimizes data exposure while providing full email capabilities.

OpenAI-Compatible API

Ollama provides an OpenAI-compatible API, so you can use the same tool calling patterns as cloud providers. Define MultiMail email tools once and they work with any local model that supports function calling.

Zero Inference Cost

Running models locally with Ollama means no per-token costs. Combined with MultiMail's free Starter tier (200 emails/month), you can prototype email agents with zero running costs.

Graduated Trust via Oversight Modes

Start with gated_all (human approves every action) for untested local models, then move to gated_send as you validate quality. MultiMail's five oversight modes let you safely experiment with different models.


Get started in minutes

Define MultiMail Tools for Ollama
python
import ollama
import requests
import json

MULTIMAIL_API = "https://api.multimail.dev/v1"
MM_HEADERS = {"Authorization": "Bearer mm_live_your_api_key"}

email_tools = [
    {
        "type": "function",
        "function": {
            "name": "send_email",
            "description": "Send an email through MultiMail. In gated_send mode, queues for human approval.",
            "parameters": {
                "type": "object",
                "properties": {
                    "mailbox_id": {"type": "string", "description": "Mailbox to send from"},
                    "to": {"type": "string", "description": "Recipient email"},
                    "subject": {"type": "string", "description": "Subject line"},
                    "body": {"type": "string", "description": "Email body"}
                },
                "required": ["mailbox_id", "to", "subject", "body"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "check_inbox",
            "description": "Check inbox for recent messages.",
            "parameters": {
                "type": "object",
                "properties": {
                    "mailbox_id": {"type": "string", "description": "Mailbox to check"},
                    "limit": {"type": "integer", "description": "Max messages"}
                },
                "required": ["mailbox_id"]
            }
        }
    }
]

Create tool definitions using Ollama's OpenAI-compatible function calling format.

Build a Local Email Agent with Ollama
python
def execute_tool(name, args):
    if name == "send_email":
        resp = requests.post(f"{MULTIMAIL_API}/send", headers=MM_HEADERS, json=args)
    elif name == "check_inbox":
        resp = requests.get(
            f"{MULTIMAIL_API}/mailboxes/{args[&"cm">#039;mailbox_id']}/inbox",
            headers=MM_HEADERS, params={"limit": args.get("limit", 10)}
        )
    elif name == "reply_email":
        resp = requests.post(f"{MULTIMAIL_API}/reply", headers=MM_HEADERS, json=args)
    else:
        return {"error": f"Unknown tool: {name}"}
    return resp.json()

def run_local_email_agent(user_message, mailbox_id):
    messages = [
        {"role": "system", "content": f"You are an email assistant for mailbox {mailbox_id}. "
         f"Emails use gated_send mode and queue for human approval."},
        {"role": "user", "content": user_message}
    ]
    while True:
        response = ollama.chat(
            model="llama3.3",
            messages=messages,
            tools=email_tools
        )
        msg = response["message"]
        if msg.get("tool_calls"):
            messages.append(msg)
            for tc in msg["tool_calls"]:
                result = execute_tool(
                    tc["function"]["name"],
                    tc["function"]["arguments"]
                )
                messages.append({
                    "role": "tool",
                    "content": json.dumps(result)
                })
        else:
            return msg["content"]

print(run_local_email_agent("Check my inbox", "mbx_abc123"))

Create an agentic loop using Ollama's chat API with MultiMail tools.

OpenAI-Compatible Client with Ollama
python
from openai import OpenAI
import json

"cm"># Point OpenAI client at Ollama's local server
client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="ollama"  "cm"># Ollama doesn't require a real key
)

email_functions = [
    {
        "type": "function",
        "function": {
            "name": "send_email",
            "description": "Send an email. Queues for human approval in gated_send mode.",
            "parameters": {
                "type": "object",
                "properties": {
                    "mailbox_id": {"type": "string"},
                    "to": {"type": "string"},
                    "subject": {"type": "string"},
                    "body": {"type": "string"}
                },
                "required": ["mailbox_id", "to", "subject", "body"]
            }
        }
    }
]

"cm"># Same code as OpenAI, running locally
response = client.chat.completions.create(
    model="llama3.3",
    messages=[
        {"role": "system", "content": "Email assistant. gated_send mode."},
        {"role": "user", "content": "Draft an email to [email protected] about the meeting"}
    ],
    tools=email_functions
)

if response.choices[0].message.tool_calls:
    tc = response.choices[0].message.tool_calls[0]
    args = json.loads(tc.function.arguments)
    print(f"Would send to: {args[&"cm">#039;to']}")
    print(f"Subject: {args[&"cm">#039;subject']}")

Use the OpenAI Python SDK pointed at Ollama's local server for a familiar API experience.


Step by step

1

Create a MultiMail Account and API Key

Sign up at multimail.dev, create a mailbox, and generate an API key from your dashboard. Your key will start with mm_live_.

2

Install Ollama and Pull a Model

Install Ollama and download a model with tool calling support like Llama 3.3.

bash
brew install ollama && ollama pull llama3.3
3

Install Python Dependencies

Install the Ollama Python library and requests for calling the MultiMail API.

bash
pip install ollama requests
4

Build the Agent Loop

Define email tools and implement the agent loop using Ollama's chat API with tool calling.

bash
response = ollama.chat(
    model="llama3.3",
    messages=messages,
    tools=email_tools
)
5

Approve Pending Emails

Review and approve pending emails in the MultiMail dashboard. This step is especially important with local models that may produce lower-quality outputs.


Common questions

Which Ollama models support tool calling for email agents?
Llama 3.3, Mistral, and Qwen 2.5 models support tool calling in Ollama. Larger models (70B+) generally produce better email content, but 8B models work for simple triage tasks. Check Ollama's model library for the latest tool-calling support.
Why is oversight more important with local models?
Local models may have weaker instruction following than cloud models like GPT-4 or Claude. They are more likely to generate inappropriate email content or malformed tool calls. MultiMail's gated_send mode catches these issues before emails are delivered.
Does my email data stay private with Ollama?
Your prompts and model reasoning run entirely on your local machine. Only the final email content (to, subject, body) is sent to MultiMail's API for delivery. Inbox reads also go through MultiMail's API, but the model's analysis of that content stays local.
Can I use Ollama's API with the OpenAI Python SDK?
Yes. Ollama exposes an OpenAI-compatible endpoint at localhost:11434/v1. Point the OpenAI Python SDK at this URL and use the same tool calling code you would use with OpenAI's cloud API. This makes it easy to switch between local and cloud models.
Is there rate limiting on the MultiMail API?
Rate limits depend on your plan tier. The Starter (free) plan allows 200 emails per month, while paid plans range from 5,000 to 150,000. Combined with Ollama's zero inference cost, the free tier is a great starting point for local email agents.

Explore more

The only agent email with a verifiable sender

Email infrastructure built for AI agents. Verifiable identity, graduated oversight, and a 38-tool MCP server. Formally verified in Lean 4.