Structured Email Extraction with Instructor

Use Instructor's Pydantic-powered structured outputs to extract email data from natural language, then send through MultiMail with configurable human oversight.


Instructor is a library for getting structured outputs from LLMs using Pydantic models. It patches LLM client libraries to return validated, typed objects instead of raw text, with automatic retry on validation failure. MultiMail provides the email delivery infrastructure that turns Instructor's structured email data into actual sent messages.

The combination is powerful: Instructor extracts structured email fields (recipients, subject, body, urgency) from natural language requests, validates them with Pydantic, and MultiMail handles the delivery with human oversight. This ensures emails are both well-structured and appropriately reviewed before sending.

Integrate Instructor with MultiMail by defining Pydantic models for email data, extracting structured content with Instructor, and calling the MultiMail REST API to send. No special SDK required — just Pydantic models and HTTP requests.

Built for Instructor developers

Validated Email Structure

Instructor ensures the LLM produces valid email data — proper email addresses, non-empty subjects, appropriate body length — through Pydantic validation with automatic retries on failure.

Natural Language to Email

Users describe what they want to send in natural language. Instructor extracts the structured email fields, validates them, and MultiMail delivers the result. The entire pipeline is type-safe and validated.

Automatic Retry on Validation Failure

If the LLM produces an invalid email structure (missing fields, bad format), Instructor automatically retries with the validation error as feedback. This self-healing loop produces reliable email data without manual intervention.

Human Oversight After Extraction

Instructor ensures emails are well-structured, but cannot judge appropriateness. MultiMail's gated_send mode adds human review after extraction, catching issues that structural validation misses.

Multi-Provider Support

Instructor patches multiple LLM providers (OpenAI, Anthropic, Cohere, Mistral). Use any supported provider for email extraction while MultiMail handles delivery consistently across all of them.


Get started in minutes

Define Email Models and Extract
python
import instructor
import requests
from openai import OpenAI
from pydantic import BaseModel, EmailStr, field_validator

MULTIMAIL_API = "https://api.multimail.dev/v1"
HEADERS = {"Authorization": "Bearer mm_live_your_api_key"}

class EmailDraft(BaseModel):
    to: str
    subject: str
    body: str
    urgency: str = "normal"

    @field_validator("subject")
    @classmethod
    def subject_not_empty(cls, v):
        if not v.strip():
            raise ValueError("Subject cannot be empty")
        return v

    @field_validator("body")
    @classmethod
    def body_min_length(cls, v):
        if len(v) < 20:
            raise ValueError("Body must be at least 20 characters")
        return v

client = instructor.from_openai(OpenAI())

draft = client.chat.completions.create(
    model="gpt-4o",
    response_model=EmailDraft,
    messages=[{
        "role": "user",
        "content": "Send a follow-up to [email protected] about the proposal we discussed yesterday. Mention the $50k budget and Q3 timeline."
    }]
)
print(f"To: {draft.to}, Subject: {draft.subject}")

Create Pydantic models for email data and use Instructor to extract structured emails from natural language.

Send Extracted Email via MultiMail
python
def send_via_multimail(draft: EmailDraft, mailbox_id: str) -> dict:
    """Send a validated EmailDraft through MultiMail."""
    resp = requests.post(f"{MULTIMAIL_API}/send", headers=HEADERS, json={
        "mailbox_id": mailbox_id,
        "to": draft.to,
        "subject": draft.subject,
        "body": draft.body
    })
    return resp.json()

"cm"># Extract and send in one flow
def compose_and_send(user_request: str, mailbox_id: str) -> dict:
    """Extract structured email from natural language and send via MultiMail."""
    draft = client.chat.completions.create(
        model="gpt-4o",
        response_model=EmailDraft,
        max_retries=3,  "cm"># Auto-retry on validation failure
        messages=[{
            "role": "system",
            "content": "Extract a professional email from the user&"cm">#039;s request. "
            "Include proper greeting and sign-off."
        }, {
            "role": "user",
            "content": user_request
        }]
    )
    # In gated_send mode, this queues for human approval
    return send_via_multimail(draft, mailbox_id)

result = compose_and_send(
    "Email the team at [email protected] that the deploy is pushed to Thursday",
    "your_mailbox_id"
)

Take the structured email from Instructor and send it through MultiMail's API with oversight.

Classify and Route Inbound Emails
python
from enum import Enum
from typing import Optional

class EmailCategory(str, Enum):
    SUPPORT = "support"
    SALES = "sales"
    BILLING = "billing"
    SPAM = "spam"
    OTHER = "other"

class EmailClassification(BaseModel):
    category: EmailCategory
    urgency: str
    summary: str
    suggested_action: str
    auto_reply_appropriate: bool

def classify_email(email_subject: str, email_body: str) -> EmailClassification:
    return client.chat.completions.create(
        model="gpt-4o",
        response_model=EmailClassification,
        messages=[{
            "role": "user",
            "content": f"Classify this email:\nSubject: {email_subject}\n\n{email_body}"
        }]
    )

"cm"># Fetch and classify inbox
resp = requests.get(
    f"{MULTIMAIL_API}/mailboxes/your_mailbox_id/inbox",
    headers=HEADERS, params={"limit": 10}
)
for email in resp.json().get("emails", []):
    classification = classify_email(email["subject"], email["body"])
    print(f"{email[&"cm">#039;subject']}: {classification.category} ({classification.urgency})")
    if classification.auto_reply_appropriate:
        "cm"># Generate and send auto-reply through MultiMail
        pass

Use Instructor to classify inbound emails from MultiMail and route them to appropriate handlers.

Batch Email Extraction
python
from typing import Iterable

def extract_multiple_emails(request: str) -> list[EmailDraft]:
    """Extract multiple email drafts from a batch request."""
    return client.chat.completions.create(
        model="gpt-4o",
        response_model=Iterable[EmailDraft],
        messages=[{
            "role": "system",
            "content": "Extract each individual email from the user&"cm">#039;s request. "
            "Each email should have a distinct recipient, subject, and body."
        }, {
            "role": "user",
            "content": request
        }]
    )

# Extract and send batch
drafts = extract_multiple_emails(
    "Send updates to the team: tell [email protected] the design is approved, "
    "tell [email protected] we need the API docs by Friday, "
    "and tell [email protected] the budget is confirmed."
)

for draft in drafts:
    result = send_via_multimail(draft, "your_mailbox_id")
    print(f"Sent to {draft.to}: {result}")

Extract multiple emails from a single natural language request using Instructor's iterable support.


Step by step

1

Create a MultiMail Account and API Key

Sign up at multimail.dev, create a mailbox, and generate an API key. Your key will start with mm_live_.

2

Install Dependencies

Install Instructor, an LLM client library, and requests for calling the MultiMail API.

bash
pip install instructor openai requests
3

Define Email Models

Create Pydantic models for email data with field validators for content quality checks.

4

Extract and Send

Use Instructor to extract structured email data from natural language, then send through MultiMail's REST API.

bash
draft = client.chat.completions.create(
    model="gpt-4o",
    response_model=EmailDraft,
    messages=[{"role": "user", "content": "Send a meeting follow-up to..."}]
)
5

Review Pending Emails

If using gated_send mode (the default), approve or reject pending emails in the MultiMail dashboard before delivery.


Common questions

How does Instructor's validation help with email quality?
Instructor validates the LLM's output against your Pydantic model, ensuring fields like email addresses, subject lines, and body content meet your criteria. If validation fails, Instructor automatically retries with the error as feedback, so the LLM corrects its output. This ensures every email sent to MultiMail is well-structured.
Can I use Instructor with models other than OpenAI?
Yes. Instructor supports multiple LLM providers including Anthropic, Cohere, Mistral, and others. The email extraction pattern works the same regardless of provider — define your Pydantic models, patch the client with Instructor, and extract structured data. MultiMail's REST API is provider-agnostic.
What's the difference between Instructor validation and MultiMail oversight?
Instructor validation ensures emails are structurally correct — valid addresses, non-empty fields, appropriate length. MultiMail's oversight ensures emails are contextually appropriate — right recipients, suitable tone, correct timing. Validation is automatic and instant; oversight involves human judgment.
Can I extract and send multiple emails in one request?
Yes. Use Instructor's Iterable response type to extract multiple EmailDraft objects from a single natural language request. Then iterate and send each through MultiMail's API. In gated_send mode, each email queues separately for individual approval.
How do I handle extraction failures?
Set max_retries on Instructor's create call. When the LLM produces output that fails Pydantic validation, Instructor sends the validation error back to the LLM and retries. After exhausting retries, it raises a ValidationError. Your application can catch this and inform the user that the request couldn't be parsed into a valid email.

Explore more

The only agent email with a verifiable sender

Email infrastructure built for AI agents. Verifiable identity, graduated oversight, and a 38-tool MCP server. Formally verified in Lean 4.