Instructor Library: The Easiest Way to Get Typed Outputs from LLMs

What Is Instructor?

Instructor is an open-source Python library that patches LLM client libraries (OpenAI, Anthropic, Mistral, and others) to return typed Pydantic objects instead of raw strings. It handles the entire structured output lifecycle: schema generation, prompt injection, response parsing, validation, and automatic retries when validation fails.

Instead of manually crafting JSON schemas, parsing responses, and writing retry loops, you define a Pydantic model and call the LLM. Instructor handles everything else.

Installation and Setup

pip install instructor

Instructor works by wrapping your existing OpenAI client:

flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus<br/>classify"]
    PLAN["Plan and tool<br/>selection"]
    AGENT["Agent loop<br/>LLM plus tools"]
    GUARD{"Guardrails<br/>and policy"}
    EXEC["Execute and<br/>verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus<br/>next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff

import instructor
from openai import OpenAI
from pydantic import BaseModel

# Patch the OpenAI client
client = instructor.from_openai(OpenAI())

That single line transforms the client so that every call can accept a response_model parameter and return typed objects.

Your First Typed Extraction

Define a Pydantic model and use it as the response_model:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

class UserInfo(BaseModel):
    name: str
    age: int
    email: str

user = client.chat.completions.create(
    model="gpt-4o",
    response_model=UserInfo,
    messages=[
        {"role": "user", "content": "John Smith is 32 years old. His email is [email protected]"}
    ],
)

print(user)
# UserInfo(name='John Smith', age=32, email='[email protected]')
print(type(user))
# <class '__main__.UserInfo'>

No JSON parsing. No schema construction. No validation code. The return value is a fully typed Python object.

How It Works Under the Hood

When you call create() with a response_model, Instructor:

Generates a JSON schema from your Pydantic model
Injects the schema into the API call (via response_format or tools)
Parses the raw JSON response into your Pydantic model
If validation fails, it retries with the validation error included in the prompt

This retry-with-error-feedback loop is what makes Instructor especially powerful. The model learns from its own mistakes within the same conversation.

Automatic Retry with Validation

Add Pydantic validators and let Instructor auto-correct:

from pydantic import field_validator

class MovieReview(BaseModel):
    title: str
    rating: float
    genre: str
    summary: str

    @field_validator("rating")
    @classmethod
    def rating_in_range(cls, v: float) -> float:
        if not 1.0 <= v <= 10.0:
            raise ValueError(f"Rating must be between 1 and 10, got {v}")
        return v

    @field_validator("genre")
    @classmethod
    def valid_genre(cls, v: str) -> str:
        allowed = {"action", "comedy", "drama", "horror", "sci-fi", "thriller"}
        if v.lower() not in allowed:
            raise ValueError(f"Genre must be one of {allowed}, got '{v}'")
        return v.lower()

review = client.chat.completions.create(
    model="gpt-4o",
    response_model=MovieReview,
    max_retries=3,
    messages=[
        {"role": "user", "content": "Review: Inception is a mind-bending masterpiece rated 9.2/10"}
    ],
)

print(review.genre)   # "sci-fi"
print(review.rating)  # 9.2

If the model returns a rating of 15 on the first attempt, Instructor catches the validation error and resends the request with a message like "Rating must be between 1 and 10, got 15. Please correct this." The model almost always self-corrects on the retry.

Controlling the Mode

Instructor supports multiple extraction modes depending on your provider:

import instructor

# Use function calling (default, most compatible)
client = instructor.from_openai(OpenAI(), mode=instructor.Mode.TOOLS)

# Use JSON mode (simpler, slightly less reliable)
client = instructor.from_openai(OpenAI(), mode=instructor.Mode.JSON)

# Use structured outputs with strict schema (most reliable)
client = instructor.from_openai(OpenAI(), mode=instructor.Mode.JSON_SCHEMA)

JSON_SCHEMA mode uses OpenAI's constrained decoding and is the most reliable option for supported models. TOOLS mode works across more models and providers.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Extracting Lists of Objects

Use Iterable to extract multiple objects from a single prompt:

from typing import Iterable

class Contact(BaseModel):
    name: str
    role: str
    company: str

contacts = client.chat.completions.create(
    model="gpt-4o",
    response_model=Iterable[Contact],
    messages=[
        {
            "role": "user",
            "content": (
                "Meeting attendees: Sarah Chen (CTO at TechCorp), "
                "Mike Ross (Sales Lead at DataInc), "
                "Lisa Park (Founder of AIStart)"
            )
        }
    ],
)

for contact in contacts:
    print(f"{contact.name} - {contact.role} at {contact.company}")

Working with Other Providers

Instructor is not limited to OpenAI. It supports Anthropic, Mistral, and any OpenAI-compatible API:

from anthropic import Anthropic

# Patch Anthropic client
client = instructor.from_anthropic(Anthropic())

result = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    response_model=UserInfo,
    messages=[
        {"role": "user", "content": "Extract: Alice is 28, [email protected]"}
    ],
)

The same Pydantic models work across all providers. Switch providers without changing your extraction logic.

FAQ

How does Instructor differ from using OpenAI's response_format directly?

Instructor adds automatic retries with validation feedback, multi-provider support, streaming of partial objects, and a simpler API surface. With raw response_format, you handle parsing, validation, and retries yourself. Instructor is a convenience layer that eliminates that boilerplate.

Does Instructor work with local models like Ollama or vLLM?

Yes. Use instructor.from_openai() with any OpenAI-compatible API by setting the base_url parameter on the OpenAI client. Local models may be less reliable at following schemas, so set max_retries=5 and use simpler schemas.

What is the performance overhead of using Instructor?

The Python-side overhead is negligible — under 1ms per call. The real cost comes from retries: if your validators are too strict, you may burn extra API calls. Design validators that reject truly invalid data but accept reasonable variations.

#Instructor #Pydantic #StructuredOutputs #OpenAI #Python #AgenticAI #LearnAI #AIEngineering

Instructor Library: The Easiest Way to Get Typed Outputs from LLMs

What Is Instructor?

Installation and Setup

Your First Typed Extraction

How It Works Under the Hood

Automatic Retry with Validation

Controlling the Mode

Extracting Lists of Objects

Working with Other Providers

FAQ

How does Instructor differ from using OpenAI's response_format directly?

Does Instructor work with local models like Ollama or vLLM?

What is the performance overhead of using Instructor?

Try CallSphere AI Voice Agents

Related Articles You May Like

Browser Agents with LangGraph + Playwright: Visual Evaluation Pipelines That Don't Lie

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

OpenAI Computer-Use Agents (CUA) in Production: Build + Evaluate a Real Workflow (2026)

OpenAI revenue run-rate — April 2026 read — April 2026 update

Stargate progress update — April 2026 site and capex

OpenAI acquisitions and acquihires — April 2026 roundup