Skip to content

Structured Output Parsing

advanced18 min read

The JSON You Can Actually Trust

Here's the deal with LLMs: they're incredible at generating text, but text is chaos. You ask for JSON and sometimes you get valid JSON, sometimes you get JSON wrapped in markdown code fences, sometimes you get a friendly explanation followed by JSON, and sometimes you get something that looks like JSON but has trailing commas that blow up JSON.parse. Building production UIs on "maybe-JSON" is a nightmare.

Structured output solves this completely. Instead of hoping the model returns valid JSON, you force it to. You define a schema upfront — "I want an object with these exact fields, these exact types, nothing else" — and the model's output is constrained to match that schema. Every single time. No parsing hacks, no regex extraction, no retry loops.

This is the difference between a demo and a product.

Mental Model

Think of structured output like a form with strict validation. Without structured output, you're handing someone a blank piece of paper and saying "write your address somewhere on here." With structured output, you're handing them a form with labeled fields, character limits, and required markers. They can only write in the boxes, and they can't submit until every required field is filled correctly. The LLM is the person filling out the form — the schema is the form itself.

Why This Matters for Frontend Engineers

You might think structured output is a backend concern. It's not. As a frontend engineer working with AI, you need structured output for:

  • Rendering AI responses as UI components — not raw text, but cards, tables, forms, charts
  • Type-safe AI data — your TypeScript types match exactly what the model returns
  • Streaming partial UI — rendering fields as they arrive, not waiting for the full response
  • Form generation — the model outputs a form schema, your app renders it
  • Data extraction — pull structured data from unstructured user input

Without structured output, you're writing fragile parsing code that breaks on edge cases. With it, you get a typed object you can pass straight to your components.

How It Actually Works Under the Hood

When you send a schema to an LLM provider with structured output enabled, the model doesn't just "try harder" to output valid JSON. The provider modifies the token sampling process itself. At each step of generation, tokens that would make the output invalid according to the schema are masked out — their probability is set to zero. The model literally cannot produce invalid output.

This is called constrained decoding or grammar-guided generation. The provider maintains a state machine that tracks where in the JSON structure the model currently is. Writing an object? Only " (to start a key) or } (to close) are valid next tokens. Just wrote a string key? Only : is valid. Just wrote the value for the last required field? The object must close.

The result: 100% schema conformance. Not 99.9%. Not "usually works." One hundred percent.

Quiz
How does structured output achieve 100% schema conformance?

The Provider Landscape

Each major provider handles structured output differently. Let's break down the three approaches you'll encounter in production.

FeatureOpenAI Strict ModeAnthropic tool_useVercel AI SDK
Mechanismjson_schema in response_format with strict: trueSingle tool definition where input_schema = desired outputgenerateObject() / streamObject() with Zod schema
Schema formatJSON Schema (auto-converted from Zod)JSON Schema via tool input_schemaZod schema (converts to JSON Schema internally)
Conformance100% guaranteedVery high but not formally 100%Depends on underlying provider
StreamingPartial JSON via streamPartial tool input via streamstreamObject() with partial Zod parsing
Refusal handlingFirst-class refusal field in responseStop reason checkBuilt-in error types
Best forDirect OpenAI API usageAnthropic API usageProvider-agnostic apps, Next.js

OpenAI: Strict Mode

OpenAI's structured output uses response_format with type: "json_schema" and strict: true. This is the production approach — don't confuse it with legacy JSON mode, which only guarantees valid JSON but not schema adherence.

import OpenAI from "openai";
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";

const QuizSchema = z.object({
  question: z.string(),
  options: z.array(z.string()).length(4),
  correctIndex: z.number().int().min(0).max(3),
  explanation: z.string(),
});

const client = new OpenAI();

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    {
      role: "system",
      content: "Generate a quiz question about JavaScript closures.",
    },
  ],
  response_format: zodResponseFormat(QuizSchema, "quiz"),
});

const quiz = JSON.parse(response.choices[0].message.content!);
// quiz is guaranteed to match QuizSchema

A few things to notice here. The zodResponseFormat helper converts your Zod schema into JSON Schema automatically. The "quiz" string is just a name for the schema — it doesn't affect behavior. And strict: true is set internally by the helper.

What Strict Mode Supports

Not every JSON Schema feature works with strict mode. Here's what you can use:

  • String, number, integer, boolean, null
  • Objects with properties and required (all properties must be required)
  • Arrays with items
  • Enums (z.enum(["a", "b", "c"]))
  • Union types via anyOf
  • Recursive schemas (up to a depth limit)

And what you can't use:

  • Optional properties — every field must be required (use a union with null instead)
  • additionalProperties — not supported
  • minItems, maxItems on arrays — not enforced structurally
  • Complex oneOf / allOf patterns
Common Trap

OpenAI strict mode requires every object property to be marked as required. If you have an optional field, you can't just use z.optional(). Instead, make it required but nullable: z.string().nullable(). The model will output null when the field isn't applicable. This catches a lot of people — your Zod schema works fine for regular validation but fails when converted to strict mode JSON Schema if it has optional fields.

Handling Refusals

Sometimes the model refuses to generate content that matches your schema — maybe you asked for something that violates safety guidelines. OpenAI handles this with a dedicated refusal field:

const message = response.choices[0].message;

if (message.refusal) {
  console.log("Model refused:", message.refusal);
} else {
  const data = JSON.parse(message.content!);
}

Always check for refusals before parsing. If the model refuses, content will be null and JSON.parse will throw.

Quiz
In OpenAI strict mode, how do you handle optional fields in your schema?

Anthropic: The Tool-Use Trick

Anthropic doesn't have a dedicated structured output mode (yet). But there's a clever technique the community discovered: define a single tool whose input_schema is exactly the JSON structure you want. When the model "calls" that tool, its arguments are your structured data.

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const response = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  tools: [
    {
      name: "output_quiz",
      description:
        "Output a structured quiz question. Always use this tool to respond.",
      input_schema: {
        type: "object" as const,
        properties: {
          question: { type: "string", description: "The quiz question" },
          options: {
            type: "array",
            items: { type: "string" },
            description: "Exactly 4 answer options",
          },
          correctIndex: {
            type: "integer",
            minimum: 0,
            maximum: 3,
            description: "0-indexed correct answer",
          },
          explanation: { type: "string", description: "Why the answer is correct" },
        },
        required: ["question", "options", "correctIndex", "explanation"],
      },
    },
  ],
  tool_choice: { type: "tool", name: "output_quiz" },
  messages: [
    {
      role: "user",
      content: "Generate a quiz question about JavaScript closures.",
    },
  ],
});

const toolBlock = response.content.find((block) => block.type === "tool_use");
if (toolBlock && toolBlock.type === "tool_use") {
  const quiz = toolBlock.input;
  // quiz matches your schema
}

The key trick is tool_choice: { type: "tool", name: "output_quiz" }. This forces the model to call that specific tool — it can't respond with plain text. The tool's input_schema acts as your structured output schema.

Why This Works

The model's tool calling mechanism already constrains output to match the tool's input schema. By defining a "tool" that doesn't actually do anything (you never execute it — you just read the arguments), you're hijacking that constraint mechanism for structured output. It's hacky but effective, and it's the recommended approach until Anthropic ships native structured output.

The Anthropic tool-use approach has a subtle advantage: tool input schemas support richer JSON Schema features than OpenAI's strict mode. You can use optional properties, minItems/maxItems, pattern validation, and other constraints. The trade-off is that conformance isn't formally 100% guaranteed — the model might occasionally produce output that doesn't perfectly match complex constraints. In practice, with Claude models, conformance is extremely high. But if you need absolute guarantees, validate with Zod on the client side.

Vercel AI SDK: The Frontend-First Approach

If you're building with Next.js (and you probably are), the Vercel AI SDK is the best way to work with structured output. It gives you generateObject() for one-shot generation and streamObject() for streaming — both with first-class Zod integration.

import { generateObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

const QuizSchema = z.object({
  question: z.string().describe("The quiz question"),
  options: z.array(z.string()).length(4).describe("4 answer options"),
  correctIndex: z.number().int().min(0).max(3),
  explanation: z.string(),
});

const { object } = await generateObject({
  model: openai("gpt-4o"),
  schema: QuizSchema,
  prompt: "Generate a quiz about JavaScript closures.",
});

// object is fully typed as z.infer<typeof QuizSchema>
// TypeScript knows: object.question is string, object.options is string[], etc.

Notice what happened: you wrote a Zod schema, passed it to generateObject, and got back a fully typed object. No JSON.parse. No type assertions. No validation step. The SDK handles schema conversion, API calls, response parsing, and type inference in one clean function call.

Streaming Structured Output

This is where the AI SDK really shines. streamObject() lets you render fields as they arrive — the user sees the UI building up in real time instead of staring at a loading spinner:

import { streamObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

const AnalysisSchema = z.object({
  summary: z.string(),
  sentiment: z.enum(["positive", "negative", "neutral"]),
  keyTopics: z.array(z.string()),
  score: z.number().min(0).max(100),
});

const { partialObjectStream } = streamObject({
  model: openai("gpt-4o"),
  schema: AnalysisSchema,
  prompt: `Analyze this customer review: "${review}"`,
});

for await (const partial of partialObjectStream) {
  // partial is a DeepPartial<Analysis>
  // Fields appear as the model generates them:
  // First iteration: { summary: "The cust..." }
  // Later: { summary: "The customer is...", sentiment: "positive" }
  // Later: { summary: "...", sentiment: "positive", keyTopics: ["speed"] }
  renderPartialUI(partial);
}

The partialObjectStream yields increasingly complete objects. Each field appears as the model generates it. Your UI can render available fields immediately and show skeletons for fields that haven't arrived yet.

Using streamObject in a Next.js Route Handler

Here's the real-world pattern for a Next.js API route:

// app/api/analyze/route.ts
import { streamObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

const AnalysisSchema = z.object({
  summary: z.string(),
  sentiment: z.enum(["positive", "negative", "neutral"]),
  keyTopics: z.array(z.string()),
  confidence: z.number(),
});

export async function POST(req: Request) {
  const { text } = await req.json();

  const result = streamObject({
    model: openai("gpt-4o"),
    schema: AnalysisSchema,
    prompt: `Analyze: ${text}`,
  });

  return result.toTextStreamResponse();
}

And the client-side React component:

// components/AnalysisCard.tsx
"use client";

import { experimental_useObject as useObject } from "ai/react";
import { z } from "zod";

const AnalysisSchema = z.object({
  summary: z.string(),
  sentiment: z.enum(["positive", "negative", "neutral"]),
  keyTopics: z.array(z.string()),
  confidence: z.number(),
});

export function AnalysisCard() {
  const { object, submit, isLoading, error } = useObject({
    api: "/api/analyze",
    schema: AnalysisSchema,
  });

  return (
    <div>
      <button onClick={() => submit({ text: "Great product!" })}>
        Analyze
      </button>

      {isLoading && !object && <Skeleton />}

      {object && (
        <div>
          {object.summary && <p>{object.summary}</p>}
          {object.sentiment && <Badge>{object.sentiment}</Badge>}
          {object.keyTopics?.map((topic) => (
            <Tag key={topic}>{topic}</Tag>
          ))}
          {object.confidence != null && (
            <Progress value={object.confidence} />
          )}
        </div>
      )}

      {error && <ErrorMessage>{error.message}</ErrorMessage>}
    </div>
  );
}

The useObject hook manages the streaming connection, parses partial JSON, and gives you a reactive object that updates as fields arrive. Each field is optional in the partial state (it's DeepPartial), so you conditionally render based on what's available. The UI fills in progressively — summary first, then sentiment, then topics, then the confidence score.

Quiz
What type does the partial object have during streaming with streamObject?

Zod: The Universal Schema Language

Here's something beautiful about this whole ecosystem: Zod is the bridge between frontend validation and AI structured output. The same schema you use to validate a form submission is the same schema you send to the LLM. One source of truth.

// schemas/quiz.ts — shared between frontend and AI
import { z } from "zod";

export const QuizSchema = z.object({
  question: z.string().min(10).describe("Clear, specific question"),
  options: z
    .array(z.string().min(1))
    .length(4)
    .describe("Exactly 4 plausible options"),
  correctIndex: z
    .number()
    .int()
    .min(0)
    .max(3)
    .describe("0-indexed correct answer"),
  explanation: z
    .string()
    .min(20)
    .describe("Explains why the correct answer is right"),
});

export type Quiz = z.infer<typeof QuizSchema>;

// Used for AI generation:
const { object } = await generateObject({
  model: openai("gpt-4o"),
  schema: QuizSchema,
  prompt: "Generate a JavaScript quiz",
});

// Used for form validation:
const result = QuizSchema.safeParse(formData);
if (!result.success) {
  console.log(result.error.issues);
}

// Used for API validation:
export async function POST(req: Request) {
  const body = await req.json();
  const quiz = QuizSchema.parse(body); // throws on invalid
}

The .describe() calls are important — they become part of the JSON Schema sent to the model and help it understand what each field should contain. Think of descriptions as prompt engineering at the schema level.

Schema Design Tips for AI

Not all Zod schemas work equally well with LLMs. Here are patterns that produce better results:

// Use enums instead of open strings when you have a fixed set
const sentiment = z.enum(["positive", "negative", "neutral"]);

// Use .describe() liberally — it guides the model
const score = z
  .number()
  .min(0)
  .max(100)
  .describe("Confidence score from 0 to 100");

// Use discriminated unions for polymorphic output
const ContentBlock = z.discriminatedUnion("type", [
  z.object({
    type: z.literal("text"),
    content: z.string(),
  }),
  z.object({
    type: z.literal("code"),
    language: z.string(),
    code: z.string(),
  }),
  z.object({
    type: z.literal("quiz"),
    question: z.string(),
    options: z.array(z.string()),
    answer: z.number(),
  }),
]);

// Nullable over optional for OpenAI strict mode
const profile = z.object({
  name: z.string(),
  bio: z.string().nullable(), // model outputs null if not applicable
});
Quiz
Why should you use .describe() on Zod schema fields when generating structured output?

Error Handling in Production

Structured output isn't magic — things can still go wrong. Here's a production-grade error handling strategy:

import { generateObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

const ResultSchema = z.object({
  answer: z.string(),
  confidence: z.number(),
  sources: z.array(z.string()),
});

type GenerationResult =
  | { success: true; data: z.infer<typeof ResultSchema> }
  | { success: false; error: string; partial?: unknown };

async function generateWithFallback(
  prompt: string,
): Promise<GenerationResult> {
  try {
    const { object } = await generateObject({
      model: openai("gpt-4o"),
      schema: ResultSchema,
      prompt,
    });

    return { success: true, data: object };
  } catch (error) {
    if (error instanceof Error) {
      if (error.message.includes("refusal")) {
        return {
          success: false,
          error: "The model declined to generate this content.",
        };
      }

      if (error.message.includes("rate_limit")) {
        return {
          success: false,
          error: "Rate limited. Please try again shortly.",
        };
      }
    }

    return {
      success: false,
      error: "Failed to generate structured output.",
    };
  }
}

Schema Validation as a Safety Net

Even with structured output guarantees, add a Zod validation step. Why? Because you might switch providers, and not all providers guarantee 100% conformance. Defense in depth:

async function safeGenerate<T extends z.ZodType>(
  schema: T,
  prompt: string,
): Promise<z.infer<T>> {
  const { object } = await generateObject({
    model: openai("gpt-4o"),
    schema,
    prompt,
  });

  const result = schema.safeParse(object);
  if (!result.success) {
    throw new Error(
      `Schema validation failed: ${result.error.issues.map((i) => i.message).join(", ")}`,
    );
  }

  return result.data;
}

This pattern costs almost nothing at runtime but saves you from silent data corruption when your AI provider changes behavior.

Real-World Patterns

Let's look at patterns you'll actually use in production.

Pattern 1: AI-Generated Form Schemas

The model generates a form definition, your app renders it dynamically:

const FormFieldSchema = z.object({
  id: z.string(),
  label: z.string(),
  type: z.enum(["text", "email", "number", "select", "textarea"]),
  placeholder: z.string().nullable(),
  required: z.boolean(),
  options: z.array(z.string()).nullable(),
  validation: z
    .object({
      min: z.number().nullable(),
      max: z.number().nullable(),
      pattern: z.string().nullable(),
    })
    .nullable(),
});

const FormSchema = z.object({
  title: z.string(),
  description: z.string(),
  fields: z.array(FormFieldSchema),
  submitLabel: z.string(),
});

const { object: form } = await generateObject({
  model: openai("gpt-4o"),
  schema: FormSchema,
  prompt: "Create a job application form for a frontend engineer role.",
});

// form.fields is a typed array you can map over to render inputs

Pattern 2: Structured Data Extraction

Pull structured data from unstructured text — invoices, emails, support tickets:

const InvoiceSchema = z.object({
  vendor: z.string(),
  invoiceNumber: z.string(),
  date: z.string().describe("ISO 8601 date format"),
  lineItems: z.array(
    z.object({
      description: z.string(),
      quantity: z.number(),
      unitPrice: z.number(),
      total: z.number(),
    }),
  ),
  subtotal: z.number(),
  tax: z.number(),
  total: z.number(),
  currency: z.string().describe("3-letter ISO currency code"),
});

const { object: invoice } = await generateObject({
  model: openai("gpt-4o"),
  schema: InvoiceSchema,
  prompt: `Extract invoice data from this text:\n\n${rawText}`,
});

Pattern 3: UI Component Generation

The model outputs a component definition that your app renders:

const ChartSchema = z.object({
  type: z.enum(["bar", "line", "pie", "scatter"]),
  title: z.string(),
  xAxis: z.object({
    label: z.string(),
    values: z.array(z.string()),
  }),
  yAxis: z.object({
    label: z.string(),
    values: z.array(z.number()),
  }),
  color: z.string().describe("CSS color value"),
});

const { object: chart } = await generateObject({
  model: openai("gpt-4o"),
  schema: ChartSchema,
  prompt: `Create a chart showing monthly revenue for 2025: ${data}`,
});

// Pass chart directly to your charting component
// <Chart type={chart.type} data={chart} />
Quiz
You're building an AI feature that extracts structured data from customer emails. Which approach is most production-ready?

Streaming Structured Data Deep Dive

Streaming structured output is where the UX becomes genuinely impressive. Instead of a loading spinner followed by a wall of content, users see data materialize field by field. Here's how partial JSON parsing works:

When the model streams {"summary": "The product, you don't have valid JSON yet. But a partial JSON parser can extract that summary has started and its current value is "The product". As more tokens arrive — {"summary": "The product is excellent", "sent — the parser knows summary is complete and sentiment has started.

The Vercel AI SDK handles this internally with streamObject(). But understanding the mechanics helps you build better UIs:

// Progressive rendering based on field availability
function StreamingAnalysis({ partial }: { partial: DeepPartial<Analysis> }) {
  return (
    <div>
      <div style={{ minHeight: "3rem" }}>
        {partial.summary ? (
          <p>{partial.summary}</p>
        ) : (
          <TextSkeleton lines={2} />
        )}
      </div>

      <div style={{ minHeight: "2rem" }}>
        {partial.sentiment ? (
          <SentimentBadge value={partial.sentiment} />
        ) : (
          <PillSkeleton />
        )}
      </div>

      <div style={{ minHeight: "2rem" }}>
        {partial.keyTopics && partial.keyTopics.length > 0 ? (
          <TopicList topics={partial.keyTopics.filter(Boolean) as string[]} />
        ) : (
          <TagsSkeleton count={3} />
        )}
      </div>

      <div style={{ minHeight: "1.5rem" }}>
        {partial.confidence != null ? (
          <ConfidenceBar value={partial.confidence} />
        ) : (
          <BarSkeleton />
        )}
      </div>
    </div>
  );
}

The fixed minHeight values prevent layout shifts as content appears — a CLS optimization that matters for Core Web Vitals even on AI-generated content.

Partial JSON parsing is trickier than it sounds. Consider this mid-stream state: {"items": [{"name": "Wid. Is items an array with one object whose name starts with "Wid"? Or is the stream about to produce "Widget" as the full value? The parser has to handle both cases — it gives you the partial value "Wid" for now and updates it as more tokens arrive. Libraries like partial-json and the AI SDK's built-in parser handle these edge cases, including nested objects, arrays mid-element, and escaped characters mid-string. You almost never need to implement this yourself, but knowing it exists helps you debug weird streaming behavior.

Common Patterns and Gotchas

What developers doWhat they should do
Using legacy JSON mode (response_format: { type: 'json_object' }) and assuming it matches your schema
JSON mode only guarantees valid JSON — it could return any valid JSON object. You might get { "answer": 42 } when you expected { "question": "...", "options": [...] }. Strict mode constrains every field to match your schema.
Use strict mode with json_schema (response_format: zodResponseFormat(schema, name)) which guarantees schema conformance
Using z.optional() for fields the model might not fill
OpenAI strict mode requires all properties to be in the required array. Optional fields are not supported. Nullable fields let the model output null when a value is not applicable while keeping the field required.
Use z.string().nullable() — make the field required but nullable
Not handling refusals — assuming content is always present
When a model refuses to generate content (safety filters, policy violations), the content field is null. Calling JSON.parse(null) throws. Always check for refusal first — it's a normal, expected response type.
Check for refusal before parsing: if (message.refusal) handle it, else parse content
Defining huge monolithic schemas with 20+ fields
Large schemas increase latency, reduce quality (model has more constraints to satisfy simultaneously), and make streaming less useful (many fields means longer wait for each). Smaller schemas generate faster and with higher quality.
Break into smaller focused schemas and compose them, or make separate calls
Key Rules
  1. 1Always validate with Zod even after structured output — defense in depth against provider changes
  2. 2Use .describe() on schema fields to guide the model — it reads these as instructions
  3. 3Prefer nullable over optional for OpenAI strict mode compatibility
  4. 4Set fixed dimensions on streaming UI containers to prevent layout shifts
  5. 5Handle refusals as a first-class error type, not an edge case
  6. 6Share Zod schemas between frontend validation and AI generation — single source of truth
  7. 7Keep schemas focused and small — large schemas degrade quality and increase latency

Putting It All Together

Here's the mental framework for structured output in production:

  1. Define your schema with Zod — use .describe(), prefer enums over open strings, make fields nullable not optional
  2. Choose your generation methodgenerateObject for one-shot, streamObject for progressive UI
  3. Handle errors at every level — refusals, rate limits, schema validation failures, network errors
  4. Build streaming UIs — render available fields immediately, show skeletons for pending fields, prevent layout shifts
  5. Validate the output — even with structured output guarantees, run safeParse as a safety net
  6. Share schemas — the same Zod schema validates forms, API inputs, and AI outputs

Structured output turns LLMs from unpredictable text generators into reliable data sources. Once you start thinking of AI responses as typed objects instead of strings, everything clicks — your components get cleaner, your error handling gets simpler, and your users get a polished experience that feels like magic.

Quiz
What is the main advantage of using the Vercel AI SDK's useObject hook for streaming structured output in React?