Tutorial: Extracting Structured Data

Why Use Structured Output?

LLMs are great at generating text, but sometimes you need data in a machine-readable format like JSON. Structured output allows you to define a schema (using Zod in TypeScript) and force the LLM to return data that matches that schema.

This is essential for:

Data extraction from unstructured text
Building API integrations
Populating databases
ensuring reliable downstream processing

Getting Started

Let's build an example that extracts user information from a natural language description.

1. Define Your Data Schema

We use zod to define the structure we want.

typescript

import { z } from 'zod';

const UserInfoSchema = z.object({
  name: z.string(),
  age: z.number(),
  interests: z.array(z.string())
});

type UserInfo = z.infer<typeof UserInfoSchema>;

2. Initialize the Broker

typescript

import { LlmBroker, OllamaGateway } from 'mojentic';

const gateway = new OllamaGateway();
const broker = new LlmBroker('qwen3:32b', gateway);

3. Generate Structured Data

Use broker.generateStructured to request the data.

typescript

const text = "John Doe is a 30-year-old software engineer who loves hiking and reading.";

const userInfo = await broker.generateStructured(text, UserInfoSchema);

console.log(userInfo);
// {
//   name: "John Doe",
//   age: 30,
//   interests: ["hiking", "reading"]
// }

How It Works

Schema Definition: Mojentic converts your Zod schema into a JSON schema that the LLM can understand.
Prompt Engineering: The broker automatically appends instructions to the prompt, telling the LLM to output JSON matching the schema.
Validation: When the response comes back, Mojentic parses the JSON and validates it against your Zod schema.

Advanced: Nested Schemas

You can also use nested schemas for more complex data.

typescript

const AddressSchema = z.object({
  street: z.string(),
  city: z.string()
});

const UserProfileSchema = z.object({
  name: z.string(),
  address: AddressSchema
});

Summary

Structured output turns unstructured text into reliable data structures. By defining Zod schemas, you can integrate LLM outputs directly into your application's logic with type safety and validation.

Tutorial: Extracting Structured Data ​

Why Use Structured Output? ​

Getting Started ​

1. Define Your Data Schema ​

2. Initialize the Broker ​

3. Generate Structured Data ​

How It Works ​

Advanced: Nested Schemas ​

Summary ​

Tutorial: Extracting Structured Data

Why Use Structured Output?

Getting Started

1. Define Your Data Schema

2. Initialize the Broker

3. Generate Structured Data

How It Works

Advanced: Nested Schemas

Summary