Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Mojentic for Rust

Mojentic is an agentic framework with an asynchronous pub/sub architecture designed to orchestrate LLM-powered agents. This Rust crate focuses on high performance, strong typing, and reliability.

  • Fast, async-first event processing
  • Agents compose into complex systems
  • Clear separation between Core (LLM primitives) and Agents (behaviors)

If you’re familiar with the Python project, this book mirrors its structure and concepts, adapted to Rust idioms.

Getting Started

This section helps you install and begin using Mojentic in Rust.

Install

Add the dependency (placeholder until published):

[dependencies]
mojentic = { path = "../" }

Minimal Example

use mojentic::{Broker, Message};

fn main() {
    let mut broker = Broker::default();
    broker.emit(Message::text("hello world"));
}

Next Steps

Using LLMs (Broker)

Mojentic’s LLM broker routes completion requests to pluggable gateways (e.g., Ollama). It provides a unified API for chat and text completions.

Quick chat example

use mojentic::llm::{Broker, CompletionConfig, Message};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let broker = Broker::new()?;
    let cfg = CompletionConfig::default();

    let resp = broker.chat(
        cfg,
        [
            Message::system("You are a helpful assistant"),
            Message::user("Say hi in one sentence"),
        ],
    ).await?;

    println!("{}", resp.text());
    Ok(())
}

Gateways

  • Ollama: local models for fast iteration.
  • HTTP-based gateways: add your own by implementing the Gateway trait.

Structured output

Use typed schemas to parse the model output into structs. See Structured Output.

Configuration

Use CompletionConfig to control generation parameters:

#![allow(unused)]
fn main() {
use mojentic::llm::gateway::{CompletionConfig, ReasoningEffort};

let config = CompletionConfig {
    temperature: 0.3,
    reasoning_effort: Some(ReasoningEffort::High),
    ..Default::default()
};
}

Available Parameters

  • temperature (f64): Controls randomness. Default: 1.0
  • num_ctx (u32): Context window size in tokens. Default: 32768
  • max_tokens (u32): Maximum tokens to generate. Default: 16384
  • num_predict (i32): Tokens to predict (-1 = no limit). Default: -1
  • reasoning_effort (Option<ReasoningEffort>): Extended thinking level — Low, Medium, High, or None. Default: None

Reasoning Effort

Control how much the model thinks before responding:

#![allow(unused)]
fn main() {
use mojentic::llm::gateway::{CompletionConfig, ReasoningEffort};

// Deep reasoning for complex problems
let config = CompletionConfig {
    reasoning_effort: Some(ReasoningEffort::High),
    temperature: 0.1,
    ..Default::default()
};

// Quick responses
let config = CompletionConfig {
    reasoning_effort: Some(ReasoningEffort::Low),
    ..Default::default()
};
}
  • Ollama: Maps to think: true parameter for extended thinking. The model’s reasoning trace is available in LlmGatewayResponse.thinking.
  • OpenAI: Maps to reasoning_effort API parameter for reasoning models (o1, o3 series). Ignored with a warning for non-reasoning models.

For full details, see Reasoning Effort Control.

Quick Reference

Installation

[dependencies]
mojentic = { path = "./mojentic-rust" }
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
schemars = "0.8"  # For structured output

Basic Usage

Import the prelude

#![allow(unused)]
fn main() {
use mojentic::prelude::*;
use std::sync::Arc;
}

Create a broker

#![allow(unused)]
fn main() {
let gateway = OllamaGateway::new();
let broker = LlmBroker::new("qwen3:32b", Arc::new(gateway));
}

Simple text generation

#![allow(unused)]
fn main() {
let messages = vec![LlmMessage::user("Hello, how are you?")];
let response = broker.generate(&messages, None, None).await?;
}

Message Types

User message

#![allow(unused)]
fn main() {
LlmMessage::user("Your message here")
}

System message

#![allow(unused)]
fn main() {
LlmMessage::system("You are a helpful assistant")
}

Assistant message

#![allow(unused)]
fn main() {
LlmMessage::assistant("Response from assistant")
}

Message with images

#![allow(unused)]
fn main() {
LlmMessage::user("Describe this image")
    .with_images(vec!["path/to/image.jpg".to_string()])
}

Structured Output

Define your type

#![allow(unused)]
fn main() {
use serde::{Deserialize, Serialize};
use schemars::JsonSchema;

#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct Person {
    name: String,
    age: u32,
    occupation: String,
}
}

Generate typed response

#![allow(unused)]
fn main() {
let messages = vec![
    LlmMessage::user("Extract person info: John Doe, 30, software engineer")
];

let person: Person = broker.generate_object(&messages, None).await?;
}

Configuration

Default config

#![allow(unused)]
fn main() {
let config = CompletionConfig::default();
// temperature: 1.0
// num_ctx: 32768
// max_tokens: 16384
// num_predict: None
// top_p: None
// top_k: None
// response_format: None
}

Custom config

#![allow(unused)]
fn main() {
let config = CompletionConfig {
    temperature: 0.7,
    num_ctx: 16384,
    max_tokens: 8192,
    num_predict: Some(1000),
    top_p: Some(0.9),
    top_k: Some(40),
    response_format: None,
};

let response = broker.generate(&messages, None, Some(config)).await?;
}

JSON response format

#![allow(unused)]
fn main() {
use mojentic::llm::gateway::ResponseFormat;

// Request JSON without schema
let config = CompletionConfig {
    temperature: 0.7,
    num_ctx: 16384,
    max_tokens: 8192,
    num_predict: None,
    top_p: None,
    top_k: None,
    response_format: Some(ResponseFormat::JsonObject { schema: None }),
};

// Request JSON with schema
let schema = serde_json::json!({
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "number"}
    }
});

let config = CompletionConfig {
    temperature: 0.7,
    num_ctx: 16384,
    max_tokens: 8192,
    num_predict: None,
    top_p: None,
    top_k: None,
    response_format: Some(ResponseFormat::JsonObject {
        schema: Some(schema)
    }),
};
}

Gateway Configuration

Ollama with custom host

#![allow(unused)]
fn main() {
let gateway = OllamaGateway::with_host("http://localhost:11434");
}

Ollama with environment variable

#![allow(unused)]
fn main() {
// Set OLLAMA_HOST environment variable
std::env::set_var("OLLAMA_HOST", "http://custom-host:11434");
let gateway = OllamaGateway::new();
}

Error Handling

#![allow(unused)]
fn main() {
use mojentic::Result;

async fn my_function() -> Result<String> {
    let gateway = OllamaGateway::new();
    let broker = LlmBroker::new("qwen3:32b", Arc::new(gateway));

    let messages = vec![LlmMessage::user("Hello")];

    match broker.generate(&messages, None, None).await {
        Ok(response) => Ok(response),
        Err(e) => {
            eprintln!("Error: {}", e);
            Err(e)
        }
    }
}
}

Common Patterns

Multi-turn conversation

#![allow(unused)]
fn main() {
let mut messages = vec![
    LlmMessage::system("You are a helpful assistant"),
];

// First turn
messages.push(LlmMessage::user("What is Rust?"));
let response1 = broker.generate(&messages, None, None).await?;
messages.push(LlmMessage::assistant(response1));

// Second turn
messages.push(LlmMessage::user("What are its benefits?"));
let response2 = broker.generate(&messages, None, None).await?;
}

Async parallel requests

#![allow(unused)]
fn main() {
use tokio::try_join;

let broker = Arc::new(broker);
let broker1 = broker.clone();
let broker2 = broker.clone();

let (response1, response2) = try_join!(
    async move {
        broker1.generate(
            &[LlmMessage::user("Question 1")],
            None,
            None
        ).await
    },
    async move {
        broker2.generate(
            &[LlmMessage::user("Question 2")],
            None,
            None
        ).await
    }
)?;
}

Model management (Ollama)

#![allow(unused)]
fn main() {
let gateway = OllamaGateway::new();

// List available models
let models = gateway.get_available_models().await?;

// Pull a model
gateway.pull_model("qwen3:32b").await?;

// Calculate embeddings
let embedding = gateway.calculate_embeddings(
    "Some text to embed",
    Some("mxbai-embed-large")
).await?;
}

Logging

Enable logging with tracing:

// In main.rs
use tracing_subscriber;

#[tokio::main]
async fn main() {
    // Initialize logging
    tracing_subscriber::fmt::init();

    // Your code here
}

Set log level via environment variable:

RUST_LOG=debug cargo run
RUST_LOG=info cargo run
RUST_LOG=mojentic=debug cargo run

Use Cases

Tutorial: Building Chatbots

Why Use Chat Sessions?

When working with Large Language Models (LLMs), simple text generation is useful for one-off interactions, but many applications require ongoing conversations where the model remembers previous exchanges. This is where chat sessions come in.

Chat sessions are essential when you need to:

  • Build conversational agents or chatbots
  • Maintain context across multiple user interactions
  • Create applications where the LLM needs to remember previous information
  • Develop more natural and coherent conversational experiences

When to Apply This Approach

Use chat sessions when:

  • Your application requires multi-turn conversations
  • You need the LLM to reference information from earlier in the conversation
  • You want to create a more interactive and engaging user experience

The Key Difference: Expanding Context

The fundamental difference between simple text generation and chat sessions is the expanding context. With each new message in a chat session:

  1. The message is added to the conversation history
  2. All previous messages (within token limits) are sent to the LLM with each new query
  3. The LLM can reference and build upon earlier parts of the conversation

Getting Started

Let’s walk through a simple example of building a chatbot using Mojentic’s ChatSession.

Basic Implementation

Here’s the simplest way to create a chat session with Mojentic:

use mojentic::prelude::*;
use std::io::{self, Write};
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<()> {
    // 1. Create an LLM broker
    let gateway = Arc::new(OllamaGateway::new());
    let broker = LlmBroker::new("qwen3:32b", gateway, None);

    // 2. Initialize a chat session
    let mut session = ChatSession::new(broker);

    // 3. Simple interactive loop
    println!("Chatbot started. Type 'exit' to quit.");

    loop {
        print!("Query: ");
        io::stdout().flush()?;

        let mut query = String::new();
        io::stdin().read_line(&mut query)?;
        let query = query.trim();

        if query == "exit" {
            break;
        }

        let response = session.send_message(query).await?;
        println!("{}", response);
    }
    
    Ok(())
}

This code creates an interactive chatbot that maintains context across multiple exchanges.

Step-by-Step Explanation

1. Initialize the Broker

#![allow(unused)]
fn main() {
let gateway = Arc::new(OllamaGateway::new());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
}

The LlmBroker is the central component that handles communication with the LLM provider (in this case, Ollama).

2. Start the Session

#![allow(unused)]
fn main() {
let mut session = ChatSession::new(broker);
}

ChatSession holds the state of the conversation. By default, it manages the message history and ensures it fits within the model’s context window.

3. Send Messages

#![allow(unused)]
fn main() {
let response = session.send_message(query).await?;
}

When you send a message:

  1. It’s added to the history.
  2. The full history is sent to the LLM.
  3. The LLM’s response is added to the history.
  4. The response text is returned.

Customizing Your Chat Session

You can customize the session with a system prompt or tools.

System Prompt

The system prompt sets the behavior of the assistant.

#![allow(unused)]
fn main() {
let mut session = ChatSession::new(broker)
    .with_system_prompt("You are a helpful AI assistant specialized in Rust programming.");
}

Adding Tools

You can enhance your chatbot by providing tools that the LLM can use.

#![allow(unused)]
fn main() {
use mojentic::tools::DateResolver;

let tools: Vec<Arc<dyn LlmTool>> = vec![Arc::new(DateResolver::new())];

let mut session = ChatSession::new(broker)
    .with_tools(tools);

// The LLM can now use the date tool in conversations
let response = session.send_message("What day of the week is July 4th, 2025?").await?;
println!("{}", response);
}

Streaming Responses

For a better user experience with longer responses, you can stream the LLM’s reply as it’s generated:

#![allow(unused)]
fn main() {
use futures::stream::StreamExt;

let mut stream = session.send_stream("Tell me a story");
while let Some(result) = stream.next().await {
    print!("{}", result?);
}
println!(); // Newline after streaming completes
}

The send_stream() method works just like send() for conversation management — it adds the user message to history before streaming, and records the full assembled response after the stream is consumed. Tools are handled transparently through the broker’s recursive streaming.

Summary

In this tutorial, we’ve learned how to:

  1. Initialize a ChatSession with an LlmBroker.
  2. Create an interactive loop to chat with the model.
  3. Customize the session with system prompts and tools.

By leveraging chat sessions, you can create engaging conversational experiences that maintain context across multiple interactions.

Tutorial: Extracting Structured Data

Why Use Structured Output?

LLMs are great at generating text, but sometimes you need data in a machine-readable format like JSON. Structured output allows you to define a schema (using Rust structs with Serde) and force the LLM to return data that matches that schema.

This is essential for:

  • Data extraction from unstructured text
  • Building API integrations
  • Populating databases
  • ensuring reliable downstream processing

Getting Started

Let’s build an example that extracts user information from a natural language description.

1. Define Your Data Schema

We use serde to define the structure we want.

#![allow(unused)]
fn main() {
use serde::{Deserialize, Serialize};
use schemars::JsonSchema;

#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct UserInfo {
    name: String,
    age: u32,
    interests: Vec<String>,
}
}

2. Initialize the Broker

#![allow(unused)]
fn main() {
use mojentic::prelude::*;
use std::sync::Arc;

let gateway = Arc::new(OllamaGateway::new());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
}

3. Generate Structured Data

Use broker.generate_structured to request the data.

#![allow(unused)]
fn main() {
let text = "John Doe is a 30-year-old software engineer who loves hiking and reading.";

let user_info: UserInfo = broker.generate_structured(text).await?;

println!("{:?}", user_info);
// UserInfo {
//   name: "John Doe",
//   age: 30,
//   interests: ["hiking", "reading"]
// }
}

How It Works

  1. Schema Definition: Mojentic uses schemars to convert your Rust struct into a JSON schema that the LLM can understand.
  2. Prompt Engineering: The broker automatically appends instructions to the prompt, telling the LLM to output JSON matching the schema.
  3. Validation: When the response comes back, Mojentic parses the JSON and deserializes it into your struct, performing validation (e.g., ensuring age is a number).

Advanced: Nested Schemas

You can also use nested structs for more complex data.

#![allow(unused)]
fn main() {
#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct Address {
    street: String,
    city: String,
}

#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct UserProfile {
    name: String,
    address: Address,
}
}

Summary

Structured output turns unstructured text into reliable data structures. By defining Rust structs, you can integrate LLM outputs directly into your application’s logic with type safety and validation.

Tutorial: Building Autonomous Agents

What is an Autonomous Agent?

An autonomous agent is a system that can perceive its environment, reason about how to achieve a goal, and take actions (use tools) to accomplish that goal. Unlike a simple chatbot, an agent has a loop of “Thought -> Action -> Observation”.

The Simple Recursive Agent (SRA)

Mojentic provides a SimpleRecursiveAgent pattern. This agent:

  1. Receives a goal.
  2. Thinks about the next step.
  3. Selects a tool to use.
  4. Executes the tool.
  5. Observes the result.
  6. Repeats until the goal is met.

Building an Agent

Let’s build an agent that can answer questions using a web search tool.

1. Setup

You’ll need the WebSearchTool (or any other tool) and a Broker.

use mojentic::prelude::*;
use mojentic::tools::WebSearchTool;
use mojentic::agents::SimpleRecursiveAgent;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<()> {
    // Initialize broker
    let gateway = Arc::new(OllamaGateway::new());
    let broker = LlmBroker::new("qwen3:32b", gateway, None);

    // Configure tools
    let tools: Vec<Arc<dyn LlmTool>> = vec![
        Arc::new(WebSearchTool::new(SearchProvider::Tavily)),
    ];
    
    // ...
    Ok(())
}

2. Run the Agent

#![allow(unused)]
fn main() {
let goal = "Find out who won the latest Super Bowl and tell me the score.";

let result = SimpleRecursiveAgent::run(&broker, goal, tools, None).await?;

println!("Final Answer: {}", result);
}

Step-by-Step Execution

When you run this, the agent enters a loop:

  1. Thought: “I need to search for the latest Super Bowl winner.”
  2. Action: Calls WebSearchTool with query “latest Super Bowl winner score”.
  3. Observation: Receives search results (e.g., “Kansas City Chiefs defeated San Francisco 49ers 25-22…”).
  4. Thought: “I have the information. I can now answer the user.”
  5. Final Answer: “The Kansas City Chiefs won the latest Super Bowl with a score of 25-22.”

Customizing the Agent

You can customize the agent’s behavior by:

  • Adding more tools: Give it file access, calculation abilities, etc.
  • System Prompt: Adjust its personality or constraints.
  • Max Iterations: Limit how many steps it can take to prevent infinite loops.
#![allow(unused)]
fn main() {
let config = AgentConfig {
    max_iterations: 10,
    ..Default::default()
};

SimpleRecursiveAgent::run(&broker, goal, tools, Some(config)).await?;
}

Summary

Autonomous agents allow you to solve complex, multi-step problems. By combining a reasoning loop with a set of tools, you can build systems that can interact with the world to achieve user goals.

Image Analysis

Vision-capable models can analyze images by passing image file paths along with text prompts. The framework automatically handles reading image files and encoding them as Base64 for transmission to the LLM.

Basic Usage

Attach images to a message using the with_images() method:

#![allow(unused)]
fn main() {
use mojentic::prelude::*;

let message = LlmMessage::user("Describe this image")
    .with_images(vec!["/path/to/image.jpg".to_string()]);
}

You can attach multiple images to a single message:

#![allow(unused)]
fn main() {
let message = LlmMessage::user("Compare these images")
    .with_images(vec![
        "/path/to/image1.jpg".to_string(),
        "/path/to/image2.jpg".to_string(),
    ]);
}

Complete Example

use mojentic::prelude::*;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<()> {
    // Create gateway and broker with a vision-capable model
    let gateway = OllamaGateway::new();
    let broker = LlmBroker::new("llava:latest", Arc::new(gateway), None);

    // Create a message with image
    let message = LlmMessage::user("What's in this image?")
        .with_images(vec!["path/to/image.jpg".to_string()]);

    // Generate response
    let response = broker.generate(&[message], None, None, None).await?;
    println!("{}", response);

    Ok(())
}

Vision-Capable Models

Common vision-capable models available through Ollama:

  • llava:latest - General-purpose vision model
  • bakllava:latest - BakLLaVA vision model
  • qwen3-vl:30b - Qwen3 vision-language model
  • gemma3:27b - Gemma 3 with vision support

Pull a model before using:

ollama pull llava:latest

How It Works

When you attach images to a message:

  1. File Reading: The gateway reads the image file from the specified path
  2. Base64 Encoding: The image bytes are encoded as Base64 using the base64 crate
  3. API Transmission: The encoded image is included in the images field of the Ollama API request
  4. Model Processing: The vision-capable model analyzes both the text prompt and image(s)

Error Handling

Image processing can fail if:

  • The image file doesn’t exist or isn’t readable
  • The file path is invalid
  • The model doesn’t support vision

Always handle errors when working with images:

#![allow(unused)]
fn main() {
match broker.generate(&[message], None, None, None).await {
    Ok(response) => println!("Response: {}", response),
    Err(e) => eprintln!("Error analyzing image: {}", e),
}
}

Supported Image Formats

The framework reads raw image bytes and passes them to the model. Supported formats depend on the specific model being used. Most vision models support common formats like JPEG and PNG.

See Also

Examples

Example: File Tools

This chapter demonstrates how to build tools for interacting with the file system using the Mojentic framework. The included FileTool and CodingFileTool serve as reference implementations that you can use directly or adapt for your specific needs.

Available Tools

FileTool

The FileTool provides basic file operations:

  • read_file: Read content of a file
  • write_file: Write content to a file
  • list_dir: List directory contents
  • file_exists: Check if a file exists

CodingFileTool

The CodingFileTool extends FileTool with features specifically for coding tasks:

  • apply_patch: Apply a unified diff patch to a file
  • replace_text: Replace specific text in a file
  • search_files: Search for patterns in files

Usage

use mojentic::prelude::*;
use mojentic::tools::FileTool;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<()> {
    // Initialize broker
    let gateway = Arc::new(OllamaGateway::new());
    let broker = LlmBroker::new("qwen3:32b", gateway, None);

    // Register tools
    let tools: Vec<Arc<dyn LlmTool>> = vec![
        Arc::new(FileTool::default()),
    ];

    // Ask the agent to perform file operations
    let messages = vec![
        LlmMessage::user("Create a file named 'hello.txt' with the content 'Hello, World!'"),
    ];

    let response = broker.generate(&messages, Some(tools), None).await?;
    println!("{}", response);
    
    Ok(())
}

Security

By default, file tools are restricted to the current working directory. You can configure allowed paths to restrict access further.

Example: Task Management

The TaskManager is an example of how to build stateful tools that allow agents to manage ephemeral tasks. This reference implementation shows how to maintain state across tool calls.

Features

  • Create Tasks: Add new tasks to the list
  • List Tasks: View all current tasks and their status
  • Complete Tasks: Mark tasks as done
  • Prioritize: Agents can determine the order of execution

Usage

use mojentic::prelude::*;
use mojentic::tools::TaskManager;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<()> {
    // Initialize broker
    let gateway = Arc::new(OllamaGateway::new());
    let broker = LlmBroker::new("qwen3:32b", gateway, None);

    // Register the tool
    let tools: Vec<Arc<dyn LlmTool>> = vec![
        Arc::new(TaskManager::new()),
    ];

    // The agent can now manage its own tasks
    let messages = vec![
        LlmMessage::system("You are a helpful assistant. Use the task manager to track your work."),
        LlmMessage::user("Plan a party for 10 people."),
    ];

    let response = broker.generate(&messages, Some(tools), None).await?;
    println!("{}", response);
    
    Ok(())
}

Integration with Agents

The Task Manager is particularly powerful when combined with the IterativeProblemSolver agent, allowing it to maintain state across multiple reasoning steps.

Example: Web Search

The WebSearchTool demonstrates how to integrate external APIs into your agent’s toolset. This example implementation supports multiple providers like Tavily and Serper.

Configuration

The web search tool typically requires an API key for a search provider (e.g., Tavily, Serper).

#![allow(unused)]
fn main() {
use std::env;

env::set_var("TAVILY_API_KEY", "your-api-key");
}

Usage

use mojentic::prelude::*;
use mojentic::tools::WebSearchTool;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<()> {
    // Initialize broker
    let gateway = Arc::new(OllamaGateway::new());
    let broker = LlmBroker::new("qwen3:32b", gateway, None);

    // Register the tool
    let tools: Vec<Arc<dyn LlmTool>> = vec![
        Arc::new(WebSearchTool::new(SearchProvider::Tavily)),
    ];

    // Ask a question requiring up-to-date info
    let messages = vec![
        LlmMessage::user("What is the current stock price of Apple?"),
    ];

    let response = broker.generate(&messages, Some(tools), None).await?;
    println!("{}", response);
    
    Ok(())
}

Supported Providers

  • Tavily: Optimized for LLM agents
  • Serper: Google Search API
  • DuckDuckGo: Privacy-focused search (no API key required for basic usage)

Streaming Responses

Streaming allows you to receive LLM responses chunk-by-chunk as they are generated, improving perceived latency for users.

Basic Streaming

Use broker.generate_stream to get a stream of chunks:

use mojentic::prelude::*;
use futures::StreamExt;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<()> {
    let gateway = Arc::new(OllamaGateway::new());
    let broker = LlmBroker::new("qwen3:32b", gateway, None);
    
    let messages = vec![LlmMessage::user("Tell me a story.")];

    let mut stream = broker.generate_stream(&messages, None, None).await?;

    while let Some(result) = stream.next().await {
        match result {
            Ok(chunk) => print!("{}", chunk),
            Err(e) => eprintln!("Error: {}", e),
        }
    }
    
    Ok(())
}

Streaming with Tools

Mojentic supports streaming even when tools are involved. The broker will pause streaming to execute tools and then resume streaming the final response.

#![allow(unused)]
fn main() {
use mojentic::tools::DateResolver;

let tools: Vec<Arc<dyn LlmTool>> = vec![Arc::new(DateResolver::new())];
let mut stream = broker.generate_stream(&messages, Some(tools), None).await?;

// The stream will contain text chunks.
// Tool execution happens transparently in the background.
while let Some(result) = stream.next().await {
    // ...
}
}

Embeddings

Embeddings allow you to convert text into vector representations, which are useful for semantic search, clustering, and similarity comparisons.

Setup

You need an embedding model. Ollama supports models like mxbai-embed-large or nomic-embed-text.

use mojentic::prelude::*;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<()> {
    // Initialize gateway
    let gateway = Arc::new(EmbeddingsGateway::new("mxbai-embed-large"));
    
    Ok(())
}

Generating Embeddings

#![allow(unused)]
fn main() {
let text = "The quick brown fox jumps over the lazy dog.";
let vector = gateway.embed(text).await?;

println!("{:?}", &vector[0..5]);
// => [0.123, -0.456, ...]
}

Batch Processing

You can embed multiple texts at once:

#![allow(unused)]
fn main() {
let texts = vec!["Hello", "World"];
let vectors = gateway.embed_batch(texts).await?;
}

Cosine Similarity

Mojentic provides utilities to calculate similarity between vectors:

#![allow(unused)]
fn main() {
use mojentic::utils::math::cosine_similarity;

let similarity = cosine_similarity(&vector1, &vector2);
}

Layer 1 - Core

Layer 1 provides primitives for interacting with LLMs and composing messages, tools, and structured outputs.

Components

  • Messages: typed containers for role + content.
  • Completion Config: tuning parameters (temperature, max tokens, etc.).
  • Tools: callable functions surfaced to the LLM.
  • Gateway: abstraction over model providers.

Proceed through the subsections for practical usage.

The Basics

A quick orientation to the core types and flows.

  • Create messages (system, user, assistant)
  • Configure a completion
  • Send to a gateway via the broker
use mojentic::llm::{Broker, CompletionConfig, Message};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let broker = Broker::new()?;
    let cfg = CompletionConfig::default();
    let msgs = vec![
        Message::system("You are concise"),
        Message::user("Explain Rust lifetimes in one line"),
    ];
    let out = broker.chat(cfg, msgs).await?;
    println!("{}", out.text());
    Ok(())
}

Message Composers

Helpers for building prompts and role-structured messages.

  • System prompts set behavior
  • User messages provide tasks
  • Assistant messages chain context

Tip: keep messages small and focused; prefer function/tool calls for larger payloads.

Simple Text Generation

The minimal path from prompt to text.

use mojentic::llm::{Broker, CompletionConfig, Message};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let resp = Broker::new()?
        .chat(CompletionConfig::default(), [Message::user("Haiku about Rust")])
        .await?;
    println!("{}", resp.text());
    Ok(())
}

Reasoning Effort Control

Some LLM models support extended thinking modes that allow them to reason more deeply about complex problems before responding. Mojentic provides reasoning effort control through the CompletionConfig.

Overview

The reasoning_effort field allows you to control how much computational effort the model should spend on reasoning:

  • Low: Quick responses with minimal reasoning
  • Medium: Balanced approach (default reasoning depth)
  • High: Extended thinking for complex problems

Usage

use mojentic::llm::gateway::{CompletionConfig, ReasoningEffort};
use mojentic::llm::broker::LlmBroker;
use mojentic::llm::models::LlmMessage;
use mojentic::llm::gateways::ollama::OllamaGateway;
use std::sync::Arc;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let gateway = Arc::new(OllamaGateway::new());
    let broker = LlmBroker::new("qwen3:32b", gateway, None);

    // Configure with high reasoning effort
    let config = CompletionConfig {
        reasoning_effort: Some(ReasoningEffort::High),
        ..Default::default()
    };

    let messages = vec![
        LlmMessage::system("You are a helpful assistant."),
        LlmMessage::user("Explain quantum entanglement in simple terms."),
    ];

    let response = broker.generate(&messages, None, Some(config), None).await?;
    println!("Response: {}", response);

    Ok(())
}

Provider-Specific Behavior

Ollama

When reasoning_effort is set (any value), Ollama receives the think: true parameter, which enables extended thinking mode. The model’s internal reasoning is captured in the thinking field of the response:

#![allow(unused)]
fn main() {
let response = broker.generate(&messages, None, Some(config), None).await?;

// For Ollama, check the gateway response directly for thinking field
// The thinking content shows the model's internal reasoning process
}

OpenAI

For OpenAI reasoning models (like o1), the reasoning_effort parameter maps directly to the API’s reasoning_effort field:

  • ReasoningEffort::Low"low"
  • ReasoningEffort::Medium"medium"
  • ReasoningEffort::High"high"

For non-reasoning OpenAI models, the parameter is ignored and a warning is logged.

When to Use Reasoning Effort

Use higher reasoning effort for:

  • Complex mathematical problems
  • Multi-step logical reasoning
  • Code generation requiring architectural decisions
  • Nuanced ethical or philosophical questions
  • Tasks requiring careful consideration of trade-offs

Use lower reasoning effort for:

  • Simple factual questions
  • Quick classifications
  • Routine formatting tasks
  • When response speed is critical

Performance Considerations

Higher reasoning effort typically means:

  • Longer response times
  • Increased token usage
  • More thoughtful, accurate responses
  • Better handling of edge cases

The trade-off between speed and quality should guide your choice of reasoning effort level.

Building Tools

Creating Custom Tools

Tools extend LLM capabilities by providing functions they can call. All tools must implement the LlmTool trait.

Basic Tool Structure

#![allow(unused)]
fn main() {
use mojentic::prelude::*;
use serde_json::{json, Value};
use std::collections::HashMap;

struct CalculatorTool;

impl LlmTool for CalculatorTool {
    fn run(&self, args: &HashMap<String, Value>) -> mojentic::Result<Value> {
        let operation = args.get("operation")
            .and_then(|v| v.as_str())
            .ok_or_else(|| mojentic::MojenticError::ToolError(
                "Missing operation parameter".to_string()
            ))?;

        let a = args.get("a")
            .and_then(|v| v.as_f64())
            .ok_or_else(|| mojentic::MojenticError::ToolError(
                "Missing 'a' parameter".to_string()
            ))?;

        let b = args.get("b")
            .and_then(|v| v.as_f64())
            .ok_or_else(|| mojentic::MojenticError::ToolError(
                "Missing 'b' parameter".to_string()
            ))?;

        let result = match operation {
            "add" => a + b,
            "subtract" => a - b,
            "multiply" => a * b,
            "divide" => {
                if b == 0.0 {
                    return Err(mojentic::MojenticError::ToolError(
                        "Division by zero".to_string()
                    ));
                }
                a / b
            }
            _ => return Err(mojentic::MojenticError::ToolError(
                format!("Unknown operation: {}", operation)
            )),
        };

        Ok(json!({
            "result": result,
            "operation": operation,
            "a": a,
            "b": b
        }))
    }

    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor {
            r#type: "function".to_string(),
            function: FunctionDescriptor {
                name: "calculator".to_string(),
                description: "Perform basic arithmetic operations".to_string(),
                parameters: json!({
                    "type": "object",
                    "properties": {
                        "operation": {
                            "type": "string",
                            "enum": ["add", "subtract", "multiply", "divide"],
                            "description": "The operation to perform"
                        },
                        "a": {
                            "type": "number",
                            "description": "First operand"
                        },
                        "b": {
                            "type": "number",
                            "description": "Second operand"
                        }
                    },
                    "required": ["operation", "a", "b"]
                }),
            },
        }
    }
}
}

Using Custom Tools

#[tokio::main]
async fn main() -> mojentic::Result<()> {
    let gateway = OllamaGateway::new();
    let broker = LlmBroker::new("qwen3:32b", Arc::new(gateway));

    let tools: Vec<Box<dyn LlmTool>> = vec![
        Box::new(CalculatorTool),
    ];

    let messages = vec![
        LlmMessage::user("What is 42 multiplied by 17?")
    ];

    let response = broker.generate(&messages, Some(&tools), None).await?;
    println!("{}", response);

    Ok(())
}

Testing Custom Tools

Create tests for your tools:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_calculator_add() {
        let tool = CalculatorTool;
        let mut args = HashMap::new();
        args.insert("operation".to_string(), json!("add"));
        args.insert("a".to_string(), json!(5.0));
        args.insert("b".to_string(), json!(3.0));

        let result = tool.run(&args).unwrap();
        assert_eq!(result["result"], 8.0);
    }

    #[test]
    fn test_calculator_divide_by_zero() {
        let tool = CalculatorTool;
        let mut args = HashMap::new();
        args.insert("operation".to_string(), json!("divide"));
        args.insert("a".to_string(), json!(10.0));
        args.insert("b".to_string(), json!(0.0));

        assert!(tool.run(&args).is_err());
    }
}
}

Best Practices

  1. Error Handling: Always use proper error types, never panic in library code
  2. Documentation: Add rustdoc comments to your tool implementations
  3. Testing: Write comprehensive tests for your tools
  4. Type Safety: Validate all input parameters
  5. JSON Schema: Provide clear, detailed parameter descriptions

Tool Usage

Let models invoke your tools to retrieve data or perform actions.

Patterns:

  • Stateless RPC-like tools
  • Stateful tools with access to context
  • Safety: validate inputs, timeouts, retries

Agent Delegation with ToolWrapper

The ToolWrapper enables agent delegation patterns by wrapping an agent (broker + tools + behavior) as a tool that can be used by other agents. This allows you to build hierarchical agent systems where coordinator agents can delegate specialized tasks to expert agents.

Overview

ToolWrapper wraps three core components:

  • Broker: The LLM interface for the agent
  • Tools: The tools available to the agent
  • Behavior: The system message defining the agent’s personality and capabilities

The wrapped agent appears as a standard tool with a single input parameter, making it easy for other agents to delegate work.

Basic Usage

#![allow(unused)]
fn main() {
use mojentic::llm::{LlmBroker, LlmTool, ToolWrapper};
use mojentic::llm::gateways::OllamaGateway;
use std::sync::Arc;

// Create a specialist agent
let gateway = Arc::new(OllamaGateway::default());
let specialist_broker = Arc::new(LlmBroker::new("qwen2.5:7b", gateway.clone()));

let specialist_tools: Vec<Box<dyn LlmTool>> = vec![
    // Add specialist's tools here
];

let specialist_behaviour =
    "You are a specialist in temporal reasoning and date calculations.";

// Wrap the specialist as a tool
let specialist_tool = ToolWrapper::new(
    specialist_broker,
    specialist_tools,
    specialist_behaviour,
    "temporal_specialist",
    "A specialist that handles date and time-related queries."
);

// Use the specialist in a coordinator agent
let coordinator_broker = LlmBroker::new("qwen2.5:14b", gateway);
let coordinator_tools: Vec<Box<dyn LlmTool>> = vec![
    Box::new(specialist_tool)
];

// The coordinator can now delegate to the specialist
}

How It Works

  1. Tool Descriptor: The ToolWrapper generates a tool descriptor with:

    • Function name (e.g., “temporal_specialist”)
    • Description (what the agent does)
    • Single input parameter for instructions
  2. Execution Flow: When called:

    • Extracts the input parameter
    • Creates initial messages with the agent’s behavior (system message)
    • Appends the input as a user message
    • Calls the agent’s broker with its tools
    • Returns the agent’s response
  3. Delegation Pattern: The coordinator agent sees specialist agents as tools and decides when to delegate based on the task requirements.

Example: Multi-Agent System

See examples/broker_as_tool.rs for a complete example demonstrating:

  • Creating specialist agents with specific tools and behaviors
  • Wrapping specialists as tools
  • Building a coordinator that delegates appropriately
  • Testing different query types
cargo run --example broker_as_tool

Design Considerations

Agent Ownership

ToolWrapper uses Arc<LlmBroker> to handle shared ownership of the broker. This allows multiple references to the same broker if needed.

Async Execution

The run method is synchronous (required by the LlmTool trait) but internally handles async operations using tokio::task::block_in_place. This requires tests to use the multi-threaded runtime:

#![allow(unused)]
fn main() {
#[tokio::test(flavor = "multi_thread")]
async fn test_tool_wrapper() {
    // Test code here
}
}

Tool Isolation

Each wrapped agent maintains its own:

  • Tool set
  • Behavior/personality
  • Conversation context (per invocation)

This ensures clean separation between agents and prevents context bleeding.

Common Patterns

Specialist Agent Pattern

Create focused agents with specific expertise:

#![allow(unused)]
fn main() {
// Data analysis specialist
let data_analyst = ToolWrapper::new(
    analyst_broker,
    vec![Box::new(DataQueryTool), Box::new(VisualizationTool)],
    "You are a data analyst specializing in statistical analysis.",
    "data_analyst",
    "Analyzes data and provides statistical insights."
);

// Writing specialist
let writer = ToolWrapper::new(
    writer_broker,
    vec![Box::new(GrammarCheckTool), Box::new(StyleGuideTool)],
    "You are a professional writer and editor.",
    "writer",
    "Improves and edits written content."
);
}

Coordinator Pattern

Build a coordinator that orchestrates multiple specialists:

#![allow(unused)]
fn main() {
let coordinator = LlmBroker::new("qwen2.5:32b", gateway);
let tools: Vec<Box<dyn LlmTool>> = vec![
    Box::new(data_analyst),
    Box::new(writer),
    Box::new(researcher),
];

// Coordinator decides which specialist to use based on the task
}

Hierarchical Agents

Create multi-level hierarchies:

#![allow(unused)]
fn main() {
// Level 1: Base specialists
let math_specialist = ToolWrapper::new(...);
let physics_specialist = ToolWrapper::new(...);

// Level 2: Domain coordinator
let science_coordinator = ToolWrapper::new(
    coordinator_broker,
    vec![Box::new(math_specialist), Box::new(physics_specialist)],
    "You coordinate math and physics specialists.",
    "science_coordinator",
    "Handles scientific queries."
);

// Level 3: Top-level coordinator
let main_coordinator = LlmBroker::new("qwen2.5:32b", gateway);
let main_tools: Vec<Box<dyn LlmTool>> = vec![
    Box::new(science_coordinator),
    // Other domain coordinators...
];
}

Best Practices

  1. Clear Specialization: Give each agent a well-defined area of expertise
  2. Descriptive Names: Use clear, descriptive names for tools (e.g., “temporal_specialist”, not “agent1”)
  3. Comprehensive Descriptions: Write detailed descriptions so coordinators understand when to delegate
  4. Model Selection: Use appropriate model sizes for each role:
    • Specialists: Smaller, faster models (7B-14B)
    • Coordinators: Larger, more capable models (32B+)
  5. Tool Composition: Specialists should have tools relevant to their domain
  6. Testing: Test both individual specialists and the full delegation chain

Limitations

  • Synchronous Interface: The LlmTool trait requires synchronous run methods, though async operations happen internally
  • No Streaming: Tool calls don’t support streaming responses (limitation of current tool architecture)
  • Context: Each tool invocation is independent; no automatic context sharing between calls
  • Error Handling: Tool errors propagate up to the coordinator; implement appropriate error handling

See Also

Chat Sessions with Tools

Combine conversational context with tool calling for grounded, actionable responses. ChatSession seamlessly integrates with Mojentic’s tool system to enable function calling within conversations.

Overview

When tools are provided to a ChatSession, the LLM can call them during the conversation. The broker handles:

  1. Detecting tool call requests
  2. Executing the appropriate tools
  3. Adding tool results to the conversation
  4. Continuing the conversation with the LLM

This happens transparently - you just call send() as normal.

Basic Tool Integration

Add tools when building the session:

use mojentic::llm::{ChatSession, LlmBroker};
use mojentic::llm::gateways::OllamaGateway;
use mojentic::llm::tools::simple_date_tool::SimpleDateTool;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let gateway = Arc::new(OllamaGateway::default());
    let broker = LlmBroker::new("qwen3:32b", gateway);

    // Create tools
    let tools: Vec<Box<dyn mojentic::llm::LlmTool>> = vec![
        Box::new(SimpleDateTool),
    ];

    // Build session with tools
    let mut session = ChatSession::builder(broker)
        .system_prompt(
            "You are a helpful assistant. When users ask about dates, \
             use the resolve_date tool to convert relative dates to absolute dates."
        )
        .tools(tools)
        .build();

    // The LLM will automatically use the tool when needed
    let response = session.send("What is tomorrow's date?").await?;
    println!("{}", response);

    Ok(())
}

How It Works

  1. User sends a message: session.send("What is tomorrow's date?")
  2. LLM recognizes need for tool: Generates a tool call request
  3. Broker executes tool: Calls SimpleDateTool with appropriate arguments
  4. Tool result added to history: As a tool response message
  5. LLM generates final response: Using the tool’s output
  6. Response returned to user: “Tomorrow’s date is 2025-11-14”

All of this happens in a single send() call.

Multiple Tools

Provide multiple tools for different capabilities:

#![allow(unused)]
fn main() {
use mojentic::llm::tools::simple_date_tool::SimpleDateTool;
// Import other tools...

let tools: Vec<Box<dyn mojentic::llm::LlmTool>> = vec![
    Box::new(SimpleDateTool),
    Box::new(CalculatorTool),
    Box::new(WeatherTool),
];

let mut session = ChatSession::builder(broker)
    .system_prompt(
        "You are a helpful assistant with access to date resolution, \
         calculation, and weather information tools. Use them when appropriate."
    )
    .tools(tools)
    .build();
}

The LLM will choose which tool to use based on the user’s question.

Tool Call History

Tool calls and results are part of the conversation history:

#![allow(unused)]
fn main() {
// After a tool call
for msg in session.messages() {
    match msg.role() {
        MessageRole::User => println!("User: {}", msg.content().unwrap_or("")),
        MessageRole::Assistant => {
            if let Some(tool_calls) = &msg.message.tool_calls {
                println!("Assistant called tool: {:?}", tool_calls);
            } else {
                println!("Assistant: {}", msg.content().unwrap_or(""));
            }
        }
        MessageRole::Tool => println!("Tool result: {}", msg.content().unwrap_or("")),
        _ => {}
    }
}
}

Context Management with Tools

Tool calls and results consume tokens. The session manages this automatically:

#![allow(unused)]
fn main() {
let mut session = ChatSession::builder(broker)
    .max_context(8192)
    .tools(tools)
    .build();

// Even with tool calls, context trimming works
session.send("What's tomorrow?").await?;  // May trigger tool call
session.send("And the day after?").await?;  // May trigger another tool call

// Old messages (including old tool calls) are trimmed when needed
println!("Total tokens: {}", session.total_tokens());
}

Creating Custom Tools

Build your own tools for specific needs:

#![allow(unused)]
fn main() {
use mojentic::llm::tools::{LlmTool, ToolDescriptor, FunctionDescriptor};
use serde_json::{json, Value};
use std::collections::HashMap;

struct WeatherTool;

impl LlmTool for WeatherTool {
    fn run(&self, args: &HashMap<String, Value>) -> mojentic::error::Result<Value> {
        let city = args.get("city")
            .and_then(|v| v.as_str())
            .ok_or_else(|| mojentic::error::MojenticError::ToolError(
                "Missing city argument".to_string()
            ))?;

        // In reality, you'd call a weather API here
        Ok(json!({
            "city": city,
            "temperature": "72°F",
            "conditions": "Sunny"
        }))
    }

    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor {
            r#type: "function".to_string(),
            function: FunctionDescriptor {
                name: "get_weather".to_string(),
                description: "Get current weather for a city".to_string(),
                parameters: json!({
                    "type": "object",
                    "properties": {
                        "city": {
                            "type": "string",
                            "description": "The city name"
                        }
                    },
                    "required": ["city"]
                }),
            },
        }
    }
}
}

Then use it:

#![allow(unused)]
fn main() {
let tools: Vec<Box<dyn mojentic::llm::LlmTool>> = vec![
    Box::new(WeatherTool),
];

let mut session = ChatSession::builder(broker)
    .system_prompt("You can check weather using the get_weather tool.")
    .tools(tools)
    .build();

let response = session.send("What's the weather in San Francisco?").await?;
}

Interactive Example

Run the complete example:

cargo run --example chat_session_with_tool

This demonstrates:

  • Tool integration in conversation
  • Automatic tool invocation
  • Natural multi-turn dialogue with tool results

Best Practices

  1. Clear tool descriptions: Help the LLM understand when to use each tool
  2. Descriptive system prompts: Explain what tools are available
  3. Error handling in tools: Return meaningful errors for invalid inputs
  4. Tool result formatting: Return structured JSON that’s easy for the LLM to interpret
  5. Monitor token usage: Tools can add significant tokens to the conversation

Debugging Tool Calls

Enable tracing to see tool execution:

#![allow(unused)]
fn main() {
// Initialize tracing
tracing_subscriber::fmt::init();

// Tool calls will be logged
let response = session.send("What's tomorrow?").await?;
}

You’ll see logs like:

INFO Tool calls requested: 1
INFO Executing tool: resolve_date

Limitations

  • Tool calls count toward the context window
  • Very long tool results may need truncation
  • Not all models support tool calling equally well
  • Tool execution is synchronous (async tools not yet supported)

Next Steps

OpenAI Model Registry

Overview

The OpenAI Model Registry is a centralized system for managing model-specific configurations, capabilities, and parameter requirements for OpenAI models. It provides a compile-time type-safe registry that informs the gateway about which parameters and features each model supports.

This registry is essential for:

  • Determining which models support tools, streaming, or vision capabilities
  • Selecting the correct token limit parameter (max_tokens vs max_completion_tokens)
  • Understanding temperature support for reasoning models
  • Tracking which API endpoints each model supports
  • Providing sensible defaults when working with unknown models

Model Types

The registry categorizes models into four distinct types:

Reasoning Models

Models that use internal reasoning steps before generating responses. These models use max_completion_tokens instead of max_tokens.

#![allow(unused)]
fn main() {
ModelType::Reasoning
}

Examples: o1, o1-mini, o3, o3-mini, gpt-5, gpt-5.1

Chat Models

Standard chat completion models that use max_tokens for output control.

#![allow(unused)]
fn main() {
ModelType::Chat
}

Examples: gpt-4o, gpt-4-turbo, gpt-3.5-turbo

Embedding Models

Models designed for generating text embeddings.

#![allow(unused)]
fn main() {
ModelType::Embedding
}

Examples: text-embedding-3-small, text-embedding-3-large

Moderation Models

Models for content moderation and safety classification.

#![allow(unused)]
fn main() {
ModelType::Moderation
}

Examples: omni-moderation-latest, text-moderation-latest

Model Capabilities

The ModelCapabilities struct defines what each model can do. Here are all the fields:

Core Capabilities

  • model_type: ModelType - The type of model (Reasoning, Chat, Embedding, or Moderation)
  • supports_tools: bool - Whether the model supports function/tool calling
  • supports_streaming: bool - Whether the model supports streaming responses
  • supports_vision: bool - Whether the model can process image inputs

Token Limits

  • max_context_tokens: Option<u32> - Maximum input context window size in tokens
  • max_output_tokens: Option<u32> - Maximum number of output tokens the model can generate

Temperature Support

  • supported_temperatures: Option<Vec<f32>> - Temperature constraints:
    • None - Accepts any temperature value (most chat models)
    • Some(vec![]) - Does not accept temperature parameter (not exposed to users)
    • Some(vec![1.0]) - Only accepts temperature 1.0 (reasoning models like o1, o3)

API Endpoint Support

  • supports_chat_api: bool - Supports /v1/chat/completions endpoint
  • supports_completions_api: bool - Supports /v1/completions endpoint
  • supports_responses_api: bool - Supports /v1/responses endpoint

Note: The OpenAI gateway currently only calls the Chat API. These flags are informational and available for future gateway enhancements.

API Endpoint Categories

OpenAI models support different API endpoints based on their design and capabilities:

Chat API (Most Common)

The /v1/chat/completions endpoint is supported by most modern models:

  • All GPT-4 variants (except legacy completions models)
  • Reasoning models (o1, o3, o4-mini)
  • Base GPT-5 models

Completions API (Legacy)

The /v1/completions endpoint is supported by older models:

  • babbage-002
  • davinci-002
  • gpt-3.5-turbo-instruct

These models do not support the chat API format.

Both APIs

Some models support both chat and completions:

  • gpt-4o-mini
  • gpt-4.1-nano
  • gpt-5.1

Responses API (Newer Models)

The /v1/responses endpoint is supported by specialized models:

  • gpt-5-pro
  • codex-mini-latest

Using the Global Registry

The registry provides a global instance for convenience:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::{
    get_model_registry, ModelType
};

// Access the global registry
let registry = get_model_registry();

// Look up model capabilities
let caps = registry.get_model_capabilities("gpt-4o");
assert_eq!(caps.model_type, ModelType::Chat);
assert!(caps.supports_tools);
assert!(caps.supports_streaming);
assert!(caps.supports_vision);
}

Token Limit Parameters

Different model types use different parameter names for controlling output length:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;

let registry = get_model_registry();

// Chat models use "max_tokens"
let gpt4_caps = registry.get_model_capabilities("gpt-4o");
assert_eq!(gpt4_caps.get_token_limit_param(), "max_tokens");

// Reasoning models use "max_completion_tokens"
let o1_caps = registry.get_model_capabilities("o1");
assert_eq!(o1_caps.get_token_limit_param(), "max_completion_tokens");
}

Temperature Support

Reasoning models have restricted temperature support:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;

let registry = get_model_registry();

// Chat models support arbitrary temperatures
let gpt4_caps = registry.get_model_capabilities("gpt-4o");
assert!(gpt4_caps.supports_temperature(0.7));
assert!(gpt4_caps.supports_temperature(1.5));

// Reasoning models only support temperature 1.0
let o1_caps = registry.get_model_capabilities("o1");
assert!(o1_caps.supports_temperature(1.0));
assert!(!o1_caps.supports_temperature(0.7));
}

Checking Endpoint Support

You can check which API endpoints a model supports:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;

let registry = get_model_registry();

// Standard chat model - chat API only
let caps = registry.get_model_capabilities("gpt-4o");
assert!(caps.supports_chat_api);
assert!(!caps.supports_completions_api);
assert!(!caps.supports_responses_api);

// Dual-endpoint model
let mini_caps = registry.get_model_capabilities("gpt-4o-mini");
assert!(mini_caps.supports_chat_api);
assert!(mini_caps.supports_completions_api);
assert!(!mini_caps.supports_responses_api);

// Completions-only model
let instruct_caps = registry.get_model_capabilities("gpt-3.5-turbo-instruct");
assert!(!instruct_caps.supports_chat_api);
assert!(instruct_caps.supports_completions_api);
assert!(!instruct_caps.supports_responses_api);

// Responses-only model
let pro_caps = registry.get_model_capabilities("gpt-5-pro");
assert!(!pro_caps.supports_chat_api);
assert!(!pro_caps.supports_completions_api);
assert!(pro_caps.supports_responses_api);
}

Creating Custom Registries

While the global registry is convenient, you can create custom registries for testing or specialized configurations:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::{
    OpenAIModelRegistry, ModelCapabilities, ModelType
};

// Create a new empty registry
let mut custom_registry = OpenAIModelRegistry::new();

// Register a custom model
custom_registry.register_model("custom-model", ModelCapabilities {
    model_type: ModelType::Chat,
    supports_tools: true,
    supports_streaming: true,
    supports_vision: false,
    max_context_tokens: Some(8192),
    max_output_tokens: Some(4096),
    supported_temperatures: None, // Accepts any temperature
    supports_chat_api: true,
    supports_completions_api: false,
    supports_responses_api: false,
});

// Use the custom registry
let caps = custom_registry.get_model_capabilities("custom-model");
assert_eq!(caps.model_type, ModelType::Chat);
assert!(caps.supports_tools);
}

Pattern Matching for Unknown Models

When a model is not explicitly registered, the registry infers capabilities from the model name:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;

let registry = get_model_registry();

// Unknown model with "gpt-4" prefix inherits GPT-4 capabilities
let unknown_caps = registry.get_model_capabilities("gpt-4-future-model");
assert!(unknown_caps.supports_tools);
assert!(unknown_caps.supports_streaming);

// Unknown model with "o" prefix is treated as a reasoning model
let unknown_o_caps = registry.get_model_capabilities("o5-experimental");
assert_eq!(unknown_o_caps.get_token_limit_param(), "max_completion_tokens");
assert!(unknown_o_caps.supports_temperature(1.0));
assert!(!unknown_o_caps.supports_temperature(0.7));
}

Common Model Categories

Chat-Only Models

Most GPT-4, reasoning, and base GPT-5 models:

  • gpt-4o, gpt-4-turbo, gpt-4
  • o1, o1-mini, o3, o3-mini, o4-mini
  • gpt-5, gpt-5.1

Completions-Only Models

Legacy models that don’t support chat format:

  • babbage-002
  • davinci-002
  • gpt-3.5-turbo-instruct

Dual-Endpoint Models

Models supporting both chat and completions:

  • gpt-4o-mini
  • gpt-4.1-nano
  • gpt-5.1

Responses-Only Models

Specialized newer endpoint models:

  • gpt-5-pro
  • codex-mini-latest

Summary

The OpenAI Model Registry provides:

  • Type-safe model capability definitions
  • Automatic parameter selection based on model type
  • Temperature validation for reasoning models
  • API endpoint support tracking
  • Pattern matching for unknown models
  • A global registry for convenience
  • Custom registries for specialized use cases

By consulting the registry, the OpenAI gateway ensures it uses the correct parameters and features for each model, providing a robust and maintainable integration layer.

Layer 2 - Agents

Agents are the smallest unit of computation in Mojentic. They process events and emit new events, composing into larger systems.

This section mirrors the Python project’s agent layer but uses Rust’s async and type system.

Asynchronous Capabilities

Mojentic (Rust) embraces async end-to-end.

  • Concurrency for overlapping LLM calls
  • Channels/streams for event propagation
  • Timeouts and cancellation for robustness

Iterative Problem Solver

The IterativeProblemSolver is an agent that iteratively attempts to solve problems using available tools. It employs a chat-based approach and continues working until it succeeds, fails explicitly, or reaches the maximum number of iterations.

Overview

The Iterative Problem Solver follows a simple but powerful pattern:

  1. Plan - Analyze the problem and identify what needs to be done
  2. Act - Execute actions using available tools
  3. Observe - Review the results
  4. Refine - Adjust the approach based on observations
  5. Terminate - Stop when the goal is met or the iteration budget is exhausted

Key Features

  • Tool Integration: Seamlessly integrates with any LlmTool implementations
  • Automatic Termination: Stops when the LLM responds with “DONE” or “FAIL”
  • Iteration Control: Configurable maximum iteration count prevents infinite loops
  • Chat-Based Context: Maintains conversation history for context-aware problem solving
  • Summary Generation: Provides a clean summary of the final result

Usage

Basic Example

use mojentic::agents::IterativeProblemSolver;
use mojentic::llm::{LlmBroker, LlmTool};
use mojentic::llm::gateways::OllamaGateway;
use mojentic::llm::tools::simple_date_tool::SimpleDateTool;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize the LLM broker
    let gateway = Arc::new(OllamaGateway::default());
    let broker = LlmBroker::new("qwen3:32b", gateway, None);

    // Define available tools
    let tools: Vec<Box<dyn LlmTool>> = vec![
        Box::new(SimpleDateTool),
    ];

    // Create the solver
    let mut solver = IterativeProblemSolver::builder(broker)
        .tools(tools)
        .max_iterations(5)
        .build();

    // Solve a problem
    let result = solver.solve("What's the date next Friday?").await?;
    println!("Result: {}", result);

    Ok(())
}

Custom System Prompt

You can customize the system prompt to guide the solver’s behavior:

#![allow(unused)]
fn main() {
let mut solver = IterativeProblemSolver::builder(broker)
    .tools(tools)
    .max_iterations(10)
    .system_prompt(
        "You are a specialized data analysis assistant. \
         Break down complex queries into clear steps and use tools methodically."
    )
    .build();
}

With Multiple Tools

The solver works best when given appropriate tools for the problem domain:

#![allow(unused)]
fn main() {
use mojentic::llm::tools::ask_user_tool::AskUserTool;
use mojentic::llm::tools::simple_date_tool::SimpleDateTool;

let tools: Vec<Box<dyn LlmTool>> = vec![
    Box::new(AskUserTool::new()),
    Box::new(SimpleDateTool),
];

let mut solver = IterativeProblemSolver::builder(broker)
    .tools(tools)
    .max_iterations(5)
    .build();
}

How It Works

Step-by-Step Process

  1. Initialization: The solver creates a ChatSession with the provided system prompt and tools
  2. Iteration Loop: For each iteration:
    • Sends the problem description with instructions to use tools
    • Checks the response for “DONE” (success) or “FAIL” (failure)
    • Continues if neither keyword is present and iterations remain
  3. Summary: After termination, requests a concise summary of the result
  4. Return: Returns the summary as the final result

Termination Conditions

The solver terminates when one of these conditions is met:

  • Success: The LLM’s response contains “DONE” (case-insensitive)
  • Failure: The LLM’s response contains “FAIL” (case-insensitive)
  • Exhaustion: The maximum number of iterations is reached

Logging

The solver uses the tracing crate to log important events:

  • info: Logged when a task completes successfully or fails
  • warn: Logged when maximum iterations are reached

Configuration Options

Builder Pattern

The IterativeProblemSolver uses the builder pattern for configuration:

#![allow(unused)]
fn main() {
IterativeProblemSolver::builder(broker)
    .tools(tools)              // Set available tools
    .max_iterations(10)        // Set max iterations (default: 3)
    .system_prompt("...")      // Set custom system prompt
    .build()
}

Default Values

  • max_iterations: 3
  • system_prompt:

    “You are a problem-solving assistant that can solve complex problems step by step. You analyze problems, break them down into smaller parts, and solve them systematically. If you cannot solve a problem completely in one step, you make progress and identify what to do next.”

Best Practices

1. Choose Appropriate Tools

Select tools that are relevant to your problem domain:

#![allow(unused)]
fn main() {
// For date/time problems
let tools = vec![Box::new(SimpleDateTool)];

// For user interaction
let tools = vec![Box::new(AskUserTool::new())];

// For data analysis
let tools = vec![
    Box::new(CalculatorTool),
    Box::new(DataRetrievalTool),
];
}

2. Set Reasonable Iteration Limits

Balance between giving the solver enough attempts and preventing excessive computation:

  • Simple queries: 3-5 iterations
  • Complex analyses: 10-15 iterations
  • Open-ended exploration: 20+ iterations

3. Provide Context in Problem Description

The more context you provide, the better the solver can work:

#![allow(unused)]
fn main() {
// Less effective
solver.solve("Analyze the data").await?;

// More effective
solver.solve(
    "Analyze the sales data from Q1 2024. \
     Focus on trends in the technology sector. \
     Provide insights on growth patterns."
).await?;
}

4. Monitor Logs

Enable tracing to understand solver behavior:

#![allow(unused)]
fn main() {
tracing_subscriber::fmt()
    .with_max_level(tracing::Level::INFO)
    .init();
}

Common Patterns

Retry Logic

For operations that might fail transiently:

#![allow(unused)]
fn main() {
let mut attempts = 0;
let max_attempts = 3;

let result = loop {
    attempts += 1;
    match solver.solve(problem).await {
        Ok(result) if !result.contains("FAIL") => break result,
        Ok(_) if attempts < max_attempts => continue,
        Ok(result) => break result,
        Err(e) => return Err(e),
    }
};
}

Multi-Stage Problems

For problems that require multiple phases:

#![allow(unused)]
fn main() {
// Phase 1: Data gathering
let mut solver = IterativeProblemSolver::builder(broker.clone())
    .tools(data_tools)
    .max_iterations(5)
    .build();
let data = solver.solve("Gather all relevant data").await?;

// Phase 2: Analysis
let mut solver = IterativeProblemSolver::builder(broker)
    .tools(analysis_tools)
    .max_iterations(10)
    .build();
let analysis = solver.solve(&format!("Analyze: {}", data)).await?;
}

Examples

See the complete examples at:

  • examples/iterative_solver.rs - Basic usage with date and user interaction tools
  • examples/solver_chat_session.rs - Interactive chat session with solver delegation pattern

Error Handling

The solver returns Result<String, MojenticError>:

#![allow(unused)]
fn main() {
match solver.solve(problem).await {
    Ok(result) => println!("Solution: {}", result),
    Err(MojenticError::GatewayError(msg)) => {
        eprintln!("Gateway error: {}", msg);
    }
    Err(MojenticError::ToolError(msg)) => {
        eprintln!("Tool error: {}", msg);
    }
    Err(e) => {
        eprintln!("Unexpected error: {}", e);
    }
}
}

Advanced: Solver as a Tool

The IterativeProblemSolver can be wrapped as a tool and used within a ChatSession, enabling powerful delegation patterns where a chat assistant can offload complex problems to a specialized solver agent.

Creating a Solver Tool

#![allow(unused)]
fn main() {
use mojentic::llm::tools::{FunctionDescriptor, LlmTool, ToolDescriptor};
use mojentic::agents::IterativeProblemSolver;
use serde_json::{json, Value};
use std::collections::HashMap;
use std::sync::Arc;

struct IterativeProblemSolverTool {
    broker: Arc<LlmBroker>,
    tools: Vec<Box<dyn LlmTool>>,
}

impl IterativeProblemSolverTool {
    fn new(broker: Arc<LlmBroker>, tools: Vec<Box<dyn LlmTool>>) -> Self {
        Self { broker, tools }
    }
}

impl Clone for IterativeProblemSolverTool {
    fn clone(&self) -> Self {
        Self {
            broker: self.broker.clone(),
            tools: self.tools.iter().map(|t| t.clone_box()).collect(),
        }
    }
}

impl LlmTool for IterativeProblemSolverTool {
    fn run(&self, args: &HashMap<String, Value>) -> mojentic::error::Result<Value> {
        let problem_to_solve = args
            .get("problem_to_solve")
            .and_then(|v| v.as_str())
            .ok_or_else(|| {
                mojentic::error::MojenticError::ToolError(
                    "Missing required argument: problem_to_solve".to_string(),
                )
            })?;

        let solver_tools: Vec<Box<dyn LlmTool>> =
            self.tools.iter().map(|t| t.clone_box()).collect();

        let runtime = tokio::runtime::Handle::current();
        let broker_clone = (*self.broker).clone();

        let result = runtime.block_on(async move {
            let mut solver = IterativeProblemSolver::builder(broker_clone)
                .tools(solver_tools)
                .max_iterations(5)
                .build();

            solver.solve(problem_to_solve).await
        })?;

        Ok(json!({"solution": result}))
    }

    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor {
            r#type: "function".to_string(),
            function: FunctionDescriptor {
                name: "iterative_problem_solver".to_string(),
                description: "Iteratively solve a complex multi-step problem using available tools.".to_string(),
                parameters: json!({
                    "type": "object",
                    "properties": {
                        "problem_to_solve": {
                            "type": "string",
                            "description": "The problem or request to be solved."
                        }
                    },
                    "required": ["problem_to_solve"],
                    "additionalProperties": false
                }),
            },
        }
    }

    fn clone_box(&self) -> Box<dyn LlmTool> {
        Box::new(self.clone())
    }
}
}

Using the Solver Tool in a Chat Session

use mojentic::llm::{ChatSession, LlmBroker};
use mojentic::llm::tools::simple_date_tool::SimpleDateTool;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let gateway = Arc::new(OllamaGateway::default());
    let broker = Arc::new(LlmBroker::new("qwq", gateway, None));

    // Create the solver tool with SimpleDateTool as the inner tool
    let solver_tools: Vec<Box<dyn LlmTool>> = vec![Box::new(SimpleDateTool)];
    let solver_tool = IterativeProblemSolverTool::new(broker.clone(), solver_tools);

    // Create chat session with the solver tool
    let mut session = ChatSession::builder((*broker).clone())
        .system_prompt(
            "You are a helpful assistant with access to an iterative problem solver. \
             When faced with complex multi-step problems or questions that require \
             reasoning and tool usage, use the iterative_problem_solver tool."
        )
        .tools(vec![Box::new(solver_tool)])
        .build();

    // Interactive loop
    loop {
        let mut query = String::new();
        print!("Query: ");
        io::stdout().flush()?;
        io::stdin().read_line(&mut query)?;

        if query.trim().is_empty() {
            break;
        }

        let response = session.send(query.trim()).await?;
        println!("{}\n", response);
    }

    Ok(())
}

Benefits of This Pattern

  1. Delegation: The chat assistant can offload complex problems to a specialized solver
  2. Composability: Mix solver capabilities with other tools in the same session
  3. Context Preservation: The chat session maintains conversation history
  4. Flexible Interaction: Users can ask simple questions directly or complex problems that trigger the solver

See the complete example at:

  • examples/solver_chat_session.rs - Interactive chat session with solver delegation

Limitations

  • LLM Dependency: Quality of results depends on the underlying LLM’s capabilities
  • Tool Design: Effectiveness relies on well-designed tools with clear descriptions
  • Token Limits: Long iterations may hit context window limits
  • Cost: Multiple LLM calls per problem can increase API costs

See Also

Working Memory Pattern

The working memory pattern enables agents to maintain and share context across multiple interactions. This guide shows how to use SharedWorkingMemory and build memory-aware agents in Rust.

Overview

Working memory provides:

  • Shared Context: Multiple agents can read from and write to the same memory
  • Continuous Learning: Agents automatically learn and remember new information
  • State Persistence: Knowledge is maintained across interactions
  • Thread Safety: Safe concurrent access using Arc<Mutex<T>>

Quick Start

Basic Usage

#![allow(unused)]
fn main() {
use mojentic::context::SharedWorkingMemory;
use serde_json::json;

// Create memory with initial data
let memory = SharedWorkingMemory::new(json!({
    "User": {
        "name": "Alice",
        "age": 30
    }
}));

// Retrieve current state
let current = memory.get_working_memory();

// Update memory (deep merge)
memory.merge_to_working_memory(&json!({
    "User": {
        "city": "NYC",
        "preferences": {
            "theme": "dark"
        }
    }
}));

// Result: {"User": {"name": "Alice", "age": 30, "city": "NYC", "preferences": {...}}}
}

Memory-Aware Agent Pattern

use mojentic::llm::{LlmBroker, LlmMessage};
use mojentic::llm::gateways::OllamaGateway;
use mojentic::context::SharedWorkingMemory;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize
    let gateway = OllamaGateway::default();
    let broker = LlmBroker::new("qwen2.5:7b", gateway);
    let memory = SharedWorkingMemory::new(json!({
        "User": {"name": "Alice"}
    }));

    // Create prompt with memory context
    let memory_context = memory.get_working_memory();
    let prompt = format!(
        "This is what you remember: {}\n\nRemember anything new you learn.\n\nUser: I love pizza",
        serde_json::to_string_pretty(&memory_context)?
    );

    // Generate response with schema that includes memory field
    let schema = json!({
        "type": "object",
        "required": ["answer"],
        "properties": {
            "answer": {"type": "string"},
            "memory": {
                "type": "object",
                "description": "Add anything new you learned here."
            }
        }
    });

    let response = broker.generate(
        vec![LlmMessage::user(&prompt)],
        Some(schema),
        None,
        None,
        None,
        None
    ).await?;

    // Parse response and update memory
    let response_json: serde_json::Value = serde_json::from_str(&response.content)?;
    if let Some(learned) = response_json.get("memory") {
        memory.merge_to_working_memory(learned);
    }

    Ok(())
}

Core Concepts

SharedWorkingMemory

A thread-safe, mutable key-value store that agents use to share context:

#![allow(unused)]
fn main() {
pub struct SharedWorkingMemory {
    memory: Arc<Mutex<HashMap<String, Value>>>,
}

impl SharedWorkingMemory {
    pub fn new(initial: Value) -> Self
    pub fn get_working_memory(&self) -> Value
    pub fn merge_to_working_memory(&self, updates: &Value)
}
}

Key features:

  • Thread-Safe: Uses Arc<Mutex<T>> for concurrent access
  • Deep Merge: Nested JSON objects are recursively merged
  • Simple API: Just 3 methods to learn

Deep Merge Behavior

Memory updates use deep merge to preserve existing data:

#![allow(unused)]
fn main() {
use serde_json::json;

// Initial memory
let memory = SharedWorkingMemory::new(json!({
    "User": {
        "name": "Alice",
        "age": 30,
        "address": {
            "city": "NYC",
            "state": "NY"
        }
    }
}));

// Update with nested data
memory.merge_to_working_memory(&json!({
    "User": {
        "age": 31,
        "address": {
            "zip": "10001"
        }
    }
}));

// Result: All fields preserved, nested objects merged
// {
//   "User": {
//     "name": "Alice",      // Preserved
//     "age": 31,            // Updated
//     "address": {
//       "city": "NYC",      // Preserved
//       "state": "NY",      // Preserved
//       "zip": "10001"      // Added
//     }
//   }
// }
}

Building Memory-Aware Agents

Here’s a complete pattern for memory-aware agents:

#![allow(unused)]
fn main() {
use mojentic::llm::{LlmBroker, LlmMessage};
use mojentic::context::SharedWorkingMemory;
use mojentic::agents::{Event, BaseAsyncAgent};
use serde::{Deserialize, Serialize};
use serde_json::json;
use async_trait::async_trait;

#[derive(Serialize, Deserialize)]
struct ResponseModel {
    text: String,
    #[serde(skip_serializing_if = "Option::is_none")]
    memory: Option<serde_json::Value>,
}

struct MemoryAgent {
    broker: Arc<LlmBroker>,
    memory: SharedWorkingMemory,
    behaviour: String,
    instructions: String,
}

impl MemoryAgent {
    fn new(
        broker: Arc<LlmBroker>,
        memory: SharedWorkingMemory,
        behaviour: String,
        instructions: String,
    ) -> Self {
        Self { broker, memory, behaviour, instructions }
    }

    async fn generate_with_memory(
        &self,
        user_input: &str,
    ) -> Result<ResponseModel, Box<dyn std::error::Error>> {
        // Build prompt with memory context
        let memory_context = self.memory.get_working_memory();
        let messages = vec![
            LlmMessage::system(&self.behaviour),
            LlmMessage::user(&format!(
                "This is what you remember:\n{}\n\n{}",
                serde_json::to_string_pretty(&memory_context)?,
                self.instructions
            )),
            LlmMessage::user(user_input),
        ];

        // Schema with memory field
        let schema = json!({
            "type": "object",
            "required": ["text"],
            "properties": {
                "text": {"type": "string"},
                "memory": {
                    "type": "object",
                    "description": "Add new information here."
                }
            }
        });

        // Generate response
        let response = self.broker.generate(
            messages,
            Some(schema),
            None,
            None,
            None,
            None
        ).await?;

        // Parse and update memory
        let result: ResponseModel = serde_json::from_str(&response.content)?;
        if let Some(ref learned) = result.memory {
            self.memory.merge_to_working_memory(learned);
        }

        Ok(result)
    }
}
}

Multi-Agent Coordination

Multiple agents can share the same memory instance:

#![allow(unused)]
fn main() {
use std::sync::Arc;

// Shared memory
let memory = Arc::new(SharedWorkingMemory::new(json!({
    "context": {}
})));

// Multiple agents
let researcher = MemoryAgent::new(
    Arc::clone(&broker),
    Arc::clone(&memory),
    "You are a research assistant.".to_string(),
    "Research topics thoroughly.".to_string(),
);

let writer = MemoryAgent::new(
    Arc::clone(&broker),
    Arc::clone(&memory),
    "You are a technical writer.".to_string(),
    "Write clear documentation.".to_string(),
);

// Researcher updates memory
let research = researcher
    .generate_with_memory("Research Rust async patterns")
    .await?;

// Writer uses updated memory (already shared)
let article = writer
    .generate_with_memory("Write an article about what was researched")
    .await?;
}

Use Cases

1. Conversational Chatbots

#![allow(unused)]
fn main() {
let memory = SharedWorkingMemory::new(json!({
    "conversation_history": [],
    "user_preferences": {}
}));
}

2. Workflow Automation

#![allow(unused)]
fn main() {
let memory = SharedWorkingMemory::new(json!({
    "workflow_state": "started",
    "completed_steps": [],
    "pending_tasks": []
}));
}

3. Knowledge Base Building

#![allow(unused)]
fn main() {
let memory = SharedWorkingMemory::new(json!({
    "entities": {},
    "relationships": [],
    "facts": []
}));
}

Best Practices

1. Structure Your Memory

Use clear, hierarchical keys:

#![allow(unused)]
fn main() {
json!({
    "User": {...},
    "Conversation": {...},
    "SystemState": {...}
})
}

2. Use Arc for Sharing

Share memory across threads/agents:

#![allow(unused)]
fn main() {
let memory = Arc::new(SharedWorkingMemory::new(initial));
let memory_clone = Arc::clone(&memory);
}

3. Validate Memory Updates

Check memory quality before accepting:

#![allow(unused)]
fn main() {
if let Some(ref learned) = result.memory {
    if is_valid_update(learned) {
        memory.merge_to_working_memory(learned);
    }
}
}

4. Handle Errors Gracefully

Memory operations can fail:

#![allow(unused)]
fn main() {
match agent.generate_with_memory(input).await {
    Ok(response) => {
        // Process response
    }
    Err(e) => {
        eprintln!("Failed to generate response: {}", e);
        // Don't update memory on error
    }
}
}

Example Application

See the complete working memory example:

cd mojentic-ru
cargo run --example working_memory

The example demonstrates:

  • Initializing memory with user data
  • RequestAgent that learns from conversation
  • Event-driven coordination with AsyncDispatcher
  • Memory persistence across interactions

API Reference

SharedWorkingMemory

#![allow(unused)]
fn main() {
impl SharedWorkingMemory {
    /// Create new memory with initial data
    pub fn new(initial: Value) -> Self

    /// Get current memory snapshot
    pub fn get_working_memory(&self) -> Value

    /// Deep merge updates into memory
    pub fn merge_to_working_memory(&self, updates: &Value)
}
}

See src/context/shared_working_memory.rs for full documentation.

Thread Safety

SharedWorkingMemory is thread-safe and can be shared across async tasks:

#![allow(unused)]
fn main() {
use tokio::task;

let memory = Arc::new(SharedWorkingMemory::new(initial));

let task1 = {
    let memory = Arc::clone(&memory);
    task::spawn(async move {
        memory.merge_to_working_memory(&updates1);
    })
};

let task2 = {
    let memory = Arc::clone(&memory);
    task::spawn(async move {
        memory.merge_to_working_memory(&updates2);
    })
};

task1.await?;
task2.await?;
}

Observability

Tracing and introspection help debug agentic flows, measure latency, and optimize tool usage.

Components:

  • Tracer API for spans/events
  • Instrumented broker calls
  • Future: metrics export

Tracer System

The tracer system provides comprehensive observability into LLM interactions, tool executions, and agent communications. It records events with timestamps and correlation IDs, enabling detailed debugging and monitoring.

Goals:

  • Performance insight
  • Failure localization
  • Reproducibility of agent flows

Architecture

The tracer system consists of several key components:

Core Components

  • TracerEvent: Base trait for all event types with timestamps, correlation IDs, and printable summaries
  • EventStore: Thread-safe storage for events with callbacks and filtering capabilities
  • TracerSystem: Coordination layer providing convenience methods for recording events
  • NullTracer: Null object pattern implementation for when tracing is disabled

Event Types

The system supports four main event types:

  1. LlmCallTracerEvent: Records LLM calls with model, messages, temperature, and available tools
  2. LlmResponseTracerEvent: Records LLM responses with content, tool calls, and duration
  3. ToolCallTracerEvent: Records tool executions with arguments, results, and duration
  4. AgentInteractionTracerEvent: Records agent-to-agent communications

Usage

Basic Setup

#![allow(unused)]
fn main() {
use mojentic::tracer::TracerSystem;
use std::sync::Arc;

// Create a tracer system
let tracer = Arc::new(TracerSystem::default());

// Enable/disable tracing
tracer.enable();
tracer.disable();
}

Recording Events

LLM Calls

#![allow(unused)]
fn main() {
use std::collections::HashMap;

tracer.record_llm_call(
    "llama3.2",                    // model
    vec![],                        // messages (simplified)
    0.7,                           // temperature
    None,                          // tools
    "my_broker",                   // source
    "correlation-123"              // correlation_id
);
}

LLM Responses

#![allow(unused)]
fn main() {
tracer.record_llm_response(
    "llama3.2",                    // model
    "Response text",               // content
    None,                          // tool_calls
    Some(150.5),                   // call_duration_ms
    "my_broker",                   // source
    "correlation-123"              // correlation_id
);
}

Tool Calls

#![allow(unused)]
fn main() {
use serde_json::json;

let mut arguments = HashMap::new();
arguments.insert("input".to_string(), json!("test data"));

tracer.record_tool_call(
    "date_tool",                   // tool_name
    arguments,                     // arguments
    json!({"result": "2024-01-15"}), // result
    Some("agent1".to_string()),    // caller
    Some(25.0),                    // call_duration_ms
    "tool_executor",               // source
    "correlation-123"              // correlation_id
);
}

Agent Interactions

#![allow(unused)]
fn main() {
tracer.record_agent_interaction(
    "agent1",                      // from_agent
    "agent2",                      // to_agent
    "request",                     // event_type
    Some("evt-456".to_string()),   // event_id
    "dispatcher",                  // source
    "correlation-123"              // correlation_id
);
}

Querying Events

Get All Events

#![allow(unused)]
fn main() {
let summaries = tracer.get_event_summaries(None, None, None);
for summary in summaries {
    println!("{}", summary);
}
}

Filter by Time Range

#![allow(unused)]
fn main() {
use std::time::{SystemTime, UNIX_EPOCH};

let now = SystemTime::now()
    .duration_since(UNIX_EPOCH)
    .unwrap()
    .as_secs_f64();

let one_hour_ago = now - 3600.0;

let recent_events = tracer.get_event_summaries(
    Some(one_hour_ago),  // start_time
    Some(now),           // end_time
    None                 // filter_func
);
}

Filter by Custom Predicate

#![allow(unused)]
fn main() {
// Find all events from a specific correlation ID
let correlation_id = "correlation-123";
let related_events = tracer.get_event_summaries(
    None,
    None,
    Some(&|event| event.correlation_id() == correlation_id)
);

// Count tool call events
let tool_call_count = tracer.count_events(
    None,
    None,
    Some(&|event| {
        event.printable_summary().contains("ToolCallTracerEvent")
    })
);
}

Get Last N Events

#![allow(unused)]
fn main() {
// Get last 10 events
let last_events = tracer.get_last_n_summaries(10, None);

// Get last 5 LLM-related events
let last_llm_events = tracer.get_last_n_summaries(
    5,
    Some(&|event| {
        let summary = event.printable_summary();
        summary.contains("LlmCallTracerEvent") ||
        summary.contains("LlmResponseTracerEvent")
    })
);
}

Managing Events

#![allow(unused)]
fn main() {
// Get event count
let total_events = tracer.len();
println!("Total events: {}", total_events);

// Check if empty
if tracer.is_empty() {
    println!("No events recorded");
}

// Clear all events
tracer.clear();
}

Correlation IDs

Correlation IDs are UUIDs that track related events across the system. They enable you to trace all events related to a single request or operation, creating a complete audit trail.

Best Practices

  1. Generate Once: Create a correlation ID at the start of a request
  2. Pass Through: Copy the correlation ID to all downstream operations
  3. Query Together: Use correlation IDs to filter related events

Example flow:

User Request → Generate correlation_id
  ↓
LLM Call (correlation_id: "abc-123")
  ↓
LLM Response (correlation_id: "abc-123")
  ↓
Tool Call (correlation_id: "abc-123")
  ↓
LLM Call with tool result (correlation_id: "abc-123")
  ↓
Final LLM Response (correlation_id: "abc-123")

Event Store Callbacks

You can register a callback function that’s called whenever an event is stored:

#![allow(unused)]
fn main() {
use mojentic::tracer::{EventStore, EventCallback, TracerSystem};
use std::sync::{Arc, atomic::{AtomicUsize, Ordering}};

// Create a counter
let event_count = Arc::new(AtomicUsize::new(0));
let count_clone = Arc::clone(&event_count);

// Create callback
let callback: EventCallback = Arc::new(move |event| {
    count_clone.fetch_add(1, Ordering::SeqCst);
    println!("Event stored: {}", event.printable_summary());
});

// Create event store with callback
let event_store = Arc::new(EventStore::new(Some(callback)));

// Create tracer with custom event store
let tracer = Arc::new(TracerSystem::new(Some(event_store), true));

// Events will trigger the callback
tracer.record_llm_call("llama3.2", vec![], 1.0, None, "test", "corr-1");

println!("Total events: {}", event_count.load(Ordering::SeqCst));
}

Null Tracer

For environments where tracing should be completely disabled without conditional checks:

#![allow(unused)]
fn main() {
use mojentic::tracer::NullTracer;
use std::collections::HashMap;
use serde_json::json;

let tracer = NullTracer::new();

// All operations are no-ops
tracer.record_llm_call("model", vec![], 1.0, None, "source", "id");
tracer.record_tool_call("tool", HashMap::new(), json!({}), None, None, "source", "id");

// Queries return empty results
assert!(tracer.is_empty());
assert_eq!(tracer.len(), 0);
assert_eq!(tracer.get_event_summaries(None, None, None).len(), 0);
}

Performance Considerations

  1. Thread Safety: EventStore uses Arc<Mutex<>> for thread-safe access
  2. Memory: Events are stored in memory; clear periodically for long-running processes
  3. Overhead: Minimal when disabled; consider NullTracer for zero overhead
  4. Callbacks: Keep callback functions fast to avoid blocking event recording

Integration with LlmBroker

The tracer integrates with LlmBroker to automatically record LLM interactions (implementation in progress):

use mojentic::llm::{LlmBroker, LlmMessage};
use mojentic::llm::gateways::OllamaGateway;
use mojentic::tracer::TracerSystem;
use std::sync::Arc;

// Create tracer
let tracer = Arc::new(TracerSystem::default());

// Create broker with tracer
let gateway = Arc::new(OllamaGateway::default());
let broker = LlmBroker::builder("llama3.2", gateway)
    .with_tracer(tracer.clone())
    .build();

// Broker will automatically record events
let response = broker.generate(
    &[LlmMessage::user("Hello!")],
    None,
    None
).await?;

// Query tracer for events
let events = tracer.get_event_summaries(None, None, None);
for event in events {
    println!("{}", event);
}

Testing

The tracer system includes comprehensive unit tests covering:

  • Event creation and formatting
  • Event storage and retrieval
  • Callbacks and filtering
  • TracerSystem coordination
  • NullTracer behavior

Run tests with:

cargo test tracer

Implementation Status

✅ Layer 2 Tracer System - Core Complete

Current implementation:

  • ✅ TracerEvent trait with 4 event types
  • ✅ EventStore with callbacks and filtering
  • ✅ TracerSystem coordination layer
  • ✅ NullTracer for zero-overhead tracing
  • ✅ 24 comprehensive unit tests (all passing)
  • ✅ Correlation ID support
  • ✅ Documentation complete

Pending integration:

  • ⏳ LlmBroker integration (add tracer parameter)
  • ⏳ Tool system integration (add tracer parameter)
  • ⏳ Example application (tracer_demo.rs)

See Also

Comprehensive Guide

End-to-end walkthrough of observing a multi-agent run.

(Placeholder for full example once agents crate stabilized.)

API Documentation

The full Rust API reference is generated with cargo doc and published under /api/ of the documentation site.

  • Visit: /api/
  • Crate landing page: /api/mojentic/

Note: During local development, run:

# Build API docs
cargo doc --no-deps --all-features
# Build the mdBook
mdbook build book
# Preview by serving book/book/ and opening ./api/ in a second tab from target/doc/

Contributing

We welcome issues and PRs. Please follow Rust style (rustfmt, clippy) and add tests for new behavior.

Extending Mojentic

This guide shows how to extend the Mojentic framework with new gateways and custom functionality.

Adding a New LLM Gateway

To add support for a new LLM provider, implement the LlmGateway trait:

1. Create the Gateway File

#![allow(unused)]
fn main() {
// src/llm/gateways/openai.rs
use crate::error::{MojenticError, Result};
use crate::llm::gateway::{CompletionConfig, LlmGateway};
use crate::llm::models::{LlmGatewayResponse, LlmMessage};
use crate::llm::tools::LlmTool;
use async_trait::async_trait;
use serde_json::Value;

pub struct OpenAiGateway {
    client: reqwest::Client,
    api_key: String,
    base_url: String,
}

impl OpenAiGateway {
    pub fn new(api_key: String) -> Self {
        Self {
            client: reqwest::Client::new(),
            api_key,
            base_url: "https://api.openai.com/v1".to_string(),
        }
    }
}

#[async_trait]
impl LlmGateway for OpenAiGateway {
    async fn complete(
        &self,
        model: &str,
        messages: &[LlmMessage],
        tools: Option<&[Box<dyn LlmTool>]>,
        config: &CompletionConfig,
    ) -> Result<LlmGatewayResponse> {
        // Implement OpenAI completion API call
        todo!()
    }

    async fn complete_json(
        &self,
        model: &str,
        messages: &[LlmMessage],
        schema: Value,
        config: &CompletionConfig,
    ) -> Result<Value> {
        // Implement OpenAI structured output
        todo!()
    }

    async fn get_available_models(&self) -> Result<Vec<String>> {
        // Implement model listing
        todo!()
    }

    async fn calculate_embeddings(
        &self,
        text: &str,
        model: Option<&str>,
    ) -> Result<Vec<f32>> {
        // Implement embeddings API
        todo!()
    }
}
}

2. Export the Gateway

#![allow(unused)]
fn main() {
// src/llm/gateways/mod.rs
pub mod ollama;
pub mod openai;  // Add this line

pub use ollama::{OllamaConfig, OllamaGateway};
pub use openai::OpenAiGateway;  // Add this line
}

3. Use the Gateway

use mojentic::prelude::*;
use mojentic::llm::gateways::OpenAiGateway;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<()> {
    let gateway = OpenAiGateway::new("your-api-key".to_string());
    let broker = LlmBroker::new("gpt-4", Arc::new(gateway));

    // Use as normal
    let messages = vec![LlmMessage::user("Hello!")];
    let response = broker.generate(&messages, None, None).await?;

    Ok(())
}

Creating Custom Message Types

You can create helper functions for specific message types:

#![allow(unused)]
fn main() {
use mojentic::llm::models::{LlmMessage, MessageRole};

impl LlmMessage {
    /// Create a message with an image
    pub fn user_with_image(content: impl Into<String>, image_path: impl Into<String>) -> Self {
        Self {
            role: MessageRole::User,
            content: Some(content.into()),
            tool_calls: None,
            image_paths: Some(vec![image_path.into()]),
        }
    }

    /// Create a system message with specific formatting
    pub fn system_instruction(instruction: impl Into<String>) -> Self {
        let content = format!("SYSTEM INSTRUCTION: {}", instruction.into());
        Self::system(content)
    }
}
}

Implementing Structured Output Types

Define your own structured output types with JSON Schema:

use serde::{Deserialize, Serialize};
use schemars::JsonSchema;

#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct ProductReview {
    rating: u8,
    pros: Vec<String>,
    cons: Vec<String>,
    recommendation: bool,
    summary: String,
}

#[tokio::main]
async fn main() -> mojentic::Result<()> {
    let gateway = OllamaGateway::new();
    let broker = LlmBroker::new("qwen3:32b", Arc::new(gateway));

    let messages = vec![
        LlmMessage::user(
            "Review this product: A wireless mouse with RGB lighting, \
             3200 DPI, 6 buttons, and 40-hour battery life."
        )
    ];

    let review: ProductReview = broker.generate_object(&messages, None).await?;

    println!("Rating: {}/5", review.rating);
    println!("Pros: {:?}", review.pros);
    println!("Cons: {:?}", review.cons);
    println!("Recommended: {}", review.recommendation);

    Ok(())
}

Advanced: Custom Gateway Configuration

Create configuration structs for your gateways:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct CustomGatewayConfig {
    pub endpoint: String,
    pub timeout: std::time::Duration,
    pub retry_attempts: u32,
    pub custom_headers: HashMap<String, String>,
}

impl Default for CustomGatewayConfig {
    fn default() -> Self {
        Self {
            endpoint: std::env::var("CUSTOM_ENDPOINT")
                .unwrap_or_else(|_| "https://api.example.com".to_string()),
            timeout: std::time::Duration::from_secs(30),
            retry_attempts: 3,
            custom_headers: HashMap::new(),
        }
    }
}

pub struct CustomGateway {
    client: reqwest::Client,
    config: CustomGatewayConfig,
}

impl CustomGateway {
    pub fn new() -> Self {
        Self::with_config(CustomGatewayConfig::default())
    }

    pub fn with_config(config: CustomGatewayConfig) -> Self {
        let client = reqwest::Client::builder()
            .timeout(config.timeout)
            .build()
            .unwrap();

        Self { client, config }
    }
}
}

Testing Your Extensions

Create tests for your gateways and tools:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;

    #[tokio::test]
    async fn test_gateway_models() {
        let gateway = OllamaGateway::new();
        let models = gateway.get_available_models().await.unwrap();
        assert!(!models.is_empty());
    }
}
}

Best Practices

  1. Error Handling: Always use proper error types, never panic in library code
  2. Async: All I/O operations should be async
  3. Documentation: Add rustdoc comments to all public APIs
  4. Testing: Write tests for your implementations
  5. Configuration: Support environment variables for API keys and endpoints
  6. Type Safety: Leverage Rust’s type system for compile-time guarantees

Need Help?

  • Check the existing implementations in src/llm/gateways/ollama.rs
  • Review the tool example in examples/tool_usage.rs
  • See the Building Tools guide for tool development

Testing

  • Unit tests: cargo test --all-features
  • Lints: cargo clippy -- -D warnings
  • Formatting: cargo fmt -- --check

User Personas

  • Framework user: wants ready-to-run examples
  • Library integrator: needs stable APIs and performance
  • Contributor: deep dives into internals and design