Mojentic for Rust
Mojentic is an agentic framework with an asynchronous pub/sub architecture designed to orchestrate LLM-powered agents. This Rust crate focuses on high performance, strong typing, and reliability.
- Fast, async-first event processing
- Agents compose into complex systems
- Clear separation between Core (LLM primitives) and Agents (behaviors)
If you’re familiar with the Python project, this book mirrors its structure and concepts, adapted to Rust idioms.
Quick links
- Getting Started → Get set up
- LLM usage → Broker and gateways
- Core concepts → Layer 1
- Agents → Layer 2
- Observability → Tracer
- API Docs → Rust API reference
Getting Started
This section helps you install and begin using Mojentic in Rust.
Install
Add the dependency (placeholder until published):
[dependencies]
mojentic = { path = "../" }
Minimal Example
use mojentic::{Broker, Message};
fn main() {
let mut broker = Broker::default();
broker.emit(Message::text("hello world"));
}
Next Steps
- Learn the LLM broker → Broker
- Dive into Core concepts → Layer 1 overview
Using LLMs (Broker)
Mojentic’s LLM broker routes completion requests to pluggable gateways (e.g., Ollama). It provides a unified API for chat and text completions.
Quick chat example
use mojentic::llm::{Broker, CompletionConfig, Message};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let broker = Broker::new()?;
let cfg = CompletionConfig::default();
let resp = broker.chat(
cfg,
[
Message::system("You are a helpful assistant"),
Message::user("Say hi in one sentence"),
],
).await?;
println!("{}", resp.text());
Ok(())
}
Gateways
- Ollama: local models for fast iteration.
- HTTP-based gateways: add your own by implementing the
Gatewaytrait.
Structured output
Use typed schemas to parse the model output into structs. See Structured Output.
Configuration
Use CompletionConfig to control generation parameters:
#![allow(unused)]
fn main() {
use mojentic::llm::gateway::{CompletionConfig, ReasoningEffort};
let config = CompletionConfig {
temperature: 0.3,
reasoning_effort: Some(ReasoningEffort::High),
..Default::default()
};
}
Available Parameters
- temperature (
f64): Controls randomness. Default: 1.0 - num_ctx (
u32): Context window size in tokens. Default: 32768 - max_tokens (
u32): Maximum tokens to generate. Default: 16384 - num_predict (
i32): Tokens to predict (-1 = no limit). Default: -1 - reasoning_effort (
Option<ReasoningEffort>): Extended thinking level —Low,Medium,High, orNone. Default: None
Reasoning Effort
Control how much the model thinks before responding:
#![allow(unused)]
fn main() {
use mojentic::llm::gateway::{CompletionConfig, ReasoningEffort};
// Deep reasoning for complex problems
let config = CompletionConfig {
reasoning_effort: Some(ReasoningEffort::High),
temperature: 0.1,
..Default::default()
};
// Quick responses
let config = CompletionConfig {
reasoning_effort: Some(ReasoningEffort::Low),
..Default::default()
};
}
- Ollama: Maps to
think: trueparameter for extended thinking. The model’s reasoning trace is available inLlmGatewayResponse.thinking. - OpenAI: Maps to
reasoning_effortAPI parameter for reasoning models (o1, o3 series). Ignored with a warning for non-reasoning models.
For full details, see Reasoning Effort Control.
Quick Reference
Installation
[dependencies]
mojentic = { path = "./mojentic-rust" }
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
schemars = "0.8" # For structured output
Basic Usage
Import the prelude
#![allow(unused)]
fn main() {
use mojentic::prelude::*;
use std::sync::Arc;
}
Create a broker
#![allow(unused)]
fn main() {
let gateway = OllamaGateway::new();
let broker = LlmBroker::new("qwen3:32b", Arc::new(gateway));
}
Simple text generation
#![allow(unused)]
fn main() {
let messages = vec![LlmMessage::user("Hello, how are you?")];
let response = broker.generate(&messages, None, None).await?;
}
Message Types
User message
#![allow(unused)]
fn main() {
LlmMessage::user("Your message here")
}
System message
#![allow(unused)]
fn main() {
LlmMessage::system("You are a helpful assistant")
}
Assistant message
#![allow(unused)]
fn main() {
LlmMessage::assistant("Response from assistant")
}
Message with images
#![allow(unused)]
fn main() {
LlmMessage::user("Describe this image")
.with_images(vec!["path/to/image.jpg".to_string()])
}
Structured Output
Define your type
#![allow(unused)]
fn main() {
use serde::{Deserialize, Serialize};
use schemars::JsonSchema;
#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct Person {
name: String,
age: u32,
occupation: String,
}
}
Generate typed response
#![allow(unused)]
fn main() {
let messages = vec![
LlmMessage::user("Extract person info: John Doe, 30, software engineer")
];
let person: Person = broker.generate_object(&messages, None).await?;
}
Configuration
Default config
#![allow(unused)]
fn main() {
let config = CompletionConfig::default();
// temperature: 1.0
// num_ctx: 32768
// max_tokens: 16384
// num_predict: None
// top_p: None
// top_k: None
// response_format: None
}
Custom config
#![allow(unused)]
fn main() {
let config = CompletionConfig {
temperature: 0.7,
num_ctx: 16384,
max_tokens: 8192,
num_predict: Some(1000),
top_p: Some(0.9),
top_k: Some(40),
response_format: None,
};
let response = broker.generate(&messages, None, Some(config)).await?;
}
JSON response format
#![allow(unused)]
fn main() {
use mojentic::llm::gateway::ResponseFormat;
// Request JSON without schema
let config = CompletionConfig {
temperature: 0.7,
num_ctx: 16384,
max_tokens: 8192,
num_predict: None,
top_p: None,
top_k: None,
response_format: Some(ResponseFormat::JsonObject { schema: None }),
};
// Request JSON with schema
let schema = serde_json::json!({
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "number"}
}
});
let config = CompletionConfig {
temperature: 0.7,
num_ctx: 16384,
max_tokens: 8192,
num_predict: None,
top_p: None,
top_k: None,
response_format: Some(ResponseFormat::JsonObject {
schema: Some(schema)
}),
};
}
Gateway Configuration
Ollama with custom host
#![allow(unused)]
fn main() {
let gateway = OllamaGateway::with_host("http://localhost:11434");
}
Ollama with environment variable
#![allow(unused)]
fn main() {
// Set OLLAMA_HOST environment variable
std::env::set_var("OLLAMA_HOST", "http://custom-host:11434");
let gateway = OllamaGateway::new();
}
Error Handling
#![allow(unused)]
fn main() {
use mojentic::Result;
async fn my_function() -> Result<String> {
let gateway = OllamaGateway::new();
let broker = LlmBroker::new("qwen3:32b", Arc::new(gateway));
let messages = vec![LlmMessage::user("Hello")];
match broker.generate(&messages, None, None).await {
Ok(response) => Ok(response),
Err(e) => {
eprintln!("Error: {}", e);
Err(e)
}
}
}
}
Common Patterns
Multi-turn conversation
#![allow(unused)]
fn main() {
let mut messages = vec![
LlmMessage::system("You are a helpful assistant"),
];
// First turn
messages.push(LlmMessage::user("What is Rust?"));
let response1 = broker.generate(&messages, None, None).await?;
messages.push(LlmMessage::assistant(response1));
// Second turn
messages.push(LlmMessage::user("What are its benefits?"));
let response2 = broker.generate(&messages, None, None).await?;
}
Async parallel requests
#![allow(unused)]
fn main() {
use tokio::try_join;
let broker = Arc::new(broker);
let broker1 = broker.clone();
let broker2 = broker.clone();
let (response1, response2) = try_join!(
async move {
broker1.generate(
&[LlmMessage::user("Question 1")],
None,
None
).await
},
async move {
broker2.generate(
&[LlmMessage::user("Question 2")],
None,
None
).await
}
)?;
}
Model management (Ollama)
#![allow(unused)]
fn main() {
let gateway = OllamaGateway::new();
// List available models
let models = gateway.get_available_models().await?;
// Pull a model
gateway.pull_model("qwen3:32b").await?;
// Calculate embeddings
let embedding = gateway.calculate_embeddings(
"Some text to embed",
Some("mxbai-embed-large")
).await?;
}
Logging
Enable logging with tracing:
// In main.rs
use tracing_subscriber;
#[tokio::main]
async fn main() {
// Initialize logging
tracing_subscriber::fmt::init();
// Your code here
}
Set log level via environment variable:
RUST_LOG=debug cargo run
RUST_LOG=info cargo run
RUST_LOG=mojentic=debug cargo run
Use Cases
Tutorial: Building Chatbots
Why Use Chat Sessions?
When working with Large Language Models (LLMs), simple text generation is useful for one-off interactions, but many applications require ongoing conversations where the model remembers previous exchanges. This is where chat sessions come in.
Chat sessions are essential when you need to:
- Build conversational agents or chatbots
- Maintain context across multiple user interactions
- Create applications where the LLM needs to remember previous information
- Develop more natural and coherent conversational experiences
When to Apply This Approach
Use chat sessions when:
- Your application requires multi-turn conversations
- You need the LLM to reference information from earlier in the conversation
- You want to create a more interactive and engaging user experience
The Key Difference: Expanding Context
The fundamental difference between simple text generation and chat sessions is the expanding context. With each new message in a chat session:
- The message is added to the conversation history
- All previous messages (within token limits) are sent to the LLM with each new query
- The LLM can reference and build upon earlier parts of the conversation
Getting Started
Let’s walk through a simple example of building a chatbot using Mojentic’s ChatSession.
Basic Implementation
Here’s the simplest way to create a chat session with Mojentic:
use mojentic::prelude::*;
use std::io::{self, Write};
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
// 1. Create an LLM broker
let gateway = Arc::new(OllamaGateway::new());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
// 2. Initialize a chat session
let mut session = ChatSession::new(broker);
// 3. Simple interactive loop
println!("Chatbot started. Type 'exit' to quit.");
loop {
print!("Query: ");
io::stdout().flush()?;
let mut query = String::new();
io::stdin().read_line(&mut query)?;
let query = query.trim();
if query == "exit" {
break;
}
let response = session.send_message(query).await?;
println!("{}", response);
}
Ok(())
}
This code creates an interactive chatbot that maintains context across multiple exchanges.
Step-by-Step Explanation
1. Initialize the Broker
#![allow(unused)]
fn main() {
let gateway = Arc::new(OllamaGateway::new());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
}
The LlmBroker is the central component that handles communication with the LLM provider (in this case, Ollama).
2. Start the Session
#![allow(unused)]
fn main() {
let mut session = ChatSession::new(broker);
}
ChatSession holds the state of the conversation. By default, it manages the message history and ensures it fits within the model’s context window.
3. Send Messages
#![allow(unused)]
fn main() {
let response = session.send_message(query).await?;
}
When you send a message:
- It’s added to the history.
- The full history is sent to the LLM.
- The LLM’s response is added to the history.
- The response text is returned.
Customizing Your Chat Session
You can customize the session with a system prompt or tools.
System Prompt
The system prompt sets the behavior of the assistant.
#![allow(unused)]
fn main() {
let mut session = ChatSession::new(broker)
.with_system_prompt("You are a helpful AI assistant specialized in Rust programming.");
}
Adding Tools
You can enhance your chatbot by providing tools that the LLM can use.
#![allow(unused)]
fn main() {
use mojentic::tools::DateResolver;
let tools: Vec<Arc<dyn LlmTool>> = vec![Arc::new(DateResolver::new())];
let mut session = ChatSession::new(broker)
.with_tools(tools);
// The LLM can now use the date tool in conversations
let response = session.send_message("What day of the week is July 4th, 2025?").await?;
println!("{}", response);
}
Streaming Responses
For a better user experience with longer responses, you can stream the LLM’s reply as it’s generated:
#![allow(unused)]
fn main() {
use futures::stream::StreamExt;
let mut stream = session.send_stream("Tell me a story");
while let Some(result) = stream.next().await {
print!("{}", result?);
}
println!(); // Newline after streaming completes
}
The send_stream() method works just like send() for conversation management — it adds the user message to history before streaming, and records the full assembled response after the stream is consumed. Tools are handled transparently through the broker’s recursive streaming.
Summary
In this tutorial, we’ve learned how to:
- Initialize a
ChatSessionwith anLlmBroker. - Create an interactive loop to chat with the model.
- Customize the session with system prompts and tools.
By leveraging chat sessions, you can create engaging conversational experiences that maintain context across multiple interactions.
Tutorial: Extracting Structured Data
Why Use Structured Output?
LLMs are great at generating text, but sometimes you need data in a machine-readable format like JSON. Structured output allows you to define a schema (using Rust structs with Serde) and force the LLM to return data that matches that schema.
This is essential for:
- Data extraction from unstructured text
- Building API integrations
- Populating databases
- ensuring reliable downstream processing
Getting Started
Let’s build an example that extracts user information from a natural language description.
1. Define Your Data Schema
We use serde to define the structure we want.
#![allow(unused)]
fn main() {
use serde::{Deserialize, Serialize};
use schemars::JsonSchema;
#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct UserInfo {
name: String,
age: u32,
interests: Vec<String>,
}
}
2. Initialize the Broker
#![allow(unused)]
fn main() {
use mojentic::prelude::*;
use std::sync::Arc;
let gateway = Arc::new(OllamaGateway::new());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
}
3. Generate Structured Data
Use broker.generate_structured to request the data.
#![allow(unused)]
fn main() {
let text = "John Doe is a 30-year-old software engineer who loves hiking and reading.";
let user_info: UserInfo = broker.generate_structured(text).await?;
println!("{:?}", user_info);
// UserInfo {
// name: "John Doe",
// age: 30,
// interests: ["hiking", "reading"]
// }
}
How It Works
- Schema Definition: Mojentic uses
schemarsto convert your Rust struct into a JSON schema that the LLM can understand. - Prompt Engineering: The broker automatically appends instructions to the prompt, telling the LLM to output JSON matching the schema.
- Validation: When the response comes back, Mojentic parses the JSON and deserializes it into your struct, performing validation (e.g., ensuring
ageis a number).
Advanced: Nested Schemas
You can also use nested structs for more complex data.
#![allow(unused)]
fn main() {
#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct Address {
street: String,
city: String,
}
#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct UserProfile {
name: String,
address: Address,
}
}
Summary
Structured output turns unstructured text into reliable data structures. By defining Rust structs, you can integrate LLM outputs directly into your application’s logic with type safety and validation.
Tutorial: Building Autonomous Agents
What is an Autonomous Agent?
An autonomous agent is a system that can perceive its environment, reason about how to achieve a goal, and take actions (use tools) to accomplish that goal. Unlike a simple chatbot, an agent has a loop of “Thought -> Action -> Observation”.
The Simple Recursive Agent (SRA)
Mojentic provides a SimpleRecursiveAgent pattern. This agent:
- Receives a goal.
- Thinks about the next step.
- Selects a tool to use.
- Executes the tool.
- Observes the result.
- Repeats until the goal is met.
Building an Agent
Let’s build an agent that can answer questions using a web search tool.
1. Setup
You’ll need the WebSearchTool (or any other tool) and a Broker.
use mojentic::prelude::*;
use mojentic::tools::WebSearchTool;
use mojentic::agents::SimpleRecursiveAgent;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
// Initialize broker
let gateway = Arc::new(OllamaGateway::new());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
// Configure tools
let tools: Vec<Arc<dyn LlmTool>> = vec![
Arc::new(WebSearchTool::new(SearchProvider::Tavily)),
];
// ...
Ok(())
}
2. Run the Agent
#![allow(unused)]
fn main() {
let goal = "Find out who won the latest Super Bowl and tell me the score.";
let result = SimpleRecursiveAgent::run(&broker, goal, tools, None).await?;
println!("Final Answer: {}", result);
}
Step-by-Step Execution
When you run this, the agent enters a loop:
- Thought: “I need to search for the latest Super Bowl winner.”
- Action: Calls
WebSearchToolwith query “latest Super Bowl winner score”. - Observation: Receives search results (e.g., “Kansas City Chiefs defeated San Francisco 49ers 25-22…”).
- Thought: “I have the information. I can now answer the user.”
- Final Answer: “The Kansas City Chiefs won the latest Super Bowl with a score of 25-22.”
Customizing the Agent
You can customize the agent’s behavior by:
- Adding more tools: Give it file access, calculation abilities, etc.
- System Prompt: Adjust its personality or constraints.
- Max Iterations: Limit how many steps it can take to prevent infinite loops.
#![allow(unused)]
fn main() {
let config = AgentConfig {
max_iterations: 10,
..Default::default()
};
SimpleRecursiveAgent::run(&broker, goal, tools, Some(config)).await?;
}
Summary
Autonomous agents allow you to solve complex, multi-step problems. By combining a reasoning loop with a set of tools, you can build systems that can interact with the world to achieve user goals.
Image Analysis
Vision-capable models can analyze images by passing image file paths along with text prompts. The framework automatically handles reading image files and encoding them as Base64 for transmission to the LLM.
Basic Usage
Attach images to a message using the with_images() method:
#![allow(unused)]
fn main() {
use mojentic::prelude::*;
let message = LlmMessage::user("Describe this image")
.with_images(vec!["/path/to/image.jpg".to_string()]);
}
You can attach multiple images to a single message:
#![allow(unused)]
fn main() {
let message = LlmMessage::user("Compare these images")
.with_images(vec![
"/path/to/image1.jpg".to_string(),
"/path/to/image2.jpg".to_string(),
]);
}
Complete Example
use mojentic::prelude::*;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
// Create gateway and broker with a vision-capable model
let gateway = OllamaGateway::new();
let broker = LlmBroker::new("llava:latest", Arc::new(gateway), None);
// Create a message with image
let message = LlmMessage::user("What's in this image?")
.with_images(vec!["path/to/image.jpg".to_string()]);
// Generate response
let response = broker.generate(&[message], None, None, None).await?;
println!("{}", response);
Ok(())
}
Vision-Capable Models
Common vision-capable models available through Ollama:
llava:latest- General-purpose vision modelbakllava:latest- BakLLaVA vision modelqwen3-vl:30b- Qwen3 vision-language modelgemma3:27b- Gemma 3 with vision support
Pull a model before using:
ollama pull llava:latest
How It Works
When you attach images to a message:
- File Reading: The gateway reads the image file from the specified path
- Base64 Encoding: The image bytes are encoded as Base64 using the
base64crate - API Transmission: The encoded image is included in the
imagesfield of the Ollama API request - Model Processing: The vision-capable model analyzes both the text prompt and image(s)
Error Handling
Image processing can fail if:
- The image file doesn’t exist or isn’t readable
- The file path is invalid
- The model doesn’t support vision
Always handle errors when working with images:
#![allow(unused)]
fn main() {
match broker.generate(&[message], None, None, None).await {
Ok(response) => println!("Response: {}", response),
Err(e) => eprintln!("Error analyzing image: {}", e),
}
}
Supported Image Formats
The framework reads raw image bytes and passes them to the model. Supported formats depend on the specific model being used. Most vision models support common formats like JPEG and PNG.
See Also
- Examples - See
image_analysis.rsfor a working example - LlmMessage API - Full message construction API
- Ollama Gateway - Gateway-specific details
Examples
Example: File Tools
This chapter demonstrates how to build tools for interacting with the file system using the Mojentic framework. The included FileTool and CodingFileTool serve as reference implementations that you can use directly or adapt for your specific needs.
Available Tools
FileTool
The FileTool provides basic file operations:
read_file: Read content of a filewrite_file: Write content to a filelist_dir: List directory contentsfile_exists: Check if a file exists
CodingFileTool
The CodingFileTool extends FileTool with features specifically for coding tasks:
apply_patch: Apply a unified diff patch to a filereplace_text: Replace specific text in a filesearch_files: Search for patterns in files
Usage
use mojentic::prelude::*;
use mojentic::tools::FileTool;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
// Initialize broker
let gateway = Arc::new(OllamaGateway::new());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
// Register tools
let tools: Vec<Arc<dyn LlmTool>> = vec![
Arc::new(FileTool::default()),
];
// Ask the agent to perform file operations
let messages = vec![
LlmMessage::user("Create a file named 'hello.txt' with the content 'Hello, World!'"),
];
let response = broker.generate(&messages, Some(tools), None).await?;
println!("{}", response);
Ok(())
}
Security
By default, file tools are restricted to the current working directory. You can configure allowed paths to restrict access further.
Example: Task Management
The TaskManager is an example of how to build stateful tools that allow agents to manage ephemeral tasks. This reference implementation shows how to maintain state across tool calls.
Features
- Create Tasks: Add new tasks to the list
- List Tasks: View all current tasks and their status
- Complete Tasks: Mark tasks as done
- Prioritize: Agents can determine the order of execution
Usage
use mojentic::prelude::*;
use mojentic::tools::TaskManager;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
// Initialize broker
let gateway = Arc::new(OllamaGateway::new());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
// Register the tool
let tools: Vec<Arc<dyn LlmTool>> = vec![
Arc::new(TaskManager::new()),
];
// The agent can now manage its own tasks
let messages = vec![
LlmMessage::system("You are a helpful assistant. Use the task manager to track your work."),
LlmMessage::user("Plan a party for 10 people."),
];
let response = broker.generate(&messages, Some(tools), None).await?;
println!("{}", response);
Ok(())
}
Integration with Agents
The Task Manager is particularly powerful when combined with the IterativeProblemSolver agent, allowing it to maintain state across multiple reasoning steps.
Example: Web Search
The WebSearchTool demonstrates how to integrate external APIs into your agent’s toolset. This example implementation supports multiple providers like Tavily and Serper.
Configuration
The web search tool typically requires an API key for a search provider (e.g., Tavily, Serper).
#![allow(unused)]
fn main() {
use std::env;
env::set_var("TAVILY_API_KEY", "your-api-key");
}
Usage
use mojentic::prelude::*;
use mojentic::tools::WebSearchTool;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
// Initialize broker
let gateway = Arc::new(OllamaGateway::new());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
// Register the tool
let tools: Vec<Arc<dyn LlmTool>> = vec![
Arc::new(WebSearchTool::new(SearchProvider::Tavily)),
];
// Ask a question requiring up-to-date info
let messages = vec![
LlmMessage::user("What is the current stock price of Apple?"),
];
let response = broker.generate(&messages, Some(tools), None).await?;
println!("{}", response);
Ok(())
}
Supported Providers
- Tavily: Optimized for LLM agents
- Serper: Google Search API
- DuckDuckGo: Privacy-focused search (no API key required for basic usage)
Streaming Responses
Streaming allows you to receive LLM responses chunk-by-chunk as they are generated, improving perceived latency for users.
Basic Streaming
Use broker.generate_stream to get a stream of chunks:
use mojentic::prelude::*;
use futures::StreamExt;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
let gateway = Arc::new(OllamaGateway::new());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
let messages = vec![LlmMessage::user("Tell me a story.")];
let mut stream = broker.generate_stream(&messages, None, None).await?;
while let Some(result) = stream.next().await {
match result {
Ok(chunk) => print!("{}", chunk),
Err(e) => eprintln!("Error: {}", e),
}
}
Ok(())
}
Streaming with Tools
Mojentic supports streaming even when tools are involved. The broker will pause streaming to execute tools and then resume streaming the final response.
#![allow(unused)]
fn main() {
use mojentic::tools::DateResolver;
let tools: Vec<Arc<dyn LlmTool>> = vec![Arc::new(DateResolver::new())];
let mut stream = broker.generate_stream(&messages, Some(tools), None).await?;
// The stream will contain text chunks.
// Tool execution happens transparently in the background.
while let Some(result) = stream.next().await {
// ...
}
}
Embeddings
Embeddings allow you to convert text into vector representations, which are useful for semantic search, clustering, and similarity comparisons.
Setup
You need an embedding model. Ollama supports models like mxbai-embed-large or nomic-embed-text.
use mojentic::prelude::*;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
// Initialize gateway
let gateway = Arc::new(EmbeddingsGateway::new("mxbai-embed-large"));
Ok(())
}
Generating Embeddings
#![allow(unused)]
fn main() {
let text = "The quick brown fox jumps over the lazy dog.";
let vector = gateway.embed(text).await?;
println!("{:?}", &vector[0..5]);
// => [0.123, -0.456, ...]
}
Batch Processing
You can embed multiple texts at once:
#![allow(unused)]
fn main() {
let texts = vec!["Hello", "World"];
let vectors = gateway.embed_batch(texts).await?;
}
Cosine Similarity
Mojentic provides utilities to calculate similarity between vectors:
#![allow(unused)]
fn main() {
use mojentic::utils::math::cosine_similarity;
let similarity = cosine_similarity(&vector1, &vector2);
}
Layer 1 - Core
Layer 1 provides primitives for interacting with LLMs and composing messages, tools, and structured outputs.
Components
- Messages: typed containers for role + content.
- Completion Config: tuning parameters (temperature, max tokens, etc.).
- Tools: callable functions surfaced to the LLM.
- Gateway: abstraction over model providers.
Proceed through the subsections for practical usage.
The Basics
A quick orientation to the core types and flows.
- Create messages (system, user, assistant)
- Configure a completion
- Send to a gateway via the broker
use mojentic::llm::{Broker, CompletionConfig, Message};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let broker = Broker::new()?;
let cfg = CompletionConfig::default();
let msgs = vec![
Message::system("You are concise"),
Message::user("Explain Rust lifetimes in one line"),
];
let out = broker.chat(cfg, msgs).await?;
println!("{}", out.text());
Ok(())
}
Message Composers
Helpers for building prompts and role-structured messages.
- System prompts set behavior
- User messages provide tasks
- Assistant messages chain context
Tip: keep messages small and focused; prefer function/tool calls for larger payloads.
Simple Text Generation
The minimal path from prompt to text.
use mojentic::llm::{Broker, CompletionConfig, Message};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let resp = Broker::new()?
.chat(CompletionConfig::default(), [Message::user("Haiku about Rust")])
.await?;
println!("{}", resp.text());
Ok(())
}
Reasoning Effort Control
Some LLM models support extended thinking modes that allow them to reason more deeply about complex problems before responding. Mojentic provides reasoning effort control through the CompletionConfig.
Overview
The reasoning_effort field allows you to control how much computational effort the model should spend on reasoning:
- Low: Quick responses with minimal reasoning
- Medium: Balanced approach (default reasoning depth)
- High: Extended thinking for complex problems
Usage
use mojentic::llm::gateway::{CompletionConfig, ReasoningEffort};
use mojentic::llm::broker::LlmBroker;
use mojentic::llm::models::LlmMessage;
use mojentic::llm::gateways::ollama::OllamaGateway;
use std::sync::Arc;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let gateway = Arc::new(OllamaGateway::new());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
// Configure with high reasoning effort
let config = CompletionConfig {
reasoning_effort: Some(ReasoningEffort::High),
..Default::default()
};
let messages = vec![
LlmMessage::system("You are a helpful assistant."),
LlmMessage::user("Explain quantum entanglement in simple terms."),
];
let response = broker.generate(&messages, None, Some(config), None).await?;
println!("Response: {}", response);
Ok(())
}
Provider-Specific Behavior
Ollama
When reasoning_effort is set (any value), Ollama receives the think: true parameter, which enables extended thinking mode. The model’s internal reasoning is captured in the thinking field of the response:
#![allow(unused)]
fn main() {
let response = broker.generate(&messages, None, Some(config), None).await?;
// For Ollama, check the gateway response directly for thinking field
// The thinking content shows the model's internal reasoning process
}
OpenAI
For OpenAI reasoning models (like o1), the reasoning_effort parameter maps directly to the API’s reasoning_effort field:
ReasoningEffort::Low→"low"ReasoningEffort::Medium→"medium"ReasoningEffort::High→"high"
For non-reasoning OpenAI models, the parameter is ignored and a warning is logged.
When to Use Reasoning Effort
Use higher reasoning effort for:
- Complex mathematical problems
- Multi-step logical reasoning
- Code generation requiring architectural decisions
- Nuanced ethical or philosophical questions
- Tasks requiring careful consideration of trade-offs
Use lower reasoning effort for:
- Simple factual questions
- Quick classifications
- Routine formatting tasks
- When response speed is critical
Performance Considerations
Higher reasoning effort typically means:
- Longer response times
- Increased token usage
- More thoughtful, accurate responses
- Better handling of edge cases
The trade-off between speed and quality should guide your choice of reasoning effort level.
Building Tools
Creating Custom Tools
Tools extend LLM capabilities by providing functions they can call. All tools must implement the LlmTool trait.
Basic Tool Structure
#![allow(unused)]
fn main() {
use mojentic::prelude::*;
use serde_json::{json, Value};
use std::collections::HashMap;
struct CalculatorTool;
impl LlmTool for CalculatorTool {
fn run(&self, args: &HashMap<String, Value>) -> mojentic::Result<Value> {
let operation = args.get("operation")
.and_then(|v| v.as_str())
.ok_or_else(|| mojentic::MojenticError::ToolError(
"Missing operation parameter".to_string()
))?;
let a = args.get("a")
.and_then(|v| v.as_f64())
.ok_or_else(|| mojentic::MojenticError::ToolError(
"Missing 'a' parameter".to_string()
))?;
let b = args.get("b")
.and_then(|v| v.as_f64())
.ok_or_else(|| mojentic::MojenticError::ToolError(
"Missing 'b' parameter".to_string()
))?;
let result = match operation {
"add" => a + b,
"subtract" => a - b,
"multiply" => a * b,
"divide" => {
if b == 0.0 {
return Err(mojentic::MojenticError::ToolError(
"Division by zero".to_string()
));
}
a / b
}
_ => return Err(mojentic::MojenticError::ToolError(
format!("Unknown operation: {}", operation)
)),
};
Ok(json!({
"result": result,
"operation": operation,
"a": a,
"b": b
}))
}
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor {
r#type: "function".to_string(),
function: FunctionDescriptor {
name: "calculator".to_string(),
description: "Perform basic arithmetic operations".to_string(),
parameters: json!({
"type": "object",
"properties": {
"operation": {
"type": "string",
"enum": ["add", "subtract", "multiply", "divide"],
"description": "The operation to perform"
},
"a": {
"type": "number",
"description": "First operand"
},
"b": {
"type": "number",
"description": "Second operand"
}
},
"required": ["operation", "a", "b"]
}),
},
}
}
}
}
Using Custom Tools
#[tokio::main]
async fn main() -> mojentic::Result<()> {
let gateway = OllamaGateway::new();
let broker = LlmBroker::new("qwen3:32b", Arc::new(gateway));
let tools: Vec<Box<dyn LlmTool>> = vec![
Box::new(CalculatorTool),
];
let messages = vec![
LlmMessage::user("What is 42 multiplied by 17?")
];
let response = broker.generate(&messages, Some(&tools), None).await?;
println!("{}", response);
Ok(())
}
Testing Custom Tools
Create tests for your tools:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_calculator_add() {
let tool = CalculatorTool;
let mut args = HashMap::new();
args.insert("operation".to_string(), json!("add"));
args.insert("a".to_string(), json!(5.0));
args.insert("b".to_string(), json!(3.0));
let result = tool.run(&args).unwrap();
assert_eq!(result["result"], 8.0);
}
#[test]
fn test_calculator_divide_by_zero() {
let tool = CalculatorTool;
let mut args = HashMap::new();
args.insert("operation".to_string(), json!("divide"));
args.insert("a".to_string(), json!(10.0));
args.insert("b".to_string(), json!(0.0));
assert!(tool.run(&args).is_err());
}
}
}
Best Practices
- Error Handling: Always use proper error types, never panic in library code
- Documentation: Add rustdoc comments to your tool implementations
- Testing: Write comprehensive tests for your tools
- Type Safety: Validate all input parameters
- JSON Schema: Provide clear, detailed parameter descriptions
Tool Usage
Let models invoke your tools to retrieve data or perform actions.
Patterns:
- Stateless RPC-like tools
- Stateful tools with access to context
- Safety: validate inputs, timeouts, retries
Agent Delegation with ToolWrapper
The ToolWrapper enables agent delegation patterns by wrapping an agent (broker + tools + behavior) as a tool that can be used by other agents. This allows you to build hierarchical agent systems where coordinator agents can delegate specialized tasks to expert agents.
Overview
ToolWrapper wraps three core components:
- Broker: The LLM interface for the agent
- Tools: The tools available to the agent
- Behavior: The system message defining the agent’s personality and capabilities
The wrapped agent appears as a standard tool with a single input parameter, making it easy for other agents to delegate work.
Basic Usage
#![allow(unused)]
fn main() {
use mojentic::llm::{LlmBroker, LlmTool, ToolWrapper};
use mojentic::llm::gateways::OllamaGateway;
use std::sync::Arc;
// Create a specialist agent
let gateway = Arc::new(OllamaGateway::default());
let specialist_broker = Arc::new(LlmBroker::new("qwen2.5:7b", gateway.clone()));
let specialist_tools: Vec<Box<dyn LlmTool>> = vec![
// Add specialist's tools here
];
let specialist_behaviour =
"You are a specialist in temporal reasoning and date calculations.";
// Wrap the specialist as a tool
let specialist_tool = ToolWrapper::new(
specialist_broker,
specialist_tools,
specialist_behaviour,
"temporal_specialist",
"A specialist that handles date and time-related queries."
);
// Use the specialist in a coordinator agent
let coordinator_broker = LlmBroker::new("qwen2.5:14b", gateway);
let coordinator_tools: Vec<Box<dyn LlmTool>> = vec![
Box::new(specialist_tool)
];
// The coordinator can now delegate to the specialist
}
How It Works
-
Tool Descriptor: The ToolWrapper generates a tool descriptor with:
- Function name (e.g., “temporal_specialist”)
- Description (what the agent does)
- Single
inputparameter for instructions
-
Execution Flow: When called:
- Extracts the
inputparameter - Creates initial messages with the agent’s behavior (system message)
- Appends the input as a user message
- Calls the agent’s broker with its tools
- Returns the agent’s response
- Extracts the
-
Delegation Pattern: The coordinator agent sees specialist agents as tools and decides when to delegate based on the task requirements.
Example: Multi-Agent System
See examples/broker_as_tool.rs for a complete example demonstrating:
- Creating specialist agents with specific tools and behaviors
- Wrapping specialists as tools
- Building a coordinator that delegates appropriately
- Testing different query types
cargo run --example broker_as_tool
Design Considerations
Agent Ownership
ToolWrapper uses Arc<LlmBroker> to handle shared ownership of the broker. This allows multiple references to the same broker if needed.
Async Execution
The run method is synchronous (required by the LlmTool trait) but internally handles async operations using tokio::task::block_in_place. This requires tests to use the multi-threaded runtime:
#![allow(unused)]
fn main() {
#[tokio::test(flavor = "multi_thread")]
async fn test_tool_wrapper() {
// Test code here
}
}
Tool Isolation
Each wrapped agent maintains its own:
- Tool set
- Behavior/personality
- Conversation context (per invocation)
This ensures clean separation between agents and prevents context bleeding.
Common Patterns
Specialist Agent Pattern
Create focused agents with specific expertise:
#![allow(unused)]
fn main() {
// Data analysis specialist
let data_analyst = ToolWrapper::new(
analyst_broker,
vec![Box::new(DataQueryTool), Box::new(VisualizationTool)],
"You are a data analyst specializing in statistical analysis.",
"data_analyst",
"Analyzes data and provides statistical insights."
);
// Writing specialist
let writer = ToolWrapper::new(
writer_broker,
vec![Box::new(GrammarCheckTool), Box::new(StyleGuideTool)],
"You are a professional writer and editor.",
"writer",
"Improves and edits written content."
);
}
Coordinator Pattern
Build a coordinator that orchestrates multiple specialists:
#![allow(unused)]
fn main() {
let coordinator = LlmBroker::new("qwen2.5:32b", gateway);
let tools: Vec<Box<dyn LlmTool>> = vec![
Box::new(data_analyst),
Box::new(writer),
Box::new(researcher),
];
// Coordinator decides which specialist to use based on the task
}
Hierarchical Agents
Create multi-level hierarchies:
#![allow(unused)]
fn main() {
// Level 1: Base specialists
let math_specialist = ToolWrapper::new(...);
let physics_specialist = ToolWrapper::new(...);
// Level 2: Domain coordinator
let science_coordinator = ToolWrapper::new(
coordinator_broker,
vec![Box::new(math_specialist), Box::new(physics_specialist)],
"You coordinate math and physics specialists.",
"science_coordinator",
"Handles scientific queries."
);
// Level 3: Top-level coordinator
let main_coordinator = LlmBroker::new("qwen2.5:32b", gateway);
let main_tools: Vec<Box<dyn LlmTool>> = vec![
Box::new(science_coordinator),
// Other domain coordinators...
];
}
Best Practices
- Clear Specialization: Give each agent a well-defined area of expertise
- Descriptive Names: Use clear, descriptive names for tools (e.g., “temporal_specialist”, not “agent1”)
- Comprehensive Descriptions: Write detailed descriptions so coordinators understand when to delegate
- Model Selection: Use appropriate model sizes for each role:
- Specialists: Smaller, faster models (7B-14B)
- Coordinators: Larger, more capable models (32B+)
- Tool Composition: Specialists should have tools relevant to their domain
- Testing: Test both individual specialists and the full delegation chain
Limitations
- Synchronous Interface: The
LlmTooltrait requires synchronousrunmethods, though async operations happen internally - No Streaming: Tool calls don’t support streaming responses (limitation of current tool architecture)
- Context: Each tool invocation is independent; no automatic context sharing between calls
- Error Handling: Tool errors propagate up to the coordinator; implement appropriate error handling
See Also
- Tool Usage - General tool usage patterns
- Building Tools - Creating custom tools
- Chat Sessions with Tools - Using tools in chat sessions
Chat Sessions with Tools
Combine conversational context with tool calling for grounded, actionable responses. ChatSession seamlessly integrates with Mojentic’s tool system to enable function calling within conversations.
Overview
When tools are provided to a ChatSession, the LLM can call them during the conversation. The broker handles:
- Detecting tool call requests
- Executing the appropriate tools
- Adding tool results to the conversation
- Continuing the conversation with the LLM
This happens transparently - you just call send() as normal.
Basic Tool Integration
Add tools when building the session:
use mojentic::llm::{ChatSession, LlmBroker};
use mojentic::llm::gateways::OllamaGateway;
use mojentic::llm::tools::simple_date_tool::SimpleDateTool;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let gateway = Arc::new(OllamaGateway::default());
let broker = LlmBroker::new("qwen3:32b", gateway);
// Create tools
let tools: Vec<Box<dyn mojentic::llm::LlmTool>> = vec![
Box::new(SimpleDateTool),
];
// Build session with tools
let mut session = ChatSession::builder(broker)
.system_prompt(
"You are a helpful assistant. When users ask about dates, \
use the resolve_date tool to convert relative dates to absolute dates."
)
.tools(tools)
.build();
// The LLM will automatically use the tool when needed
let response = session.send("What is tomorrow's date?").await?;
println!("{}", response);
Ok(())
}
How It Works
- User sends a message:
session.send("What is tomorrow's date?") - LLM recognizes need for tool: Generates a tool call request
- Broker executes tool: Calls
SimpleDateToolwith appropriate arguments - Tool result added to history: As a tool response message
- LLM generates final response: Using the tool’s output
- Response returned to user: “Tomorrow’s date is 2025-11-14”
All of this happens in a single send() call.
Multiple Tools
Provide multiple tools for different capabilities:
#![allow(unused)]
fn main() {
use mojentic::llm::tools::simple_date_tool::SimpleDateTool;
// Import other tools...
let tools: Vec<Box<dyn mojentic::llm::LlmTool>> = vec![
Box::new(SimpleDateTool),
Box::new(CalculatorTool),
Box::new(WeatherTool),
];
let mut session = ChatSession::builder(broker)
.system_prompt(
"You are a helpful assistant with access to date resolution, \
calculation, and weather information tools. Use them when appropriate."
)
.tools(tools)
.build();
}
The LLM will choose which tool to use based on the user’s question.
Tool Call History
Tool calls and results are part of the conversation history:
#![allow(unused)]
fn main() {
// After a tool call
for msg in session.messages() {
match msg.role() {
MessageRole::User => println!("User: {}", msg.content().unwrap_or("")),
MessageRole::Assistant => {
if let Some(tool_calls) = &msg.message.tool_calls {
println!("Assistant called tool: {:?}", tool_calls);
} else {
println!("Assistant: {}", msg.content().unwrap_or(""));
}
}
MessageRole::Tool => println!("Tool result: {}", msg.content().unwrap_or("")),
_ => {}
}
}
}
Context Management with Tools
Tool calls and results consume tokens. The session manages this automatically:
#![allow(unused)]
fn main() {
let mut session = ChatSession::builder(broker)
.max_context(8192)
.tools(tools)
.build();
// Even with tool calls, context trimming works
session.send("What's tomorrow?").await?; // May trigger tool call
session.send("And the day after?").await?; // May trigger another tool call
// Old messages (including old tool calls) are trimmed when needed
println!("Total tokens: {}", session.total_tokens());
}
Creating Custom Tools
Build your own tools for specific needs:
#![allow(unused)]
fn main() {
use mojentic::llm::tools::{LlmTool, ToolDescriptor, FunctionDescriptor};
use serde_json::{json, Value};
use std::collections::HashMap;
struct WeatherTool;
impl LlmTool for WeatherTool {
fn run(&self, args: &HashMap<String, Value>) -> mojentic::error::Result<Value> {
let city = args.get("city")
.and_then(|v| v.as_str())
.ok_or_else(|| mojentic::error::MojenticError::ToolError(
"Missing city argument".to_string()
))?;
// In reality, you'd call a weather API here
Ok(json!({
"city": city,
"temperature": "72°F",
"conditions": "Sunny"
}))
}
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor {
r#type: "function".to_string(),
function: FunctionDescriptor {
name: "get_weather".to_string(),
description: "Get current weather for a city".to_string(),
parameters: json!({
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name"
}
},
"required": ["city"]
}),
},
}
}
}
}
Then use it:
#![allow(unused)]
fn main() {
let tools: Vec<Box<dyn mojentic::llm::LlmTool>> = vec![
Box::new(WeatherTool),
];
let mut session = ChatSession::builder(broker)
.system_prompt("You can check weather using the get_weather tool.")
.tools(tools)
.build();
let response = session.send("What's the weather in San Francisco?").await?;
}
Interactive Example
Run the complete example:
cargo run --example chat_session_with_tool
This demonstrates:
- Tool integration in conversation
- Automatic tool invocation
- Natural multi-turn dialogue with tool results
Best Practices
- Clear tool descriptions: Help the LLM understand when to use each tool
- Descriptive system prompts: Explain what tools are available
- Error handling in tools: Return meaningful errors for invalid inputs
- Tool result formatting: Return structured JSON that’s easy for the LLM to interpret
- Monitor token usage: Tools can add significant tokens to the conversation
Debugging Tool Calls
Enable tracing to see tool execution:
#![allow(unused)]
fn main() {
// Initialize tracing
tracing_subscriber::fmt::init();
// Tool calls will be logged
let response = session.send("What's tomorrow?").await?;
}
You’ll see logs like:
INFO Tool calls requested: 1
INFO Executing tool: resolve_date
Limitations
- Tool calls count toward the context window
- Very long tool results may need truncation
- Not all models support tool calling equally well
- Tool execution is synchronous (async tools not yet supported)
Next Steps
- Learn about Building Tools
- Explore Tool Usage patterns
- Read about Simple Text Generation
OpenAI Model Registry
Overview
The OpenAI Model Registry is a centralized system for managing model-specific configurations, capabilities, and parameter requirements for OpenAI models. It provides a compile-time type-safe registry that informs the gateway about which parameters and features each model supports.
This registry is essential for:
- Determining which models support tools, streaming, or vision capabilities
- Selecting the correct token limit parameter (
max_tokensvsmax_completion_tokens) - Understanding temperature support for reasoning models
- Tracking which API endpoints each model supports
- Providing sensible defaults when working with unknown models
Model Types
The registry categorizes models into four distinct types:
Reasoning Models
Models that use internal reasoning steps before generating responses. These models use max_completion_tokens instead of max_tokens.
#![allow(unused)]
fn main() {
ModelType::Reasoning
}
Examples: o1, o1-mini, o3, o3-mini, gpt-5, gpt-5.1
Chat Models
Standard chat completion models that use max_tokens for output control.
#![allow(unused)]
fn main() {
ModelType::Chat
}
Examples: gpt-4o, gpt-4-turbo, gpt-3.5-turbo
Embedding Models
Models designed for generating text embeddings.
#![allow(unused)]
fn main() {
ModelType::Embedding
}
Examples: text-embedding-3-small, text-embedding-3-large
Moderation Models
Models for content moderation and safety classification.
#![allow(unused)]
fn main() {
ModelType::Moderation
}
Examples: omni-moderation-latest, text-moderation-latest
Model Capabilities
The ModelCapabilities struct defines what each model can do. Here are all the fields:
Core Capabilities
model_type: ModelType- The type of model (Reasoning, Chat, Embedding, or Moderation)supports_tools: bool- Whether the model supports function/tool callingsupports_streaming: bool- Whether the model supports streaming responsessupports_vision: bool- Whether the model can process image inputs
Token Limits
max_context_tokens: Option<u32>- Maximum input context window size in tokensmax_output_tokens: Option<u32>- Maximum number of output tokens the model can generate
Temperature Support
supported_temperatures: Option<Vec<f32>>- Temperature constraints:None- Accepts any temperature value (most chat models)Some(vec![])- Does not accept temperature parameter (not exposed to users)Some(vec![1.0])- Only accepts temperature 1.0 (reasoning models like o1, o3)
API Endpoint Support
supports_chat_api: bool- Supports/v1/chat/completionsendpointsupports_completions_api: bool- Supports/v1/completionsendpointsupports_responses_api: bool- Supports/v1/responsesendpoint
Note: The OpenAI gateway currently only calls the Chat API. These flags are informational and available for future gateway enhancements.
API Endpoint Categories
OpenAI models support different API endpoints based on their design and capabilities:
Chat API (Most Common)
The /v1/chat/completions endpoint is supported by most modern models:
- All GPT-4 variants (except legacy completions models)
- Reasoning models (o1, o3, o4-mini)
- Base GPT-5 models
Completions API (Legacy)
The /v1/completions endpoint is supported by older models:
babbage-002davinci-002gpt-3.5-turbo-instruct
These models do not support the chat API format.
Both APIs
Some models support both chat and completions:
gpt-4o-minigpt-4.1-nanogpt-5.1
Responses API (Newer Models)
The /v1/responses endpoint is supported by specialized models:
gpt-5-procodex-mini-latest
Using the Global Registry
The registry provides a global instance for convenience:
#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::{
get_model_registry, ModelType
};
// Access the global registry
let registry = get_model_registry();
// Look up model capabilities
let caps = registry.get_model_capabilities("gpt-4o");
assert_eq!(caps.model_type, ModelType::Chat);
assert!(caps.supports_tools);
assert!(caps.supports_streaming);
assert!(caps.supports_vision);
}
Token Limit Parameters
Different model types use different parameter names for controlling output length:
#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;
let registry = get_model_registry();
// Chat models use "max_tokens"
let gpt4_caps = registry.get_model_capabilities("gpt-4o");
assert_eq!(gpt4_caps.get_token_limit_param(), "max_tokens");
// Reasoning models use "max_completion_tokens"
let o1_caps = registry.get_model_capabilities("o1");
assert_eq!(o1_caps.get_token_limit_param(), "max_completion_tokens");
}
Temperature Support
Reasoning models have restricted temperature support:
#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;
let registry = get_model_registry();
// Chat models support arbitrary temperatures
let gpt4_caps = registry.get_model_capabilities("gpt-4o");
assert!(gpt4_caps.supports_temperature(0.7));
assert!(gpt4_caps.supports_temperature(1.5));
// Reasoning models only support temperature 1.0
let o1_caps = registry.get_model_capabilities("o1");
assert!(o1_caps.supports_temperature(1.0));
assert!(!o1_caps.supports_temperature(0.7));
}
Checking Endpoint Support
You can check which API endpoints a model supports:
#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;
let registry = get_model_registry();
// Standard chat model - chat API only
let caps = registry.get_model_capabilities("gpt-4o");
assert!(caps.supports_chat_api);
assert!(!caps.supports_completions_api);
assert!(!caps.supports_responses_api);
// Dual-endpoint model
let mini_caps = registry.get_model_capabilities("gpt-4o-mini");
assert!(mini_caps.supports_chat_api);
assert!(mini_caps.supports_completions_api);
assert!(!mini_caps.supports_responses_api);
// Completions-only model
let instruct_caps = registry.get_model_capabilities("gpt-3.5-turbo-instruct");
assert!(!instruct_caps.supports_chat_api);
assert!(instruct_caps.supports_completions_api);
assert!(!instruct_caps.supports_responses_api);
// Responses-only model
let pro_caps = registry.get_model_capabilities("gpt-5-pro");
assert!(!pro_caps.supports_chat_api);
assert!(!pro_caps.supports_completions_api);
assert!(pro_caps.supports_responses_api);
}
Creating Custom Registries
While the global registry is convenient, you can create custom registries for testing or specialized configurations:
#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::{
OpenAIModelRegistry, ModelCapabilities, ModelType
};
// Create a new empty registry
let mut custom_registry = OpenAIModelRegistry::new();
// Register a custom model
custom_registry.register_model("custom-model", ModelCapabilities {
model_type: ModelType::Chat,
supports_tools: true,
supports_streaming: true,
supports_vision: false,
max_context_tokens: Some(8192),
max_output_tokens: Some(4096),
supported_temperatures: None, // Accepts any temperature
supports_chat_api: true,
supports_completions_api: false,
supports_responses_api: false,
});
// Use the custom registry
let caps = custom_registry.get_model_capabilities("custom-model");
assert_eq!(caps.model_type, ModelType::Chat);
assert!(caps.supports_tools);
}
Pattern Matching for Unknown Models
When a model is not explicitly registered, the registry infers capabilities from the model name:
#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;
let registry = get_model_registry();
// Unknown model with "gpt-4" prefix inherits GPT-4 capabilities
let unknown_caps = registry.get_model_capabilities("gpt-4-future-model");
assert!(unknown_caps.supports_tools);
assert!(unknown_caps.supports_streaming);
// Unknown model with "o" prefix is treated as a reasoning model
let unknown_o_caps = registry.get_model_capabilities("o5-experimental");
assert_eq!(unknown_o_caps.get_token_limit_param(), "max_completion_tokens");
assert!(unknown_o_caps.supports_temperature(1.0));
assert!(!unknown_o_caps.supports_temperature(0.7));
}
Common Model Categories
Chat-Only Models
Most GPT-4, reasoning, and base GPT-5 models:
gpt-4o,gpt-4-turbo,gpt-4o1,o1-mini,o3,o3-mini,o4-minigpt-5,gpt-5.1
Completions-Only Models
Legacy models that don’t support chat format:
babbage-002davinci-002gpt-3.5-turbo-instruct
Dual-Endpoint Models
Models supporting both chat and completions:
gpt-4o-minigpt-4.1-nanogpt-5.1
Responses-Only Models
Specialized newer endpoint models:
gpt-5-procodex-mini-latest
Summary
The OpenAI Model Registry provides:
- Type-safe model capability definitions
- Automatic parameter selection based on model type
- Temperature validation for reasoning models
- API endpoint support tracking
- Pattern matching for unknown models
- A global registry for convenience
- Custom registries for specialized use cases
By consulting the registry, the OpenAI gateway ensures it uses the correct parameters and features for each model, providing a robust and maintainable integration layer.
Layer 2 - Agents
Agents are the smallest unit of computation in Mojentic. They process events and emit new events, composing into larger systems.
This section mirrors the Python project’s agent layer but uses Rust’s async and type system.
Asynchronous Capabilities
Mojentic (Rust) embraces async end-to-end.
- Concurrency for overlapping LLM calls
- Channels/streams for event propagation
- Timeouts and cancellation for robustness
Iterative Problem Solver
The IterativeProblemSolver is an agent that iteratively attempts to solve problems using available tools. It employs a chat-based approach and continues working until it succeeds, fails explicitly, or reaches the maximum number of iterations.
Overview
The Iterative Problem Solver follows a simple but powerful pattern:
- Plan - Analyze the problem and identify what needs to be done
- Act - Execute actions using available tools
- Observe - Review the results
- Refine - Adjust the approach based on observations
- Terminate - Stop when the goal is met or the iteration budget is exhausted
Key Features
- Tool Integration: Seamlessly integrates with any
LlmToolimplementations - Automatic Termination: Stops when the LLM responds with “DONE” or “FAIL”
- Iteration Control: Configurable maximum iteration count prevents infinite loops
- Chat-Based Context: Maintains conversation history for context-aware problem solving
- Summary Generation: Provides a clean summary of the final result
Usage
Basic Example
use mojentic::agents::IterativeProblemSolver;
use mojentic::llm::{LlmBroker, LlmTool};
use mojentic::llm::gateways::OllamaGateway;
use mojentic::llm::tools::simple_date_tool::SimpleDateTool;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize the LLM broker
let gateway = Arc::new(OllamaGateway::default());
let broker = LlmBroker::new("qwen3:32b", gateway, None);
// Define available tools
let tools: Vec<Box<dyn LlmTool>> = vec![
Box::new(SimpleDateTool),
];
// Create the solver
let mut solver = IterativeProblemSolver::builder(broker)
.tools(tools)
.max_iterations(5)
.build();
// Solve a problem
let result = solver.solve("What's the date next Friday?").await?;
println!("Result: {}", result);
Ok(())
}
Custom System Prompt
You can customize the system prompt to guide the solver’s behavior:
#![allow(unused)]
fn main() {
let mut solver = IterativeProblemSolver::builder(broker)
.tools(tools)
.max_iterations(10)
.system_prompt(
"You are a specialized data analysis assistant. \
Break down complex queries into clear steps and use tools methodically."
)
.build();
}
With Multiple Tools
The solver works best when given appropriate tools for the problem domain:
#![allow(unused)]
fn main() {
use mojentic::llm::tools::ask_user_tool::AskUserTool;
use mojentic::llm::tools::simple_date_tool::SimpleDateTool;
let tools: Vec<Box<dyn LlmTool>> = vec![
Box::new(AskUserTool::new()),
Box::new(SimpleDateTool),
];
let mut solver = IterativeProblemSolver::builder(broker)
.tools(tools)
.max_iterations(5)
.build();
}
How It Works
Step-by-Step Process
- Initialization: The solver creates a
ChatSessionwith the provided system prompt and tools - Iteration Loop: For each iteration:
- Sends the problem description with instructions to use tools
- Checks the response for “DONE” (success) or “FAIL” (failure)
- Continues if neither keyword is present and iterations remain
- Summary: After termination, requests a concise summary of the result
- Return: Returns the summary as the final result
Termination Conditions
The solver terminates when one of these conditions is met:
- Success: The LLM’s response contains “DONE” (case-insensitive)
- Failure: The LLM’s response contains “FAIL” (case-insensitive)
- Exhaustion: The maximum number of iterations is reached
Logging
The solver uses the tracing crate to log important events:
info: Logged when a task completes successfully or failswarn: Logged when maximum iterations are reached
Configuration Options
Builder Pattern
The IterativeProblemSolver uses the builder pattern for configuration:
#![allow(unused)]
fn main() {
IterativeProblemSolver::builder(broker)
.tools(tools) // Set available tools
.max_iterations(10) // Set max iterations (default: 3)
.system_prompt("...") // Set custom system prompt
.build()
}
Default Values
- max_iterations: 3
- system_prompt:
“You are a problem-solving assistant that can solve complex problems step by step. You analyze problems, break them down into smaller parts, and solve them systematically. If you cannot solve a problem completely in one step, you make progress and identify what to do next.”
Best Practices
1. Choose Appropriate Tools
Select tools that are relevant to your problem domain:
#![allow(unused)]
fn main() {
// For date/time problems
let tools = vec![Box::new(SimpleDateTool)];
// For user interaction
let tools = vec![Box::new(AskUserTool::new())];
// For data analysis
let tools = vec![
Box::new(CalculatorTool),
Box::new(DataRetrievalTool),
];
}
2. Set Reasonable Iteration Limits
Balance between giving the solver enough attempts and preventing excessive computation:
- Simple queries: 3-5 iterations
- Complex analyses: 10-15 iterations
- Open-ended exploration: 20+ iterations
3. Provide Context in Problem Description
The more context you provide, the better the solver can work:
#![allow(unused)]
fn main() {
// Less effective
solver.solve("Analyze the data").await?;
// More effective
solver.solve(
"Analyze the sales data from Q1 2024. \
Focus on trends in the technology sector. \
Provide insights on growth patterns."
).await?;
}
4. Monitor Logs
Enable tracing to understand solver behavior:
#![allow(unused)]
fn main() {
tracing_subscriber::fmt()
.with_max_level(tracing::Level::INFO)
.init();
}
Common Patterns
Retry Logic
For operations that might fail transiently:
#![allow(unused)]
fn main() {
let mut attempts = 0;
let max_attempts = 3;
let result = loop {
attempts += 1;
match solver.solve(problem).await {
Ok(result) if !result.contains("FAIL") => break result,
Ok(_) if attempts < max_attempts => continue,
Ok(result) => break result,
Err(e) => return Err(e),
}
};
}
Multi-Stage Problems
For problems that require multiple phases:
#![allow(unused)]
fn main() {
// Phase 1: Data gathering
let mut solver = IterativeProblemSolver::builder(broker.clone())
.tools(data_tools)
.max_iterations(5)
.build();
let data = solver.solve("Gather all relevant data").await?;
// Phase 2: Analysis
let mut solver = IterativeProblemSolver::builder(broker)
.tools(analysis_tools)
.max_iterations(10)
.build();
let analysis = solver.solve(&format!("Analyze: {}", data)).await?;
}
Examples
See the complete examples at:
examples/iterative_solver.rs- Basic usage with date and user interaction toolsexamples/solver_chat_session.rs- Interactive chat session with solver delegation pattern
Error Handling
The solver returns Result<String, MojenticError>:
#![allow(unused)]
fn main() {
match solver.solve(problem).await {
Ok(result) => println!("Solution: {}", result),
Err(MojenticError::GatewayError(msg)) => {
eprintln!("Gateway error: {}", msg);
}
Err(MojenticError::ToolError(msg)) => {
eprintln!("Tool error: {}", msg);
}
Err(e) => {
eprintln!("Unexpected error: {}", e);
}
}
}
Advanced: Solver as a Tool
The IterativeProblemSolver can be wrapped as a tool and used within a ChatSession, enabling powerful delegation patterns where a chat assistant can offload complex problems to a specialized solver agent.
Creating a Solver Tool
#![allow(unused)]
fn main() {
use mojentic::llm::tools::{FunctionDescriptor, LlmTool, ToolDescriptor};
use mojentic::agents::IterativeProblemSolver;
use serde_json::{json, Value};
use std::collections::HashMap;
use std::sync::Arc;
struct IterativeProblemSolverTool {
broker: Arc<LlmBroker>,
tools: Vec<Box<dyn LlmTool>>,
}
impl IterativeProblemSolverTool {
fn new(broker: Arc<LlmBroker>, tools: Vec<Box<dyn LlmTool>>) -> Self {
Self { broker, tools }
}
}
impl Clone for IterativeProblemSolverTool {
fn clone(&self) -> Self {
Self {
broker: self.broker.clone(),
tools: self.tools.iter().map(|t| t.clone_box()).collect(),
}
}
}
impl LlmTool for IterativeProblemSolverTool {
fn run(&self, args: &HashMap<String, Value>) -> mojentic::error::Result<Value> {
let problem_to_solve = args
.get("problem_to_solve")
.and_then(|v| v.as_str())
.ok_or_else(|| {
mojentic::error::MojenticError::ToolError(
"Missing required argument: problem_to_solve".to_string(),
)
})?;
let solver_tools: Vec<Box<dyn LlmTool>> =
self.tools.iter().map(|t| t.clone_box()).collect();
let runtime = tokio::runtime::Handle::current();
let broker_clone = (*self.broker).clone();
let result = runtime.block_on(async move {
let mut solver = IterativeProblemSolver::builder(broker_clone)
.tools(solver_tools)
.max_iterations(5)
.build();
solver.solve(problem_to_solve).await
})?;
Ok(json!({"solution": result}))
}
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor {
r#type: "function".to_string(),
function: FunctionDescriptor {
name: "iterative_problem_solver".to_string(),
description: "Iteratively solve a complex multi-step problem using available tools.".to_string(),
parameters: json!({
"type": "object",
"properties": {
"problem_to_solve": {
"type": "string",
"description": "The problem or request to be solved."
}
},
"required": ["problem_to_solve"],
"additionalProperties": false
}),
},
}
}
fn clone_box(&self) -> Box<dyn LlmTool> {
Box::new(self.clone())
}
}
}
Using the Solver Tool in a Chat Session
use mojentic::llm::{ChatSession, LlmBroker};
use mojentic::llm::tools::simple_date_tool::SimpleDateTool;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let gateway = Arc::new(OllamaGateway::default());
let broker = Arc::new(LlmBroker::new("qwq", gateway, None));
// Create the solver tool with SimpleDateTool as the inner tool
let solver_tools: Vec<Box<dyn LlmTool>> = vec![Box::new(SimpleDateTool)];
let solver_tool = IterativeProblemSolverTool::new(broker.clone(), solver_tools);
// Create chat session with the solver tool
let mut session = ChatSession::builder((*broker).clone())
.system_prompt(
"You are a helpful assistant with access to an iterative problem solver. \
When faced with complex multi-step problems or questions that require \
reasoning and tool usage, use the iterative_problem_solver tool."
)
.tools(vec![Box::new(solver_tool)])
.build();
// Interactive loop
loop {
let mut query = String::new();
print!("Query: ");
io::stdout().flush()?;
io::stdin().read_line(&mut query)?;
if query.trim().is_empty() {
break;
}
let response = session.send(query.trim()).await?;
println!("{}\n", response);
}
Ok(())
}
Benefits of This Pattern
- Delegation: The chat assistant can offload complex problems to a specialized solver
- Composability: Mix solver capabilities with other tools in the same session
- Context Preservation: The chat session maintains conversation history
- Flexible Interaction: Users can ask simple questions directly or complex problems that trigger the solver
See the complete example at:
examples/solver_chat_session.rs- Interactive chat session with solver delegation
Limitations
- LLM Dependency: Quality of results depends on the underlying LLM’s capabilities
- Tool Design: Effectiveness relies on well-designed tools with clear descriptions
- Token Limits: Long iterations may hit context window limits
- Cost: Multiple LLM calls per problem can increase API costs
See Also
- Chat Sessions - Understanding the underlying chat mechanism
- Building Tools - Creating custom tools for the solver
- Simple Recursive Agent - Alternative problem-solving pattern
Working Memory Pattern
The working memory pattern enables agents to maintain and share context across multiple interactions. This guide shows how to use SharedWorkingMemory and build memory-aware agents in Rust.
Overview
Working memory provides:
- Shared Context: Multiple agents can read from and write to the same memory
- Continuous Learning: Agents automatically learn and remember new information
- State Persistence: Knowledge is maintained across interactions
- Thread Safety: Safe concurrent access using
Arc<Mutex<T>>
Quick Start
Basic Usage
#![allow(unused)]
fn main() {
use mojentic::context::SharedWorkingMemory;
use serde_json::json;
// Create memory with initial data
let memory = SharedWorkingMemory::new(json!({
"User": {
"name": "Alice",
"age": 30
}
}));
// Retrieve current state
let current = memory.get_working_memory();
// Update memory (deep merge)
memory.merge_to_working_memory(&json!({
"User": {
"city": "NYC",
"preferences": {
"theme": "dark"
}
}
}));
// Result: {"User": {"name": "Alice", "age": 30, "city": "NYC", "preferences": {...}}}
}
Memory-Aware Agent Pattern
use mojentic::llm::{LlmBroker, LlmMessage};
use mojentic::llm::gateways::OllamaGateway;
use mojentic::context::SharedWorkingMemory;
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize
let gateway = OllamaGateway::default();
let broker = LlmBroker::new("qwen2.5:7b", gateway);
let memory = SharedWorkingMemory::new(json!({
"User": {"name": "Alice"}
}));
// Create prompt with memory context
let memory_context = memory.get_working_memory();
let prompt = format!(
"This is what you remember: {}\n\nRemember anything new you learn.\n\nUser: I love pizza",
serde_json::to_string_pretty(&memory_context)?
);
// Generate response with schema that includes memory field
let schema = json!({
"type": "object",
"required": ["answer"],
"properties": {
"answer": {"type": "string"},
"memory": {
"type": "object",
"description": "Add anything new you learned here."
}
}
});
let response = broker.generate(
vec![LlmMessage::user(&prompt)],
Some(schema),
None,
None,
None,
None
).await?;
// Parse response and update memory
let response_json: serde_json::Value = serde_json::from_str(&response.content)?;
if let Some(learned) = response_json.get("memory") {
memory.merge_to_working_memory(learned);
}
Ok(())
}
Core Concepts
SharedWorkingMemory
A thread-safe, mutable key-value store that agents use to share context:
#![allow(unused)]
fn main() {
pub struct SharedWorkingMemory {
memory: Arc<Mutex<HashMap<String, Value>>>,
}
impl SharedWorkingMemory {
pub fn new(initial: Value) -> Self
pub fn get_working_memory(&self) -> Value
pub fn merge_to_working_memory(&self, updates: &Value)
}
}
Key features:
- Thread-Safe: Uses
Arc<Mutex<T>>for concurrent access - Deep Merge: Nested JSON objects are recursively merged
- Simple API: Just 3 methods to learn
Deep Merge Behavior
Memory updates use deep merge to preserve existing data:
#![allow(unused)]
fn main() {
use serde_json::json;
// Initial memory
let memory = SharedWorkingMemory::new(json!({
"User": {
"name": "Alice",
"age": 30,
"address": {
"city": "NYC",
"state": "NY"
}
}
}));
// Update with nested data
memory.merge_to_working_memory(&json!({
"User": {
"age": 31,
"address": {
"zip": "10001"
}
}
}));
// Result: All fields preserved, nested objects merged
// {
// "User": {
// "name": "Alice", // Preserved
// "age": 31, // Updated
// "address": {
// "city": "NYC", // Preserved
// "state": "NY", // Preserved
// "zip": "10001" // Added
// }
// }
// }
}
Building Memory-Aware Agents
Here’s a complete pattern for memory-aware agents:
#![allow(unused)]
fn main() {
use mojentic::llm::{LlmBroker, LlmMessage};
use mojentic::context::SharedWorkingMemory;
use mojentic::agents::{Event, BaseAsyncAgent};
use serde::{Deserialize, Serialize};
use serde_json::json;
use async_trait::async_trait;
#[derive(Serialize, Deserialize)]
struct ResponseModel {
text: String,
#[serde(skip_serializing_if = "Option::is_none")]
memory: Option<serde_json::Value>,
}
struct MemoryAgent {
broker: Arc<LlmBroker>,
memory: SharedWorkingMemory,
behaviour: String,
instructions: String,
}
impl MemoryAgent {
fn new(
broker: Arc<LlmBroker>,
memory: SharedWorkingMemory,
behaviour: String,
instructions: String,
) -> Self {
Self { broker, memory, behaviour, instructions }
}
async fn generate_with_memory(
&self,
user_input: &str,
) -> Result<ResponseModel, Box<dyn std::error::Error>> {
// Build prompt with memory context
let memory_context = self.memory.get_working_memory();
let messages = vec![
LlmMessage::system(&self.behaviour),
LlmMessage::user(&format!(
"This is what you remember:\n{}\n\n{}",
serde_json::to_string_pretty(&memory_context)?,
self.instructions
)),
LlmMessage::user(user_input),
];
// Schema with memory field
let schema = json!({
"type": "object",
"required": ["text"],
"properties": {
"text": {"type": "string"},
"memory": {
"type": "object",
"description": "Add new information here."
}
}
});
// Generate response
let response = self.broker.generate(
messages,
Some(schema),
None,
None,
None,
None
).await?;
// Parse and update memory
let result: ResponseModel = serde_json::from_str(&response.content)?;
if let Some(ref learned) = result.memory {
self.memory.merge_to_working_memory(learned);
}
Ok(result)
}
}
}
Multi-Agent Coordination
Multiple agents can share the same memory instance:
#![allow(unused)]
fn main() {
use std::sync::Arc;
// Shared memory
let memory = Arc::new(SharedWorkingMemory::new(json!({
"context": {}
})));
// Multiple agents
let researcher = MemoryAgent::new(
Arc::clone(&broker),
Arc::clone(&memory),
"You are a research assistant.".to_string(),
"Research topics thoroughly.".to_string(),
);
let writer = MemoryAgent::new(
Arc::clone(&broker),
Arc::clone(&memory),
"You are a technical writer.".to_string(),
"Write clear documentation.".to_string(),
);
// Researcher updates memory
let research = researcher
.generate_with_memory("Research Rust async patterns")
.await?;
// Writer uses updated memory (already shared)
let article = writer
.generate_with_memory("Write an article about what was researched")
.await?;
}
Use Cases
1. Conversational Chatbots
#![allow(unused)]
fn main() {
let memory = SharedWorkingMemory::new(json!({
"conversation_history": [],
"user_preferences": {}
}));
}
2. Workflow Automation
#![allow(unused)]
fn main() {
let memory = SharedWorkingMemory::new(json!({
"workflow_state": "started",
"completed_steps": [],
"pending_tasks": []
}));
}
3. Knowledge Base Building
#![allow(unused)]
fn main() {
let memory = SharedWorkingMemory::new(json!({
"entities": {},
"relationships": [],
"facts": []
}));
}
Best Practices
1. Structure Your Memory
Use clear, hierarchical keys:
#![allow(unused)]
fn main() {
json!({
"User": {...},
"Conversation": {...},
"SystemState": {...}
})
}
2. Use Arc for Sharing
Share memory across threads/agents:
#![allow(unused)]
fn main() {
let memory = Arc::new(SharedWorkingMemory::new(initial));
let memory_clone = Arc::clone(&memory);
}
3. Validate Memory Updates
Check memory quality before accepting:
#![allow(unused)]
fn main() {
if let Some(ref learned) = result.memory {
if is_valid_update(learned) {
memory.merge_to_working_memory(learned);
}
}
}
4. Handle Errors Gracefully
Memory operations can fail:
#![allow(unused)]
fn main() {
match agent.generate_with_memory(input).await {
Ok(response) => {
// Process response
}
Err(e) => {
eprintln!("Failed to generate response: {}", e);
// Don't update memory on error
}
}
}
Example Application
See the complete working memory example:
cd mojentic-ru
cargo run --example working_memory
The example demonstrates:
- Initializing memory with user data
- RequestAgent that learns from conversation
- Event-driven coordination with AsyncDispatcher
- Memory persistence across interactions
API Reference
SharedWorkingMemory
#![allow(unused)]
fn main() {
impl SharedWorkingMemory {
/// Create new memory with initial data
pub fn new(initial: Value) -> Self
/// Get current memory snapshot
pub fn get_working_memory(&self) -> Value
/// Deep merge updates into memory
pub fn merge_to_working_memory(&self, updates: &Value)
}
}
See src/context/shared_working_memory.rs for full documentation.
Thread Safety
SharedWorkingMemory is thread-safe and can be shared across async tasks:
#![allow(unused)]
fn main() {
use tokio::task;
let memory = Arc::new(SharedWorkingMemory::new(initial));
let task1 = {
let memory = Arc::clone(&memory);
task::spawn(async move {
memory.merge_to_working_memory(&updates1);
})
};
let task2 = {
let memory = Arc::clone(&memory);
task::spawn(async move {
memory.merge_to_working_memory(&updates2);
})
};
task1.await?;
task2.await?;
}
Observability
Tracing and introspection help debug agentic flows, measure latency, and optimize tool usage.
Components:
- Tracer API for spans/events
- Instrumented broker calls
- Future: metrics export
Tracer System
The tracer system provides comprehensive observability into LLM interactions, tool executions, and agent communications. It records events with timestamps and correlation IDs, enabling detailed debugging and monitoring.
Goals:
- Performance insight
- Failure localization
- Reproducibility of agent flows
Architecture
The tracer system consists of several key components:
Core Components
- TracerEvent: Base trait for all event types with timestamps, correlation IDs, and printable summaries
- EventStore: Thread-safe storage for events with callbacks and filtering capabilities
- TracerSystem: Coordination layer providing convenience methods for recording events
- NullTracer: Null object pattern implementation for when tracing is disabled
Event Types
The system supports four main event types:
- LlmCallTracerEvent: Records LLM calls with model, messages, temperature, and available tools
- LlmResponseTracerEvent: Records LLM responses with content, tool calls, and duration
- ToolCallTracerEvent: Records tool executions with arguments, results, and duration
- AgentInteractionTracerEvent: Records agent-to-agent communications
Usage
Basic Setup
#![allow(unused)]
fn main() {
use mojentic::tracer::TracerSystem;
use std::sync::Arc;
// Create a tracer system
let tracer = Arc::new(TracerSystem::default());
// Enable/disable tracing
tracer.enable();
tracer.disable();
}
Recording Events
LLM Calls
#![allow(unused)]
fn main() {
use std::collections::HashMap;
tracer.record_llm_call(
"llama3.2", // model
vec![], // messages (simplified)
0.7, // temperature
None, // tools
"my_broker", // source
"correlation-123" // correlation_id
);
}
LLM Responses
#![allow(unused)]
fn main() {
tracer.record_llm_response(
"llama3.2", // model
"Response text", // content
None, // tool_calls
Some(150.5), // call_duration_ms
"my_broker", // source
"correlation-123" // correlation_id
);
}
Tool Calls
#![allow(unused)]
fn main() {
use serde_json::json;
let mut arguments = HashMap::new();
arguments.insert("input".to_string(), json!("test data"));
tracer.record_tool_call(
"date_tool", // tool_name
arguments, // arguments
json!({"result": "2024-01-15"}), // result
Some("agent1".to_string()), // caller
Some(25.0), // call_duration_ms
"tool_executor", // source
"correlation-123" // correlation_id
);
}
Agent Interactions
#![allow(unused)]
fn main() {
tracer.record_agent_interaction(
"agent1", // from_agent
"agent2", // to_agent
"request", // event_type
Some("evt-456".to_string()), // event_id
"dispatcher", // source
"correlation-123" // correlation_id
);
}
Querying Events
Get All Events
#![allow(unused)]
fn main() {
let summaries = tracer.get_event_summaries(None, None, None);
for summary in summaries {
println!("{}", summary);
}
}
Filter by Time Range
#![allow(unused)]
fn main() {
use std::time::{SystemTime, UNIX_EPOCH};
let now = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap()
.as_secs_f64();
let one_hour_ago = now - 3600.0;
let recent_events = tracer.get_event_summaries(
Some(one_hour_ago), // start_time
Some(now), // end_time
None // filter_func
);
}
Filter by Custom Predicate
#![allow(unused)]
fn main() {
// Find all events from a specific correlation ID
let correlation_id = "correlation-123";
let related_events = tracer.get_event_summaries(
None,
None,
Some(&|event| event.correlation_id() == correlation_id)
);
// Count tool call events
let tool_call_count = tracer.count_events(
None,
None,
Some(&|event| {
event.printable_summary().contains("ToolCallTracerEvent")
})
);
}
Get Last N Events
#![allow(unused)]
fn main() {
// Get last 10 events
let last_events = tracer.get_last_n_summaries(10, None);
// Get last 5 LLM-related events
let last_llm_events = tracer.get_last_n_summaries(
5,
Some(&|event| {
let summary = event.printable_summary();
summary.contains("LlmCallTracerEvent") ||
summary.contains("LlmResponseTracerEvent")
})
);
}
Managing Events
#![allow(unused)]
fn main() {
// Get event count
let total_events = tracer.len();
println!("Total events: {}", total_events);
// Check if empty
if tracer.is_empty() {
println!("No events recorded");
}
// Clear all events
tracer.clear();
}
Correlation IDs
Correlation IDs are UUIDs that track related events across the system. They enable you to trace all events related to a single request or operation, creating a complete audit trail.
Best Practices
- Generate Once: Create a correlation ID at the start of a request
- Pass Through: Copy the correlation ID to all downstream operations
- Query Together: Use correlation IDs to filter related events
Example flow:
User Request → Generate correlation_id
↓
LLM Call (correlation_id: "abc-123")
↓
LLM Response (correlation_id: "abc-123")
↓
Tool Call (correlation_id: "abc-123")
↓
LLM Call with tool result (correlation_id: "abc-123")
↓
Final LLM Response (correlation_id: "abc-123")
Event Store Callbacks
You can register a callback function that’s called whenever an event is stored:
#![allow(unused)]
fn main() {
use mojentic::tracer::{EventStore, EventCallback, TracerSystem};
use std::sync::{Arc, atomic::{AtomicUsize, Ordering}};
// Create a counter
let event_count = Arc::new(AtomicUsize::new(0));
let count_clone = Arc::clone(&event_count);
// Create callback
let callback: EventCallback = Arc::new(move |event| {
count_clone.fetch_add(1, Ordering::SeqCst);
println!("Event stored: {}", event.printable_summary());
});
// Create event store with callback
let event_store = Arc::new(EventStore::new(Some(callback)));
// Create tracer with custom event store
let tracer = Arc::new(TracerSystem::new(Some(event_store), true));
// Events will trigger the callback
tracer.record_llm_call("llama3.2", vec![], 1.0, None, "test", "corr-1");
println!("Total events: {}", event_count.load(Ordering::SeqCst));
}
Null Tracer
For environments where tracing should be completely disabled without conditional checks:
#![allow(unused)]
fn main() {
use mojentic::tracer::NullTracer;
use std::collections::HashMap;
use serde_json::json;
let tracer = NullTracer::new();
// All operations are no-ops
tracer.record_llm_call("model", vec![], 1.0, None, "source", "id");
tracer.record_tool_call("tool", HashMap::new(), json!({}), None, None, "source", "id");
// Queries return empty results
assert!(tracer.is_empty());
assert_eq!(tracer.len(), 0);
assert_eq!(tracer.get_event_summaries(None, None, None).len(), 0);
}
Performance Considerations
- Thread Safety: EventStore uses Arc<Mutex<>> for thread-safe access
- Memory: Events are stored in memory; clear periodically for long-running processes
- Overhead: Minimal when disabled; consider NullTracer for zero overhead
- Callbacks: Keep callback functions fast to avoid blocking event recording
Integration with LlmBroker
The tracer integrates with LlmBroker to automatically record LLM interactions (implementation in progress):
use mojentic::llm::{LlmBroker, LlmMessage};
use mojentic::llm::gateways::OllamaGateway;
use mojentic::tracer::TracerSystem;
use std::sync::Arc;
// Create tracer
let tracer = Arc::new(TracerSystem::default());
// Create broker with tracer
let gateway = Arc::new(OllamaGateway::default());
let broker = LlmBroker::builder("llama3.2", gateway)
.with_tracer(tracer.clone())
.build();
// Broker will automatically record events
let response = broker.generate(
&[LlmMessage::user("Hello!")],
None,
None
).await?;
// Query tracer for events
let events = tracer.get_event_summaries(None, None, None);
for event in events {
println!("{}", event);
}
Testing
The tracer system includes comprehensive unit tests covering:
- Event creation and formatting
- Event storage and retrieval
- Callbacks and filtering
- TracerSystem coordination
- NullTracer behavior
Run tests with:
cargo test tracer
Implementation Status
✅ Layer 2 Tracer System - Core Complete
Current implementation:
- ✅ TracerEvent trait with 4 event types
- ✅ EventStore with callbacks and filtering
- ✅ TracerSystem coordination layer
- ✅ NullTracer for zero-overhead tracing
- ✅ 24 comprehensive unit tests (all passing)
- ✅ Correlation ID support
- ✅ Documentation complete
Pending integration:
- ⏳ LlmBroker integration (add tracer parameter)
- ⏳ Tool system integration (add tracer parameter)
- ⏳ Example application (tracer_demo.rs)
See Also
Comprehensive Guide
End-to-end walkthrough of observing a multi-agent run.
(Placeholder for full example once agents crate stabilized.)
API Documentation
The full Rust API reference is generated with cargo doc and published under /api/ of the documentation site.
- Visit: /api/
- Crate landing page: /api/mojentic/
Note: During local development, run:
# Build API docs
cargo doc --no-deps --all-features
# Build the mdBook
mdbook build book
# Preview by serving book/book/ and opening ./api/ in a second tab from target/doc/
Contributing
We welcome issues and PRs. Please follow Rust style (rustfmt, clippy) and add tests for new behavior.
Extending Mojentic
This guide shows how to extend the Mojentic framework with new gateways and custom functionality.
Adding a New LLM Gateway
To add support for a new LLM provider, implement the LlmGateway trait:
1. Create the Gateway File
#![allow(unused)]
fn main() {
// src/llm/gateways/openai.rs
use crate::error::{MojenticError, Result};
use crate::llm::gateway::{CompletionConfig, LlmGateway};
use crate::llm::models::{LlmGatewayResponse, LlmMessage};
use crate::llm::tools::LlmTool;
use async_trait::async_trait;
use serde_json::Value;
pub struct OpenAiGateway {
client: reqwest::Client,
api_key: String,
base_url: String,
}
impl OpenAiGateway {
pub fn new(api_key: String) -> Self {
Self {
client: reqwest::Client::new(),
api_key,
base_url: "https://api.openai.com/v1".to_string(),
}
}
}
#[async_trait]
impl LlmGateway for OpenAiGateway {
async fn complete(
&self,
model: &str,
messages: &[LlmMessage],
tools: Option<&[Box<dyn LlmTool>]>,
config: &CompletionConfig,
) -> Result<LlmGatewayResponse> {
// Implement OpenAI completion API call
todo!()
}
async fn complete_json(
&self,
model: &str,
messages: &[LlmMessage],
schema: Value,
config: &CompletionConfig,
) -> Result<Value> {
// Implement OpenAI structured output
todo!()
}
async fn get_available_models(&self) -> Result<Vec<String>> {
// Implement model listing
todo!()
}
async fn calculate_embeddings(
&self,
text: &str,
model: Option<&str>,
) -> Result<Vec<f32>> {
// Implement embeddings API
todo!()
}
}
}
2. Export the Gateway
#![allow(unused)]
fn main() {
// src/llm/gateways/mod.rs
pub mod ollama;
pub mod openai; // Add this line
pub use ollama::{OllamaConfig, OllamaGateway};
pub use openai::OpenAiGateway; // Add this line
}
3. Use the Gateway
use mojentic::prelude::*;
use mojentic::llm::gateways::OpenAiGateway;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<()> {
let gateway = OpenAiGateway::new("your-api-key".to_string());
let broker = LlmBroker::new("gpt-4", Arc::new(gateway));
// Use as normal
let messages = vec![LlmMessage::user("Hello!")];
let response = broker.generate(&messages, None, None).await?;
Ok(())
}
Creating Custom Message Types
You can create helper functions for specific message types:
#![allow(unused)]
fn main() {
use mojentic::llm::models::{LlmMessage, MessageRole};
impl LlmMessage {
/// Create a message with an image
pub fn user_with_image(content: impl Into<String>, image_path: impl Into<String>) -> Self {
Self {
role: MessageRole::User,
content: Some(content.into()),
tool_calls: None,
image_paths: Some(vec![image_path.into()]),
}
}
/// Create a system message with specific formatting
pub fn system_instruction(instruction: impl Into<String>) -> Self {
let content = format!("SYSTEM INSTRUCTION: {}", instruction.into());
Self::system(content)
}
}
}
Implementing Structured Output Types
Define your own structured output types with JSON Schema:
use serde::{Deserialize, Serialize};
use schemars::JsonSchema;
#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct ProductReview {
rating: u8,
pros: Vec<String>,
cons: Vec<String>,
recommendation: bool,
summary: String,
}
#[tokio::main]
async fn main() -> mojentic::Result<()> {
let gateway = OllamaGateway::new();
let broker = LlmBroker::new("qwen3:32b", Arc::new(gateway));
let messages = vec![
LlmMessage::user(
"Review this product: A wireless mouse with RGB lighting, \
3200 DPI, 6 buttons, and 40-hour battery life."
)
];
let review: ProductReview = broker.generate_object(&messages, None).await?;
println!("Rating: {}/5", review.rating);
println!("Pros: {:?}", review.pros);
println!("Cons: {:?}", review.cons);
println!("Recommended: {}", review.recommendation);
Ok(())
}
Advanced: Custom Gateway Configuration
Create configuration structs for your gateways:
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct CustomGatewayConfig {
pub endpoint: String,
pub timeout: std::time::Duration,
pub retry_attempts: u32,
pub custom_headers: HashMap<String, String>,
}
impl Default for CustomGatewayConfig {
fn default() -> Self {
Self {
endpoint: std::env::var("CUSTOM_ENDPOINT")
.unwrap_or_else(|_| "https://api.example.com".to_string()),
timeout: std::time::Duration::from_secs(30),
retry_attempts: 3,
custom_headers: HashMap::new(),
}
}
}
pub struct CustomGateway {
client: reqwest::Client,
config: CustomGatewayConfig,
}
impl CustomGateway {
pub fn new() -> Self {
Self::with_config(CustomGatewayConfig::default())
}
pub fn with_config(config: CustomGatewayConfig) -> Self {
let client = reqwest::Client::builder()
.timeout(config.timeout)
.build()
.unwrap();
Self { client, config }
}
}
}
Testing Your Extensions
Create tests for your gateways and tools:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use super::*;
#[tokio::test]
async fn test_gateway_models() {
let gateway = OllamaGateway::new();
let models = gateway.get_available_models().await.unwrap();
assert!(!models.is_empty());
}
}
}
Best Practices
- Error Handling: Always use proper error types, never panic in library code
- Async: All I/O operations should be async
- Documentation: Add rustdoc comments to all public APIs
- Testing: Write tests for your implementations
- Configuration: Support environment variables for API keys and endpoints
- Type Safety: Leverage Rust’s type system for compile-time guarantees
Need Help?
- Check the existing implementations in
src/llm/gateways/ollama.rs - Review the tool example in
examples/tool_usage.rs - See the Building Tools guide for tool development
Testing
- Unit tests:
cargo test --all-features - Lints:
cargo clippy -- -D warnings - Formatting:
cargo fmt -- --check
User Personas
- Framework user: wants ready-to-run examples
- Library integrator: needs stable APIs and performance
- Contributor: deep dives into internals and design