Using LLMs (Broker)
Mojentic’s LLM broker routes completion requests to pluggable gateways (e.g., Ollama). It provides a unified API for chat and text completions.
Quick chat example
use mojentic::llm::{Broker, CompletionConfig, Message};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let broker = Broker::new()?;
let cfg = CompletionConfig::default();
let resp = broker.chat(
cfg,
[
Message::system("You are a helpful assistant"),
Message::user("Say hi in one sentence"),
],
).await?;
println!("{}", resp.text());
Ok(())
}
Gateways
- Ollama: local models for fast iteration.
- HTTP-based gateways: add your own by implementing the
Gatewaytrait.
Structured output
Use typed schemas to parse the model output into structs. See Structured Output.
Configuration
Use CompletionConfig to control generation parameters:
#![allow(unused)]
fn main() {
use mojentic::llm::gateway::{CompletionConfig, ReasoningEffort};
let config = CompletionConfig {
temperature: 0.3,
reasoning_effort: Some(ReasoningEffort::High),
..Default::default()
};
}
Available Parameters
- temperature (
f64): Controls randomness. Default: 1.0 - num_ctx (
u32): Context window size in tokens. Default: 32768 - max_tokens (
u32): Maximum tokens to generate. Default: 16384 - num_predict (
i32): Tokens to predict (-1 = no limit). Default: -1 - reasoning_effort (
Option<ReasoningEffort>): Extended thinking level —Low,Medium,High, orNone. Default: None
Reasoning Effort
Control how much the model thinks before responding:
#![allow(unused)]
fn main() {
use mojentic::llm::gateway::{CompletionConfig, ReasoningEffort};
// Deep reasoning for complex problems
let config = CompletionConfig {
reasoning_effort: Some(ReasoningEffort::High),
temperature: 0.1,
..Default::default()
};
// Quick responses
let config = CompletionConfig {
reasoning_effort: Some(ReasoningEffort::Low),
..Default::default()
};
}
- Ollama: Maps to
think: trueparameter for extended thinking. The model’s reasoning trace is available inLlmGatewayResponse.thinking. - OpenAI: Maps to
reasoning_effortAPI parameter for reasoning models (o1, o3 series). Ignored with a warning for non-reasoning models.
For full details, see Reasoning Effort Control.