API Reference - Gateways

Gateways provide the interface between Mojentic and LLM providers.

LlmGateway Interface

typescript

interface LlmGateway {
  generate(
    messages: LlmMessage[],
    modelId: string,
    config?: CompletionConfig
  ): Promise<Result<GatewayResponse, Error>>;

  generateStream(
    messages: LlmMessage[],
    modelId: string,
    config?: CompletionConfig
  ): Promise<Result<AsyncGenerator<StreamChunk, void, unknown>, Error>>;

  listModels(): Promise<Result<string[], Error>>;
}

Base interface for all LLM gateways.

OllamaGateway

Gateway for Ollama local LLM server.

Constructor

typescript

class OllamaGateway implements LlmGateway {
  constructor(baseUrl?: string)
}

Parameters:

baseUrl: Ollama server URL (default: 'http://localhost:11434')

Example:

typescript

import { OllamaGateway } from 'mojentic';

// Default local server
const gateway = new OllamaGateway();

// Custom URL
const gateway = new OllamaGateway('http://192.168.1.100:11434');

generate

typescript

async generate(
  messages: LlmMessage[],
  modelId: string,
  config?: CompletionConfig
): Promise<Result<GatewayResponse, Error>>

Generate a complete response.

Parameters:

messages: Conversation messages
modelId: Ollama model name (e.g., 'qwen3:32b', 'llama2', 'mistral')
config: Optional configuration

Returns:

Result<GatewayResponse, Error>: Ok with response or Err with error

Example:

typescript

const messages = [
  Message.user('Hello!')
];

const result = await gateway.generate(messages, 'qwen3:32b');

if (isOk(result)) {
  const response = result.value;
  console.log(response.content);
  console.log('Tokens used:', response.usage?.totalTokens);
}

With Configuration:

typescript

const result = await gateway.generate(messages, 'qwen3:32b', {
  temperature: 0.8,
  maxTokens: 1000
});

generateStream

typescript

async generateStream(
  messages: LlmMessage[],
  modelId: string,
  config?: CompletionConfig
): Promise<Result<AsyncGenerator<StreamChunk, void, unknown>, Error>>

Generate a streaming response.

Parameters:

messages: Conversation messages
modelId: Ollama model name
config: Optional configuration (must include stream: true)

Returns:

Result<AsyncGenerator<StreamChunk>, Error>: Ok with async generator or Err with error

Example:

typescript

const result = await gateway.generateStream(
  messages,
  'qwen3:32b',
  { stream: true }
);

if (isOk(result)) {
  for await (const chunk of result.value) {
    process.stdout.write(chunk.content);

    if (chunk.isComplete) {
      console.log('\n---Done---');
    }
  }
}

listModels

typescript

async listModels(): Promise<Result<string[], Error>>

Get list of available models.

Returns:

Result<string[], Error>: Ok with model names or Err with error

Example:

typescript

const result = await gateway.listModels();

if (isOk(result)) {
  console.log('Available models:');
  result.value.forEach(model => console.log(`  - ${model}`));
}

Supported Models

Ollama supports many models. Popular ones include:

Qwen: qwen3:32b, qwen3:14b, qwen3:7b
Llama: llama2, llama2:13b, llama2:70b
Mistral: mistral, mistral:7b
CodeLlama: codellama, codellama:13b
Phi: phi, phi:medium

Check available models:

bash

ollama list

Pull new models:

bash

ollama pull qwen3:32b

Message Format

Mojentic to Ollama

Mojentic messages are converted to Ollama format:

typescript

// Mojentic
{
  role: MessageRole.User,
  content: "Hello!"
}

// Ollama API
{
  role: "user",
  content: "Hello!"
}

Tool Calls

Tool calls are converted to Ollama's format:

typescript

// Mojentic
{
  role: MessageRole.Assistant,
  content: "",
  toolCalls: [{
    id: "call_1",
    type: "function",
    function: {
      name: "get_weather",
      arguments: '{"location": "Paris"}'
    }
  }]
}

// Ollama API
{
  role: "assistant",
  content: "",
  tool_calls: [{
    id: "call_1",
    type: "function",
    function: {
      name: "get_weather",
      arguments: {"location": "Paris"}
    }
  }]
}

Configuration Options

Ollama-Specific

typescript

interface OllamaConfig extends CompletionConfig {
  temperature?: number;      // 0.0-2.0
  maxTokens?: number;        // Cross-provider max tokens to generate
  numPredict?: number;       // Ollama-specific max tokens (takes precedence)
  topP?: number;             // 0.0-1.0, nucleus sampling
  topK?: number;             // Top-K sampling (limits token choices)
  numCtx?: number;           // Context window size in tokens
  frequencyPenalty?: number; // Penalty for token frequency
  presencePenalty?: number;  // Penalty for token presence
  stop?: string[];           // Stop sequences
  stream?: boolean;          // Enable streaming
  responseFormat?: {         // Structured output
    type: 'json_object' | 'text';
    schema?: Record<string, unknown>;
  };
}

Example:

typescript

const config = {
  temperature: 0.7,
  numPredict: 2000,  // Ollama-specific, preferred over maxTokens
  topP: 0.9,
  topK: 40,
  numCtx: 8192,      // Context window size
  stop: ['END', 'STOP']
};

const result = await gateway.generate('qwen3:32b', messages, config);

Temperature Guide

0.0-0.3: Focused, deterministic
0.4-0.7: Balanced
0.8-1.2: Creative
1.3-2.0: Very creative, less coherent

Error Handling

Gateway Errors

typescript

const result = await gateway.generate(messages, 'qwen3:32b');

if (isErr(result)) {
  const error = result.error;

  if (error instanceof GatewayError) {
    console.error('Ollama error:', error.message);
    console.error('Status code:', error.statusCode);

    if (error.statusCode === 404) {
      console.error('Model not found. Install with: ollama pull qwen3:32b');
    } else if (error.statusCode === 503) {
      console.error('Ollama server not responding');
    }
  }
}

Connection Errors

typescript

try {
  const result = await gateway.generate(messages, 'qwen3:32b');

  if (isErr(result)) {
    if (result.error.message.includes('ECONNREFUSED')) {
      console.error('Cannot connect to Ollama. Is it running?');
      console.error('Start with: ollama serve');
    }
  }
} catch (error) {
  console.error('Unexpected error:', error);
}

Streaming Details

Chunk Structure

typescript

interface StreamChunk {
  content: string;        // Partial text
  isComplete: boolean;    // Is this the last chunk?
  toolCalls?: ToolCall[]; // Tool calls (only in final chunk)
  finishReason?: string;  // Why generation stopped
}

Processing Chunks

typescript

const result = await gateway.generateStream(messages, 'qwen3:32b', {
  stream: true
});

if (isOk(result)) {
  let fullResponse = '';

  for await (const chunk of result.value) {
    fullResponse += chunk.content;
    process.stdout.write(chunk.content);

    if (chunk.isComplete) {
      console.log('\n---Complete---');
      console.log('Full response length:', fullResponse.length);

      if (chunk.toolCalls) {
        console.log('Tool calls requested:', chunk.toolCalls.length);
      }
    }
  }
}

Stream Error Handling

typescript

const result = await gateway.generateStream(messages, 'qwen3:32b', {
  stream: true
});

if (isErr(result)) {
  console.error('Failed to start stream:', result.error);
  return;
}

try {
  for await (const chunk of result.value) {
    // Process chunk
  }
} catch (error) {
  console.error('Stream interrupted:', error);
}

Best Practices

1. Connection Management

typescript

// Good: Reuse gateway instance
const gateway = new OllamaGateway();

async function chat1() {
  return await gateway.generate(messages, 'qwen3:32b');
}

async function chat2() {
  return await gateway.generate(messages, 'qwen3:32b');
}

2. Check Server Availability

typescript

async function ensureOllamaRunning(gateway: OllamaGateway): Promise<boolean> {
  const result = await gateway.listModels();
  return isOk(result);
}

if (!await ensureOllamaRunning(gateway)) {
  console.error('Ollama server not available');
  process.exit(1);
}

3. Model Validation

typescript

async function checkModel(
  gateway: OllamaGateway,
  modelId: string
): Promise<boolean> {
  const result = await gateway.listModels();

  if (isOk(result)) {
    return result.value.includes(modelId);
  }

  return false;
}

const modelExists = await checkModel(gateway, 'qwen3:32b');
if (!modelExists) {
  console.error('Model not installed. Run: ollama pull qwen3:32b');
}

4. Streaming for Long Responses

typescript

// Use streaming for better UX with long responses
const config = {
  stream: true,
  maxTokens: 4000
};

const result = await gateway.generateStream(messages, 'qwen3:32b', config);

Complete Example

typescript

import {
  OllamaGateway,
  Message,
  isOk,
  isErr,
  GatewayError
} from 'mojentic';

// Setup
const gateway = new OllamaGateway();

// Check server
const modelsResult = await gateway.listModels();
if (isErr(modelsResult)) {
  console.error('Cannot connect to Ollama server');
  process.exit(1);
}

console.log('Available models:', modelsResult.value);

// Check specific model
const modelId = 'qwen3:32b';
if (!modelsResult.value.includes(modelId)) {
  console.error(`Model ${modelId} not found`);
  console.error(`Install with: ollama pull ${modelId}`);
  process.exit(1);
}

// Generate
const messages = [
  Message.system('You are a helpful assistant'),
  Message.user('Explain async/await in TypeScript')
];

const config = {
  temperature: 0.7,
  maxTokens: 1000
};

const result = await gateway.generate(messages, modelId, config);

if (isOk(result)) {
  const response = result.value;
  console.log('Response:', response.content);
  console.log('\nUsage:');
  console.log('  Prompt tokens:', response.usage?.promptTokens);
  console.log('  Completion tokens:', response.usage?.completionTokens);
  console.log('  Total tokens:', response.usage?.totalTokens);
} else {
  const error = result.error;

  if (error instanceof GatewayError) {
    console.error('Gateway error:', error.message);
    console.error('Status:', error.statusCode);
  } else {
    console.error('Error:', error.message);
  }
}

Future Gateways

Planned gateway implementations:

OpenAI: ChatGPT models (GPT-4, GPT-3.5-turbo)
Anthropic: Claude models
Google: Gemini models
Groq: Fast inference

All gateways implement the same LlmGateway interface for consistency.

API Reference - Gateways ​

LlmGateway Interface ​

OllamaGateway ​

Constructor ​

generate ​

generateStream ​

listModels ​

Supported Models ​

Message Format ​

Mojentic to Ollama ​

Tool Calls ​

Configuration Options ​

Ollama-Specific ​

Temperature Guide ​

Error Handling ​

Gateway Errors ​

Connection Errors ​

Streaming Details ​

Chunk Structure ​

Processing Chunks ​

Stream Error Handling ​

Best Practices ​

1. Connection Management ​

2. Check Server Availability ​

3. Model Validation ​

4. Streaming for Long Responses ​

Complete Example ​

Future Gateways ​

See Also ​

API Reference - Gateways

LlmGateway Interface

OllamaGateway

Constructor

generate

generateStream

listModels

Supported Models

Message Format

Mojentic to Ollama

Tool Calls

Configuration Options

Ollama-Specific

Temperature Guide

Error Handling

Gateway Errors

Connection Errors

Streaming Details

Chunk Structure

Processing Chunks

Stream Error Handling

Best Practices

1. Connection Management

2. Check Server Availability

3. Model Validation

4. Streaming for Long Responses

Complete Example

Future Gateways

See Also