Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

OpenAI Model Registry

Overview

The OpenAI Model Registry is a centralized system for managing model-specific configurations, capabilities, and parameter requirements for OpenAI models. It provides a compile-time type-safe registry that informs the gateway about which parameters and features each model supports.

This registry is essential for:

  • Determining which models support tools, streaming, or vision capabilities
  • Selecting the correct token limit parameter (max_tokens vs max_completion_tokens)
  • Understanding temperature support for reasoning models
  • Tracking which API endpoints each model supports
  • Providing sensible defaults when working with unknown models

Model Types

The registry categorizes models into four distinct types:

Reasoning Models

Models that use internal reasoning steps before generating responses. These models use max_completion_tokens instead of max_tokens.

#![allow(unused)]
fn main() {
ModelType::Reasoning
}

Examples: o1, o1-mini, o3, o3-mini, gpt-5, gpt-5.1

Chat Models

Standard chat completion models that use max_tokens for output control.

#![allow(unused)]
fn main() {
ModelType::Chat
}

Examples: gpt-4o, gpt-4-turbo, gpt-3.5-turbo

Embedding Models

Models designed for generating text embeddings.

#![allow(unused)]
fn main() {
ModelType::Embedding
}

Examples: text-embedding-3-small, text-embedding-3-large

Moderation Models

Models for content moderation and safety classification.

#![allow(unused)]
fn main() {
ModelType::Moderation
}

Examples: omni-moderation-latest, text-moderation-latest

Model Capabilities

The ModelCapabilities struct defines what each model can do. Here are all the fields:

Core Capabilities

  • model_type: ModelType - The type of model (Reasoning, Chat, Embedding, or Moderation)
  • supports_tools: bool - Whether the model supports function/tool calling
  • supports_streaming: bool - Whether the model supports streaming responses
  • supports_vision: bool - Whether the model can process image inputs

Token Limits

  • max_context_tokens: Option<u32> - Maximum input context window size in tokens
  • max_output_tokens: Option<u32> - Maximum number of output tokens the model can generate

Temperature Support

  • supported_temperatures: Option<Vec<f32>> - Temperature constraints:
    • None - Accepts any temperature value (most chat models)
    • Some(vec![]) - Does not accept temperature parameter (not exposed to users)
    • Some(vec![1.0]) - Only accepts temperature 1.0 (reasoning models like o1, o3)

API Endpoint Support

  • supports_chat_api: bool - Supports /v1/chat/completions endpoint
  • supports_completions_api: bool - Supports /v1/completions endpoint
  • supports_responses_api: bool - Supports /v1/responses endpoint

Note: The OpenAI gateway currently only calls the Chat API. These flags are informational and available for future gateway enhancements.

API Endpoint Categories

OpenAI models support different API endpoints based on their design and capabilities:

Chat API (Most Common)

The /v1/chat/completions endpoint is supported by most modern models:

  • All GPT-4 variants (except legacy completions models)
  • Reasoning models (o1, o3, o4-mini)
  • Base GPT-5 models

Completions API (Legacy)

The /v1/completions endpoint is supported by older models:

  • babbage-002
  • davinci-002
  • gpt-3.5-turbo-instruct

These models do not support the chat API format.

Both APIs

Some models support both chat and completions:

  • gpt-4o-mini
  • gpt-4.1-nano
  • gpt-5.1

Responses API (Newer Models)

The /v1/responses endpoint is supported by specialized models:

  • gpt-5-pro
  • codex-mini-latest

Using the Global Registry

The registry provides a global instance for convenience:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::{
    get_model_registry, ModelType
};

// Access the global registry
let registry = get_model_registry();

// Look up model capabilities
let caps = registry.get_model_capabilities("gpt-4o");
assert_eq!(caps.model_type, ModelType::Chat);
assert!(caps.supports_tools);
assert!(caps.supports_streaming);
assert!(caps.supports_vision);
}

Token Limit Parameters

Different model types use different parameter names for controlling output length:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;

let registry = get_model_registry();

// Chat models use "max_tokens"
let gpt4_caps = registry.get_model_capabilities("gpt-4o");
assert_eq!(gpt4_caps.get_token_limit_param(), "max_tokens");

// Reasoning models use "max_completion_tokens"
let o1_caps = registry.get_model_capabilities("o1");
assert_eq!(o1_caps.get_token_limit_param(), "max_completion_tokens");
}

Temperature Support

Reasoning models have restricted temperature support:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;

let registry = get_model_registry();

// Chat models support arbitrary temperatures
let gpt4_caps = registry.get_model_capabilities("gpt-4o");
assert!(gpt4_caps.supports_temperature(0.7));
assert!(gpt4_caps.supports_temperature(1.5));

// Reasoning models only support temperature 1.0
let o1_caps = registry.get_model_capabilities("o1");
assert!(o1_caps.supports_temperature(1.0));
assert!(!o1_caps.supports_temperature(0.7));
}

Checking Endpoint Support

You can check which API endpoints a model supports:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;

let registry = get_model_registry();

// Standard chat model - chat API only
let caps = registry.get_model_capabilities("gpt-4o");
assert!(caps.supports_chat_api);
assert!(!caps.supports_completions_api);
assert!(!caps.supports_responses_api);

// Dual-endpoint model
let mini_caps = registry.get_model_capabilities("gpt-4o-mini");
assert!(mini_caps.supports_chat_api);
assert!(mini_caps.supports_completions_api);
assert!(!mini_caps.supports_responses_api);

// Completions-only model
let instruct_caps = registry.get_model_capabilities("gpt-3.5-turbo-instruct");
assert!(!instruct_caps.supports_chat_api);
assert!(instruct_caps.supports_completions_api);
assert!(!instruct_caps.supports_responses_api);

// Responses-only model
let pro_caps = registry.get_model_capabilities("gpt-5-pro");
assert!(!pro_caps.supports_chat_api);
assert!(!pro_caps.supports_completions_api);
assert!(pro_caps.supports_responses_api);
}

Creating Custom Registries

While the global registry is convenient, you can create custom registries for testing or specialized configurations:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::{
    OpenAIModelRegistry, ModelCapabilities, ModelType
};

// Create a new empty registry
let mut custom_registry = OpenAIModelRegistry::new();

// Register a custom model
custom_registry.register_model("custom-model", ModelCapabilities {
    model_type: ModelType::Chat,
    supports_tools: true,
    supports_streaming: true,
    supports_vision: false,
    max_context_tokens: Some(8192),
    max_output_tokens: Some(4096),
    supported_temperatures: None, // Accepts any temperature
    supports_chat_api: true,
    supports_completions_api: false,
    supports_responses_api: false,
});

// Use the custom registry
let caps = custom_registry.get_model_capabilities("custom-model");
assert_eq!(caps.model_type, ModelType::Chat);
assert!(caps.supports_tools);
}

Pattern Matching for Unknown Models

When a model is not explicitly registered, the registry infers capabilities from the model name:

#![allow(unused)]
fn main() {
use mojentic::llm::gateways::openai_model_registry::get_model_registry;

let registry = get_model_registry();

// Unknown model with "gpt-4" prefix inherits GPT-4 capabilities
let unknown_caps = registry.get_model_capabilities("gpt-4-future-model");
assert!(unknown_caps.supports_tools);
assert!(unknown_caps.supports_streaming);

// Unknown model with "o" prefix is treated as a reasoning model
let unknown_o_caps = registry.get_model_capabilities("o5-experimental");
assert_eq!(unknown_o_caps.get_token_limit_param(), "max_completion_tokens");
assert!(unknown_o_caps.supports_temperature(1.0));
assert!(!unknown_o_caps.supports_temperature(0.7));
}

Common Model Categories

Chat-Only Models

Most GPT-4, reasoning, and base GPT-5 models:

  • gpt-4o, gpt-4-turbo, gpt-4
  • o1, o1-mini, o3, o3-mini, o4-mini
  • gpt-5, gpt-5.1

Completions-Only Models

Legacy models that don’t support chat format:

  • babbage-002
  • davinci-002
  • gpt-3.5-turbo-instruct

Dual-Endpoint Models

Models supporting both chat and completions:

  • gpt-4o-mini
  • gpt-4.1-nano
  • gpt-5.1

Responses-Only Models

Specialized newer endpoint models:

  • gpt-5-pro
  • codex-mini-latest

Summary

The OpenAI Model Registry provides:

  • Type-safe model capability definitions
  • Automatic parameter selection based on model type
  • Temperature validation for reasoning models
  • API endpoint support tracking
  • Pattern matching for unknown models
  • A global registry for convenience
  • Custom registries for specialized use cases

By consulting the registry, the OpenAI gateway ensures it uses the correct parameters and features for each model, providing a robust and maintainable integration layer.