Layer 1 - LLM Abstraction
This layer is about abstracting the function of an LLM, so that you can think about prompting and output and tool use in a way that does not tie you to a specific LLM, its calling conventions, and the quirks of its specific library.
At this layer we have:
-
LLMBroker: This is the main entrypoint to the layer. It leverages an LLM specific Gateway, and is the primary interface for interacting with the LLM on the other side. The LLMBroker correctly handles text generation, structured output, and tool use.
-
ChatSession: This is a simple class that wraps the LLMBroker and provides a conversational interface to the LLM with context size management. It is a good starting point for building a chatbot.
-
OllamaGateway, OpenAIGateway: These are out-of-the-box adapters that will interact with models available through Ollama and OpenAI.
-
LLMGateway: This is the abstract class that all LLM adapters must inherit from. It provides a common interface and isolation point for interacting with LLMs.
-
MessageBuilder: This is a utility class for constructing messages with text, images, and file contents using a fluent interface.
Architecture Overview
The following diagram illustrates how the key classes in Layer 1 relate to each other:
classDiagram
class LLMBroker {
+model: str
+adapter: LLMGateway
+tokenizer: TokenizerGateway
+tracer: TracerSystem
+generate(messages, tools, temperature) str
+generate_object(messages, object_model) BaseModel
}
class ChatSession {
+broker: LLMBroker
+messages: List[LLMMessage]
+chat(message) str
+clear_history()
}
class LLMGateway {
<<abstract>>
+complete(model, messages, tools) LLMGatewayResponse
+calculate_embeddings(text, model) List[float]
}
class OllamaGateway {
+complete(model, messages, tools) LLMGatewayResponse
+calculate_embeddings(text, model) List[float]
}
class OpenAIGateway {
+complete(model, messages, tools) LLMGatewayResponse
+calculate_embeddings(text, model) List[float]
}
class TokenizerGateway {
+encode(text) List
+decode(tokens) str
}
class TracerSystem {
<<interface>>
+record_llm_call()
+record_llm_response()
+record_tool_call()
}
class LLMGatewayResponse {
+content: str
+tool_calls: List[LLMToolCall]
+object: BaseModel
}
class LLMMessage {
+role: MessageRole
+content: str
+tool_calls: List[LLMToolCall]
}
class MessageBuilder {
+add_text(text) MessageBuilder
+add_image(path) MessageBuilder
+add_file(path) MessageBuilder
+build() LLMMessage
}
ChatSession --> LLMBroker : wraps
LLMBroker --> LLMGateway : uses via adapter
LLMBroker --> TokenizerGateway : uses
LLMBroker --> TracerSystem : uses (optional)
OllamaGateway --|> LLMGateway : extends
OpenAIGateway --|> LLMGateway : extends
LLMGateway --> LLMGatewayResponse : returns
LLMBroker --> LLMMessage : sends/receives
MessageBuilder --> LLMMessage : builds
LLMGatewayResponse --> LLMMessage : contains
Working with Embeddings
Mojentic provides embeddings functionality through the calculate_embeddings
method in both the OllamaGateway
and OpenAIGateway
classes. Embeddings are vector representations of text that capture semantic meaning, making them useful for similarity comparisons, clustering, and other NLP tasks.
Usage Example
from mojentic.llm.gateways import OllamaGateway, OpenAIGateway
# Initialize the gateways
ollama_gateway = OllamaGateway()
openai_gateway = OpenAIGateway(api_key="your-api-key")
# Calculate embeddings using Ollama
text = "This is a sample text for embeddings."
ollama_embeddings = ollama_gateway.calculate_embeddings(
text=text,
model="mxbai-embed-large" # Default model for Ollama
)
# Calculate embeddings using OpenAI
openai_embeddings = openai_gateway.calculate_embeddings(
text=text,
model="text-embedding-3-large" # Default model for OpenAI
)
# Use the embeddings for similarity comparison, clustering, etc.
print(f"Ollama embeddings dimension: {len(ollama_embeddings)}")
print(f"OpenAI embeddings dimension: {len(openai_embeddings)}")
Important Notes
- Available Models:
- For Ollama: Models like "mxbai-embed-large" (default), "nomic-embed-text" are commonly used
-
For OpenAI: Models like "text-embedding-3-large" (default), "text-embedding-3-small", "text-embedding-ada-002" are available
-
Embedding Dimensions:
- Different models produce embeddings with different dimensions
- Ollama's "mxbai-embed-large" typically produces 1024-dimensional embeddings
-
OpenAI's "text-embedding-3-large" typically produces 3072-dimensional embeddings
-
Performance Considerations:
- Embedding generation is generally faster and less resource-intensive than text generation
- Local embedding models (via Ollama) may be more cost-effective for high-volume applications
Working with Images
Mojentic supports sending images to LLMs using the MessageBuilder
class. This allows you to perform image analysis, OCR, and other vision-based tasks with a clean, fluent interface.
Usage Example
from mojentic.llm import LLMBroker
from mojentic.llm import MessageBuilder
from pathlib import Path
# Initialize the LLM broker
llm = LLMBroker(model="gemma3:27b") # Use an image-capable model
# Build a message with an image
message = MessageBuilder("Describe what you see in this image.") \
.add_image(Path.cwd() / "images" / "example.jpg") \
.build()
# Generate a response
result = llm.generate(messages=[message])
print(result)
Important Notes
- Image-Capable Models: You must use an image-capable model to process images. Not all models support image analysis.
- For Ollama: Models like "gemma3", "llava", and "bakllava" support image analysis
-
For OpenAI: Models like "gpt-4-vision-preview" and "gpt-4o" support image analysis
-
Image Formats: Supported image formats include JPEG, PNG, GIF, and WebP.
-
Implementation Details:
- The
MessageBuilder
handles the appropriate encoding of images for different LLM providers - For Ollama: Images are passed as file paths (handled internally by MessageBuilder)
-
For OpenAI: Images are base64-encoded and included in the message content (handled internally by MessageBuilder)
-
Performance Considerations: Image analysis may require more tokens and processing time than text-only requests.
Building Blocks
mojentic.llm.LLMBroker
This class is responsible for managing interaction with a Large Language Model. It abstracts the user from the specific mechanics of the LLM and provides a common interface for generating responses.
Source code in src/mojentic/llm/llm_broker.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 |
|
__init__(model, gateway=None, tokenizer=None, tracer=None)
Create an instance of the LLMBroker.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
str
|
The name of the model to use. |
required |
gateway
|
Optional[LLMGateway]
|
The gateway to use for communication with the LLM. If None, a gateway is created that will utilize a local Ollama server. |
None
|
tokenizer
|
Optional[TokenizerGateway]
|
The gateway to use for tokenization. This is used to log approximate token counts for
the LLM calls. If
None, tiktoken's |
None
|
tracer
|
Optional[TracerSystem]
|
Optional tracer system to record LLM calls and responses. |
None
|
Source code in src/mojentic/llm/llm_broker.py
generate(messages, tools=None, temperature=1.0, num_ctx=32768, num_predict=-1, max_tokens=16384, correlation_id=None)
Generate a text response from the LLM.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
messages
|
LLMMessage
|
A list of messages to send to the LLM. |
required |
tools
|
List[Tool]
|
A list of tools to use with the LLM. If a tool call is requested, the tool will be called and the output will be included in the response. |
None
|
temperature
|
float
|
The temperature to use for the response. Defaults to 1.0 |
1.0
|
num_ctx
|
int
|
The number of context tokens to use. Defaults to 32768. |
32768
|
num_predict
|
int
|
The number of tokens to predict. Defaults to no limit. |
-1
|
correlation_id
|
str
|
UUID string that is copied from cause-to-affect for tracing events. |
None
|
Returns:
Type | Description |
---|---|
str
|
The response from the LLM. |
Source code in src/mojentic/llm/llm_broker.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 |
|
generate_object(messages, object_model, temperature=1.0, num_ctx=32768, num_predict=-1, max_tokens=16384, correlation_id=None)
Generate a structured response from the LLM and return it as an object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
messages
|
List[LLMMessage]
|
A list of messages to send to the LLM. |
required |
object_model
|
BaseModel
|
The class of the model to use for the structured response data. |
required |
temperature
|
float
|
The temperature to use for the response. Defaults to 1.0. |
1.0
|
num_ctx
|
int
|
The number of context tokens to use. Defaults to 32768. |
32768
|
num_predict
|
int
|
The number of tokens to predict. Defaults to no limit. |
-1
|
correlation_id
|
str
|
UUID string that is copied from cause-to-affect for tracing events. |
None
|
Returns:
Type | Description |
---|---|
BaseModel
|
An instance of the model class provided containing the structured response data. |
Source code in src/mojentic/llm/llm_broker.py
mojentic.llm.ChatSession
This class is responsible for managing the state of a conversation with the LLM.
Source code in src/mojentic/llm/chat_session.py
__init__(llm, system_prompt='You are a helpful assistant.', tools=None, max_context=32768, tokenizer_gateway=None, temperature=1.0)
Create an instance of the ChatSession.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
llm
|
LLMBroker
|
The broker to use for generating responses. |
required |
system_prompt
|
str
|
The prompt to use for the system messages. Defaults to "You are a helpful assistant." |
'You are a helpful assistant.'
|
tools
|
List[LLMTool]
|
The tools you want to make available to the LLM. Defaults to None. |
None
|
max_context
|
int
|
The maximum number of tokens to keep in the context. Defaults to 32768. |
32768
|
tokenizer_gateway
|
TokenizerGateway
|
The gateway to use for tokenization. If None, |
None
|
temperature
|
float
|
The temperature to use for the response. Defaults to 1.0. |
1.0
|
Source code in src/mojentic/llm/chat_session.py
insert_message(message)
Add a message onto the end of the chat session. If the total token count exceeds the max context, the oldest messages are removed.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
message
|
LLMMessage
|
The message to add to the chat session. |
required |
Source code in src/mojentic/llm/chat_session.py
send(query)
Send a query to the LLM and return the response. Also records the query and response in the ongoing chat session.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query
|
str
|
The query to send to the LLM. |
required |
Returns:
Type | Description |
---|---|
str
|
The response from the LLM. |
Source code in src/mojentic/llm/chat_session.py
mojentic.llm.gateways.LLMGateway
This is an abstract class from which specific LLM gateways are derived.
To create a new gateway, inherit from this class and implement the complete
method.
Source code in src/mojentic/llm/gateways/llm_gateway.py
calculate_embeddings(text, model=None)
Calculate embeddings for the given text using the specified model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
The text to calculate embeddings for. |
required |
model
|
str
|
The name of the model to use for embeddings. Default value depends on the implementation. |
None
|
Returns:
Type | Description |
---|---|
List[Any]
|
The embeddings for the text. |
Source code in src/mojentic/llm/gateways/llm_gateway.py
complete(model, messages, object_model=None, tools=None, temperature=1.0, num_ctx=32768, max_tokens=16384, num_predict=-1)
Complete the LLM request.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
str
|
The name of the model to use, as appears in |
required |
messages
|
List[LLMMessage]
|
A list of messages to send to the LLM. |
required |
object_model
|
Optional[BaseModel]
|
The model to use for validating the response. |
None
|
tools
|
Optional[List[LLMTool]]
|
A list of tools to use with the LLM. If a tool call is requested, the tool will be called and the output will be included in the response. |
None
|
temperature
|
float
|
The temperature to use for the response. Defaults to 1.0. |
1.0
|
num_ctx
|
int
|
The number of context tokens to use. Defaults to 32768. |
32768
|
max_tokens
|
int
|
The maximum number of tokens to generate. Defaults to 16384. |
16384
|
num_predict
|
int
|
The number of tokens to predict. Defaults to no limit. |
-1
|
Returns:
Type | Description |
---|---|
LLMGatewayResponse
|
The response from the Ollama service. |
Source code in src/mojentic/llm/gateways/llm_gateway.py
mojentic.llm.gateways.OllamaGateway
Bases: LLMGateway
This class is a gateway to the Ollama LLM service.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
host
|
str
|
The Ollama host to connect to. Defaults to "http://localhost:11434". |
'http://localhost:11434'
|
headers
|
dict
|
The headers to send with the request. Defaults to an empty dict. |
{}
|
Source code in src/mojentic/llm/gateways/ollama.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
|
calculate_embeddings(text, model='mxbai-embed-large')
Calculate embeddings for the given text using the specified model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
The text to calculate embeddings for. |
required |
model
|
str
|
The name of the model to use for embeddings. Defaults to "mxbai-embed-large". |
'mxbai-embed-large'
|
Returns:
Type | Description |
---|---|
list
|
The embeddings for the text. |
Source code in src/mojentic/llm/gateways/ollama.py
complete(**args)
Complete the LLM request by delegating to the Ollama service.
Keyword Arguments
model : str
The name of the model to use, as appears in ollama list
.
messages : List[LLMMessage]
A list of messages to send to the LLM.
object_model : Optional[BaseModel]
The model to use for validating the response.
tools : Optional[List[LLMTool]]
A list of tools to use with the LLM. If a tool call is requested, the tool will be called and the output
will be included in the response.
temperature : float, optional
The temperature to use for the response. Defaults to 1.0.
num_ctx : int, optional
The number of context tokens to use. Defaults to 32768.
max_tokens : int, optional
The maximum number of tokens to generate. Defaults to 16384.
num_predict : int, optional
The number of tokens to predict. Defaults to no limit.
Returns:
Type | Description |
---|---|
LLMGatewayResponse
|
The response from the Ollama service. |
Source code in src/mojentic/llm/gateways/ollama.py
complete_stream(**args)
Stream the LLM response from Ollama service.
Keyword Arguments
model : str
The name of the model to use, as appears in ollama list
.
messages : List[LLMMessage]
A list of messages to send to the LLM.
tools : Optional[List[LLMTool]]
A list of tools to use with the LLM. If a tool call is requested, the tool will be called and the output
will be included in the response.
temperature : float, optional
The temperature to use for the response. Defaults to 1.0.
num_ctx : int, optional
The number of context tokens to use. Defaults to 32768.
max_tokens : int, optional
The maximum number of tokens to generate. Defaults to 16384.
num_predict : int, optional
The number of tokens to predict. Defaults to no limit.
Returns:
Type | Description |
---|---|
Iterator[StreamingResponse]
|
An iterator of StreamingResponse objects containing response chunks. |
Source code in src/mojentic/llm/gateways/ollama.py
get_available_models()
pull_model(model)
Pull the model from the Ollama service.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
str
|
The name of the model to pull. |
required |
mojentic.llm.gateways.OpenAIGateway
Bases: LLMGateway
This class is a gateway to the OpenAI LLM service.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
api_key
|
str
|
The OpenAI API key to use. |
required |
Source code in src/mojentic/llm/gateways/openai.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 |
|
calculate_embeddings(text, model='text-embedding-3-large')
Calculate embeddings for the given text using the specified OpenAI model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
The text to calculate embeddings for. |
required |
model
|
str
|
The name of the OpenAI embeddings model to use. Defaults to "text-embedding-3-large". |
'text-embedding-3-large'
|
Returns:
Type | Description |
---|---|
list
|
The embeddings for the text. |
Source code in src/mojentic/llm/gateways/openai.py
complete(**kwargs)
Complete the LLM request by delegating to the OpenAI service.
Keyword Arguments
model : str The name of the model to use. messages : List[LLMMessage] A list of messages to send to the LLM. object_model : Optional[Type[BaseModel]] The model to use for validating the response. tools : Optional[List[LLMTool]] A list of tools to use with the LLM. If a tool call is requested, the tool will be called and the output will be included in the response. temperature : float, optional The temperature to use for the response. Defaults to 1.0. num_ctx : int, optional The number of context tokens to use. Defaults to 32768. max_tokens : int, optional The maximum number of tokens to generate. Defaults to 16384. num_predict : int, optional The number of tokens to predict. Defaults to no limit.
Returns:
Type | Description |
---|---|
LLMGatewayResponse
|
The response from the OpenAI service. |
Source code in src/mojentic/llm/gateways/openai.py
151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 |
|
mojentic.llm.gateways.models.LLMMessage
Bases: BaseModel
A message to be sent to the LLM. These would accumulate during a chat session with an LLM.
Attributes:
Name | Type | Description |
---|
Parameters:
Name | Type | Description | Default |
---|---|---|---|
role
|
MessageRole
|
|
<MessageRole.User: 'user'>
|
content
|
str | None
|
|
None
|
object
|
BaseModel | None
|
|
None
|
tool_calls
|
List[LLMToolCall] | None
|
|
None
|
image_paths
|
List[str] | None
|
|
None
|
Source code in src/mojentic/llm/gateways/models.py
mojentic.llm.gateways.models.LLMToolCall
Bases: BaseModel
A tool call to be made available to the LLM.
Attributes:
Name | Type | Description |
---|
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id
|
str | None
|
|
None
|
name
|
str
|
|
required |
arguments
|
dict[str, str]
|
|
required |
Source code in src/mojentic/llm/gateways/models.py
mojentic.llm.gateways.models.LLMGatewayResponse
Bases: BaseModel
The response from the LLM gateway, abstracting you from the quirks of a specific LLM.
Attributes:
Name | Type | Description |
---|
Parameters:
Name | Type | Description | Default |
---|---|---|---|
content
|
str | dict[str, str] | None
|
The content of the response. |
None
|
object
|
BaseModel | None
|
Parsed response object |
None
|
tool_calls
|
List[LLMToolCall]
|
List of requested tool calls from the LLM. |
<dynamic>
|