TokenizerGateway

Struct TokenizerGateway 

Source
pub struct TokenizerGateway { /* private fields */ }
Expand description

Gateway for tokenizing and detokenizing text using tiktoken.

The tokenizer gateway provides encoding and decoding functionality, allowing you to convert text to tokens and back. This is essential for understanding token usage and managing context windows.

§Examples

use mojentic::llm::gateways::TokenizerGateway;

let tokenizer = TokenizerGateway::new("cl100k_base").unwrap();
let text = "Hello, world!";
let tokens = tokenizer.encode(text);
let decoded = tokenizer.decode(&tokens);
assert_eq!(text, decoded);

Implementations§

Source§

impl TokenizerGateway

Source

pub fn new(model: &str) -> Result<Self, Box<dyn Error>>

Creates a new TokenizerGateway with the specified encoding model.

§Arguments
  • model - The encoding model to use. Common options:
    • “cl100k_base” - Used by GPT-4 and GPT-3.5-turbo (default)
    • “p50k_base” - Used by older GPT-3 models
    • “r50k_base” - Used by even older models
§Errors

Returns an error if the specified model is not available.

§Examples
use mojentic::llm::gateways::TokenizerGateway;

let tokenizer = TokenizerGateway::new("cl100k_base").unwrap();
Source

pub fn encode(&self, text: &str) -> Vec<usize>

Encodes text into tokens.

§Arguments
  • text - The text to encode
§Returns

A vector of token IDs representing the encoded text.

§Examples
use mojentic::llm::gateways::TokenizerGateway;

let tokenizer = TokenizerGateway::default();
let tokens = tokenizer.encode("Hello, world!");
println!("Token count: {}", tokens.len());
Source

pub fn decode(&self, tokens: &[usize]) -> String

Decodes tokens back into text.

§Arguments
  • tokens - The slice of token IDs to decode
§Returns

The decoded text.

§Examples
use mojentic::llm::gateways::TokenizerGateway;

let tokenizer = TokenizerGateway::default();
let tokens = vec![9906, 11, 1917, 0];
let text = tokenizer.decode(&tokens);
println!("Decoded: {}", text);
Source

pub fn count_tokens(&self, text: &str) -> usize

Counts the number of tokens in a text string.

This is a convenience method that encodes the text and returns the token count without allocating the token vector.

§Arguments
  • text - The text to count tokens for
§Returns

The number of tokens in the text.

§Examples
use mojentic::llm::gateways::TokenizerGateway;

let tokenizer = TokenizerGateway::default();
let count = tokenizer.count_tokens("Hello, world!");
println!("Token count: {}", count);

Trait Implementations§

Source§

impl Default for TokenizerGateway

Source§

fn default() -> Self

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

§

impl<T> PolicyExt for T
where T: ?Sized,

§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] only if self and other return Action::Follow. Read more
§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] if either self or other returns Action::Follow. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more