Explore ChatBotKit's diverse range of conversational AI models from multiple providers, with details on token costs and custom model settings.

ChatBotKit supports a wide range of models to create engaging conversational AI experiences. These include models from OpenAI, Anthropic, Google, Mistral, Perplexity, and other providers, along with ChatBotKit's own in-house models. We regularly add new models as they become available.

For the complete, up-to-date list of supported models - including descriptions, token ratios, and pricing details - visit the platform models page. You can also retrieve model information programmatically via the API.

Understanding Token Costs

Most models have separate input token ratio and output token ratio values that reflect the different costs of processing input versus generating output. The per-model input and output ratios are available on the platform models page and via the API.

The formula for calculating CBK token consumption is:

For example, if a model has an input token ratio of 0.0893 and an output token ratio of 0.5556, and a request consumes 400 input tokens and 600 output tokens:

A few important details about how token usage is recorded:

  • Upstream provider usage: ChatBotKit records the actual token counts reported by the upstream provider (e.g. OpenAI, Anthropic). If the provider reports 400 input tokens, that is what gets recorded and billed.
  • Cached tokens are not charged: When a provider supports prompt caching, only non-cached input tokens are counted. You are not charged for cached tokens.
  • Usage log breakdown: Each usage record contains a detailed line-item breakdown showing input tokens, output tokens, and other components. Adding up all line items will match the total recorded consumption.

The input and output token ratios are derived from the market price of each model relative to one base token. A higher ratio corresponds to a more expensive model. You can retrieve the exact ratios for every model via the API or on the platform models page.

The context size refers to the maximum tokens (words or symbols) the model can consider when generating a response. A larger context size allows for more information to be taken into account, potentially leading to more accurate and relevant responses.

When choosing a model, it's essential to evaluate not just its capabilities, but also its cost and context size. Larger and more expensive models aren't always the best choice for every task. Often, a smaller model can perform equally well or even better for your specific use case. Consider starting with a cost-efficient model and scaling up only if needed.

FAQ

Can I get regional access to some models?

Yes. Some models such as Claude can be accessed within your own designated region. Please contact us for more information.

Can I bring my own model?

Our models are designed to scale no matter the circumstances. However, customers that wish to bring their own model can do so on some of our higher-tier plans such as Pro, Pro Plus and Team.

How is token usage calculated?

Each model has an input token ratio and an output token ratio. When a request completes, ChatBotKit records the actual token counts reported by the upstream provider and applies the formula:

CBK Tokens = (inputTokens x inputTokenRatio) + (outputTokens x outputTokenRatio)

Cached tokens are excluded from the input count, so you are only charged for tokens the provider actually processes. The usage log for each conversation provides a detailed line-item breakdown of all token consumption. You can find the exact input and output ratios for every model on the platform models page or via the API. Other factors such as the number of datasets, skillsets, and their types may also affect overall usage.

Bring Your Own Model

ChatBotKit offers the unique option of bringing your own model and keys to the platform. This feature is designed for those who desire more control over their models and costs. If you have a model that you've trained and perfected over time for your specific use case or requirement, you're free to bring it to our platform. This means you can use your own keys, which allows you to handle the payment for the model usage directly. This could be beneficial, especially if you have particular budget constraints or specific cost strategies. In essence, with ChatBotKit, you're not just limited to using our pre-built models, but you can also introduce your custom-made models, providing you with more flexibility and control to meet your specific needs.

Here is an outline of the steps required to create your own custom model.

  1. Navigate to the Bot Configuration Screen
    • From the main dashboard, click on the "Bots" section in the left-hand menu.
    • Select the bot you want to configure or create a new bot.
  2. Choose the Model
    • Under the "Model" section, select "custom" from the dropdown menu.
    • Press the "Settings" button.
  3. Model Configuration Window
    • Enter the name of the model in the "Name" field (e.g. the model identifier from the provider).
    • Choose the provider from the "Provider" dropdown menu (e.g. OpenAI, Anthropic, etc.).
    • Provide the necessary credentials for accessing the custom model. Click on the credentials field and enter the required information.
    • Define the maximum number of tokens the chatbot will use for each interaction in the "Max Tokens" field.
BYOK Caveats

When you opt to use your own key (BYOK) for model access, you assume full responsibility for the model's availability and operational limits. This shift occurs because you are no longer utilizing the default ChatBotKit service tiers, which may offer different capabilities and restrictions.

Customizing Model Settings

To customize a model, click the settings icon next to the selected model.

Core options

  • Max Tokens (context window): Maximum tokens available to each interaction. Lower values reduce cost and context depth. Higher values preserve more context.
  • Temperature: Controls randomness. Lower values are more deterministic. Higher values are more creative.
  • Interaction Max Messages: Maximum number of messages sent to the model for each interaction.
  • Threshold Strategy: Controls history reduction when thresholds are reached.
    • Truncate keeps the latest conversation turns.
    • Compact summarizes prior turns into checkpoint-style context.

Model and provider options

  • Region: Selects where the model runs when the provider supports regions.
  • Force Function: Forces the model to call a specific function.

For custom models (custom), these additional options are available:

  • Name: Provider model identifier.
  • Provider: Language model provider (for example OpenAI or Anthropic).
  • Credentials: Provider credentials used for requests.
  • Endpoint: Optional custom endpoint URL.

Advanced options

  • Frequency Penalty: Reduces repeated phrases by penalizing frequent tokens.
  • Presence Penalty: Encourages topic exploration by penalizing already-used tokens.
  • Interpreter: Enables native code interpreter for supported models.
  • Image / Audio / Video / File: Enables native multimodal input capabilities when supported by the selected model.
  • Reasoning Effort: Adjusts reasoning intensity for models that expose reasoning effort controls.

These settings let teams tune quality, cost, latency, and memory behavior per bot or conversation.