LLMOps
Product Handbook

Variants

A Variant is a specific model configuration that can be attached to configs. Variants contain the actual LLM settings: provider, model, parameters, and system prompt.

Overview

Each variant:

  • Has a name for easy identification
  • Contains model configuration (provider, model, parameters)
  • Automatically creates versions when edited
  • Can be attached to multiple configs
  • Supports Nunjucks/Jinja2 template syntax in system prompts

Creating a Variant

When creating or editing a variant, you configure:

PropertyDescription
NameHuman-readable name (e.g., "GPT-4 Turbo - Helpful")
ProviderThe LLM provider (e.g., OpenAI, Anthropic)
ModelThe model to use (e.g., gpt-4-turbo, claude-3-sonnet)
System PromptInstructions that define the assistant's behavior
ParametersModel parameters like temperature, max tokens, etc.

Model Parameters

ParameterDescriptionRange
TemperatureControls randomness. Lower = more focused, higher = more creative0 - 2
Max TokensMaximum length of the responseVaries by model
Top PNucleus sampling. Alternative to temperature0 - 1
Frequency PenaltyReduces repetition of tokens-2 - 2
Presence PenaltyEncourages new topics-2 - 2

Template Variables

System prompts support Nunjucks/Jinja2 template syntax, allowing you to create dynamic prompts with variables that are filled at runtime.

Supported Syntax

SyntaxDescriptionExample
{{ variable }}Variable interpolation{{ user_name }}
{% for item in items %}Loop through arrays{% for tag in tags %}{{ tag }}{% endfor %}
{% if condition %}Conditional blocks{% if premium %}Premium user{% endif %}
{# comment #}Comments (not rendered){# TODO: improve this #}

Example System Prompt with Templates

You are a {{ role }} assistant for {{ company_name }}.

User Profile:
- Name: {{ user.name }}
- Plan: {{ user.plan }}

{% if user.preferences %}
User Preferences:
{% for pref in user.preferences %}
- {{ pref }}
{% endfor %}
{% endif %}

Please respond in a {{ tone }} tone.

Passing Input Variables

When calling the API, pass template variables in the input_variables field of your request body. This field is used for template rendering and is removed before the request is sent to the LLM provider.

curl -X POST https://your-api.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-llmops-prompt: your-config-id" \
  -H "Authorization: Bearer your-env-secret" \
  -d '{
    "messages": [{"role": "user", "content": "Hello!"}],
    "input_variables": {
      "role": "customer support",
      "company_name": "Acme Inc",
      "user": {
        "name": "John Doe",
        "plan": "premium",
        "preferences": ["concise responses", "technical details"]
      },
      "tone": "friendly"
    }
  }'

The input_variables field works with all OpenAI-compatible endpoints including /chat/completions and /responses.

Behavior Notes

  • Missing variables: If a variable is not provided in input_variables, it will be rendered as empty (no error is thrown)
  • Nested objects: Access nested properties with dot notation: {{ user.name }}
  • Arrays: Use {% for %} loops to iterate over arrays
  • Filters: Nunjucks filters are supported: {{ name | upper }}, {{ items | join(", ") }}

Versioning

Every time you edit a variant, a new version is created. This provides:

  • Audit Trail: See exactly what configuration was used at any point in time
  • Safe Rollbacks: Revert to a previous version if issues arise
  • Version Pinning: Production can use a specific version while staging tests newer versions

Version Pinning vs Latest

When targeting a variant to an environment:

ModeBehavior
Pinned VersionAlways serves the specified version (recommended for production)
LatestAutomatically serves the newest version (useful for staging/development)

Example Workflow

  1. Create a variant with your initial configuration

    • Provider: OpenAI
    • Model: gpt-4-turbo
    • System Prompt: "You are a helpful assistant..."
    • Temperature: 0.7
    • Creates Version 1
  2. Attach to a config and set up targeting rules

    • Production → Version 1 (pinned)
    • Staging → Latest
  3. Iterate on the variant

    • Update the system prompt
    • Creates Version 2
    • Staging automatically uses Version 2
    • Production still uses Version 1
  4. Promote to production

    • After testing in staging, update production targeting to Version 2

Multiple Variants per Config

A single config can have multiple variants attached. This enables:

  • A/B Testing: Compare different model configurations
  • Gradual Rollouts: Slowly shift traffic to new configurations
  • Fallback Options: Use a backup variant if the primary fails

Example Variants

Variant NameProviderModelUse Case
GPT-4 Turbo - HelpfulOpenAIgpt-4-turboGeneral assistance
Claude 3 - ConciseAnthropicclaude-3-sonnetBrief responses
Llama 3 - LocalOllamallama3Development/testing
GPT-4o Mini - FastOpenAIgpt-4o-miniQuick, cost-effective responses

On this page