LLMOps: A pluggable LLMOps toolkit for TypeScript applications.

The Gateway is a unified API endpoint that routes requests to LLM providers. It provides an OpenAI-compatible interface, allowing you to switch providers without changing your application code.

Overview

The Gateway provides:

Unified API: OpenAI-compatible endpoints for all providers
Provider Routing: Route requests via configs or direct provider access
Resilience: Built-in retries and fallback support
Observability: Automatic request logging and cost tracking

Supported Endpoints

The Gateway exposes OpenAI-compatible endpoints:

Endpoint	Description
`/v1/chat/completions`	Chat completions API
`/v1/completions`	Text completions API
`/v1/embeddings`	Text embeddings API
`/v1/images/generations`	Image generation API
`/v1/audio/speech`	Text-to-speech API
`/v1/audio/transcriptions`	Audio transcription API
`/v1/models`	List available models

Using the Gateway

There are two ways to route requests through the Gateway:

1. Config-Based Routing

Use configs for managed routing with environment-specific variants:

import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: 'http://localhost:3000/llmops/api/genai/v1',
  apiKey: 'your-environment-secret',
  defaultHeaders: {
    'x-llmops-config': 'your-config-id',
  },
});

const response = await openai.chat.completions.create({
  model: 'gpt-4o-mini', // Can be overridden by variant
  messages: [{ role: 'user', content: 'Hello!' }],
});

This approach:

Routes based on your config's targeting rules
Applies variant settings (model, parameters, system prompt)
Enables A/B testing and gradual rollouts

2. Direct Provider Access

Access providers directly using the @slug/model format:

const response = await openai.chat.completions.create({
  model: '@openai-prod/gpt-4-turbo',
  messages: [{ role: 'user', content: 'Hello!' }],
});

This bypasses config routing and sends requests directly to the specified provider configuration.

Provider Configuration

Configure providers in the Gateway > Providers section of the dashboard.

Adding a Provider

Each provider configuration includes:

Field	Description
Provider	The LLM provider (OpenAI, Anthropic, Azure, etc.)
Name	Human-readable name for this configuration
Slug	Optional short ID for direct access (e.g., `openai-prod`)
Credentials	Provider-specific authentication credentials

Provider Credentials

Different providers require different credentials:

Provider	Required Credentials
OpenAI	API Key, Organization (optional), Project (optional)
Anthropic	API Key
Azure	Resource Name, Deployment ID, API Version, Auth Method
AWS Bedrock	Access Key ID, Secret Access Key, Region
Google Vertex	Project ID, Region, Service Account JSON
Groq	API Key
Mistral	API Key

Example: Multiple OpenAI Configurations

You might configure multiple instances of the same provider:

Name	Slug	Use Case
OpenAI Production	`openai-prod`	Production workloads
OpenAI Development	`openai-dev`	Development and testing
OpenAI High-Limit	`openai-hl`	High-throughput operations

Access them directly:

// Production
model: '@openai-prod/gpt-4-turbo'

// Development
model: '@openai-dev/gpt-4o-mini'

// High-limit account
model: '@openai-hl/gpt-4-turbo'

Streaming

The Gateway supports streaming responses:

const stream = await openai.chat.completions.create({
  model: '@openai-prod/gpt-4-turbo',
  messages: [{ role: 'user', content: 'Write a story' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Error Handling

The Gateway returns standard OpenAI-compatible error responses:

try {
  const response = await openai.chat.completions.create({
    model: '@openai-prod/gpt-4-turbo',
    messages: [{ role: 'user', content: 'Hello!' }],
  });
} catch (error) {
  if (error instanceof OpenAI.APIError) {
    console.error('Status:', error.status);
    console.error('Message:', error.message);
  }
}

Request Headers

The Gateway accepts the following headers:

Header	Description	Required
`Authorization`	Bearer token with environment secret	Yes
`x-llmops-config`	Config ID for routing	Optional
`Content-Type`	Must be `application/json`	Yes

cURL Example

curl -X POST http://localhost:3000/llmops/api/genai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-environment-secret" \
  -d '{
    "model": "@openai-prod/gpt-4-turbo",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Gateway

On this page