LLMOps
Product Handbook

Gateway

The Gateway is a unified API endpoint that routes requests to LLM providers. It provides an OpenAI-compatible interface, allowing you to switch providers without changing your application code.

Overview

The Gateway provides:

  • Unified API: OpenAI-compatible endpoints for all providers
  • Provider Routing: Route requests via configs or direct provider access
  • Resilience: Built-in retries and fallback support
  • Observability: Automatic request logging and cost tracking

Supported Endpoints

The Gateway exposes OpenAI-compatible endpoints:

EndpointDescription
/v1/chat/completionsChat completions API
/v1/completionsText completions API
/v1/embeddingsText embeddings API
/v1/images/generationsImage generation API
/v1/audio/speechText-to-speech API
/v1/audio/transcriptionsAudio transcription API
/v1/modelsList available models

Using the Gateway

There are two ways to route requests through the Gateway:

1. Config-Based Routing

Use configs for managed routing with environment-specific variants:

import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: 'http://localhost:3000/llmops/api/genai/v1',
  apiKey: 'your-environment-secret',
  defaultHeaders: {
    'x-llmops-config': 'your-config-id',
  },
});

const response = await openai.chat.completions.create({
  model: 'gpt-4o-mini', // Can be overridden by variant
  messages: [{ role: 'user', content: 'Hello!' }],
});

This approach:

  • Routes based on your config's targeting rules
  • Applies variant settings (model, parameters, system prompt)
  • Enables A/B testing and gradual rollouts

2. Direct Provider Access

Access providers directly using the @slug/model format:

const response = await openai.chat.completions.create({
  model: '@openai-prod/gpt-4-turbo',
  messages: [{ role: 'user', content: 'Hello!' }],
});

This bypasses config routing and sends requests directly to the specified provider configuration.

Provider Configuration

Configure providers in the Gateway > Providers section of the dashboard.

Adding a Provider

Each provider configuration includes:

FieldDescription
ProviderThe LLM provider (OpenAI, Anthropic, Azure, etc.)
NameHuman-readable name for this configuration
SlugOptional short ID for direct access (e.g., openai-prod)
CredentialsProvider-specific authentication credentials

Provider Credentials

Different providers require different credentials:

ProviderRequired Credentials
OpenAIAPI Key, Organization (optional), Project (optional)
AnthropicAPI Key
AzureResource Name, Deployment ID, API Version, Auth Method
AWS BedrockAccess Key ID, Secret Access Key, Region
Google VertexProject ID, Region, Service Account JSON
GroqAPI Key
MistralAPI Key

Example: Multiple OpenAI Configurations

You might configure multiple instances of the same provider:

NameSlugUse Case
OpenAI Productionopenai-prodProduction workloads
OpenAI Developmentopenai-devDevelopment and testing
OpenAI High-Limitopenai-hlHigh-throughput operations

Access them directly:

// Production
model: '@openai-prod/gpt-4-turbo'

// Development
model: '@openai-dev/gpt-4o-mini'

// High-limit account
model: '@openai-hl/gpt-4-turbo'

Streaming

The Gateway supports streaming responses:

const stream = await openai.chat.completions.create({
  model: '@openai-prod/gpt-4-turbo',
  messages: [{ role: 'user', content: 'Write a story' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Error Handling

The Gateway returns standard OpenAI-compatible error responses:

try {
  const response = await openai.chat.completions.create({
    model: '@openai-prod/gpt-4-turbo',
    messages: [{ role: 'user', content: 'Hello!' }],
  });
} catch (error) {
  if (error instanceof OpenAI.APIError) {
    console.error('Status:', error.status);
    console.error('Message:', error.message);
  }
}

Request Headers

The Gateway accepts the following headers:

HeaderDescriptionRequired
AuthorizationBearer token with environment secretYes
x-llmops-configConfig ID for routingOptional
Content-TypeMust be application/jsonYes

cURL Example

curl -X POST http://localhost:3000/llmops/api/genai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-environment-secret" \
  -d '{
    "model": "@openai-prod/gpt-4-turbo",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

On this page