Skip to main content

Control Plane for Multi-region Architecture (Enterprise)

Learn how to deploy LiteLLM across multiple regions while maintaining centralized administration and avoiding duplication of management overhead.

info

โœจ This requires LiteLLM Enterprise features.

Enterprise Pricing

Get free 7-day trial key

Overviewโ€‹

When scaling LiteLLM for production use, you may want to deploy multiple instances across different regions or availability zones while maintaining a single point of administration. This guide covers how to set up a distributed LiteLLM deployment with:

  • Regional Worker Instances: Handle LLM requests for users in specific regions
  • Centralized Admin Instance: Manages configuration, users, keys, and monitoring

Architecture Pattern: Regional + Admin Instancesโ€‹

Typical Deployment Scenarioโ€‹

Benefits of This Architectureโ€‹

  1. Reduced Management Overhead: Only one instance needs admin capabilities
  2. Regional Performance: Users get low-latency access from their region
  3. Centralized Control: All administration happens from a single interface
  4. Security: Limit admin access to designated instances only
  5. Cost Efficiency: Avoid duplicating admin infrastructure

Configurationโ€‹

Admin Instance Configurationโ€‹

The admin instance handles all management operations and provides the UI.

Environment Variables for Admin Instance:

# Keep admin capabilities enabled (default behavior)
# DISABLE_ADMIN_UI=false # Admin UI available
# DISABLE_ADMIN_ENDPOINTS=false # Management APIs available
DISABLE_LLM_API_ENDPOINTS=true # LLM APIs disabled
DATABASE_URL=postgresql://user:pass@global-db:5432/litellm
LITELLM_MASTER_KEY=your-master-key

# Configure API Reference page to show data plane URL
API_REFERENCE_BASE_URL=https://us.company.com # Data plane URL to display in API Reference
API_REFERENCE_MODEL=gpt-4 # Optional: Default model to show in examples

Worker Instance Configurationโ€‹

Worker instances handle LLM requests but have admin capabilities disabled.

Environment Variables for Worker Instances:

# Disable admin capabilities
DISABLE_ADMIN_UI=true # No admin UI
DISABLE_ADMIN_ENDPOINTS=true # No management endpoints

DATABASE_URL=postgresql://user:pass@global-db:5432/litellm
LITELLM_MASTER_KEY=your-master-key

Environment Variables Referenceโ€‹

DISABLE_ADMIN_UIโ€‹

Disables the LiteLLM Admin UI interface.

  • Default: false
  • Worker Instances: Set to true
  • Admin Instance: Leave as false (or don't set)
# Worker instances
DISABLE_ADMIN_UI=true

Effect: When enabled, the web UI at /ui becomes unavailable.

DISABLE_ADMIN_ENDPOINTSโ€‹

info

โœจ This is an Enterprise feature.

Enterprise Pricing

Get free 7-day trial key

Disables all management/admin API endpoints.

  • Default: false
  • Worker Instances: Set to true
  • Admin Instance: Leave as false (or don't set)
# Worker instances  
DISABLE_ADMIN_ENDPOINTS=true

Disabled Endpoints Include:

  • /key/* - Key management
  • /user/* - User management
  • /team/* - Team management
  • /config/* - Configuration updates
  • All other administrative endpoints

Available Endpoints (when disabled):

  • /chat/completions - LLM requests
  • /v1/* - OpenAI-compatible APIs
  • /vertex_ai/* - Vertex AI pass-through APIs
  • /bedrock/* - Bedrock pass-through APIs
  • /health - Basic health check
  • /metrics - Prometheus metrics
  • All other LLM API endpoints

DISABLE_LLM_API_ENDPOINTSโ€‹

info

โœจ This is an Enterprise feature.

Enterprise Pricing

Get free 7-day trial key

Disables all LLM API endpoints.

  • Default: false
  • Worker Instances: Leave as false (or don't set)
  • Admin Instance: Set to true
# Admin instance
DISABLE_LLM_API_ENDPOINTS=true

Disabled Endpoints Include:

  • /chat/completions - LLM requests
  • /v1/* - OpenAI-compatible APIs
  • /vertex_ai/* - Vertex AI pass-through APIs
  • /bedrock/* - Bedrock pass-through APIs
  • All other LLM API endpoints

Available Endpoints (when disabled):

  • /key/* - Key management
  • /user/* - User management
  • /team/* - Team management
  • /config/* - Configuration updates
  • All other administrative endpoints

API_REFERENCE_BASE_URLโ€‹

info

โœจ This is useful for Control Plane setups.

Overrides the URL displayed on the API Reference page in the admin UI.

  • Default: Uses PROXY_BASE_URL value
  • Control Plane Use Case: Set to your data plane URL
# Admin instance (control plane)
PROXY_BASE_URL=https://admin.company.com
API_REFERENCE_BASE_URL=https://us.company.com # Data plane URL for LLM requests

Effect: The API Reference page will show code examples using https://us.company.com instead of the control plane URL.

API_REFERENCE_MODELโ€‹

Overrides the model name displayed in API Reference code examples.

  • Default: gpt-3.5-turbo
  • Control Plane Use Case: Set to your preferred model name
API_REFERENCE_MODEL=gpt-4

Effect: The API Reference code examples will show model="gpt-4" instead of the default model.

Usage Patternsโ€‹

Client Usageโ€‹

For LLM Requests (use regional endpoints):

import openai

# US users
client_us = openai.OpenAI(
base_url="https://us.company.com/v1",
api_key="your-litellm-key"
)

# EU users
client_eu = openai.OpenAI(
base_url="https://eu.company.com/v1",
api_key="your-litellm-key"
)

response = client_us.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}]
)

For Administration (use admin endpoint):

import requests

# Create a new API key
response = requests.post(
"https://admin.company.com/key/generate",
headers={"Authorization": "Bearer sk-1234"},
json={"duration": "30d"}
)