One endpoint, all your models

The AI Gateway for developers.

Route to hundreds of AI models through a single endpoint. Built-in failovers, per-user auth, cost controls, and enclave-aware execution.

Use it withOpenAIAnthropicGooglexAIMetaMistralCohereDeepSeek+100 more
openai/gpt-4o
1import
2 { streamText }
3from
4 'ai'
5
6const
7 result =
8streamText
9({
10 model:
11'openai/gpt-4o'
12,
13 prompt:
14'Why is the sky blue?'
15})
Models

One API key. Hundreds of models.

Unified billing and observability across your entire AI stack. Text, image, code, and embedding models from every major provider.

Text

Access the latest from every major model lab. Power your AI features through a single endpoint.

Image

Generate and edit with best-in-class image models. DALL-E, Imagen, Stable Diffusion, and more.

Code

Specialized code generation models. Codestral, Code Llama, and OpenAI code-optimized variants.

Embeddings

Vector embeddings for search, retrieval, and classification. OpenAI, Cohere, and Voyage models.

Supported providers29+ models
OpenAI
GPT-4o
GPT-4o mini
GPT-4.5
o3
o4-mini
DALL-E 3
Whisper
Anthropic
Claude 4 Opus
Claude 4 Sonnet
Claude 3.5 Haiku
Google
Gemini 2.5 Pro
Gemini 2.5 Flash
Gemini 2.0 Flash
Imagen 3
Meta
Llama 4 Maverick
Llama 4 Scout
Llama 3.3 70B
DeepSeek
DeepSeek R1
DeepSeek V3
DeepSeek R1 0528
Mistral
Mistral Large
Mistral Medium
Codestral
Pixtral
xAI
Grok 3
Grok 3 mini
Cohere
Command A
Command R+
Embed v4
Any OpenAI-compatible endpoint works out of the box. Custom providers supported.+ self-hosted models
GPT-4o
Claude
Gemini
Llama
Grok
R1

One endpoint, all your models

Unified billing and observability across your entire AI stack. Switch providers by changing a string, not your architecture.

Primary: Claude 4
down
Fallback: GPT-4o
active
Fallback: Gemini
standby
Failover completed in 120ms

Built-in failovers, better uptime

Automatic fallbacks during provider outages. Your app stays up even when a model goes down. Configurable retry chains.

Identity
Per-user + service auth
Policy
Model governance + limits
Enclaves
Isolated tool execution
Kill switch
Instant revocation

More than a proxy. A control plane.

Per-user auth, service-account auth, policy enforcement, kill switch, and deep enclave integration. The secure layer for AI traffic.

Request lifecycle

Every request. One governed path.

From inbound request to audit entry, the gateway controls identity, policy, routing, and observability at every stage.

Step 01

Client sends request

An app, agent, service, or enclave sends an AI model request to the gateway endpoint.

01
02
03
04
05
06
Gateway + Enclaves

The gateway governs. Enclaves isolate.

Simple prompts route to providers. Tool-using agents route to enclaves. Outbound requests from enclaves re-enter the policy path. Together, they form the control plane and trust boundary for agentic systems.

Straightforward model calls go directly to a provider through the gateway. Auth, policy, and audit still apply.

Source
Chat app
Celeris AI Gateway
Auth
Policy
Route
Provider
OpenAI
Unified audit
Shared policy
One kill switch
Identity-aware

Every request has an identity.

Per-user tokens, service accounts, and enclave identities. The gateway knows who is calling, what they can access, and how much they can spend.

Per-user auth

Every request carries user identity. Limits, model access, and audit are attributed to a real person.

Use case

Internal copilots, chat apps, user-facing AI products

Attached claims
user_id:u-3847
org_id:org-celeris
role:developer
spend_limit:$50/day
models:gpt-4o, claude-4

Service-account auth

Workloads and backends authenticate as services. Policies scope by service role.

Use case

Batch jobs, background agents, integrations, scheduled workflows

Attached claims
service_id:svc-batch-runner
team:ml-ops
env:production
rate_limit:500 req/min
route_profile:cost-optimized

Enclave identity

Enclaves have execution identities. Requests are tagged with enclave, session, and workload metadata for full traceability.

Use case

Tool-using agents, isolated code execution, risky automation

Attached claims
enclave_id:enc-17
session_id:sess-4921
workspace_id:ws-92
policy_set:strict-tool-use
origin:agent-workflow
Routing + governance

Policy-driven model governance.

Route across providers, enforce per-tenant policy, manage fallback chains, cache responses, and control spend. All from one configuration layer.

Provider selection by policy

Route by region, model capability, cost, or workload type

Primary/secondary failover

Automatic fallback when primary provider is unavailable

Canary model rollout

Route a percentage of traffic to a new model for evaluation

Conditional routing

Route based on request metadata, user role, or tenant config

Internal model endpoints

Route to self-hosted or private models alongside external providers

Observability

Cost, latency, failures. One dashboard.

See every request, every token, every dollar. Track provider health, cache hit rates, fallback events, and policy decisions in real time.

Gateway Dashboard
Live
Requests (24h)
0
Tokens (M)
0.0M
Spend (24h)
$0
Cache hit rate
0%
Provider health
OpenAI
142mshealthy
Anthropic
198mshealthy
Google
340msdegraded
Internal
24mshealthy
Recent events
2s agofallbackclaude-4-opus → gpt-4o (timeout)
14s agocache-hitSemantic match, saved $0.012
31s agorate-limitsvc-analytics throttled (500/min)
1m agopolicy-denyuser u-9201 denied gpt-4o-mini
2m agocircuit-breakGoogle vertex paused (error spike)
Kill switch

One switch. Every route severed.

Shut down all model traffic instantly. Disable specific routes, freeze tenants, revoke tokens, and force fallback to safe mode. Every revocation is logged.

Gateway Control
ALL ROUTES ACTIVE
RouteTrafficStatus
providerOpenAI GPT-4o
2,847 req/h
active
providerAnthropic Claude
1,923 req/h
active
internalInternal vLLM
892 req/h
active
enclaveEnclave Workflow
341 req/h
active
4Active routes
6.0K/hTotal traffic
Severing routes...
Granular control
All traffic

Sever every outbound route instantly

Specific provider

Disable one provider while others continue

Tenant / workspace

Freeze a single tenant without affecting others

Service account

Revoke a specific service identity

Use cases

Built for every AI workload.

From internal copilots to agentic systems, the gateway adapts to your auth model, routing needs, and security requirements.

Internal enterprise copilots

Per-user identity and model access by role. Spend visibility by team. Full audit trail by user.

CallerEmployees via copilot UI
AuthPer-user SSO tokens
RoutingModel access by role, cost-optimized routing
ControlsPer-user spend limits, model allowlists

Backend AI workloads

Service-account auth with retries, fallbacks, quotas, and budget enforcement. Region-aware provider routing.

CallerBackend services and jobs
AuthService account credentials
RoutingCost-optimized with region affinity
ControlsRate limits, retries, budget caps

Agentic applications

Tool-using requests route through gateway into enclaves. Isolated execution. Outbound actions remain governed.

CallerAI agents with tool access
AuthWorkspace + enclave context
RoutingEnclave-backed for tool use
ControlsEnclave isolation, tool restrictions

Multi-tenant AI platforms

Tenant-aware limits, policy packs per customer, usage metering, and model portfolio management.

CallerSaaS platform customers
AuthTenant-scoped tokens
RoutingPer-tenant model portfolios
ControlsTenant quotas, usage metering

Security-sensitive automation

Dangerous tasks forced into enclaves. Kill switch for incidents. Strict destination restrictions. Full logging.

CallerAutomated pipelines
AuthService + enclave identity
RoutingEnclave-required for risky ops
ControlsKill switch, strict allowlists, audit
Differentiation

More than a proxy. A control plane.

Many gateways offer routing and observability. Celeris adds enclave-aware execution, unified identity, security policy, and trust-boundary control.

SolutionProvider routingModel fallbacksCachingRate limitingPer-user authService-account authUnified audit trailKill switchEnclave integrationWorkload isolationPolicy-aware tool executionTrust-boundary enforcement
Direct provider SDKs
OpenAI-compatible proxy
Observability-only tools
Routing-only gateways
Enclave-only isolation
Celeris AI Gateway
Only Celeris combines routing, identity, policy, observability, and enclave integration in one gateway
SupportedNot available
Get started

Build your AI control plane on Celeris.

Secure model traffic, govern usage, and connect AI workloads to enclaves in one unified layer.