One endpoint, all your models

The AI Gateway for developers.

Route to hundreds of AI models through a single endpoint. Built-in failovers, per-user auth, cost controls, and enclave-aware execution.

Get an API key View models

Use it withOpenAIAnthropicGooglexAIMetaMistralCohereDeepSeek+100 more

openai/gpt-4o

1import
2 { streamText } 
3from
4 'ai'
5
6const
7 result = 
8streamText
9({
10  model: 
11'openai/gpt-4o'
12,
13  prompt: 
14'Why is the sky blue?'
15})

Models

One API key. Hundreds of models.

Unified billing and observability across your entire AI stack. Text, image, code, and embedding models from every major provider.

Text

Access the latest from every major model lab. Power your AI features through a single endpoint.

Image

Generate and edit with best-in-class image models. DALL-E, Imagen, Stable Diffusion, and more.

Code

Specialized code generation models. Codestral, Code Llama, and OpenAI code-optimized variants.

Embeddings

Vector embeddings for search, retrieval, and classification. OpenAI, Cohere, and Voyage models.

Supported providers29+ models

OpenAI

GPT-4o

GPT-4o mini

GPT-4.5

o4-mini

DALL-E 3

Whisper

Anthropic

Claude 4 Opus

Claude 4 Sonnet

Claude 3.5 Haiku

Google

Gemini 2.5 Pro

Gemini 2.5 Flash

Gemini 2.0 Flash

Imagen 3

One endpoint, all your models

Unified billing and observability across your entire AI stack. Switch providers by changing a string, not your architecture.

Primary: Claude 4

down

Fallback: GPT-4o

active

Fallback: Gemini

standby

Failover completed in 120ms

Built-in failovers, better uptime

Automatic fallbacks during provider outages. Your app stays up even when a model goes down. Configurable retry chains.

Identity

Per-user + service auth

Policy

Model governance + limits

Enclaves

Isolated tool execution

Kill switch

Instant revocation

More than a proxy. A control plane.

Per-user auth, service-account auth, policy enforcement, kill switch, and deep enclave integration. The secure layer for AI traffic.

Request lifecycle

Every request. One governed path.

From inbound request to audit entry, the gateway controls identity, policy, routing, and observability at every stage.

Step 01

Client sends request

An app, agent, service, or enclave sends an AI model request to the gateway endpoint.

Gateway + Enclaves

The gateway governs. Enclaves isolate.

Simple prompts route to providers. Tool-using agents route to enclaves. Outbound requests from enclaves re-enter the policy path. Together, they form the control plane and trust boundary for agentic systems.

Straightforward model calls go directly to a provider through the gateway. Auth, policy, and audit still apply.

Source

Chat app

Celeris AI Gateway

Auth

Policy

Route

Provider

OpenAI

Unified audit

Shared policy

One kill switch

Identity-aware

Every request has an identity.

Per-user tokens, service accounts, and enclave identities. The gateway knows who is calling, what they can access, and how much they can spend.

Per-user auth

Every request carries user identity. Limits, model access, and audit are attributed to a real person.

Use case

Internal copilots, chat apps, user-facing AI products

Attached claims

user_id:u-3847

org_id:org-celeris

role:developer

spend_limit:$50/day

models:gpt-4o, claude-4

Service-account auth

Workloads and backends authenticate as services. Policies scope by service role.

Use case

Batch jobs, background agents, integrations, scheduled workflows

Attached claims

service_id:svc-batch-runner

team:ml-ops

env:production

rate_limit:500 req/min

route_profile:cost-optimized

Enclave identity

Enclaves have execution identities. Requests are tagged with enclave, session, and workload metadata for full traceability.

Use case

Tool-using agents, isolated code execution, risky automation

Attached claims

enclave_id:enc-17

session_id:sess-4921

workspace_id:ws-92

policy_set:strict-tool-use

origin:agent-workflow

Routing + governance

Policy-driven model governance.

Route across providers, enforce per-tenant policy, manage fallback chains, cache responses, and control spend. All from one configuration layer.

Provider selection by policy

Route by region, model capability, cost, or workload type

Primary/secondary failover

Automatic fallback when primary provider is unavailable

Canary model rollout

Route a percentage of traffic to a new model for evaluation

Conditional routing

Route based on request metadata, user role, or tenant config

Internal model endpoints

Route to self-hosted or private models alongside external providers

Observability

Cost, latency, failures. One dashboard.

See every request, every token, every dollar. Track provider health, cache hit rates, fallback events, and policy decisions in real time.

Gateway Dashboard

Live

Requests (24h)

Tokens (M)

0.0M

Spend (24h)

Cache hit rate

Provider health

OpenAI

142mshealthy

Anthropic

198mshealthy

Google

340msdegraded

Internal

24mshealthy

Recent events

2s agofallbackclaude-4-opus → gpt-4o (timeout)

14s agocache-hitSemantic match, saved $0.012

31s agorate-limitsvc-analytics throttled (500/min)

1m agopolicy-denyuser u-9201 denied gpt-4o-mini

2m agocircuit-breakGoogle vertex paused (error spike)

Kill switch

One switch. Every route severed.

Shut down all model traffic instantly. Disable specific routes, freeze tenants, revoke tokens, and force fallback to safe mode. Every revocation is logged.

Gateway Control

ALL ROUTES ACTIVE

RouteTrafficStatus

providerOpenAI GPT-4o

2,847 req/h

active

providerAnthropic Claude

1,923 req/h

active

internalInternal vLLM

892 req/h

active

enclaveEnclave Workflow

341 req/h

active

4Active routes

6.0K/hTotal traffic

Severing routes...

Granular control

All traffic

Sever every outbound route instantly

Specific provider

Disable one provider while others continue

Tenant / workspace

Freeze a single tenant without affecting others

Service account

Revoke a specific service identity

Use cases

Built for every AI workload.

From internal copilots to agentic systems, the gateway adapts to your auth model, routing needs, and security requirements.

Internal enterprise copilots

Per-user identity and model access by role. Spend visibility by team. Full audit trail by user.

CallerEmployees via copilot UI

AuthPer-user SSO tokens

RoutingModel access by role, cost-optimized routing

ControlsPer-user spend limits, model allowlists

Backend AI workloads

Service-account auth with retries, fallbacks, quotas, and budget enforcement. Region-aware provider routing.

CallerBackend services and jobs

AuthService account credentials

RoutingCost-optimized with region affinity

ControlsRate limits, retries, budget caps

Agentic applications

Tool-using requests route through gateway into enclaves. Isolated execution. Outbound actions remain governed.

CallerAI agents with tool access

AuthWorkspace + enclave context

RoutingEnclave-backed for tool use

ControlsEnclave isolation, tool restrictions

Multi-tenant AI platforms

Tenant-aware limits, policy packs per customer, usage metering, and model portfolio management.

CallerSaaS platform customers

AuthTenant-scoped tokens

RoutingPer-tenant model portfolios

ControlsTenant quotas, usage metering

Security-sensitive automation

Dangerous tasks forced into enclaves. Kill switch for incidents. Strict destination restrictions. Full logging.

CallerAutomated pipelines

AuthService + enclave identity

RoutingEnclave-required for risky ops

ControlsKill switch, strict allowlists, audit

Differentiation

More than a proxy. A control plane.

Many gateways offer routing and observability. Celeris adds enclave-aware execution, unified identity, security policy, and trust-boundary control.

Solution	Provider routing	Model fallbacks	Caching	Rate limiting	Per-user auth	Service-account auth	Unified audit trail	Kill switch	Enclave integration	Workload isolation	Policy-aware tool execution	Trust-boundary enforcement
Direct provider SDKs
OpenAI-compatible proxy
Observability-only tools
Routing-only gateways
Enclave-only isolation
Celeris AI Gateway

Only Celeris combines routing, identity, policy, observability, and enclave integration in one gateway

SupportedNot available

Get started

Build your AI control plane on Celeris.

Secure model traffic, govern usage, and connect AI workloads to enclaves in one unified layer.