Skip to content

Routing

Routing is configured visually in the console. You build a route as a fallback chain: each tier can be a single backend, a load-balanced pool, a conditional branch, or another model group.

Routing editor

What Routing Controls

Fallback Order

Choose the order in which One AI Gateway tries backends when a provider fails, times out, or reaches a threshold.

Load Balancing

Split traffic across several compatible backends to use plan capacity or distribute provider load.

Conditional Branching

Send requests to different paths by user tier, workload type, model need, or product metadata.

Safety Triggers

Move to the next tier based on errors, plan quota, budget, or live concurrency.

Common Routing Patterns

Plan First

Put coding-plan backends in the first tier. Add a metered backend as the next tier so users continue working when plan quota is close to exhausted.

Provider Resilience

Use a strong primary provider first, then add backup providers for rate limits, outages, or timeouts.

Cost Control

Send routine workloads to a lower-cost pool and reserve premium models for high-value or complex tasks.

Customer Tiering

Use conditional branches to give paid users higher-capability models while keeping free users on efficient defaults.

Review Before Publishing

Use the model group detail page after saving a route. Confirm that fallback markers, quota meters, spend, error rate, and latency match your expectations.

Unified AI Gateway documentation for multi-model, multi-provider, and coding-plan use cases.