Routing
Routing is configured visually in the console. You build a route as a fallback chain: each tier can be a single backend, a load-balanced pool, a conditional branch, or another model group.
What Routing Controls
Fallback Order
Choose the order in which One AI Gateway tries backends when a provider fails, times out, or reaches a threshold.
Load Balancing
Split traffic across several compatible backends to use plan capacity or distribute provider load.
Conditional Branching
Send requests to different paths by user tier, workload type, model need, or product metadata.
Safety Triggers
Move to the next tier based on errors, plan quota, budget, or live concurrency.
Common Routing Patterns
Plan First
Put coding-plan backends in the first tier. Add a metered backend as the next tier so users continue working when plan quota is close to exhausted.
Provider Resilience
Use a strong primary provider first, then add backup providers for rate limits, outages, or timeouts.
Cost Control
Send routine workloads to a lower-cost pool and reserve premium models for high-value or complex tasks.
Customer Tiering
Use conditional branches to give paid users higher-capability models while keeping free users on efficient defaults.
Review Before Publishing
Use the model group detail page after saving a route. Confirm that fallback markers, quota meters, spend, error rate, and latency match your expectations.