Skip to content

Billing and Plans

Billing is monitored from the Usage and Billing areas of the console. The goal is to understand spend by product workload, provider account, model group, and API key.

Usage analytics

What You Can Track

  • Request volume.
  • Input, output, and total tokens.
  • Spend and credit consumption.
  • Error rate and latency.
  • Usage by API key, model group, backend, or model.
  • Coding-plan quota and reset windows where supported.

How to Read Usage

Use model-group views to understand product cost. Use backend views to find provider account pressure, low balance, rate limits, or plan exhaustion. Use API-key views to attribute traffic to applications, environments, or customers.

Plans and Credits

One AI Gateway can show both gateway-level credits and provider-level plan or balance information. Credit-based backends show remaining spend balance when available. Coding-plan backends show quota windows and reset timing when the provider exposes them.

Operational Recommendations

  • Set a budget expectation for each production model group.
  • Review coding-plan quota after heavy coding-agent usage.
  • Watch backend-level errors before they become product-wide failures.
  • Use fallback routing for any provider account with limited quota.

Unified AI Gateway documentation for multi-model, multi-provider, and coding-plan use cases.