Skip to content

Plans & Limits

Every plan covers every chat, reasoning, code and embedding model in the catalog. Budget and rate limits are per-key and reset on their own windows.

StandardPro
Price$30/mo$60/mo
RPM (requests/min)200400
TPM (tokens/min)500,000500,000
Budget window$0.22 per 8h$0.67 per 8h
Concurrent requests per key11
API keys per subscriptionUnlimitedUnlimited
All modelsYesYes

Annual plans use the monthly slug with -annual appended. The table below is illustrative; the authoritative source for current pricing is GET /api/plans.

SlugExample priceEffective monthly
standard-annual~$270/yr~$22.50/mo
pro-annual~$540/yr~$45/mo

Rate limits on annual are identical to monthly — the annual flag only changes Stripe’s billing cycle and upfront charge.

All models are included in every plan. Use GET /v1/models for the live list.

ProviderModelInput $/M tokensOutput $/M tokens
MetaLlama 3.3 70B$0.120$0.300
MetaLlama 3.2 3B$0.020$0.020
DeepSeekDeepSeek V3.2$0.260$0.380
DeepSeekDeepSeek R1$0.500$2.150
QwenQwen3 235B$0.300$1.200
GoogleGemma 3 27B$0.100$0.200
MoonshotKimi 2.5$0.350$0.700

Prices shown are our cost from the inference provider. With a subscription plan, you pay a flat monthly rate — not per-token charges.

Pick a specific model (DeepSeek V3.2, MiniMax M2.5, GLM 5.1, Gemma 4, etc.), reserve one or more 8-hour time blocks per day, and pay a flat monthly or annual rate. During your blocks the key runs at the full backend throughput and has no $ budget cap — only a 1 concurrent request per key ceiling. Outside your blocks the key’s rate limits drop to 0.

  • Annual billing is available per pool; each pool exposes its own discount on the list response. See the Unlimited Subscriptions API for the full subscribe flow and response shapes.

Don’t want a subscription? Top up credits starting at $10. Any amount accepted, no maximum.

Credit keys have 600 RPM by default. Budget is consumed as you use the API and never resets — top up again when depleted. See Credits guide.

MethodFor
Card (Stripe)Monthly + annual subscriptions, pool subscriptions, credits
USDC on BaseSubscriptions (monthly) and credits
x402Agent subscriptions and credits (no human setup needed)

All monthly subscriptions renew monthly; annual plans renew yearly. Cancel anytime — access continues to the end of the paid period.

Rate limits are enforced at the key level:

  • RPM — Maximum requests per minute. Exceeding returns 429.
  • TPM — Maximum tokens per minute (input + output combined). Exceeding returns 429.

Limits reset every minute. Different keys do not share limits — each key has its own independent counters.