Plans & Limits
All-Models plans
Section titled “All-Models plans”Every plan covers every chat, reasoning, code and embedding model in the catalog. Budget and rate limits are per-key and reset on their own windows.
Monthly
Section titled “Monthly”| Standard | Pro | |
|---|---|---|
| Price | $30/mo | $60/mo |
| RPM (requests/min) | 200 | 400 |
| TPM (tokens/min) | 500,000 | 500,000 |
| Budget window | $0.22 per 8h | $0.67 per 8h |
| Concurrent requests per key | 1 | 1 |
| API keys per subscription | Unlimited | Unlimited |
| All models | Yes | Yes |
Annual (save on commitment)
Section titled “Annual (save on commitment)”Annual plans use the monthly slug with -annual appended. The table below is illustrative; the authoritative source for current pricing is GET /api/plans.
| Slug | Example price | Effective monthly |
|---|---|---|
standard-annual | ~$270/yr | ~$22.50/mo |
pro-annual | ~$540/yr | ~$45/mo |
Rate limits on annual are identical to monthly — the annual flag only changes Stripe’s billing cycle and upfront charge.
Available models
Section titled “Available models”All models are included in every plan. Use GET /v1/models for the live list.
Chat models
Section titled “Chat models”| Provider | Model | Input $/M tokens | Output $/M tokens |
|---|---|---|---|
| Meta | Llama 3.3 70B | $0.120 | $0.300 |
| Meta | Llama 3.2 3B | $0.020 | $0.020 |
| DeepSeek | DeepSeek V3.2 | $0.260 | $0.380 |
| DeepSeek | DeepSeek R1 | $0.500 | $2.150 |
| Qwen | Qwen3 235B | $0.300 | $1.200 |
| Gemma 3 27B | $0.100 | $0.200 | |
| Moonshot | Kimi 2.5 | $0.350 | $0.700 |
Prices shown are our cost from the inference provider. With a subscription plan, you pay a flat monthly rate — not per-token charges.
Unlimited (dedicated model) plans
Section titled “Unlimited (dedicated model) plans”Pick a specific model (DeepSeek V3.2, MiniMax M2.5, GLM 5.1, Gemma 4, etc.), reserve one or more 8-hour time blocks per day, and pay a flat monthly or annual rate. During your blocks the key runs at the full backend throughput and has no $ budget cap — only a 1 concurrent request per key ceiling. Outside your blocks the key’s rate limits drop to 0.
- Annual billing is available per pool; each pool exposes its own discount on the list response. See the Unlimited Subscriptions API for the full subscribe flow and response shapes.
Credits (pay-as-you-go)
Section titled “Credits (pay-as-you-go)”Don’t want a subscription? Top up credits starting at $10. Any amount accepted, no maximum.
Credit keys have 600 RPM by default. Budget is consumed as you use the API and never resets — top up again when depleted. See Credits guide.
Payment methods
Section titled “Payment methods”| Method | For |
|---|---|
| Card (Stripe) | Monthly + annual subscriptions, pool subscriptions, credits |
| USDC on Base | Subscriptions (monthly) and credits |
| x402 | Agent subscriptions and credits (no human setup needed) |
All monthly subscriptions renew monthly; annual plans renew yearly. Cancel anytime — access continues to the end of the paid period.
How rate limits work
Section titled “How rate limits work”Rate limits are enforced at the key level:
- RPM — Maximum requests per minute. Exceeding returns
429. - TPM — Maximum tokens per minute (input + output combined). Exceeding returns
429.
Limits reset every minute. Different keys do not share limits — each key has its own independent counters.