Skip to content

Plans & Limits

StandardPro
Price$20/mo$60/mo
RPM (requests/min)60200
TPM (tokens/min)3,33313,333
Duration30 days30 days
All modelsYesYes

Rate limits are enforced per key and reset every minute. Each API key has its own independent limits.

All models are included in every plan. Use GET /v1/models for the live list.

ProviderModelInput $/M tokensOutput $/M tokens
MetaLlama 3.2 3B$0.020$0.020
MetaLlama 3.1 8B$0.020$0.050
DeepSeekDeepSeek V3.2$0.260$0.380
DeepSeekDeepSeek R1$0.500$2.150
GoogleGemini 2.5 Flash$0.300$2.500
GoogleGemini 2.5 Pro$1.250$10.000
AnthropicClaude 4 Sonnet$3.300$16.500
AnthropicClaude 4 Opus$16.500$82.500

Prices shown are our cost from the inference provider. With a subscription plan, you pay a flat monthly rate and get a rolling 5-hour budget window — not per-token charges.

Don’t want a subscription? Top up credits starting at $10. Any amount accepted, no maximum.

Credit keys have 60 RPM. Budget is consumed as you use the API and never resets — top up again when depleted. See Credits guide.

MethodFor
Card (Stripe)Subscriptions and credits
USDC on BaseSubscriptions and credits
x402Pay-per-request (no account needed)

All subscriptions last 30 days with no auto-renewal.

Rate limits are enforced at the key level:

  • RPM — Maximum requests per minute. Exceeding returns 429.
  • TPM — Maximum tokens per minute (input + output combined). Exceeding returns 429.

Limits reset every minute. Different keys do not share limits — each key has its own independent counters.