Plans & Limits

Plans

Rate limits are enforced per key and reset every minute. Each API key has its own independent limits.

All models are included in every plan. Use GET /v1/models for the live list.

Provider	Model	Input $/M tokens	Output $/M tokens
Meta	Llama 3.3 70B	$0.120	$0.300
Meta	Llama 3.2 3B	$0.020	$0.020
DeepSeek	DeepSeek V3.2	$0.260	$0.380
DeepSeek	DeepSeek R1	$0.500	$2.150
Qwen	Qwen3 235B	$0.300	$1.200
Google	Gemma 3 27B	$0.100	$0.200
Moonshot	Kimi 2.5	$0.350	$0.700

Prices shown are our cost from the inference provider. With a subscription plan, you pay a flat monthly rate — not per-token charges.

Don’t want a subscription? Top up credits starting at $10. Any amount accepted, no maximum.

Credit keys have 60 RPM. Budget is consumed as you use the API and never resets — top up again when depleted. See Credits guide.

Method	For
Card (Stripe)	Subscriptions and credits
USDC on Base	Subscriptions and credits
x402	Agent subscriptions and credits (no human setup needed)

All subscriptions last 30 days with no auto-renewal.

Rate limits are enforced at the key level:

RPM — Maximum requests per minute. Exceeding returns 429.
TPM — Maximum tokens per minute (input + output combined). Exceeding returns 429.

Limits reset every minute. Different keys do not share limits — each key has its own independent counters.