Skip to content

GPU Boosts

GPU Boosts give you a reserved slice of a shared GPU pool — guaranteed tokens/sec and RPM at a fixed monthly price. Each pool runs a single model; you pick which time blocks you want and how many seats.

All endpoints require authentication. Use your management key (mk_):

Authorization: Bearer mk_your_key

GET /api/pools

Returns all pools that are not in draft status.

Terminal window
curl https://api.cheapestinference.com/api/pools \
-H "Authorization: Bearer mk_your_key"
{
"success": true,
"data": [
{
"id": "pool_uuid",
"slug": "qwen3-397b-a17b",
"modelId": "Qwen/Qwen3.5-397B-A17B",
"modelName": "Qwen3.5-397B-A17B",
"description": "Dedicated Qwen3.5 inference at guaranteed throughput.",
"infraSpec": "8× H100 SXM5 · 50 tok/s guaranteed",
"status": "funding",
"totalSlots": 10,
"pledgedSlots": 3,
"minPricePerDay": "27.00"
}
]
}
FieldDescription
statusfunding (accepting pledges), activating, active, paused
totalSlotsMaximum seats available across all blocks
pledgedSlotsNumber of confirmed reservations
minPricePerDayLowest monthly price across all blocks (USD)

GET /api/pools/:id

Returns full pool info including all hour slots. :id can be the UUID or slug.

Terminal window
curl https://api.cheapestinference.com/api/pools/pool_uuid \
-H "Authorization: Bearer mk_your_key"
{
"success": true,
"data": {
"id": "pool_uuid",
"slug": "qwen3-397b-a17b",
"modelId": "Qwen/Qwen3.5-397B-A17B",
"status": "funding",
"hourSlots": [
{ "id": "slot_uuid", "hour": 0, "slotIndex": 0, "pricePerDay": "3.38", "status": "available" },
{ "id": "slot_uuid", "hour": 1, "slotIndex": 0, "pricePerDay": "3.37", "status": "available" }
]
}
}

Each hourSlot represents one hour of one seat. Prices are monthly amounts distributed across the 8 hours in a block.


POST /api/pools/:id/subscribe

Reserve seats using block names instead of raw slot IDs. This is the preferred API — it handles slot selection automatically and enforces block-level granularity.

Each pool day is divided into three fixed UTC blocks:

BlockHours (UTC)Typical coverage
night00:00–07:59Asia-Pacific
europe08:00–15:59Europe / Middle East
americas16:00–23:59Americas
ParameterTypeRequiredDescription
blocksstring[]YesOne or more block names: "night", "europe", "americas"
quantityintegerNoNumber of seats per block (1–20, default 1)
Terminal window
curl -X POST https://api.cheapestinference.com/api/pools/pool_uuid/subscribe \
-H "Authorization: Bearer mk_your_key" \
-H "Content-Type: application/json" \
-d '{"blocks": ["americas"]}'
Terminal window
curl -X POST https://api.cheapestinference.com/api/pools/pool_uuid/subscribe \
-H "Authorization: Bearer mk_your_key" \
-H "Content-Type: application/json" \
-d '{"blocks": ["night", "europe", "americas"], "quantity": 2}'
{
"success": true,
"data": {
"id": "pledge_uuid",
"poolId": "pool_uuid",
"status": "pledged",
"monthlyPrice": "27.00",
"slotCount": 8
}
}

If the pool is already active, the subscription activates immediately and status will be active with a key field included.

monthlyPrice = sum of prices for all reserved slots. Each block has a fixed monthly price set by the pool operator. With quantity: 2 and two blocks selected, you pay 2 × (block_a_price + block_b_price).

  • A $1 non-refundable reservation fee is charged to your card on file to confirm your payment method.
  • The subscription activates when the pool reaches its minimum seat threshold (funding → active).
  • Once active, your monthly charge is billed automatically and you receive a dedicated API key.

GET /api/pools/:id/my-pledge
Terminal window
curl https://api.cheapestinference.com/api/pools/pool_uuid/my-pledge \
-H "Authorization: Bearer mk_your_key"
{
"success": true,
"data": {
"id": "pledge_uuid",
"status": "active",
"monthlyPrice": "27.00",
"currentPeriodEnd": "2026-05-04T00:00:00.000Z",
"hours": [
{ "hour": 16, "slotIndex": 0, "pricePerDay": "3.38" }
],
"key": {
"id": "key_uuid",
"apiKey": "sk_pool_abc123...",
"rpmLimit": 60,
"tpmLimit": 100000,
"isActive": true
}
}
}
statusMeaning
pledgedReserved, waiting for pool to activate
activeSubscription running, key is live
past_duePayment failed, key suspended
canceledSubscription ended

GET /api/pools/:id/my-key

Returns only the key object. Useful if you just need the API key without the full pledge details.

Terminal window
curl https://api.cheapestinference.com/api/pools/pool_uuid/my-key \
-H "Authorization: Bearer mk_your_key"
{
"success": true,
"data": {
"id": "key_uuid",
"apiKey": "sk_pool_abc123...",
"rpmLimit": 60,
"tpmLimit": 100000,
"isActive": true
}
}

Use apiKey as the Authorization: Bearer token on any /v1/* or /anthropic/* inference endpoint.


DELETE /api/pools/:id/pledge

Cancels a pledged (not yet active) reservation and frees your slots. Once a subscription is active, cancellation goes through the billing portal instead.

Terminal window
curl -X DELETE https://api.cheapestinference.com/api/pools/pool_uuid/pledge \
-H "Authorization: Bearer mk_your_key"
{
"success": true,
"data": { "canceled": true }
}

  1. Reserve — pick a pool, select time blocks, pay the $1 reservation fee. You’re committed to the first billing cycle.
  2. Fund — the pool collects pledges until it hits its minimum seat count.
  3. Activate — once fully funded, the pool goes live. All pledges are charged their monthly fee and receive a dedicated key.
  4. Use — your key gets guaranteed RPM and tok/s during your reserved blocks. Use it exactly like any other API key.
  5. Renew — subscriptions recur monthly automatically. Cancel anytime — access continues to the end of the paid period.