Skip to content

Add LLM access to your SaaS in 15 minutes

This guide walks you through adding LLM access to your SaaS platform using Unlimited subscriptions — dedicated model access with guaranteed throughput and no token counting. By the end, your users will each have their own API key with unlimited usage during subscribed hours.

  • A management key to control your platform’s subscriptions
  • Unlimited subscriptions for your users (Asia, Europe, or Americas time blocks)
  • API keys assigned to each user
  • Your users calling our API with their own keys
  • A CheapestInference account (create one)
  • A saved payment method (add a card from the dashboard or via your first checkout)

Log into your dashboard and navigate to Keys. Create a Management Key — this authenticates all platform operations.

mk_your_management_key_here

Keep this key secure. It can create subscriptions, keys, and manage billing.

Step 2: List available Unlimited plans (2 min)

Section titled “Step 2: List available Unlimited plans (2 min)”

Each Unlimited plan is a dedicated model with guaranteed capacity during specific time blocks:

Terminal window
curl https://api.cheapestinference.com/api/pools \
-H "Authorization: Bearer mk_your_management_key"

Response:

{
"success": true,
"data": [
{
"id": "pool_uuid",
"slug": "kimi26",
"modelName": "Kimi 2.6, GLM 5.1, MiniMax 2.5",
"minPricePerDay": "39.00",
"annualDiscount": 0.15,
"totalSlots": 100,
"pledgedSlots": 3
}
]
}

Choose which 8-hour UTC block your user needs:

BlockHours (UTC)Best for
asia00:00–07:59Asia-Pacific users
europe08:00–15:59Europe / Middle East
americas16:00–23:59Americas users

Subscribe your user to a block:

Terminal window
curl -X POST https://api.cheapestinference.com/api/pools/kimi26/subscribe \
-H "Authorization: Bearer mk_your_management_key" \
-H "Content-Type: application/json" \
-d '{
"blocks": ["americas"],
"quantity": 1,
"billingCycle": "month"
}'

Response:

{
"success": true,
"data": {
"id": "subscription_uuid",
"poolId": "pool_uuid",
"status": "active",
"monthlyPrice": "39.00",
"hours": [
{ "hour": 16, "slotIndex": 1, "pricePerDay": "4.87" },
{ "hour": 17, "slotIndex": 1, "pricePerDay": "4.87" }
],
"key": null
}
}

No API key is created automatically — you’ll create it in the next step.

Create an API key for your user’s subscription:

Terminal window
curl -X POST https://api.cheapestinference.com/api/keys/subscription \
-H "Authorization: Bearer mk_your_management_key" \
-H "Content-Type: application/json" \
-d '{"subscriptionId": "subscription_uuid"}'

Response:

{
"success": true,
"data": {
"id": "key_uuid",
"apiKey": "sk_pool_abc123xxxxxxxx",
"isActive": true
}
}

Unlimited throughput — no RPM or TPM caps. The only limit is 1 concurrent request per key. Your user’s key works 24/7 during their subscribed block.

Your user hits our API with their key. It’s a standard OpenAI-compatible endpoint:

Python:

from openai import OpenAI
client = OpenAI(
api_key="sk_pool_abc123xxxxxxxx",
base_url="https://api.cheapestinference.com/v1"
)
response = client.chat.completions.create(
model="moonshot/kimi-k2.6",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Node.js:

import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'sk_pool_abc123xxxxxxxx',
baseURL: 'https://api.cheapestinference.com/v1',
});
const response = await client.chat.completions.create({
model: 'moonshot/kimi-k2.6',
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);

List all your subscriptions and their keys:

Terminal window
# List all subscriptions for a pool
curl https://api.cheapestinference.com/api/pools/kimi26/my-subscriptions \
-H "Authorization: Bearer mk_your_management_key"
# List all your keys on this pool
curl https://api.cheapestinference.com/api/pools/kimi26/my-keys \
-H "Authorization: Bearer mk_your_management_key"

To add a user to another time block:

Terminal window
# Subscribe to europe block for a new user
SUB=$(curl -s -X POST https://api.cheapestinference.com/api/pools/kimi26/subscribe \
-H "Authorization: Bearer mk_your_management_key" \
-H "Content-Type: application/json" \
-d '{"blocks": ["europe"], "quantity": 1, "billingCycle": "month"}')
SUB_ID=$(echo $SUB | python3 -c "import sys,json; print(json.load(sys.stdin)['data']['id'])")
curl -X POST https://api.cheapestinference.com/api/keys/subscription \
-H "Authorization: Bearer mk_your_management_key" \
-H "Content-Type: application/json" \
-d "{\"subscriptionId\": \"$SUB_ID\"}"

Questions? Contact [email protected].