Skip to content

Add LLM access to your SaaS in 15 minutes

This guide walks you through adding LLM access to your SaaS platform. By the end, your users will each have their own API key with independent rate limits and usage tracking.

  • A management key to control your platform’s keys
  • Per-user API keys with individual plans
  • Your users calling our API with their own keys
  • Usage monitoring per user
  • A CheapestInference account (create one)
  • An active subscription (Standard or Pro)

Log into your dashboard and navigate to Keys. Create a Management Key — this authenticates all platform operations.

mgmt_your_management_key_here

Keep this key secure. It can create, delete, and manage all your consumption keys.

Use the management API to create a consumption key for one of your users:

Terminal window
curl -X POST https://api.cheapestinference.com/api/keys \
-H "Authorization: Bearer mgmt_your_management_key" \
-H "Content-Type: application/json" \
-d '{
"name": "user-alice",
"plan": "pro"
}'

Response:

{
"key": "sk-alice-a8f3e2...",
"name": "user-alice",
"plan": "pro",
"rpm": 200,
"tpm": 100000
}

Each key gets its own rate limits based on the plan you assign.

Your user hits our API with their key. It’s a standard OpenAI-compatible endpoint — just change the base URL:

Python:

from openai import OpenAI
client = OpenAI(
api_key="sk-alice-a8f3e2...",
base_url="https://api.cheapestinference.com/v1"
)
response = client.chat.completions.create(
model="Qwen/Qwen3-235B-A22B-Instruct-2507",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Node.js:

import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'sk-alice-a8f3e2...',
baseURL: 'https://api.cheapestinference.com/v1',
});
const response = await client.chat.completions.create({
model: 'Qwen/Qwen3-235B-A22B-Instruct-2507',
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);

Any OpenAI-compatible SDK works. Python, Node.js, Go, Rust, Java — just change base_url.

Check how much each key is consuming:

Terminal window
curl https://api.cheapestinference.com/api/keys/sk-alice-a8f3e2.../usage \
-H "Authorization: Bearer mgmt_your_management_key"

You can also see per-key usage in your dashboard.

Repeat step 2 for each user. There’s no limit on how many keys you can create.

Terminal window
# Create keys for your whole team
for user in alice bob charlie dana; do
curl -X POST https://api.cheapestinference.com/api/keys \
-H "Authorization: Bearer mgmt_your_management_key" \
-H "Content-Type: application/json" \
-d "{\"name\": \"user-$user\", \"plan\": \"standard\"}"
done

Each key has independent:

  • Rate limits (RPM, TPM per plan)
  • Usage tracking (tokens, requests, cost)
  • Budget (shared from your subscription)

Questions? Contact [email protected].