Chat Completions
Endpoint
Section titled “Endpoint”POST /v1/chat/completionsFully compatible with the OpenAI Chat Completions API.
Request body
Section titled “Request body”| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID (e.g. gpt-4o-mini, deepseek-chat, kimi-2.5) |
messages | array | Yes | Array of message objects |
stream | boolean | No | Enable SSE streaming (default: false) |
temperature | number | No | Sampling temperature (0–2) |
max_tokens | integer | No | Maximum tokens to generate |
top_p | number | No | Nucleus sampling parameter |
tools | array | No | Tool/function definitions for function calling |
tool_choice | string/object | No | Control tool usage (auto, none, or specific tool) |
Message format
Section titled “Message format”{ "role": "user | assistant | system", "content": "Message text"}Example
Section titled “Example”curl https://api.cheapestinference.com/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o-mini", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"} ], "temperature": 0.7 }'Response
Section titled “Response”{ "id": "chatcmpl-abc123", "object": "chat.completion", "model": "gpt-4o-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The capital of France is Paris." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 25, "completion_tokens": 8, "total_tokens": 33 }}Using non-OpenAI models
Section titled “Using non-OpenAI models”All models work through the OpenAI endpoint. The API translates the request format automatically:
# DeepSeekcurl https://api.cheapestinference.com/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model": "deepseek-chat", "messages": [{"role": "user", "content": "Hello"}]}'
# Claude (via OpenAI format)curl https://api.cheapestinference.com/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model": "claude-sonnet-4-20250514", "messages": [{"role": "user", "content": "Hello"}]}'
# Kimicurl https://api.cheapestinference.com/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model": "kimi-2.5", "messages": [{"role": "user", "content": "Hello"}]}'Available models
Section titled “Available models”Any model returned by GET /v1/models can be used. See Models for the full list.