Skip to main content

Webhooks

Use webhook_url in the chat completions request body to receive completion notifications. See the parameter in the API Reference. Chat Completions – webhook_url
curl -X POST "https://api.cheapestinference.ai/v1/chat/completions" \
  -H "Authorization: Bearer $CHEAPESTINFERENCE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
    "messages": [{"role": "user", "content": "Hello"}],
    "webhook_url": "https://your-app.com/webhook"
  }'

Server-Sent Events

Real-time updates:
# Subscribe to events
events = client.events.subscribe(
    resource="batch",
    resource_id="batch_123"
)

for event in events:
    print(f"{event.type}: {event.data}")

Rate Limits

Monitor your usage:
# Check rate limits
limits = client.account.rate_limits()
print(f"Requests remaining: {limits.requests_remaining}")
print(f"Tokens remaining: {limits.tokens_remaining}")