Skip to main content

General

CheapestInference is an AI inference platform that provides access to multiple open-source AI models at the lowest prices in the market. We’re fully compatible with OpenAI’s API, making it easy to switch.
  • Lower costs: Up to 90% cheaper than major providers
  • More models: Many open-source models
  • No vendor lock-in: Standard OpenAI-compatible API
  • Transparent pricing: No hidden fees
  • Better privacy: Your data stays private
Yes! You can use the official OpenAI SDK by just changing the base URL and API key. See our OpenAI Compatibility guide.

Pricing & Billing

We charge based on usage:
  • Per token (input + output)
We accept credit cards, debit cards, and can invoice for enterprise customers.

Technical

  • Pro tier: 100 requests/minute
  • Enterprise: Custom limits
We have data centers in:
  • US East (Virginia)
  • US West (California)
  • Europe (Frankfurt)
  • Asia (Tokyo)
  • We may use other third-party data centers.
Yes! All chat models support streaming responses. But pricing has no discount for streaming requests.

Privacy & Security

  • We never train on your data
  • Data is encrypted in transit and at rest
  • We don’t store request/response data after 30 days
  • GDPR and SOC 2 compliant
Yes. API keys are:
  • Encrypted in our database
  • Never logged or exposed
  • Can be rotated anytime
  • Scoped to specific permissions

Support

  • Documentation: Check our comprehensive docs
  • Discord: Join our community
  • Email: [email protected]
  • Enterprise: Dedicated Slack channel
  • Enterprise: 99.9% uptime SLA
  • Dedicated: Custom SLA available
Yes! We offer implementation support, architecture review, and optimization consulting for enterprise customers.

Migration

It’s easy! Just change two lines:
  1. Set base_url="https://api.cheapestinference.ai/v1"
  2. Use your CheapestInference API key
See our Migration Guide.
No! We’re 100% compatible with OpenAI’s API. Your existing code will work without changes.
Yes! Many customers use us for development/testing and OpenAI for production, or vice versa.

Still have questions?