What You Can Do
Our Models
CheapestInference hosts many popular open source models, you’ll be charged based on the tokens you use and size of the model. Models are not quantized unless specified.Chat models
DeepSeek R1
DeepSeek V3.1
GPT-OSS-120B
Llama 4 Maverick
Qwen 3 Next 80B
Kimi K2 0905
Embedding models
Powerful embedding models for semantic search and retrieval. View all models →Build AI apps and agents with CheapestInference
Build an agent
Build agent workflows to solve real use cases with CheapestInference
Build a Next.js chatbot
Spin up a production-ready chatbot using CheapestInference + Next.js.
Build RAG apps
Combine retrieval and generation to build grounded RAG apps.
Build a real-time image app
Stream real-time image generations with Flux Schnell on CheapestInference.
Build a text → app workflow
Turn natural language into interactive apps with CheapestInference + CodeSandbox.
Build an AI search engine
Ship a simplified Perplexity-style search using CheapestInference models.
Use structured inputs with LLM's
Get reliable JSON by defining schemas and using structured outputs.
Working with reasoning models
Use open reasoning models (e.g., DeepSeek-R1) for logic-heavy, multi-step tasks.