Skip to content

GLM 4.7 API — unlimited & flat-rate access

GLM 4.7 is Zhipu AI’s (Z.ai) strong open-source coding model. CheapestInference serves it through an OpenAI- and Anthropic-compatible API on flat-rate monthly plans and a truly unlimited pool — so your cost does not scale with tokens.

ModelGLM 4.7
ProviderZhipu AI (Z.ai)
Model IDglm-4.7
Context window198K tokens
Cost basis$0.40 / $1.75 per 1M tokens (in / out)
Endpoints/v1/chat/completions (OpenAI), /anthropic/v1/messages (Anthropic)
PricingFrom $39/mo — reserve an 8-hour daily time block, up to full 24/7
from openai import OpenAI
client = OpenAI(
base_url="https://api.cheapestinference.com/v1",
api_key="sk-..." # your subscriber key
)
response = client.chat.completions.create(
model="glm-4.7",
messages=[{"role": "user", "content": "Write a unit test for..."}],
)
Terminal window
curl https://api.cheapestinference.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "glm-4.7", "messages": [{"role": "user", "content": "Hello"}]}'

GLM 4.7 is a capable, cost-efficient coding model. On CheapestInference you pay a fixed monthly fee rather than per token, so heavy coding and agent workloads have a predictable cost. It is part of the frontier failover set alongside Kimi K2.6 and MiniMax M2.5, and works in any OpenAI-compatible client (Cline, Roo Code, Continue, and similar).

Is there a GLM 4.7 API? Yes. Use model id glm-4.7 against https://api.cheapestinference.com/v1. The API is OpenAI- and Anthropic-SDK compatible.

How much does GLM 4.7 cost? From $39/month. You reserve one or more 8-hour daily time blocks (up to full 24/7) and use GLM 4.7 with no usage cap — billed at a flat monthly fee, not per token.

Which provider is GLM 4.7 from? GLM 4.7 is made by Zhipu AI (Z.ai). CheapestInference serves it with automatic failover across direct providers.