z.ai OpenAI-compatible · Anthropic-compatible
Beijing lab behind the GLM family of coding models.
⚠️Heads up: Aggressive 2026 price hikes. The Lite plan launched in February 2026 around $3/mo and has been moved up several times since — currently $18/mo (≈$30/quarter). That puts it within a few dollars of Claude Pro ($20/mo). Marketing still claims '3× Claude Pro usage', but that figure is vendor-supplied and based on z.ai's own quota model, not an apples-to-apples measurement. Verify the latest pricing on z.ai/subscribe before subscribing.
z.ai (Zhipu) is the Chinese lab behind the GLM family of language models. Its GLM Coding Plan is a flat-rate subscription tuned for coding agents — out of the box it works with Claude Code, Cursor, Cline and 20+ other coding tools, and the GLM-4.5 / 4.7 / 5 / 5.1 lineup is built specifically for software engineering tasks.
Strengths
- Predictable monthly cost
- Accepts both OpenAI and Anthropic wire formats
- Integrated MCPs: vision, web search, web reader, repo access
When to use it
- Daily coding agents where flat-rate beats PAYG
- Claude Code users wanting an alternative provider
- Workloads inside or outside China
Subscription plans
| Plan | Price | Quota | Available |
|---|---|---|---|
| Lite | $18/mo | 400 prompts / 5h, 2,000 / week | yes |
| Pro | $36/mo | 2,000 prompts / 5h, unlimited weekly | yes |
| Max | $96/mo | No practical cap, peak-hour SLA | yes |
Notes: Quarterly billing only — month numbers above are quarterly ÷ 3. GLM-4.5-Flash stays free for registered users (PAYG). Verify pricing on z.ai/subscribe; z.ai has changed prices several times in 2026.
Referral: GLM Coding Plan runs a referral system: invitees get 10% off their first subscription, referrers earn credits once payment clears (24-48h review). Press 'Invite Now' on z.ai/subscribe and drop your link into `signup_url`.
Models tested on z.ai
Speed numbers below are specific to z.ai's routing and hardware. The same model may appear on other providers' pages with different throughput.
| Model | Best tok/s | Avg tok/s | Runs | Success | Longest output (chars) |
|---|---|---|---|---|---|
| glm-4.5-air | 100.9 | 75.8 | 3 | 100% | 5,214 |
| glm-5-turbo | 77.9 | 44.8 | 3 | 100% | 3,245 |
| glm-4.7 | 74.2 | 56.1 | 3 | 100% | 2,734 |
| glm-5.1 | 49.9 | 40.3 | 3 | 100% | 2,680 |