Alibaba OpenAI-compatible
Qwen3-Coder, Kimi, GLM, MiniMax — new Token Plan from $30/seat/mo (legacy Coding Plan unobtainable).
⚠️Heads up: Token Plan ToS restrict use to interactive AI coding/agent tools only. Alibaba's FAQ forbids automated scripts, application backends, benchmarking and research scripts, with API key revocation as the stated penalty. Long agentic sessions inside a compatible tool (Claude Code, opencode, Qwen Code) with a human supervising are still allowed — the line is "is there a person driving a compatible tool?", not how long the session lasts. Our weekly bench against Alibaba runs against a legacy Coding Plan subscription where this clause didn't exist.
Alibaba Model Studio sells two distinct subscriptions for the same model catalogue (Qwen3-Coder, Qwen3.6-Plus, Kimi-K2.5, GLM-5, MiniMax-M2.5, DeepSeek-V3.2): the older Coding Plan charges by request count, the newer Token Plan (Team Edition) charges by Credits derived from input + cached + output tokens. Qwen3-Coder was trained specifically for agentic coding; the other models are reseller offerings on Alibaba's infrastructure.
Strengths
- Token Plan reopened the front door — first realistic way for new users to subscribe in months
- Multiple top open-source coding models on one key
- OpenAI-compatible — drop-in for opencode and others
When to use it
- Daily interactive coding with one of the compatible agents
- Workloads needing very long context (Qwen3.5-Plus: 1M tokens)
- Mixing several open-source coding models without switching keys
Subscription plans
| Plan | Price | Quota | Available |
|---|---|---|---|
| Token Plan · Standard Seat | $30/mo | 25,000 credits/mo · text, vision, image gen — interactive use only | yes |
| Token Plan · Pro Seat | $100/mo | 100,000 credits/mo (4× Standard) — Alibaba's recommended tier | yes |
| Token Plan · Max Seat | $200/mo | 250,000 credits/mo (10× Standard) | yes |
| Coding Plan · Lite (legacy) | $10/mo | 18,000 requests/mo — closed to new subs since 2026-03-20 | closed / sold out |
| Coding Plan · Pro (legacy) | $50/mo | 90,000 requests/mo — effectively impossible to buy | closed / sold out |
Notes: Two parallel products, do not confuse them. The legacy <strong>Coding Plan</strong> ($10/$50) bills by request count (5-30 requests per query depending on complexity, per their FAQ) and has been effectively unobtainable since rollout — checkout races sell out in under a second. The new <strong>Token Plan (Team Edition)</strong> bills by Credits derived from input + cached + output tokens, with no public credit-to-token conversion table — actual throughput varies by model, thinking mode and tool calls. Pricing went up sharply: the closest equivalent to the legacy $50 Pro is now Token Plan Pro Seat at $100/mo. Single subscriber per seat, no sharing, no refunds, no cancellations.
Models tested on Alibaba
Speed numbers below are specific to Alibaba's routing and hardware. The same model may appear on other providers' pages with different throughput.
| Model | Best tok/s | Avg tok/s | Runs | Success | Longest output (chars) |
|---|---|---|---|---|---|
| qwen3-coder-next | 146.6 | 143.2 | 4 | 75% | 4,629 |
| glm-4.7 | 74.1 | 62.0 | 4 | 75% | 4,782 |
| qwen3.5-plus | 64.1 | 57.1 | 4 | 75% | 4,138 |
| MiniMax-M2.5 | 62.2 | 50.6 | 4 | 75% | 3,169 |
| qwen3-coder-plus | 59.5 | 40.8 | 4 | 75% | 2,627 |
| qwen3.6-plus | 53.4 | 52.8 | 4 | 75% | 3,951 |
| kimi-k2.5 | 48.8 | 41.7 | 4 | 75% | 2,572 |
| glm-5 | 40.6 | 39.8 | 4 | 75% | 4,788 |
| qwen3-max-2026-01-23 | 32.1 | 31.6 | 4 | 75% | 3,614 |