Alibaba OpenAI-compatible

Qwen3-Coder, Kimi, GLM, MiniMax — new Token Plan from $30/seat/mo (legacy Coding Plan unobtainable).

⚠️Heads up: Retired from the weekly bench after 2026-W22 — ToS incompatibility. Both the Token Plan and the legacy Coding Plan FAQs restrict use to interactive AI coding/agent tools driven by a human, explicitly forbidding automated scripts, application backends and benchmarking. MSA's bench is exactly the script-driven workload the clause prohibits, so we will stop running it against Alibaba from 2026-W23 onwards. Historical data through 2026-W22 stays on this page for context. Thanks to forkline.dev for the API key that let us cover Alibaba up to this point.

Alibaba Model Studio sells two distinct subscriptions for the same model catalogue (Qwen3-Coder, Qwen3.6-Plus, Kimi-K2.5, GLM-5, MiniMax-M2.5, DeepSeek-V3.2): the older Coding Plan charges by request count, the newer Token Plan (Team Edition) charges by Credits derived from input + cached + output tokens. Qwen3-Coder was trained specifically for agentic coding; the other models are reseller offerings on Alibaba's infrastructure.

Strengths

Token Plan reopened the front door — first realistic way for new users to subscribe in months
Multiple top open-source coding models on one key
OpenAI-compatible — drop-in for opencode and others

When to use it

Daily interactive coding with one of the compatible agents
Workloads needing very long context (Qwen3.5-Plus: 1M tokens)
Mixing several open-source coding models without switching keys

Subscription plans

Plan	Price	Quota	Available
Token Plan · Standard Seat	$30/mo	25,000 credits/mo · text, vision, image gen — interactive use only	yes
Token Plan · Pro Seat	$100/mo	100,000 credits/mo (4× Standard) — Alibaba's recommended tier	yes
Token Plan · Max Seat	$200/mo	250,000 credits/mo (10× Standard)	yes
Coding Plan · Lite (legacy)	$10/mo	18,000 requests/mo — closed to new subs since 2026-03-20	closed / sold out
Coding Plan · Pro (legacy)	$50/mo	90,000 requests/mo — effectively impossible to buy	closed / sold out

Notes: Two parallel products, do not confuse them. The legacy <strong>Coding Plan</strong> ($10/$50) bills by request count (5-30 requests per query depending on complexity, per their FAQ) and has been effectively unobtainable since rollout — checkout races sell out in under a second. The new <strong>Token Plan (Team Edition)</strong> bills by Credits derived from input + cached + output tokens, with no public credit-to-token conversion table — actual throughput varies by model, thinking mode and tool calls. Pricing went up sharply: the closest equivalent to the legacy $50 Pro is now Token Plan Pro Seat at $100/mo. Single subscriber per seat, no sharing, no refunds, no cancellations.

Models tested on Alibaba

Speed numbers below are specific to Alibaba's routing and hardware. The same model may appear on other providers' pages with different throughput.

Best tok/s observed on Alibaba per weekly snapshot (2 points).

Model	Best tok/s	Avg tok/s	Runs	Success	Longest output (chars)
qwen3-coder-next	146.6	143.2	3	100%	4,629
glm-4.7	74.1	62.0	3	100%	4,782
qwen3.5-plus	64.1	57.1	3	100%	4,138
MiniMax-M2.5	62.2	50.6	3	100%	3,169
qwen3-coder-plus	59.5	40.8	3	100%	2,627
qwen3.6-plus	53.4	52.8	3	100%	3,951
kimi-k2.5	48.8	41.7	3	100%	2,572
glm-5	40.6	39.8	3	100%	4,788
qwen3-max-2026-01-23	32.1	31.6	3	100%	3,614