Alibaba OpenAI-compatible

Qwen3-Coder, Kimi, GLM, MiniMax — new Token Plan from $30/seat/mo (legacy Coding Plan unobtainable).

⚠️Heads up: Retired from the weekly bench after 2026-W22 — ToS incompatibility. Both the Token Plan and the legacy Coding Plan FAQs restrict use to interactive AI coding/agent tools driven by a human, explicitly forbidding automated scripts, application backends and benchmarking. MSA's bench is exactly the script-driven workload the clause prohibits, so we will stop running it against Alibaba from 2026-W23 onwards. Historical data through 2026-W22 stays on this page for context. Thanks to forkline.dev for the API key that let us cover Alibaba up to this point.

Alibaba Model Studio sells two distinct subscriptions for the same model catalogue (Qwen3-Coder, Qwen3.6-Plus, Kimi-K2.5, GLM-5, MiniMax-M2.5, DeepSeek-V3.2): the older Coding Plan charges by request count, the newer Token Plan (Team Edition) charges by Credits derived from input + cached + output tokens. Qwen3-Coder was trained specifically for agentic coding; the other models are reseller offerings on Alibaba's infrastructure.

Strengths

  • Token Plan reopened the front door — first realistic way for new users to subscribe in months
  • Multiple top open-source coding models on one key
  • OpenAI-compatible — drop-in for opencode and others

When to use it

  • Daily interactive coding with one of the compatible agents
  • Workloads needing very long context (Qwen3.5-Plus: 1M tokens)
  • Mixing several open-source coding models without switching keys

Subscription plans

PlanPriceQuotaAvailable
Token Plan · Standard Seat$30/mo25,000 credits/mo · text, vision, image gen — interactive use onlyyes
Token Plan · Pro Seat$100/mo100,000 credits/mo (4× Standard) — Alibaba's recommended tieryes
Token Plan · Max Seat$200/mo250,000 credits/mo (10× Standard)yes
Coding Plan · Lite (legacy)$10/mo18,000 requests/mo — closed to new subs since 2026-03-20closed / sold out
Coding Plan · Pro (legacy)$50/mo90,000 requests/mo — effectively impossible to buyclosed / sold out
Notes: Two parallel products, do not confuse them. The legacy <strong>Coding Plan</strong> ($10/$50) bills by request count (5-30 requests per query depending on complexity, per their FAQ) and has been effectively unobtainable since rollout — checkout races sell out in under a second. The new <strong>Token Plan (Team Edition)</strong> bills by Credits derived from input + cached + output tokens, with no public credit-to-token conversion table — actual throughput varies by model, thinking mode and tool calls. Pricing went up sharply: the closest equivalent to the legacy $50 Pro is now Token Plan Pro Seat at $100/mo. Single subscriber per seat, no sharing, no refunds, no cancellations.

Models tested on Alibaba

Speed numbers below are specific to Alibaba's routing and hardware. The same model may appear on other providers' pages with different throughput.

2026-04-26 2026-05-10 peak 147 tok/s
Best tok/s observed on Alibaba per weekly snapshot (2 points).
Model Best tok/sAvg tok/s RunsSuccess Longest output (chars)
qwen3-coder-next146.6143.23100%4,629
glm-4.774.162.03100%4,782
qwen3.5-plus64.157.13100%4,138
MiniMax-M2.562.250.63100%3,169
qwen3-coder-plus59.540.83100%2,627
qwen3.6-plus53.452.83100%3,951
kimi-k2.548.841.73100%2,572
glm-540.639.83100%4,788
qwen3-max-2026-01-2332.131.63100%3,614