DeepSeek OpenAI-compatible

Chinese AI lab known for open-weight MoE models that punch above their price.

DeepSeek is a Chinese research lab whose Mixture-of-Experts models (V3.2 chat, V3.2 Reasoner) regularly land near frontier performance on coding benchmarks while costing a fraction of GPT/Claude. Weights are open on Hugging Face — you can self-host if you have the GPUs.

Strengths

Open weights — runnable locally on a beefy machine
Reasoner variant for math and complex coding
Free tier: ~5M tokens for new accounts

When to use it

Coding workloads where Reasoner thinking helps
Open-weight self-hosting later in production
Rapid prototyping

Notes: Pay-as-you-go API. New accounts get ~5M tokens free for 30 days.

Models tested on DeepSeek

Speed numbers below are specific to DeepSeek's routing and hardware. The same model may appear on other providers' pages with different throughput.

Model	Best tok/s	Avg tok/s	Runs	Success	Longest output (chars)
deepseek-v4-flash	7394.8	6432.0	4	100%	5,640
deepseek-chat	7329.9	4407.4	4	100%	6,293
deepseek-v4-pro	7056.0	6055.7	4	100%	5,006
deepseek-reasoner	6136.8	5604.2	4	100%	5,116