DeepSeek OpenAI-compatible
Chinese AI lab known for open-weight MoE models that punch above their price.
DeepSeek is a Chinese research lab whose Mixture-of-Experts models (V3.2 chat, V3.2 Reasoner) regularly land near frontier performance on coding benchmarks while costing a fraction of GPT/Claude. Weights are open on Hugging Face — you can self-host if you have the GPUs.
Strengths
- Open weights — runnable locally on a beefy machine
- Reasoner variant for math and complex coding
- Free tier: ~5M tokens for new accounts
When to use it
- Coding workloads where Reasoner thinking helps
- Open-weight self-hosting later in production
- Rapid prototyping
Notes: Pay-as-you-go API. New accounts get ~5M tokens free for 30 days.
Models tested on DeepSeek
Speed numbers below are specific to DeepSeek's routing and hardware. The same model may appear on other providers' pages with different throughput.
| Model | Best tok/s | Avg tok/s | Runs | Success | Longest output (chars) |
|---|---|---|---|---|---|
| deepseek-v4-flash | 7394.8 | 6432.0 | 4 | 100% | 5,640 |
| deepseek-chat | 7329.9 | 4407.4 | 4 | 100% | 6,293 |
| deepseek-v4-pro | 7056.0 | 6055.7 | 4 | 100% | 5,006 |
| deepseek-reasoner | 6136.8 | 5604.2 | 4 | 100% | 5,116 |