Synthetic OpenAI-compatible
Run open-source AI models privately — flat $30/mo or pay-per-token.
Synthetic (Synthetic Lab) runs open-source AI models in private, secure datacenters. They never train on user data or store API prompts and completions. The catalogue spans DeepSeek, GLM, Kimi, Llama, MiniMax, Qwen and more — any model vLLM supports. A flat-rate subscription at $30/mo covers all always-on models; usage-based PAYG is also available.
Strengths
- Private datacenters — no data stored, no training on prompts
- Flat $30/mo includes all always-on models
- Broad open-source catalogue (DeepSeek, GLM, Kimi, Llama, MiniMax, Qwen)
When to use it
- Coding agents that need privacy guarantees
- Running many open-source models under one subscription
- Switching between models without per-token billing surprises
Subscription plans
| Plan | Price | Quota | Available |
|---|---|---|---|
| Subscription | $30/mo | 500 messages / 5h · all models included · 1 concurrent req/model | yes |
| Usage-based | $0/mo | Pay-per-token · all models | yes |
Notes: Subscription is $1/day ($30/mo). Each pack adds 1 concurrent request per model — buy more packs to scale. All always-on models are included in the subscription; no per-token charges. Usage-based PAYG is also available for enterprise.
Referral: Synthetic runs a referral program: sign up via the link above and both you and the referrer earn bonus API credits.
Models tested on Synthetic
Speed numbers below are specific to Synthetic's routing and hardware. The same model may appear on other providers' pages with different throughput.
| Model | Best tok/s | Avg tok/s | Runs | Success | Longest output (chars) |
|---|---|---|---|---|---|
| hf:moonshotai/Kimi-K2.6 | 224.2 | 172.2 | 2 | 100% | 4,093 |
| hf:Qwen/Qwen3.5-397B-A17B | 190.7 | 152.1 | 4 | 100% | 3,393 |
| hf:zai-org/GLM-4.7 | 173.7 | 167.0 | 4 | 100% | 2,894 |
| hf:zai-org/GLM-5.1 | 164.1 | 124.6 | 4 | 100% | 2,572 |
| hf:zai-org/GLM-4.7-Flash | 150.3 | 117.0 | 4 | 100% | 2,656 |
| hf:nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 | 142.2 | 136.2 | 4 | 100% | 3,959 |
| hf:deepseek-ai/DeepSeek-R1-0528 | 135.2 | 121.6 | 4 | 100% | 5,105 |
| hf:openai/gpt-oss-120b | 132.6 | 100.5 | 4 | 100% | 4,966 |
| hf:deepseek-ai/DeepSeek-R1 | 131.2 | 118.3 | 4 | 100% | 4,946 |
| hf:MiniMaxAI/MiniMax-M2.5 | 129.5 | 117.6 | 4 | 100% | 3,304 |
| hf:zai-org/GLM-5 | 110.8 | 108.2 | 4 | 100% | 3,894 |
| hf:meta-llama/Llama-3.3-70B-Instruct | 93.9 | 70.8 | 4 | 100% | 2,619 |
| hf:Qwen/Qwen3-Coder-480B-A35B-Instruct | 90.8 | 53.9 | 4 | 100% | 2,452 |
| hf:deepseek-ai/DeepSeek-V3 | 88.0 | 52.9 | 4 | 100% | 4,684 |
| hf:deepseek-ai/DeepSeek-V3.2 | 80.6 | 75.8 | 4 | 100% | 4,032 |
| hf:nvidia/Kimi-K2.5-NVFP4 | 26.7 | 18.1 | 4 | 50% | 3,541 |
| hf:moonshotai/Kimi-K2.5 | 25.6 | 22.8 | 4 | 50% | 3,949 |