Synthetic OpenAI-compatible

Run open-source AI models privately — flat $30/mo or pay-per-token.

Synthetic (Synthetic Lab) runs open-source AI models in private, secure datacenters. They never train on user data or store API prompts and completions. The catalogue spans DeepSeek, GLM, Kimi, Llama, MiniMax, Qwen and more — any model vLLM supports. A flat-rate subscription at $30/mo covers all always-on models; usage-based PAYG is also available.

Strengths

Private datacenters — no data stored, no training on prompts
Flat $30/mo includes all always-on models
Broad open-source catalogue (DeepSeek, GLM, Kimi, Llama, MiniMax, Qwen)

When to use it

Coding agents that need privacy guarantees
Running many open-source models under one subscription
Switching between models without per-token billing surprises

Subscription plans

Plan	Price	Quota	Available
Subscription	$30/mo	500 messages / 5h · all models included · 1 concurrent req/model	yes
Usage-based	$0/mo	Pay-per-token · all models	yes

Notes: Subscription is $1/day ($30/mo). Each pack adds 1 concurrent request per model — buy more packs to scale. All always-on models are included in the subscription; no per-token charges. Usage-based PAYG is also available for enterprise.

Referral: Synthetic runs a referral program: sign up via the link above and both you and the referrer earn bonus API credits.

Models tested on Synthetic

Speed numbers below are specific to Synthetic's routing and hardware. The same model may appear on other providers' pages with different throughput.

Model	Best tok/s	Avg tok/s	Runs	Success	Longest output (chars)
hf:moonshotai/Kimi-K2.6	224.2	172.2	2	100%	4,093
hf:Qwen/Qwen3.5-397B-A17B	190.7	152.1	4	100%	3,393
hf:zai-org/GLM-4.7	173.7	167.0	4	100%	2,894
hf:zai-org/GLM-5.1	164.1	124.6	4	100%	2,572
hf:zai-org/GLM-4.7-Flash	150.3	117.0	4	100%	2,656
hf:nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4	142.2	136.2	4	100%	3,959
hf:deepseek-ai/DeepSeek-R1-0528	135.2	121.6	4	100%	5,105
hf:openai/gpt-oss-120b	132.6	100.5	4	100%	4,966
hf:deepseek-ai/DeepSeek-R1	131.2	118.3	4	100%	4,946
hf:MiniMaxAI/MiniMax-M2.5	129.5	117.6	4	100%	3,304
hf:zai-org/GLM-5	110.8	108.2	4	100%	3,894
hf:meta-llama/Llama-3.3-70B-Instruct	93.9	70.8	4	100%	2,619
hf:Qwen/Qwen3-Coder-480B-A35B-Instruct	90.8	53.9	4	100%	2,452
hf:deepseek-ai/DeepSeek-V3	88.0	52.9	4	100%	4,684
hf:deepseek-ai/DeepSeek-V3.2	80.6	75.8	4	100%	4,032
hf:nvidia/Kimi-K2.5-NVFP4	26.7	18.1	4	50%	3,541
hf:moonshotai/Kimi-K2.5	25.6	22.8	4	50%	3,949