Groq pricing (2026)
Inference platform built on custom LPU hardware delivering ultra-fast token generation for open-source models like Llama and Mixtral.
7 modelsFrom $0.050/1M input tokensUpdated April 2026
| Model | Input / 1M | Output / 1M | Context |
|---|---|---|---|
| Llama 3.1 8B Instant | $0.050 | $0.080 | 128K |
| Gemma 2 9B IT | $0.200 | $0.200 | 8K |
| Mixtral 8x7B | $0.240 | $0.240 | 33K |
| Qwen QwQ 32B | $0.290 | $0.390 | 128K |
| Llama 3.3 70B Versatile | $0.590 | $0.790 | 128K |
| DeepSeek R1 Distill 70B | $0.750 | $0.990 | 128K |
| Llama 3.2 90B Vision | $0.900 | $0.900 | 128K |
About Groq
Inference platform built on custom LPU hardware delivering ultra-fast token generation for open-source models like Llama and Mixtral. Prices shown are on-demand rates in USD per 1 million tokens and may vary by region. Check the official pricing page for the latest rates before production use.
Compare with other providers: All AI pricing · GPT-4o · Claude 3.5 Sonnet · Gemini