How much does Groq API cost?

Groq API starts at $0.0000 per 1M input tokens. See the full pricing table above for all models.

Does Groq offer batch pricing?

Batch pricing is not currently listed for Groq. Check the official pricing page for updates.

Does Groq offer cached input pricing?

Yes, Groq offers cached input pricing for repeated context. See the cached input column in the table above.

Where can I find official Groq pricing?

Official Groq pricing is available at https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json

Tickerr / Pricing / Groq

Groq pricing - 2026

Operational

Groq API starts at $0.0000/1M input tokens. Prices effective since April 30, 2026.

Model	Input $/1M	Output $/1M	Cached	Notes
Qwen Qwen3.32b	$0.290	$0.590	—
Meta Llama Llama Guard 4.12b	$0.200	$0.200	—
Meta Llama Llama 4 Maverick 17b 128e Instruct	$0.200	$0.600	—
Meta Llama Llama 4 Scout 17b 16e Instruct	$0.110	$0.340	—
Moonshotai Kimi K2 Instruct 0905	$1.00	$3.00	$0.500
Openai Gpt Oss 120b	$0.150	$0.600	$0.0750
Openai Gpt Oss 20b	$0.0750	$0.300	$0.0375
Openai Gpt Oss Safeguard 20b	$0.0750	$0.300	$0.0370
Llama 3.1.8b Instant	$0.0500	$0.0800	—
Llama 3.3.70b Versatile	$0.590	$0.790	—
Gemma 7b It	$0.0500	$0.0800	—
Llama 3.1 8B (Groq)	$0.0000	$1.00	—
Llama 3.3 70B (Groq)	$0.0000	$1.00	—
Llama 3.1 8B Instant	$0.0500	$0.0800	—
Gemma 2 9B	$0.200	$0.200	—
DeepSeek R1 Distill Llama 70B	$0.750	$0.990	—	Distilled reasoning model
Mixtral 8x7B	$0.240	$0.240	—
Llama 3.3 70B Versatile	$0.590	$0.790	—

Prices effective since April 30, 2026. Verified July 11, 2026. Confirm at LiteLLM before billing.

Prices shown are sourced from official Groq documentation and updated automatically when changes are detected. Prices effective since April 30, 2026. Verified July 11, 2026. Always confirm current pricing at Groq's official pricing page before making billing decisions. Tickerr is not affiliated with Groq.

Cost calculator

Estimated monthly cost · 70% input / 30% output split

cheapestLlama 3.1.8b Instant$0.059

Gemma 7b It$0.059

Llama 3.1 8B Instant$0.059

Openai Gpt Oss 20b$0.142

Openai Gpt Oss Safeguard 20b$0.142

Meta Llama Llama 4 Scout 17b 16e Instruct$0.179

+12 more models not shown

Price history

Input price per 1M tokens - tracked from Jan 1, 2025

Prices scraped daily from official provider documentation. Chart shows input token pricing.

Groq usage limits by plan

FreeFree

Rpm	30 requests/min	Free tier
Tpm	6000 tokens/min	Llama 3.3 70B
Rpd	14400 requests/day	Free tier daily cap
Context Window	128000 tokens	Llama 3.3 70B
Concurrent Requests	5 requests	Simultaneous

Paid

Rpm	1000 requests/min	Paid tier
Tpm	500000 tokens/min	Higher throughput
Context Window	128000 tokens	All supported models
Concurrent Requests	50 requests	Simultaneous
Audio Hours Per Hour	7200 seconds/hour	Whisper transcription

Groq features and capabilities

Generation

Image generation	✕ No	Groq is an inference API — no image generation.
Web search	✕ No	No web search; fast inference only.
Code generation	✓ Yes	Supports Llama and Mixtral models for code.
Reasoning mode	✓ Yes	DeepSeek-R1 and Llama reasoning models available.

Input & Context

Audio input	✓ Yes	Whisper transcription via API.
Video input	✕ No	No video support.
Image input	✓ Yes	Vision models available (Llama 3.2 Vision).
File upload	✕ No	No file upload; text and audio only via API.

Integrations & API

API access	✓ Yes	OpenAI-compatible REST API — primary use case.
Plugins/tools	✓ Yes	Tool use and function calling supported.

Languages

English	✓ Yes	Full English support across all models.
Multilingual	✓ Yes	Supports 100+ languages via Llama models.

Memory

Memory	✕ No	No persistent memory; stateless API.
Custom instructions	✓ Yes	System prompt supported.

Privacy & Security

Training opt-out	✓ Yes	Inputs not used for training by default.
GDPR compliant	✓ Yes	GDPR-compliant data processing.

Groq live status and uptime

Real-time operational status

All AI tool pricing

Compare plans across 90+ tools

Token counter & cost calculator

Estimate your monthly API spend

About Groq API pricing

Groq API pricing is set by Groq and billed per million tokens processed. Input tokens (your prompt) and output tokens (the response) are priced separately. Cached input pricing applies when the same context is reused across requests, offering significant savings for repeated prompts.

Prices on this page are sourced from official Groq documentation and updated when Groq announces pricing changes. Check the official Groq pricing page for the most current rates.

Weekly AI pricing & uptime digest

Price drops, new model releases, and incident summaries - every Monday. Free.