Tickerr / Limits / Groq

Groq rate limits, context window & usage caps (2026)

Groq API rate limits and usage caps. Ultra-fast inference rate limits (30 RPM free), RPM/TPM/RPD by plan - free and paid tiers compared.

Groq pricing →Groq live status →Compare free tiers →

Context Window

128000

tokens · ~96K words

Plans

tiers tracked

API Tiers

rate limit tiers

Groq usage limits by plan

FreeFree

View pricing →

Rpm	30 requests/min	Free tier
Tpm	6000 tokens/min	Llama 3.3 70B
Rpd	14400 requests/day	Free tier daily cap
Context Window	128000 tokens(~96K words)	Llama 3.3 70B
Concurrent Requests	5 requests	Simultaneous

Paid

View pricing →

Rpm	1000 requests/min	Paid tier
Tpm	500000 tokens/min	Higher throughput
Context Window	128000 tokens(~96K words)	All supported models
Concurrent Requests	50 requests	Simultaneous
Audio Hours Per Hour	7200 seconds/hour	Whisper transcription

Groq API rate limits by tier

API access uses a tiered rate limit system. Higher tiers unlock more requests per minute (RPM) and tokens per minute (TPM).

Tier	RPM	TPM	Notes
Free	30	6,000	llama-3.1-8b-instant
Dev	100	20,000	—
Prod	600	120,000	—

RPM = requests per minute · TPM = tokens per minute. Limits shown are approximate and may vary by model.

What happens when you hit Groq's limits?

⚠ Rate limit triggered: Groq returns HTTP 429 with x-ratelimit-remaining-requests and x-ratelimit-reset-requests headers. Free tier is 30 RPM / 14,400 RPD - upgrade to a paid plan for higher limits.

1Wait

Check the reset window - most limits refresh within 1–60 minutes

2Retry

Use exponential backoff: 1s → 2s → 4s up to 60s max

3Upgrade

If you hit limits regularly, upgrade your plan to increase caps

HTTP 429 · Retry-After header · exponential backoff · monitor x-ratelimit-remaining-requests

Groq limit reset schedule

⚡

Per minute

API RPM limits - reset every 60 seconds

🕐

Per hour

Short rolling windows for message quotas

⏱

Per 5 hours

Common for consumer plan message limits

📅

Per day / month

Image gen credits and file storage caps

Exact reset period per limit type is shown in the "Notes" column of the plan table above. Groq uses rolling-window resets - quotas refresh continuously, not at a fixed midnight cutoff.

More Groq intelligence

Live status →

Check if rate limit errors are due to an active outage

Pricing →

Compare Groq plan costs and API token pricing

Free tier →

Compare free limits across all AI tools

Limits sourced from Groq's official documentation. Updated when plan changes are announced.

Groq limits - frequently asked questions

What is the Groq message limit?

Groq message limits vary by plan - see the full breakdown by tier in the table above.

Does Groq have a file upload limit?

Yes, Groq enforces file upload limits that vary by plan. See the detailed breakdown above.

When do Groq limits reset?

Reset periods vary by limit type - many Groq limits reset on a rolling window (e.g., per 5 hours or per 24 hours). Check the notes column in the table above for specific reset schedules.

What happens when you hit Groq's rate limit?

Groq returns HTTP 429 with x-ratelimit-remaining-requests and x-ratelimit-reset-requests headers. Free tier is 30 RPM / 14,400 RPD - upgrade to a paid plan for higher limits.

What is Groq's context window?

Groq's context window is 128000 tokens (~96K words). This is the maximum amount of text - including your conversation history - the model can process in a single request.