NEWTickerr MCP is live →
tickerr

Tickerr / Limits / Groq

Groq rate limits, context window & usage caps (2026)

Groq API rate limits and usage caps. Ultra-fast inference rate limits (30 RPM free), RPM/TPM/RPD by plan - free and paid tiers compared.

Context Window

128000

tokens · ~96K words

Plans

2

tiers tracked

API Tiers

3

rate limit tiers

Groq usage limits by plan

Rpm30 requests/minFree tier
Tpm6000 tokens/minLlama 3.3 70B
Rpd14400 requests/dayFree tier daily cap
Context Window128000 tokens(~96K words)Llama 3.3 70B
Concurrent Requests5 requestsSimultaneous
Rpm1000 requests/minPaid tier
Tpm500000 tokens/minHigher throughput
Context Window128000 tokens(~96K words)All supported models
Concurrent Requests50 requestsSimultaneous
Audio Hours Per Hour7200 seconds/hourWhisper transcription

Groq API rate limits by tier

API access uses a tiered rate limit system. Higher tiers unlock more requests per minute (RPM) and tokens per minute (TPM).

TierRPMTPM
Free306,000
Dev10020,000
Prod600120,000

RPM = requests per minute · TPM = tokens per minute. Limits shown are approximate and may vary by model.

What happens when you hit Groq's limits?

⚠ Rate limit triggered: Groq returns HTTP 429 with x-ratelimit-remaining-requests and x-ratelimit-reset-requests headers. Free tier is 30 RPM / 14,400 RPD - upgrade to a paid plan for higher limits.
1Wait

Check the reset window - most limits refresh within 1–60 minutes

2Retry

Use exponential backoff: 1s → 2s → 4s up to 60s max

3Upgrade

If you hit limits regularly, upgrade your plan to increase caps

HTTP 429 · Retry-After header · exponential backoff · monitor x-ratelimit-remaining-requests

Groq limit reset schedule

Per minute

API RPM limits - reset every 60 seconds

🕐

Per hour

Short rolling windows for message quotas

Per 5 hours

Common for consumer plan message limits

📅

Per day / month

Image gen credits and file storage caps

Exact reset period per limit type is shown in the "Notes" column of the plan table above. Groq uses rolling-window resets - quotas refresh continuously, not at a fixed midnight cutoff.

More Groq intelligence

Limits sourced from Groq's official documentation. Updated when plan changes are announced.

Groq limits - frequently asked questions

What is the Groq message limit?

Groq message limits vary by plan - see the full breakdown by tier in the table above.

Does Groq have a file upload limit?

Yes, Groq enforces file upload limits that vary by plan. See the detailed breakdown above.

When do Groq limits reset?

Reset periods vary by limit type - many Groq limits reset on a rolling window (e.g., per 5 hours or per 24 hours). Check the notes column in the table above for specific reset schedules.

What happens when you hit Groq's rate limit?

Groq returns HTTP 429 with x-ratelimit-remaining-requests and x-ratelimit-reset-requests headers. Free tier is 30 RPM / 14,400 RPD - upgrade to a paid plan for higher limits.

What is Groq's context window?

Groq's context window is 128000 tokens (~96K words). This is the maximum amount of text - including your conversation history - the model can process in a single request.