Groq rate limits, context window & usage caps by plan
Rate limits, context window, message limits, file upload caps and image generation limits — all plans compared.
Compare free tiers across all AI tools →Context window explained: Groq's context window is 128000 tokens — that's ~96K wordsof text the model can "see" at once, including your conversation history.
FreeFree
View status →| Rpm | 30 requests/min | Free tier |
| Tpm | 6000 tokens/min | Llama 3.3 70B |
| Rpd | 14400 requests/day | Free tier daily cap |
| Context Window | 128000 tokens(~96K words) | Llama 3.3 70B |
| Concurrent Requests | 5 requests | Simultaneous |
Paid
View status →| Rpm | 1000 requests/min | Paid tier |
| Tpm | 500000 tokens/min | Higher throughput |
| Context Window | 128000 tokens(~96K words) | All supported models |
| Concurrent Requests | 50 requests | Simultaneous |
| Audio Hours Per Hour | 7200 seconds/hour | Whisper transcription |
About Groq limits
Groq enforces usage limits to manage server load and ensure fair access across all users. Limits vary significantly by plan tier — free plans are the most restricted while paid plans offer higher caps or unlimited access. Limits shown here are updated manually when Groq announces changes.
Check the Groq live status and outage history to see if current limits are being affected by an outage.