Is Groq down right now?
Authenticated API inference - 2 models monitored · How we classify outages
Groq is currently operational - 215ms HTTP response. Last checked . 90-day uptime: 78.5%. Groq API: all 2 models responding - fastest TTFT 194ms.
Stay informed
HTTP uptime (90d)
78.5%
16 incidents (90d)
HTTP response now
215ms
HTTP p50 (7d)
368ms
median ping response
HTTP p95 (7d)
1119ms
tail ping response
API Inference Monitoring
Live · every 5 minBest TTFT (p50)
194ms
time to first token
Best throughput
1508tok/s
output tokens/sec (24h avg)
Min success rate
100%
worst model (24h)
P50 = typical speed. P95 = worst case 95% of the time. Measured by Tickerr's independent inference checks. Requires ≥10 checks to display.
TTFT over 24 hours
ⓘ Authenticated streaming API calls via native fetch. TTFT = milliseconds from request start to first streamed token chunk. Throughput = output tokens ÷ generation time. Checks run from Vercel us-east-1. Independent of the provider's official status page.
Agent monitoring active · 13 agents reporting · Powered by Tickerr MCP
HTTP endpoint response time (7 days)
p50 368ms·p95 1119msⓘ HTTP response times to Groq's status endpoint - measures infrastructure availability, not API inference speed. For TTFT and model-level API status, see the Groq API Status section above.
90-day uptime
Incident history
meta-llama/llama-4-scout-17b-16e-instruct API Latency Degraded
Independent monitoring detected elevated API latency for meta-llama/llama-4-scout-17b-16e-instruct. Current TTFT is 3.7× above the rolling p50 baseline (2255ms vs p50 603ms). The service is responding…
meta-llama/llama-4-scout-17b-16e-instruct API Latency Degraded
Independent monitoring detected elevated API latency for meta-llama/llama-4-scout-17b-16e-instruct. Current TTFT is 4.9× above the rolling p50 baseline (3116ms vs p50 635ms). The service is responding…
llama-3.3-70b-versatile API Latency Degraded
Independent monitoring detected elevated API latency for llama-3.3-70b-versatile. Current TTFT is 10.3× above the rolling p50 baseline (2508ms vs p50 243ms). The service is responding but slower than …
llama-3.3-70b-versatile API Latency Degraded
Independent monitoring detected elevated API latency for llama-3.3-70b-versatile. Current TTFT is 25.4× above the rolling p50 baseline (6589ms vs p50 259ms). The service is responding but slower than …
meta-llama/llama-4-scout-17b-16e-instruct API Latency Degraded
Independent monitoring detected elevated API latency for meta-llama/llama-4-scout-17b-16e-instruct. Current TTFT is 2.1× above the rolling p50 baseline (643ms vs p50 307ms). The service is responding …
meta-llama/llama-4-scout-17b-16e-instruct API Latency Degraded
Independent monitoring detected elevated API latency for meta-llama/llama-4-scout-17b-16e-instruct. Current TTFT is 2.4× above the rolling p50 baseline (633ms vs p50 269ms). The service is responding …
meta-llama/llama-4-scout-17b-16e-instruct API Latency Degraded
Independent monitoring detected elevated API latency for meta-llama/llama-4-scout-17b-16e-instruct. Current TTFT is 2.2× above the rolling p50 baseline (432ms vs p50 195ms). The service is responding …
meta-llama/llama-4-scout-17b-16e-instruct API Latency Degraded
Independent monitoring detected elevated API latency for meta-llama/llama-4-scout-17b-16e-instruct. Current TTFT is 7.7× above the rolling p50 baseline (1897ms vs p50 245ms). The service is responding…
meta-llama/llama-4-scout-17b-16e-instruct API Latency Degraded
Independent monitoring detected elevated API latency for meta-llama/llama-4-scout-17b-16e-instruct. Current TTFT is 7.7× above the rolling p50 baseline (3827ms vs p50 495ms). The service is responding…
meta-llama/llama-4-scout-17b-16e-instruct API Latency Degraded
Independent monitoring detected elevated API latency for meta-llama/llama-4-scout-17b-16e-instruct. Current TTFT is 2.5× above the rolling p50 baseline (1157ms vs p50 472ms). The service is responding…
meta-llama/llama-4-scout-17b-16e-instruct API Latency Degraded
Independent monitoring detected elevated API latency for meta-llama/llama-4-scout-17b-16e-instruct. Current TTFT is 3× above the rolling p50 baseline (2677ms vs p50 900ms). The service is responding b…
meta-llama/llama-4-scout-17b-16e-instruct API Latency Degraded
Independent monitoring detected elevated API latency for meta-llama/llama-4-scout-17b-16e-instruct. Current TTFT is 2.9× above the rolling p50 baseline (1679ms vs p50 582ms). The service is responding…
Independent monitoring detected elevated API latency for meta-llama/llama-4-scout-17b-16e-instruct. Current TTFT is 3.6× above the rolling p50 baseline (1642ms vs p50 455ms). The service is responding…
Status: Resolved The issues affecting openai/gpt-oss-120b have been resolved. The model is operating normally. Actions were taken to cancel billing plans and restrict verification status of organizati…
Related pages
Groq API not working? Common error codes
If Groq's API is returning errors, the table below explains what each code means and how to fix it. If errors are widespread, check the live status above - a service incident will appear there within minutes.
| Error | What it means & what to do |
|---|---|
| HTTP 429 | Rate limit hit - check x-ratelimit-remaining headers; free tier is 30 RPM |
| HTTP 503 | Service overloaded - Groq queues fill fast under heavy load; retry |
Note: Tickerr monitors Groq's status endpoint, not individual API calls. An HTTP 429 or 500 in your app may be specific to your account tier - check the rate limits page for plan-specific thresholds.
About Groq status
Groq is an AI inference provider offering extremely fast LLM inference using custom LPU hardware. Groq downtime is rare due to their hardware architecture but can affect API users building latency-sensitive apps. Groq offers free and paid tiers with separate rate limits.