NEWTickerr MCP is live →
tickerr

Tickerr / Compare

AI Model Speed & Cost Comparison 2026

Live P50/P95 TTFT benchmarks, API pricing, and uptime across Claude, GPT-4, Gemini, Groq, Mistral, and more - updated every 5 minutes from independent inference checks.

Live model performance & pricing

Fastest model right now: Llama 3.1 8B (Cerebras) · 127ms median TTFT (last 24h)

ModelProviderP50 TTFTP95 TTFTInput /1MOutput /1MStatus
Llama 3.1 8B (Cerebras)Cerebras127ms224msFreeFreeDegraded
Llama 3.3 70BGroq224ms562msFreeFreeOperational
Mistral SmallMistral356ms507ms$0.0600$0.180Operational
Llama 4 Scout (Groq)Groq400ms824msFreeFreeOperational
Gemini 2.5 Flash LiteGoogle512ms1556ms$0.100$0.400Operational
Claude Haiku 4.5Anthropic520ms1519ms$1.00$5.00Operational
Gemini 2.5 FlashGoogle585ms1215ms$0.300$2.50Operational
Grok 3 MinixAI972ms2042ms$0.300$0.500Operational
Mistral LargeMistral1040ms10334ms$0.500$1.50Operational
GPT-4.1 MiniOpenAI1047ms4599ms$0.400$1.60Operational
GPT-4o MiniOpenAI1063ms2112ms$0.150$0.600Operational
Grok 3xAI1207ms2368ms$3.00$15.00Operational
Claude Opus 4.7Anthropic1232ms1781ms$5.00$25.00Operational
Claude Sonnet 4.6Anthropic1361ms2623ms$3.00$15.00Degraded
Claude Opus 4.6Anthropic1546ms3120ms$5.00$25.00Operational
Qwen3 235B (Cerebras)CerebrasFreeFreeDown
Command R+Cohere$2.50$10.00Operational

P50 = typical speed. P95 = worst case 95% of the time. Measured by Tickerr's independent inference checks every 5 minutes. Pricing from official provider docs. Requires ≥10 checks to compute percentiles. Click any column header to sort.

Tool comparisons

AI assistants

Coding tools

App builders

Automation

Infrastructure

Other