Tickerr / LLM Models
LLM model pricing (2026) - API cost comparison
Token costs for 249 large language models from 9 providers. Input, output, and cached pricing per 1M tokens. Live latency benchmarks for 8 models updated every 5 minutes.
Recently added / updated (last 90 days)
Weekly AI pricing & uptime digest
Price drops, new model releases, and incident summaries - every Monday. Free.
249 models
TTFT = time-to-first-token · HTTP = end-to-end response time · Tok/s = generation speed · Outages = last 7 days. Scroll right for all columns. Hover values for p95 + latest.
Frequently asked questions
How often is pricing updated?
Prices are updated within 24 hours of a provider announcement. We pull from official documentation and pricing pages.
What is TTFT p50?
TTFT (time-to-first-token) is how long the API takes to start streaming. p50 is the median - half of checks were faster, half slower. Measured over 7 days of live checks.
What is HTTP p50?
The median end-to-end round-trip time for a complete API response, measured from our monitoring infrastructure. Higher than TTFT since it includes the full generation time.
What does Tok/s mean?
Tokens per second - the 7-day median generation speed. Higher is faster. Only available for models we actively probe.
How are outages counted?
An outage is an incident lasting 15+ minutes where the API returns errors or is unreachable. Count and total downtime shown for the last 7 days.
What does "cached" pricing mean?
Some providers (Anthropic, OpenAI) offer discounted rates for prompt cache hits - repeated prefixes that the API has already processed. Not all models support this.
Prices are per 1M tokens in USD. Cached pricing applies to prompt cache hits where supported. Latency benchmarks are from automated API probes every 5 minutes. Sourced from official provider documentation.
Also: All AI tool pricing · Compare AI tools · Free tier comparison · Token counter